7,186 Matching Annotations
  1. Jul 2023
    1. Reviewer #4 (Public Review):

      This is potentially a landmark study with far-reaching consequences for archaeology, palaeoanthropology, and more widely. The antiquity of intentional human mark marking is a hot topic but this study – understood as initial – has as yet incomplete sources of evidence and methods; and it will be interesting to follow how the study develops in subsequent studies.

      Strengths and points to build on:

      * Heuristic potential: As knowledge advances it poses a risk to accepted knowledge – and we should accept that one such risk is moving on from long-held disciplinary tenets. In this case, there has been a growing quantum of evidence – all hotly debated – for the deep antiquity of mark-making and even symbolism by species other than ourselves. Most researchers now accept Neanderthal symbolic capacity actualised in burials, intentional mark-making and the like. The evidence here presented is not unequivocal but is very suggestive and an ideal test case for applying multi-disciplinary techniques of analysis and interpretation beyond the expertise of the listed authors *see comments in 'weaknesses'). This work by itself may be equivocal but when taken together with other such work, points to a 'human' sensu lato past that is as complex as it is long. This work then helps all researchers to at least be alive to the possibility of things like anthropic marks and residues in a context not normally thought to have it.

      * Decentering speciesism: As per the above comment, I appreciate empirical studies that erode speciesism – in particular studies that open up our minds to the possibility that multiple members of the Genus Homo were capable of intentional mark-making and even 'symbolic' behaviour, though this latter term is not well understood or uniformly used. This is probably because of continuous unconscious bias on our part as currently the only exemplar of our genus living - in contrast to most of the past in which different species and genera co-existed - if not on the same landscape and/or at exactly the same time, then with enough overlap that people would have realised 'others' were about either by sight and/or by encountering their physical remains and artefacts.

      * Problematising 'firsts' and deep time: A strength – but which needs to be developed in this manuscript – is our understanding of time and change. We have a plethora of dating techniques but relatively few substantive monographs, articles, and think tanks on time – and especially on how change comes about and what causes it. This leads us to privilege 'firsts' and the 'oldest' finds in 'deep' time above those that are more recent and in 'shallow' time. I would suggest in addition to the claims for the oldest of the reported marks, the authors develop nascent remarks on the possibility the suite of marks may have been made over time. This will help counter criticism that these marks – if established to be anthropic – were not just a singularity, but part of patterned behaviour, which would move it towards the realm of 'symbolic' cognitive behaviour. And indeed, it would be good to hear more about why in this place, these marks were made to establish a replicable model for identifying early anthropic marks.

      Ultimately, this manuscript presents evidence that those who are pro the deep antiquity of intentional mark-making by Homo (and possibly even other genera) will find enough evidence to support; while those sceptical of such claims will find enough methodological flaws and evidential limits to refute those claims. The next decade of work will likely be definitive and this article makes a key contribution to the debate.

      Weaknesses and points to attend to:

      * Definitions: The term 'rock engraving' is used rather uncritically and also the term 'etching' – and it would be useful to have a short definition of how the authors understand the term. Rock art scholars regularly debate these terms and whether they are or are not 'rock art' with its overwhelmingly visual bias; which this discovery may usefully help overthrow and advance.

      * Dating: There is no evidence provided for dating the marks found in the cave system. They could, for example, have been made more recently than the dates claimed – and by another species (if we accept their anthropogenic authorship). This is a perennial problem of much rock art research – especially when it comes to understanding the wider archaeological/palaeoanthropological context. More crucially, accurate dating allows a more reliable understanding of authorship and who/what was responsible for a particular artefact or feature. This has not been demonstrated in this case, though we do have fossil evidence of Homo naledi in the cave system. The article title is this incorrect / and unsupported claim as the marks, if they are anthropic, have not been dated and are of unknown age. The authors allow that there may have been multiple episodes, but not that the marks can belong to a time other than they posit – either earlier, later, or distributed over a long period as the authors allow for in their concluding remarks.

      * Authorship: The study does not utilise either a geoscientist as one of the authorial team, or a rock art specialist. These are key oversights as the former would help better contextualise the dating of the marks reported on, as well as explore alternative non-anthropogenic agents that may have created the marks reported on. For example, the marks and 'pitting' etc may be the result of water bringing abrasive agents during times of flooding, hitting prominent rock features in the cave system. Some explanation is given from lines 114-124, but are uncited. The overlying 'sediment' may be similar to the mondmilch found in cave systems and which is of natural origin. It may be that these non-anthropogenic causes are easy to discount; but the arguments do need to be made. Or, that the polishing was made by Homo naledi brushing against the surfaces as they moved in the cave system, independent of any mark-making. A Table showing the pros and cons of intentional anthropic versus natural authorship would be very effective - as well as showing some of the natural linear marks in the cave system to avoid any confirmation or similar bias. FTIR analysis of the panel A-C would be more than useful to determine whether an additional layer of material has been added. This is mentioned for future work, but this seems a rather post-hoc research programme.

      * Use-wear analysis: If the marks are anthropic in origin; they are likely to have been made by a stone tool, which would leave characteristic marks, directionality and sequencing, distinct from natural causes. It is vital this work – such as was done on the Blombos engraved ochre – is done here – for example, linking to the chert and other tools described on lines 152-158. Note Figure 19, of such a tool, is very hard to make out. The Blombos – and Klasies River Mouth engraved ochres (curiously not referenced) – have very similar geometric markings and there is a real opportunity to compare these in securely dated contexts of 70-120 kya –which could support the argument made here for Homo naledi's cognitive capacity. On figure 16 it would be good to know on what basis some marks were selected as anthropic – and why others were not; this would help demonstrate the methodology and ability to distinguish between the two kinds of marks.

      * Viewshed: The rock art specialist would have added essential expertise on how to study anthropic marks. For example, the images of the marks shown are all of individual or small collections of motifs rather than showing each panel as well as all panels together, to help understand the iconographic context as an ensemble – a 'feature' rather than isolated 'artefacts' or 'motifs'. Line 60 mentions being able to see these as a 'triptych' but the reader is not able to have this view in this manuscript. From the cave map, it is not clear whether all three 'panels' (an unfortunate art historical term that suggests a framed entity - better to use a term like 'cluster') can be viewed simultaneously or in sequence. The view shed in relation to the area where the bodies were recovered is vaguely stated as 'only a few metres away' and is worth developing. I understand 3D scans have been made so it would be useful to have a version showing the marks in relation to where the bodies were recovered and as a 3-cluster ensemble.

      * Image enhancements: Also, in addition to polarised images, have colour enhancement tools like DStretch been tried to see if, for example, attempts at colouring with different coloured sands were made? Similarly, a 3D scan of the motif and panel – (Metashape is mentioned but not shown) – might assist in understanding how the marks and the rock they are on might relate to each other- as research in European upper Palaeolithic contexts has shown. Here, experimenting with different kinds of lighting - or in the absence of lighting, of tactility and how these marks and their rock support may have been experienced by those who may have made and interacted with them? As a note, it would be useful to have a scale in each image of the 'engravings' and it is a pity the one in situ photograph with the scale is not a standard rock art colour-corrected scale as is commonly used in rock art research.

    2. Author Response:

      We would like to thank the eLife reviewers for the considerable time and effort they have invested to review these manuscripts. We have also benefited from a previous round of review of the manuscript describing the proposed burial features, which underwent two rounds of revisions in a high-impact journal over a period of approximately 8 months during 2022 and early 2023. Both sets of reviews have reflected mixed responses to the evidence we have presented, with one reviewer recommending acceptance with minor editorial revisions, two recommending acceptance with minor revisions and the fourth recommending rejection based upon similar arguments to those reflected by some of the reviewers in this current round of reviews in eLife. Ultimately the managing editor of this first journal took the decision that the review process could not be completed in a timely manner and rejected the manuscript although the submission here reflected our consideration of these reviewers suggestions.

      We have chosen in this initial response to the eLife reviews to include some references to the previous anonymous reviews in order to illustrate differences of opinion and differences in revision suggestions within the review process. Our goal is to offer maximal insight into our decision-making process and to acknowledge the considerable time and effort put into the assessment of these manuscripts by reviewers (for eLife and in the case of the earlier review process). We hope that this approach will assist the readers, and reviewers, of our manuscripts in understanding why we are proceeding with certain decisions during the revision process.

      This is a new process for us and the reviewers, and one way in which it significantly differs from more traditional review is that both the reviews and our reply will be public well in advance of our revisions to the manuscript. Indeed, considering the scope of the reviews, some of those revisions may take considerable time, although many can be accomplished fairly easily. Thus, we are not in a position to say that we have solved every issue raised by the reviewers. Instead, we will examine what appear to be the key critical issues raised regarding the data and the analyses and how we propose to address these as we revise the papers. We will also address several philosophical and ethical issues raised by the reviews and our proposal for dealing with these. More specific editorial and citational recommendations will be dealt with on a case-by-case basis, and we do not address these point-by-point in this reply. Please note, this response to the reviewers is not the revision of the manuscript and is only the initial opinion of the corresponding authors with some guidance from the larger group of authors of all three papers. Our final submitted revision will reflect the input of all authors included on those submissions.

      We took the decision to submit three separate papers consciously. The two different categories of evidence, burials and engravings, involve different kinds of analysis and different (although overlapping) teams of researchers, and we recognized that each deserved their own presentation and assessment. Meanwhile, together they inform the context of H. naledi in a way that requires some synthetic discussion, in which both kinds of evidence are relevant, leading to a third paper. But the mutual relevance of these different kinds of evidence and their review by a common set of reviewers naturally raises cross-cutting issues, and the reviewers have cross-referenced the three articles. This has sometimes led to suggestions about one manuscript based on the contents of another. Considering the situation, we accepted the recommendation that it would be clearer to consider all three articles in a single reply. Thus, while each of the three papers will proceed separately during the revision process, it will be necessary to highlight across all three papers occasionally in our responses.

      Scientific Issues:

      In reading the reviews, we feel there are 9 critical points/assertions raised by one or more of the reviewers that present a problem for, or challenge to, our hypothesis that the observed evidence (bone accumulations and engravings) described in the Dinaledi subsystem are of intentional naledigenic origin. These are:

      1. The evidence presented does not demonstrate a clear interruption of the floor sediments, thus failing to demonstrate excavated holes.

      2. The sediments infilling the holes where the skeletal remains are found have not been demonstrated to originate from the disruption of the floor sediments and thus could be part of a natural geological process (e.g. water movement, slumping) or carnivore accumulations.

      3. Previous geological interpretations by our research group have given alternative geological explanations for formation of the bony accumulations that contradict the present evidence presented here and result in alternative origins hypotheses.

      4. Burial cannot be effectively assessed without complete excavation of the features and site.

      5. The skeletal remains as presented do not conform clearly to typical body arrangement/positions associated with human (Homo sapiens) burials.

      6. There is no evidence of grave goods or lithic scatters that are typically associated with human burials.

      7. Humans may have been involved with the creation of either the Homo naledi bone accumulations, the engravings, or both.

      8. Without a date of the engravings, the null hypothesis should be the engravings were created by Homo sapiens.

      9. The null hypothesis for explanation of the skeletal remains in this situation should be “natural accumulation”.

      Our analysis of the Dinaledi Feature 1 leads us to accept that the laminated orange-red mudstone (LORM) sedimentary layer is interrupted, indicating a non-natural intervention, and that the hole created by the interruption was then filled by both a fleshed body (and perhaps parts of other bodies) which were then covered by sediment that originated from the hole that was dug. We recognize that the four eLife reviewers are not convinced that our presentation is sufficient to establish this. Interestingly, this was not the universal opinion of earlier reviewers of the initial manuscript several of whom felt we had adequately supported this hypothesis. The lack of clarity in this current version of the burial manuscript is our responsibility. In the upcoming revision of this paper to be submitted, we will take the reviewers’ critiques to heart and add additional figures that illustrate better the disruption of the LORM and clarify the sedimentological data showing the material covering the skeletal remains in the hole are the disrupted sediments excavated from the same hole. We are proposing to isolate this most critical evidence for burial into a separate section in the revised submission based on the reviewers’ comments. The fact that the LORM layer is disrupted, a fleshed body was placed in the hole created by this disruption, and the body (and perhaps parts of other bodies) was/were then covered by the same sediments from the hole is the central feature of our hypothesis that the bone accumulations observed reflect a burial and not a natural process.

      The possibility of fluvial transport or involvement in the subsystem is a topic that we have addressed extensively in past work, and it is clear from these reviews that we must enhance our current manuscript to discuss this issue at greater length. Our previous work (Dirks et al. 2015; Dirks et al. 2017) emphasized that fluvial transport of whole bodies into the subsystem was precluded by several lines of sedimentological evidence. We excavated a rich accumulation of skeletal remains, including articulated limbs and other elements in subvertical orientations inconsistent with slow sedimentary infill, which were difficult to explain without positing either a large and dense pile of bodies and/or sediment movement. We encountered fractured chunks of laminated orange-red mudstone (LORM) in random orientations within our excavation area, within and among skeletal remains, which directly refuted that the remains were inundated with water at the time of burial, and this limited the possibility of fluvial transport. Water flow sufficient to displace bodies or complete skeletal evidence would also transport large and course sediment, which is absent from the subsystem, and would sort the commingled skeletal material that we found by size, which we do not observe. But our excavation only covered less than a square meter at very limited depth, and this was the limit to our knowledge of subsurface sediment. We thus were left with uncertainty that led us to suggest the possibility of sediment slumping or movement into subsurface drains, although these were not observed near our excavation. Our current work expands our knowledge of the subsurface and presents an alternative explanation for the disposition of skeletal remains from our earlier excavation. But we acknowledge that this new explanation is vulnerable to our own previous published proposals, and we must do a better job of explaining how the new information addresses our previous suggestions. By not clearly creating a section where we explained how these previous hypotheses were now nullified by new evidence, we clearly confused the reviewers with our own previous work. We will revise the manuscript by enhancing the review of the significant geological evidence demonstrating that there is no significant fluvial action in the system and making it clear how the burial hypothesis provides a clearer explanation for the situation of skeletal remains from our previous excavation work.

      One of the central issues raised by reviewers has been a perceived need to excavate these features completely, totally exhuming all skeletal remains from them. Reviewers have written that it is necessary to identify every skeletal element that is present and account for any missing elements. On this point, we have both ethical and scientific differences from these reviewers. We express our ethical concerns first. Many of the best-preserved possible burials ever discovered by archaeologists were subjected to total excavation and exhumation. Cases like La Chapelle-aux-Saints, La Ferrassie, and Skhūl were fully excavated at a time when data recording and excavation methods did not include the range of spatial and geomorphological approaches that later became routine. The judgment of early investigators that these situations were intentional burials was challenged by later workers, and the kind of information that might enable better tests had been irrevocably lost (Gargett 1999; Dibble et al. 2015; Rendu et al. 2014).

      Later, improved excavation standards have not sufficed to remove uncertainty or debate about possible burials. For example, it was long presumed that well-preserved remains of young children were by themselves diagnostic of intentional burial, such as those from Dederiyeh, Border Cave, or Roc de Marsal. Such cases were also fully excavated, with adequate documentation of the positioning of skeletal remains and their surrounding stratigraphic situation, but such cases were later challenged on several bases and the complete exhumation of material has confused or precluded testing of new hypotheses (e.g. Gargett 1999). The case of Roc de Marsal is one in which data from the initial excavation combined with data from the initial excavation combined with re-excavation and geoarchaeological analysis led to a naturalistic interpretation of the skeletal material (Sandgathe et al. 2011; Goldberg et al. 2017). But even in this case, the researchers erred in their interpretation of the skeleton’s situation due to a lack of identification of parts of the infant’s skeleton (Gómez-Olivencia and García-Martinez 2019). That is to say, it is not only the burial hypothesis but other hypotheses that suffer from complete excavation. Researchers concerned with preserving all possible information have sometimes taken extraordinary measures to remove and study possible burials at high-resolution in the laboratory. Such was the case of the Shanidar IV burial removed from the site and transported in plaster jacket by Solecki, which led to the disruption and loss of internal stratigraphic information (Pomeroy et al. 2020). Arguably, the current state of the art is full excavation with partial preparation, such as that undertaken at Panga ya Saidi (Martinón-Torres et al. 2021). But again, any future attempt to reinterpret or test the hypothesis of burial must rely on the adequacy of documentation as the original context has been removed.

      In our decision to leave material in place as much as possible, we are expanding upon standard practice to leave witness sections and unexcavated areas for future research. The situation is novel, representing possible burials by a nonhuman species, and that makes it doubly important in our opinion to be conservative in not fully exhuming the skeletal material from its context. We anticipate that many other researchers, including future investigators, will suggest additional methods to further test the hypothesis of burial, something that would be impossible if we had excavated the features in their entirety prior to publishing a description of our work. We believe strongly that our ethical responsibility is to publish the work and the most likely interpretation while leaving as much evidence in place as possible to enable further testing and replication. We welcome the suggestions of additional methods/analyses to test the H. naledi burial hypothesis.

      This being said, we also observe that total exhumation would not resolve the concerns raised by the reviewers. The recommendation of total exhumation is in pursuit of a full account of all skeletal material present and its preservation and spatial situation, in order to demonstrate that they conform to body positions comparable to human burials. As has been highlighted in forensic casework, the excavation of an inhumation feature does not necessarily provide an accurate spatial or anatomical manifest of the stratigraphical relationships between the body, encapsulating matrix, and any cut present due to preservational, taphonomic and operational factors (Dirkmaat and Cabo, 2016; Hunter, 2014). In particular, in cases where skeletal elements are highly fragmented, friable, or degraded (such as through bioerosion) then complete excavation—even under controlled laboratory conditions—may destroy bone and severely limit skeletal identification (Henderson, 1997; Hochrein, 2002; Owsley and Compton, 1997), particularly in elements where the ratio of trabecular to cortical bone is high (Darwent and Lyman, 2002; Lyman, 1994). As such, non-invasive methods of 3D and 4D modelling (preservation in situ) are often considered preferable to complete necropsy or excavation (preservation by record) where appropriate (Bolliger and Thali, 2009; Dell’Unto and Landeschi, 2022; Randolph-Quinney et al., 2018; Silver, 2016). 

      The test of burial is not primarily positional, but taphonomic and geological. The position and number of bones can elaborate on process-driven questions of decay and destruction in the burial environment, or post-mortem modification, but are not singularly indicative of whether the remains were intentionally buried – the post-mortem narrative of all the processes affecting the cadaveric island is required (Knüsel and Robb, 2016). In previous cases, researchers have disputed or accepted the hypothesis of intentional hominin burial based upon assumptions about how modern humans or Neandertals would have positioned bodies, with the idea that some positions reflect ritual intent while others do not. But applying such assumptions is unjustifiable, particularly for a species like H. naledi, whose culture may have differed fundamentally from our own. Our work acknowledges that the present evidence does not enable a full reconstruction of the burial positions, but it does show that fleshed remains were encased in sediment prior to decomposition of soft tissue, and that subsequent spatial changes can be most parsimoniously explained by natural decomposition within sedimentary matrix contained within a burial feature (after Green, 2022; Mickleburgh and Wescott, 2018; Mickleburgh et al., 2022). If the argument is that extraordinary claims require extraordinary evidence, we feel that the evidence documents excavation and interment (and will do so more clearly in the revision) and the fact of the remains do not match a “typical” human burial in body positioning is not in itself evidence that these are not H. naledi burials.

      We feel that the reviewers (in keeping with many palaeoanthropologists) have a clear idea of what they “think” a burial should look like in an idealised sense, but this platonic ideal of burial form is not matched by the extensive literature in archaeothanatology, funerary archaeology and forensic science which indicates enormous variability in the activity, morphology and post-mortem system experienced by the human body in cases of interment and body disposal (e.g. Aspöck, 2008; Boulestin and Duday, 2005 and 2006; Connelly et al., 2005; Channing and Randolph-Quinney, 2006; Cherryson, 2008; Donnelly et al., 1995; Finley, 2000; Hunter, 2014; Parker Pearson, 1999; Randolph-Quinney, 2013). Decades of experience in the identification, recovery and interpretation of clandestine, deviant, and non-formal burials indicates the platonic ideal is rare, and in many contexts, the exception (Cherryson, 2008; Parker Pearson, 1999). This variability is particularly relevant to morphological traits in burial context, such as the informal nature of the grave cut in plan and section, shallow burial depth, and initial disposition of body (placement) during the early post-mortem period. These might run counter to the expectations of reviewers or others referencing the fossil hominin record, but are well accepted within the communities of researchers investigating Holocene archaeological sites and forensic contexts.

      It is encouraging to see reviewers beginning to incorporate the extensive (often experimentally derived) literature from archaeothanatology and forensic taphonomy in their deliberations, and we will be taking these comments on board going forward. In particular, we acknowledge reviewers’ comments and the need to construct a more detailed post-mortem narrative, accounting for joint disarticulation (labile versus persistent joints etc), displacement, and final disposition of elements within the burial space. As such we will incorporate the hierarchy of decomposition (rank order disarticulation), associations between regions of anatomical association, areas of disassociation, and the voids produced during decomposition (after Mickleburgh and Wescott, 2018; Mickleburgh et al., 2022) into our narrative. In doing so we acknowledge the tensions between the inductive archaeolothanatological narrative-driven approach (e.g. Duday, 2005 & 2009) versus robust decomposition data derived from human forensic taphonomic experimentation recently articulated by Schotsmans and colleagues (2022) - noting that we will highlight comparative data based on forensic experimental casework and actualistic modelling over inductive intuitive approaches which come with significant evidential shortcomings (Bristow et al. 2011).

      Finally, from a taphonomic perspective it is worth pointing out to reviewers that we have already addressed the issue of lack of taphonomic evidence for carnivore involvement in the formation of the Dinaledi assemblage (Dirks, et al., 2016). Absence of any carnivore-induced bone surface modifications, patterns of skeletal part representation, and a total absence of any carnivore remains found within the Dinaledi chamber (following Kuhn and colleagues, 2010) lead us to reject carnivores as possible vectors of body accumulation within the Dinaledi Chamber and Hill Antechamber.

      Reviewers suggest that without a date derived from geochronological methods, the engravings cannot be associated with H. naledi, and that it is possible (or probable) that the engravings were done in the recent past by H. sapiens. This suggestion neglects the context of the site. We have previously documented the structure and extremely limited accessibility of the Dinaledi subsystem. This subsystem was not recorded on maps of the documented Rising Star Cave system prior to our work and its discovery by our teams. Furthermore, there is no evidence of prehistoric human activity in the areas of the cave related to possible subterranean entrances There is no evidence that humans in the past typically ventured into such extreme spaces like those of Rising Star. It is clear from the presence of the remains of many individuals that H. naledi ventured into these spaces again and again. It is likely that H. naledi moved through these spaces more easily than humans do based on their physique. We show that the engravings overlay each other suggesting multiple engraving events.  These engravings took time and effort and the only evidence for use of the Dinaledi subsystem by any hominin is by H. naledi. The context leads to the null hypothesis that H. naledi made the marks. In our revision, we will elaborate on this argument to clarify the evidence for our stance on this hypothesis. Several reviewers took issue with the title of the engraving paper as we did not insert a qualifier in front of the suggested date range for the engravings. We deliberately left out qualifying language so that the title took the form of a testable hypothesis rather than a weak assertation. Should future work find the engravings were not produced within this time range, then we will restate this hypothesis.

      Finally, with regards to the engravings we have chosen to report them because they exist. Not reporting the presence of engraved marks on the walls of a cave above hypothesized burials would be tantamount to leaving relevant evidence out of the description of an archeological context. We recognize and state in our manuscript that these markings require substantial further study, including attempts at geochronological dating. But the current evidence is clearly relevant to the archaeological context of the subsystem. We take a similar stance with reporting the presence of the tool shaped artefact near the hand of the H. naledi skeleton in the Hill Antechamber. It is evident that this object requires further study, as we stated in our manuscript, but again omitting it from our study would be leaving out relevant evidence.

      Some have suggested that the null hypothesis should be that all of these observed circumstances are of natural origin. Our team took this approach in our early investigation of the Dinaledi subsystem (Dirks et al. 2015). We adopted the null hypothesis that the geological processes involved in the accumulation of H. naledi skeletal remains were “natural” (e.g., non-naledigenic involvement), and we were able to reject many alternative explanations for the assemblage, including carnivore accumulation, “death trap” accumulation, and fluvial transport of bodies or bones (Dirks et al. 2015). This led us to the hypothesis that H. naledi were involved in bringing the bodies into the spaces where they were found. But we did not hypothesize their involvement in the formation of the deposit itself beyond bringing the bodies to the location.

      This approach seems conservative. It followed the traditional view that small-brained hominins do not engage in cultural practices. But we recognize in hindsight that this null hypothesis approach did harm to our analyses. It impeded us from recognizing within our initial excavations of the puzzle box area and other excavations between 2014 – 2017 that we might be encountering remains that were intrusive in the sedimentary floor of the chamber. If we had approached the accumulation of a large number of hominins from the perspective of the null hypothesis being that the situation was likely cultural, we perhaps would have collected evidence in a slightly different manner. We certainly note that if the Dinaledi system had been full of the remains of modern humans, there would have been little doubt that the null hypothesis would have been that this was a cultural space and not a “natural space”.  We therefore respectfully disagree with the reviewers who continue to support the idea that we should approach hominin excavations with the null hypothesis that they will be natural (specifically non-cultural) in origins. If excavations continue with this mindset we believe that potential cultural evidence is almost certain to be lost.

      There has been a gradient across paleoanthropological excavations, archaeological work, and forensic investigation, with increasing precision of context. The reality is that the recording precision and frame of approach is typically different in most paleontological excavations than in those related to contemporary human remains. If anything comes from the present discussion of whether the Dinaledi system is a burial site for H. naledi or not, we hope that by taking seriously the possibility of deep cultural dynamics of hominins, we will encourage other teams to meet the highest standards of excavation in order to preserve potential cultural evidence. Given H. naledi’s cranial capacity we suggest that even very early hominin skeletal assemblages should be re-examined, if there is sufficient evidence or records available.  These would include examples such as the A.L. 333 Au. afarensis site (the so called First Family site in Hadar Ethiopia), the Dikika infant skeleton, WT 15000 (Turkana Boy) and even A.L. 288 (Lucy) as such unusual taphonomic situations where skeletons are preserved cannot be simply explained away as “natural” in origin, based solely on the cranial capacity and assumed lack of cognitive and cultural complexity of the hominins as emphasized by us in Fuentes et al. (2023). We are not the first to observe that some very early hominin situations may represent early mortuary activity (Pettitt 2013), but we would advocate a step further. We suggest it may be damaging to take “natural accumulation” as the standard null hypothesis for hominin paleoanthropology, and that it is more conservative in practice to engage remains with the null hypothesis of possible cultural formation.

      We are deeply grateful for the time and effort all of the 8 reviewers (across three reviews) have taken with this work.  We also acknowledge the anonymous reviewers from previous submissions who’s opinions and comments will have made the final iterations of these manuscripts better for their efforts. As this process is rather public and includes commentary outside of the eLife forum, we ask that the efforts of all 37 authors and 8 reviewers involved be respected and that the discourse remain professional in all venues as we study this fascinating and quite complex occurrence. We appreciate also the efforts of members of the public who have engaged with this relatively new process where preprints are posted prior to the reviews allowing comments and interactions from colleagues and the public who are normally not part of the internal peer review process.  We believe these interactions will make for better final papers. We feel we have met the standards of demonstrating burials in H. naledi and that the engraving are most likely associated with H. naledi. However, given the reviews we see many areas where our clarity and context, and analyses, were less strong than they can be. With the clarifications and additions taken on board through these review processes the final papers will be stronger and clearer. We, recognize that this is an ongoing process of scientific investigation and further work will allow continued, and possibly better, evaluation of these hypothesis and others.

      Lee R Berger, Agustín Fuentes, John Hawks, Tebogo Makhubela

      Works cited:

      • Aspöck, E. (2008). What Actually is a ‘Deviant Burial’?: Comparing German-Language and Anglophone Research on ‘Deviant Burials.’ In E. M. Murphy (Ed.). Deviant Burial in the Archaeological Record. Oxford: Oxbow Books.  pp 17–34.

      • Bolliger, S.A. & Thali, M.J. (2009). Thanatology. In S.A. Bolliger and M.J. Thali (eds) Virtopsy Approach:  3D Optical and Radiological Scanning and Reconstruction in Forensic Medicine. Boca Raton: CRC Press. pp 187-218.

      • Boulestin, B. & Duday, H. (2005). Ethnologie et archéologie de la mort: de l’illusion des références à l’emploi d’un vocabulaire. In: C. Mordant and G. Depierre (eds) Les Pratiques Funéraires à l’Âge du Bronze en France. Actes de la table ronde de Sens-en-Bourgogne. Paris: Éditions du Comité des Travaux Historiques et Scientifiques. pp. 17–30.

      • Boulestin, B. & Duday, H. (2006). Ethnology and archaeology of death: from the illusion of references to the use of a terminology. Archaeologia Polona 44: 149–169.

      • Bristow, J., Simms, Z. & Randolph-Quinney, P.S. Taphonomy. In S. Black and E. Ferguson (eds.) Forensic Anthropology 2000-2010. Boca Raton, FL: CRC Press. pp 279-318.

      • Channing, J. & Randolph-Quinney, P.S. (2006). Death, decay and reconstruction: the archaeology of Ballykilmore Cemetery, County Westmeath. In J. O’Sullivan and M. Stanley (eds.) Settlement, Industry and Ritual: Archaeology. National Roads Authority Monograph Series No. 3. Dublin: NRA/Four Courts Press. pp 113-126.

      • Cherryson, A. K. (2008). Normal, Deviant and Atypical: Burial Variation in Late Saxon Wessex, c. AD 700–1100. In E. M. Murphy (Ed.). Deviant Burial in the Archaeological Record. Oxford: Oxbow Books. pp 115–130.

      • Connolly, M., F. Coyne & L. G. Lynch (2005). Underworld : Death and Burial in Cloghermore Cave, Co. Kerry. Bray, Co. Wicklow: Wordwell.

      • Darwent, C. M. & R. L. Lyman (2002). Detecting  the postburial fragmentation of carpals, tarsals and phalanges. In M. H. Sorg and W. D. Haglund (eds). Advances in Forensic Taphonomy: Method, Theory and Archeological Perspectives. Boca Raton, FL, CRC Press. pp 355-378.

      • d’Errico, F., & Backwell, L. (2016). Earliest evidence of personal ornaments associated with burial: The Conus shells from Border Cave. Journal of Human Evolution, 93, 91–108.

      • De Villiers. H. (1973). Human skeletal remains from Border Cave, Ingwavuma District, KwaZulu, South Africa. Annals of the Transvaal Museum, 28(13), 229–246.

      • Dell’Unto, N. and Landeschi, G. (2022). Archaeological 3D GIS. London: Routledge.

      • Dibble, H. L., Aldeias, V., Goldberg, P., McPherron, S. P., Sandgathe, D., & Steele, T. E. (2015). A critical look at evidence from La Chapelle-aux-Saints supporting an intentional Neandertal burial. Journal of Archaeological Science, 53, 649–657.

      • Dirkmaat, D. C., & Cabo, L. L. (2016). Forensic archaeology and forensic taphonomy: basic considerations on how to properly process and interpret the outdoor forensic scene_. Academic Forensic Pathology_ 6, 439–454.

      • Dirks, P. H., Berger, L. R., Roberts, E. M., Kramers, J. D., Hawks, J., Randolph-Quinney, P. S., Elliott, M., Musiba, C. M., Churchill, S. E., de Ruiter, D. J., Schmid, P., Backwell, L. R., Belyanin, G. A., Boshoff, P., Hunter, K. L., Feuerriegel, E. M., Gurtov, A., Harrison, J. du G., Hunter, R., … Tucker, S. (2015). Geological and taphonomic context for the new hominin species Homo naledi from the Dinaledi Chamber, South Africa. ELife, 4, e09561.

      • Dirks, P.H.G.M., Berger, L.R., Hawks, J., Randolph-Quinney, P.S., Backwell, L.R., and Roberts, E.M. (2016). Comment on “Deliberate body disposal by hominins in the Dinaledi Chamber, Cradle of Humankind, South Africa?” [J. Hum. Evol. 96 (2016) 145-148]. Journal of Human Evolution 96:  149-153.

      • Dirks, P. H., Roberts, E. M., Hilbert-Wolf, H., Kramers, J. D., Hawks, J., Dosseto, A., Duval, M., Elliott, M., Evans, M., Grün, R., Hellstrom, J., Herries, A. I., Joannes-Boyau, R., Makhubela, T. V., Placzek, C. J., Robbins, J., Spandler, C., Wiersma, J., Woodhead, J., & Berger, L. R. (2017). The age of Homo naledi and associated sediments in the Rising Star Cave, South Africa. ELife, 6, e24231.

      • Donnelly, S., C. Donnelly & E. Murphy (1999). The forgotten dead: The cíllíní and disused burial grounds of Ballintoy, County Antrim. Ulster Journal of Archaeology 58, 109-113.

      • Duday, H. (2005). L’archéothanatologie ou l’archéologie de la mort. In: O. Dutour, J.-J. Hublin and B. Vandermeersch (eds) Objets et Méthodes en Paléoanthropologie. Paris: Comité des Travaux Historiques et Scientifiques. pp. 153–215.

      • Duday, H. (2009). Archaeology of the Dead: Lectures in Archaeothanatology. Oxford: Oxbow Books.

      • Finley, N. (2000). Outside of life: Traditions of infant burial in Ireland from cillin to cist.  World Archaeology 31, 407-422.

      • Gargett, R. H. (1999). Middle Palaeolithic burial is not a dead issue: The view from Qafzeh, Saint-Césaire, Kebara, Amud, and Dederiyeh. Journal of Human Evolution, 37(1), 27–90.

      • Goldberg, P., Aldeias, V., Dibble, H., McPherron, S., Sandgathe, D., & Turq, A. (2017). Testing the Roc de Marsal Neandertal “Burial” with Geoarchaeology. Archaeological and Anthropological Sciences, 9(6), 1005–1015.

      • Gómez-Olivencia, A., & García-Martínez, D. (2019). New postcranial remains from the Roc de Marsal Neandertal child. PALEO. Revue d’archéologie Préhistorique, 30–1, 30–1.

      • Green, E.C. (2022). An archaeothanatological approach to the identification of late Anglo-Saxon burials in wooden containers. In C.J. Knüsel and E.M.J. Schotsmans (eds.) The Routledge Handbook of Archaeothanatology. London: Routledge. pp 436-455.

      • Henderson, J. (1987). Factors determining the state of preservation of human remains. In A. Boddington, A. Garland and R. Janaway (eds). Death, Decay and Reconstruction: Approaches to Archaeology and Forensic Science. Manchester: Manchester University Press. pp 43-54.

      • Hunter, J. R. (2014). Human remains recovery: archaeological and forensic perspectives. In C. Smith (ed). Encyclopedia of Global Archaeology. New York: Springer New York. pp 3549-3556.

      • Hochrein, M. (2002). An Autopsy of the Grave: Recognizing, Collecting and Preserving Forensic Geotaphonomic Evidence. In M. H. Sorg and W. D. Haglund (eds). Advances in Forensic Taphonomy: Method, Theory and Archeological Perspectives. Boca Raton, FL, CRC Press: 45-70.

      • Knüsel, C.K. & Robb, J. (2016). Funerary taphonomy: An overview of goals and methods. Journal of Archaeological Science: Reports 10, 655-673.

      • Kuhn, B.F., Berger, L.R. & Skinner, J.D. (2010). Examining criteria for identifying and differentiating fossil faunal assemblages accumulated by hyenas and hominins using extant hyenid accumulations. International Journal of Osteoarchaeology 20, 15-35.

      • Lyman, R. (1994). Vertebrate Taphonomy. Cambridge, Cambridge University Press.

      • Martinón-Torres, M., d’Errico, F., Santos, E., Álvaro Gallo, A., Amano, N., Archer, W., Armitage, S. J., Arsuaga, J. L., Bermúdez de Castro, J. M., Blinkhorn, J., Crowther, A., Douka, K., Dubernet, S., Faulkner, P., Fernández-Colón, P., Kourampas, N., González García, J., Larreina, D., Le Bourdonnec, F.-X., … Petraglia, M. D. (2021). Earliest known human burial in Africa. Nature, 593(7857), 7857.

      • Mickleburgh, H.L & Wescott, D.J. (2018). Controlled experimental observations on joint disarticulation and bone displacement of a human body in an open pit: implications for funerary archaeology. Journal of Archaeological Science: Reports 20: 158-167.

      • Mickleburgh, H.L., Wescott, D.J., Gluschitz, S. & Klinkenberg, V.M. (2022). Exploring the use of actualistic forensic taphonomy in the study of (forensic) archaeological human burials: An actualistic experimental research programme at the Forensic Anthropology Center at Texas State University (FACTS), San Marcos, Texas. In C.J. Knüsel and E.M.J. Schotsmans (eds.) The Routledge Handbook of Archaeothanatology. London: Routledge. pp 542-562.

      • Owsley, D. & B. Compton (1997). Preservation in late 19th Century iron coffin burials. In W. Haglund and M. Sorg (eds). Forensic Taphonomy: The Postmortem Fate of Human Remains. Boca Raton, FL, CRC Press: 511-526.

      • Parker Pearson, M. (1999). The Archaeology of Death and Burial. College Station: Texas A&M University Press.

      • Pettitt, P. (2013). The Palaeolithic Origins of Human Burial. Routledge.

      • Pomeroy, E., Bennett, P., Hunt, C. O., Reynolds, T., Farr, L., Frouin, M., Holman, J., Lane, R., French, C., & Barker, G. (2020). New Neanderthal remains associated with the ‘flower burial’ at Shanidar Cave. Antiquity, 94(373), 11–26.

      • Randolph-Quinney, P.S. (2013). From the cradle to the grave: the bioarchaeology of Clonfad 3 and Ballykilmore 6. In N. Brady, P. Stevens and J. Channing (eds.). Settlement and Community in the Fir Tulach Kingdom. Dublin: National Roads Authority Press. pp A2.1-48.

      • Randolph-Quinney, P.S., Haines, S. and Kruger, A. (2018). The use of three-dimensional scanning and surface capture methods in recording forensic taphonomic traces: issues of technology, visualisation, and validation. In: W.J. M. Groen and P. M. Barone (eds). Multidisciplinary Approaches to Forensic Archaeology. Berlin: Springer International Publishing, pp. 115-130.

      • Rendu, W., Beauval, C., Crevecoeur, I., Bayle, P., Balzeau, A., Bismuth, T., Bourguignon, L., Delfour, G., Faivre, J.-P., Lacrampe-Cuyaubère, F., Tavormina, C., Todisco, D., Turq, A., & Maureille, B. (2014). Evidence supporting an intentional Neandertal burial at La Chapelle-aux-Saints. Proceedings of the National Academy of Sciences, 111(1), 81–86.

      • Sandgathe, D. M., Dibble, H. L., Goldberg, P., & McPherron, S. P. (2011). The Roc de Marsal Neandertal child: A reassessment of its status as a deliberate burial. Journal of Human Evolution, 61(3), 243–253.

      • Silver, M. (2016). Conservation Techniques in Cultural Heritage. In E. Stylianidis and F. Remondino (eds) 3D Recording, Documentation and Management of Cultural Heritage. Dunbeath: Whittles Publishing. pp 15-106.

      • Schotsmans, E.M.J., Georges-Zimmermann, P., Ueland, M. and Dent, B.B. (2022). From flesh to bone: Building bridges between taphonomy, archaeothanatology and forensic science for a better understanding of mortuary practices. In C.J. Knüsel and E.M.J. Schotsmans (eds.) The Routledge Handbook of Archaeothanatology. London: Routledge. pp 501-541.

    1. Author Response:

      We would like to thank the eLife reviewers for the considerable time and effort they have invested to review these manuscripts. We have also benefited from a previous round of review of the manuscript describing the proposed burial features, which underwent two rounds of revisions in a high-impact journal over a period of approximately 8 months during 2022 and early 2023. Both sets of reviews have reflected mixed responses to the evidence we have presented, with one reviewer recommending acceptance with minor editorial revisions, two recommending acceptance with minor revisions and the fourth recommending rejection based upon similar arguments to those reflected by some of the reviewers in this current round of reviews in eLife. Ultimately the managing editor of this first journal took the decision that the review process could not be completed in a timely manner and rejected the manuscript although the submission here reflected our consideration of these reviewers suggestions.

      We have chosen in this initial response to the eLife reviews to include some references to the previous anonymous reviews in order to illustrate differences of opinion and differences in revision suggestions within the review process. Our goal is to offer maximal insight into our decision-making process and to acknowledge the considerable time and effort put into the assessment of these manuscripts by reviewers (for eLife and in the case of the earlier review process). We hope that this approach will assist the readers, and reviewers, of our manuscripts in understanding why we are proceeding with certain decisions during the revision process.

      This is a new process for us and the reviewers, and one way in which it significantly differs from more traditional review is that both the reviews and our reply will be public well in advance of our revisions to the manuscript. Indeed, considering the scope of the reviews, some of those revisions may take considerable time, although many can be accomplished fairly easily. Thus, we are not in a position to say that we have solved every issue raised by the reviewers. Instead, we will examine what appear to be the key critical issues raised regarding the data and the analyses and how we propose to address these as we revise the papers. We will also address several philosophical and ethical issues raised by the reviews and our proposal for dealing with these. More specific editorial and citational recommendations will be dealt with on a case-by-case basis, and we do not address these point-by-point in this reply. Please note, this response to the reviewers is not the revision of the manuscript and is only the initial opinion of the corresponding authors with some guidance from the larger group of authors of all three papers. Our final submitted revision will reflect the input of all authors included on those submissions.

      We took the decision to submit three separate papers consciously. The two different categories of evidence, burials and engravings, involve different kinds of analysis and different (although overlapping) teams of researchers, and we recognized that each deserved their own presentation and assessment. Meanwhile, together they inform the context of H. naledi in a way that requires some synthetic discussion, in which both kinds of evidence are relevant, leading to a third paper. But the mutual relevance of these different kinds of evidence and their review by a common set of reviewers naturally raises cross-cutting issues, and the reviewers have cross-referenced the three articles. This has sometimes led to suggestions about one manuscript based on the contents of another. Considering the situation, we accepted the recommendation that it would be clearer to consider all three articles in a single reply. Thus, while each of the three papers will proceed separately during the revision process, it will be necessary to highlight across all three papers occasionally in our responses.

      Scientific Issues:

      In reading the reviews, we feel there are 9 critical points/assertions raised by one or more of the reviewers that present a problem for, or challenge to, our hypothesis that the observed evidence (bone accumulations and engravings) described in the Dinaledi subsystem are of intentional naledigenic origin. These are:

      1. The evidence presented does not demonstrate a clear interruption of the floor sediments, thus failing to demonstrate excavated holes.

      2. The sediments infilling the holes where the skeletal remains are found have not been demonstrated to originate from the disruption of the floor sediments and thus could be part of a natural geological process (e.g. water movement, slumping) or carnivore accumulations.

      3. Previous geological interpretations by our research group have given alternative geological explanations for formation of the bony accumulations that contradict the present evidence presented here and result in alternative origins hypotheses.

      4. Burial cannot be effectively assessed without complete excavation of the features and site.

      5. The skeletal remains as presented do not conform clearly to typical body arrangement/positions associated with human (Homo sapiens) burials.

      6. There is no evidence of grave goods or lithic scatters that are typically associated with human burials.

      7. Humans may have been involved with the creation of either the Homo naledi bone accumulations, the engravings, or both.

      8. Without a date of the engravings, the null hypothesis should be the engravings were created by Homo sapiens.

      9. The null hypothesis for explanation of the skeletal remains in this situation should be “natural accumulation”.

      Our analysis of the Dinaledi Feature 1 leads us to accept that the laminated orange-red mudstone (LORM) sedimentary layer is interrupted, indicating a non-natural intervention, and that the hole created by the interruption was then filled by both a fleshed body (and perhaps parts of other bodies) which were then covered by sediment that originated from the hole that was dug. We recognize that the four eLife reviewers are not convinced that our presentation is sufficient to establish this. Interestingly, this was not the universal opinion of earlier reviewers of the initial manuscript several of whom felt we had adequately supported this hypothesis. The lack of clarity in this current version of the burial manuscript is our responsibility. In the upcoming revision of this paper to be submitted, we will take the reviewers’ critiques to heart and add additional figures that illustrate better the disruption of the LORM and clarify the sedimentological data showing the material covering the skeletal remains in the hole are the disrupted sediments excavated from the same hole. We are proposing to isolate this most critical evidence for burial into a separate section in the revised submission based on the reviewers’ comments. The fact that the LORM layer is disrupted, a fleshed body was placed in the hole created by this disruption, and the body (and perhaps parts of other bodies) was/were then covered by the same sediments from the hole is the central feature of our hypothesis that the bone accumulations observed reflect a burial and not a natural process.

      The possibility of fluvial transport or involvement in the subsystem is a topic that we have addressed extensively in past work, and it is clear from these reviews that we must enhance our current manuscript to discuss this issue at greater length. Our previous work (Dirks et al. 2015; Dirks et al. 2017) emphasized that fluvial transport of whole bodies into the subsystem was precluded by several lines of sedimentological evidence. We excavated a rich accumulation of skeletal remains, including articulated limbs and other elements in subvertical orientations inconsistent with slow sedimentary infill, which were difficult to explain without positing either a large and dense pile of bodies and/or sediment movement. We encountered fractured chunks of laminated orange-red mudstone (LORM) in random orientations within our excavation area, within and among skeletal remains, which directly refuted that the remains were inundated with water at the time of burial, and this limited the possibility of fluvial transport. Water flow sufficient to displace bodies or complete skeletal evidence would also transport large and course sediment, which is absent from the subsystem, and would sort the commingled skeletal material that we found by size, which we do not observe. But our excavation only covered less than a square meter at very limited depth, and this was the limit to our knowledge of subsurface sediment. We thus were left with uncertainty that led us to suggest the possibility of sediment slumping or movement into subsurface drains, although these were not observed near our excavation. Our current work expands our knowledge of the subsurface and presents an alternative explanation for the disposition of skeletal remains from our earlier excavation. But we acknowledge that this new explanation is vulnerable to our own previous published proposals, and we must do a better job of explaining how the new information addresses our previous suggestions. By not clearly creating a section where we explained how these previous hypotheses were now nullified by new evidence, we clearly confused the reviewers with our own previous work. We will revise the manuscript by enhancing the review of the significant geological evidence demonstrating that there is no significant fluvial action in the system and making it clear how the burial hypothesis provides a clearer explanation for the situation of skeletal remains from our previous excavation work.

      One of the central issues raised by reviewers has been a perceived need to excavate these features completely, totally exhuming all skeletal remains from them. Reviewers have written that it is necessary to identify every skeletal element that is present and account for any missing elements. On this point, we have both ethical and scientific differences from these reviewers. We express our ethical concerns first. Many of the best-preserved possible burials ever discovered by archaeologists were subjected to total excavation and exhumation. Cases like La Chapelle-aux-Saints, La Ferrassie, and Skhūl were fully excavated at a time when data recording and excavation methods did not include the range of spatial and geomorphological approaches that later became routine. The judgment of early investigators that these situations were intentional burials was challenged by later workers, and the kind of information that might enable better tests had been irrevocably lost (Gargett 1999; Dibble et al. 2015; Rendu et al. 2014).

      Later, improved excavation standards have not sufficed to remove uncertainty or debate about possible burials. For example, it was long presumed that well-preserved remains of young children were by themselves diagnostic of intentional burial, such as those from Dederiyeh, Border Cave, or Roc de Marsal. Such cases were also fully excavated, with adequate documentation of the positioning of skeletal remains and their surrounding stratigraphic situation, but such cases were later challenged on several bases and the complete exhumation of material has confused or precluded testing of new hypotheses (e.g. Gargett 1999). The case of Roc de Marsal is one in which data from the initial excavation combined with data from the initial excavation combined with re-excavation and geoarchaeological analysis led to a naturalistic interpretation of the skeletal material (Sandgathe et al. 2011; Goldberg et al. 2017). But even in this case, the researchers erred in their interpretation of the skeleton’s situation due to a lack of identification of parts of the infant’s skeleton (Gómez-Olivencia and García-Martinez 2019). That is to say, it is not only the burial hypothesis but other hypotheses that suffer from complete excavation. Researchers concerned with preserving all possible information have sometimes taken extraordinary measures to remove and study possible burials at high-resolution in the laboratory. Such was the case of the Shanidar IV burial removed from the site and transported in plaster jacket by Solecki, which led to the disruption and loss of internal stratigraphic information (Pomeroy et al. 2020). Arguably, the current state of the art is full excavation with partial preparation, such as that undertaken at Panga ya Saidi (Martinón-Torres et al. 2021). But again, any future attempt to reinterpret or test the hypothesis of burial must rely on the adequacy of documentation as the original context has been removed.

      In our decision to leave material in place as much as possible, we are expanding upon standard practice to leave witness sections and unexcavated areas for future research. The situation is novel, representing possible burials by a nonhuman species, and that makes it doubly important in our opinion to be conservative in not fully exhuming the skeletal material from its context. We anticipate that many other researchers, including future investigators, will suggest additional methods to further test the hypothesis of burial, something that would be impossible if we had excavated the features in their entirety prior to publishing a description of our work. We believe strongly that our ethical responsibility is to publish the work and the most likely interpretation while leaving as much evidence in place as possible to enable further testing and replication. We welcome the suggestions of additional methods/analyses to test the H. naledi burial hypothesis.

      This being said, we also observe that total exhumation would not resolve the concerns raised by the reviewers. The recommendation of total exhumation is in pursuit of a full account of all skeletal material present and its preservation and spatial situation, in order to demonstrate that they conform to body positions comparable to human burials. As has been highlighted in forensic casework, the excavation of an inhumation feature does not necessarily provide an accurate spatial or anatomical manifest of the stratigraphical relationships between the body, encapsulating matrix, and any cut present due to preservational, taphonomic and operational factors (Dirkmaat and Cabo, 2016; Hunter, 2014). In particular, in cases where skeletal elements are highly fragmented, friable, or degraded (such as through bioerosion) then complete excavation—even under controlled laboratory conditions—may destroy bone and severely limit skeletal identification (Henderson, 1997; Hochrein, 2002; Owsley and Compton, 1997), particularly in elements where the ratio of trabecular to cortical bone is high (Darwent and Lyman, 2002; Lyman, 1994). As such, non-invasive methods of 3D and 4D modelling (preservation in situ) are often considered preferable to complete necropsy or excavation (preservation by record) where appropriate (Bolliger and Thali, 2009; Dell’Unto and Landeschi, 2022; Randolph-Quinney et al., 2018; Silver, 2016). 

      The test of burial is not primarily positional, but taphonomic and geological. The position and number of bones can elaborate on process-driven questions of decay and destruction in the burial environment, or post-mortem modification, but are not singularly indicative of whether the remains were intentionally buried – the post-mortem narrative of all the processes affecting the cadaveric island is required (Knüsel and Robb, 2016). In previous cases, researchers have disputed or accepted the hypothesis of intentional hominin burial based upon assumptions about how modern humans or Neandertals would have positioned bodies, with the idea that some positions reflect ritual intent while others do not. But applying such assumptions is unjustifiable, particularly for a species like H. naledi, whose culture may have differed fundamentally from our own. Our work acknowledges that the present evidence does not enable a full reconstruction of the burial positions, but it does show that fleshed remains were encased in sediment prior to decomposition of soft tissue, and that subsequent spatial changes can be most parsimoniously explained by natural decomposition within sedimentary matrix contained within a burial feature (after Green, 2022; Mickleburgh and Wescott, 2018; Mickleburgh et al., 2022). If the argument is that extraordinary claims require extraordinary evidence, we feel that the evidence documents excavation and interment (and will do so more clearly in the revision) and the fact of the remains do not match a “typical” human burial in body positioning is not in itself evidence that these are not H. naledi burials.

      We feel that the reviewers (in keeping with many palaeoanthropologists) have a clear idea of what they “think” a burial should look like in an idealised sense, but this platonic ideal of burial form is not matched by the extensive literature in archaeothanatology, funerary archaeology and forensic science which indicates enormous variability in the activity, morphology and post-mortem system experienced by the human body in cases of interment and body disposal (e.g. Aspöck, 2008; Boulestin and Duday, 2005 and 2006; Connelly et al., 2005; Channing and Randolph-Quinney, 2006; Cherryson, 2008; Donnelly et al., 1995; Finley, 2000; Hunter, 2014; Parker Pearson, 1999; Randolph-Quinney, 2013). Decades of experience in the identification, recovery and interpretation of clandestine, deviant, and non-formal burials indicates the platonic ideal is rare, and in many contexts, the exception (Cherryson, 2008; Parker Pearson, 1999). This variability is particularly relevant to morphological traits in burial context, such as the informal nature of the grave cut in plan and section, shallow burial depth, and initial disposition of body (placement) during the early post-mortem period. These might run counter to the expectations of reviewers or others referencing the fossil hominin record, but are well accepted within the communities of researchers investigating Holocene archaeological sites and forensic contexts.

      It is encouraging to see reviewers beginning to incorporate the extensive (often experimentally derived) literature from archaeothanatology and forensic taphonomy in their deliberations, and we will be taking these comments on board going forward. In particular, we acknowledge reviewers’ comments and the need to construct a more detailed post-mortem narrative, accounting for joint disarticulation (labile versus persistent joints etc), displacement, and final disposition of elements within the burial space. As such we will incorporate the hierarchy of decomposition (rank order disarticulation), associations between regions of anatomical association, areas of disassociation, and the voids produced during decomposition (after Mickleburgh and Wescott, 2018; Mickleburgh et al., 2022) into our narrative. In doing so we acknowledge the tensions between the inductive archaeolothanatological narrative-driven approach (e.g. Duday, 2005 & 2009) versus robust decomposition data derived from human forensic taphonomic experimentation recently articulated by Schotsmans and colleagues (2022) - noting that we will highlight comparative data based on forensic experimental casework and actualistic modelling over inductive intuitive approaches which come with significant evidential shortcomings (Bristow et al. 2011).

      Finally, from a taphonomic perspective it is worth pointing out to reviewers that we have already addressed the issue of lack of taphonomic evidence for carnivore involvement in the formation of the Dinaledi assemblage (Dirks, et al., 2016). Absence of any carnivore-induced bone surface modifications, patterns of skeletal part representation, and a total absence of any carnivore remains found within the Dinaledi chamber (following Kuhn and colleagues, 2010) lead us to reject carnivores as possible vectors of body accumulation within the Dinaledi Chamber and Hill Antechamber.

      Reviewers suggest that without a date derived from geochronological methods, the engravings cannot be associated with H. naledi, and that it is possible (or probable) that the engravings were done in the recent past by H. sapiens. This suggestion neglects the context of the site. We have previously documented the structure and extremely limited accessibility of the Dinaledi subsystem. This subsystem was not recorded on maps of the documented Rising Star Cave system prior to our work and its discovery by our teams. Furthermore, there is no evidence of prehistoric human activity in the areas of the cave related to possible subterranean entrances There is no evidence that humans in the past typically ventured into such extreme spaces like those of Rising Star. It is clear from the presence of the remains of many individuals that H. naledi ventured into these spaces again and again. It is likely that H. naledi moved through these spaces more easily than humans do based on their physique. We show that the engravings overlay each other suggesting multiple engraving events.  These engravings took time and effort and the only evidence for use of the Dinaledi subsystem by any hominin is by H. naledi. The context leads to the null hypothesis that H. naledi made the marks. In our revision, we will elaborate on this argument to clarify the evidence for our stance on this hypothesis. Several reviewers took issue with the title of the engraving paper as we did not insert a qualifier in front of the suggested date range for the engravings. We deliberately left out qualifying language so that the title took the form of a testable hypothesis rather than a weak assertation. Should future work find the engravings were not produced within this time range, then we will restate this hypothesis.

      Finally, with regards to the engravings we have chosen to report them because they exist. Not reporting the presence of engraved marks on the walls of a cave above hypothesized burials would be tantamount to leaving relevant evidence out of the description of an archeological context. We recognize and state in our manuscript that these markings require substantial further study, including attempts at geochronological dating. But the current evidence is clearly relevant to the archaeological context of the subsystem. We take a similar stance with reporting the presence of the tool shaped artefact near the hand of the H. naledi skeleton in the Hill Antechamber. It is evident that this object requires further study, as we stated in our manuscript, but again omitting it from our study would be leaving out relevant evidence.

      Some have suggested that the null hypothesis should be that all of these observed circumstances are of natural origin. Our team took this approach in our early investigation of the Dinaledi subsystem (Dirks et al. 2015). We adopted the null hypothesis that the geological processes involved in the accumulation of H. naledi skeletal remains were “natural” (e.g., non-naledigenic involvement), and we were able to reject many alternative explanations for the assemblage, including carnivore accumulation, “death trap” accumulation, and fluvial transport of bodies or bones (Dirks et al. 2015). This led us to the hypothesis that H. naledi were involved in bringing the bodies into the spaces where they were found. But we did not hypothesize their involvement in the formation of the deposit itself beyond bringing the bodies to the location.

      This approach seems conservative. It followed the traditional view that small-brained hominins do not engage in cultural practices. But we recognize in hindsight that this null hypothesis approach did harm to our analyses. It impeded us from recognizing within our initial excavations of the puzzle box area and other excavations between 2014 – 2017 that we might be encountering remains that were intrusive in the sedimentary floor of the chamber. If we had approached the accumulation of a large number of hominins from the perspective of the null hypothesis being that the situation was likely cultural, we perhaps would have collected evidence in a slightly different manner. We certainly note that if the Dinaledi system had been full of the remains of modern humans, there would have been little doubt that the null hypothesis would have been that this was a cultural space and not a “natural space”.  We therefore respectfully disagree with the reviewers who continue to support the idea that we should approach hominin excavations with the null hypothesis that they will be natural (specifically non-cultural) in origins. If excavations continue with this mindset we believe that potential cultural evidence is almost certain to be lost.

      There has been a gradient across paleoanthropological excavations, archaeological work, and forensic investigation, with increasing precision of context. The reality is that the recording precision and frame of approach is typically different in most paleontological excavations than in those related to contemporary human remains. If anything comes from the present discussion of whether the Dinaledi system is a burial site for H. naledi or not, we hope that by taking seriously the possibility of deep cultural dynamics of hominins, we will encourage other teams to meet the highest standards of excavation in order to preserve potential cultural evidence. Given H. naledi’s cranial capacity we suggest that even very early hominin skeletal assemblages should be re-examined, if there is sufficient evidence or records available.  These would include examples such as the A.L. 333 Au. afarensis site (the so called First Family site in Hadar Ethiopia), the Dikika infant skeleton, WT 15000 (Turkana Boy) and even A.L. 288 (Lucy) as such unusual taphonomic situations where skeletons are preserved cannot be simply explained away as “natural” in origin, based solely on the cranial capacity and assumed lack of cognitive and cultural complexity of the hominins as emphasized by us in Fuentes et al. (2023). We are not the first to observe that some very early hominin situations may represent early mortuary activity (Pettitt 2013), but we would advocate a step further. We suggest it may be damaging to take “natural accumulation” as the standard null hypothesis for hominin paleoanthropology, and that it is more conservative in practice to engage remains with the null hypothesis of possible cultural formation.

      We are deeply grateful for the time and effort all of the 8 reviewers (across three reviews) have taken with this work.  We also acknowledge the anonymous reviewers from previous submissions who’s opinions and comments will have made the final iterations of these manuscripts better for their efforts. As this process is rather public and includes commentary outside of the eLife forum, we ask that the efforts of all 37 authors and 8 reviewers involved be respected and that the discourse remain professional in all venues as we study this fascinating and quite complex occurrence. We appreciate also the efforts of members of the public who have engaged with this relatively new process where preprints are posted prior to the reviews allowing comments and interactions from colleagues and the public who are normally not part of the internal peer review process.  We believe these interactions will make for better final papers. We feel we have met the standards of demonstrating burials in H. naledi and that the engraving are most likely associated with H. naledi. However, given the reviews we see many areas where our clarity and context, and analyses, were less strong than they can be. With the clarifications and additions taken on board through these review processes the final papers will be stronger and clearer. We, recognize that this is an ongoing process of scientific investigation and further work will allow continued, and possibly better, evaluation of these hypothesis and others.

      Lee R Berger, Agustín Fuentes, John Hawks, Tebogo Makhubela

      Works cited:

      • Aspöck, E. (2008). What Actually is a ‘Deviant Burial’?: Comparing German-Language and Anglophone Research on ‘Deviant Burials.’ In E. M. Murphy (Ed.). Deviant Burial in the Archaeological Record. Oxford: Oxbow Books.  pp 17–34.

      • Bolliger, S.A. & Thali, M.J. (2009). Thanatology. In S.A. Bolliger and M.J. Thali (eds) Virtopsy Approach:  3D Optical and Radiological Scanning and Reconstruction in Forensic Medicine. Boca Raton: CRC Press. pp 187-218.

      • Boulestin, B. & Duday, H. (2005). Ethnologie et archéologie de la mort: de l’illusion des références à l’emploi d’un vocabulaire. In: C. Mordant and G. Depierre (eds) Les Pratiques Funéraires à l’Âge du Bronze en France. Actes de la table ronde de Sens-en-Bourgogne. Paris: Éditions du Comité des Travaux Historiques et Scientifiques. pp. 17–30.

      • Boulestin, B. & Duday, H. (2006). Ethnology and archaeology of death: from the illusion of references to the use of a terminology. Archaeologia Polona 44: 149–169.

      • Bristow, J., Simms, Z. & Randolph-Quinney, P.S. Taphonomy. In S. Black and E. Ferguson (eds.) Forensic Anthropology 2000-2010. Boca Raton, FL: CRC Press. pp 279-318.

      • Channing, J. & Randolph-Quinney, P.S. (2006). Death, decay and reconstruction: the archaeology of Ballykilmore Cemetery, County Westmeath. In J. O’Sullivan and M. Stanley (eds.) Settlement, Industry and Ritual: Archaeology. National Roads Authority Monograph Series No. 3. Dublin: NRA/Four Courts Press. pp 113-126.

      • Cherryson, A. K. (2008). Normal, Deviant and Atypical: Burial Variation in Late Saxon Wessex, c. AD 700–1100. In E. M. Murphy (Ed.). Deviant Burial in the Archaeological Record. Oxford: Oxbow Books. pp 115–130.

      • Connolly, M., F. Coyne & L. G. Lynch (2005). Underworld : Death and Burial in Cloghermore Cave, Co. Kerry. Bray, Co. Wicklow: Wordwell.

      • Darwent, C. M. & R. L. Lyman (2002). Detecting  the postburial fragmentation of carpals, tarsals and phalanges. In M. H. Sorg and W. D. Haglund (eds). Advances in Forensic Taphonomy: Method, Theory and Archeological Perspectives. Boca Raton, FL, CRC Press. pp 355-378.

      • d’Errico, F., & Backwell, L. (2016). Earliest evidence of personal ornaments associated with burial: The Conus shells from Border Cave. Journal of Human Evolution, 93, 91–108.

      • De Villiers. H. (1973). Human skeletal remains from Border Cave, Ingwavuma District, KwaZulu, South Africa. Annals of the Transvaal Museum, 28(13), 229–246.

      • Dell’Unto, N. and Landeschi, G. (2022). Archaeological 3D GIS. London: Routledge.

      • Dibble, H. L., Aldeias, V., Goldberg, P., McPherron, S. P., Sandgathe, D., & Steele, T. E. (2015). A critical look at evidence from La Chapelle-aux-Saints supporting an intentional Neandertal burial. Journal of Archaeological Science, 53, 649–657.

      • Dirkmaat, D. C., & Cabo, L. L. (2016). Forensic archaeology and forensic taphonomy: basic considerations on how to properly process and interpret the outdoor forensic scene_. Academic Forensic Pathology_ 6, 439–454.

      • Dirks, P. H., Berger, L. R., Roberts, E. M., Kramers, J. D., Hawks, J., Randolph-Quinney, P. S., Elliott, M., Musiba, C. M., Churchill, S. E., de Ruiter, D. J., Schmid, P., Backwell, L. R., Belyanin, G. A., Boshoff, P., Hunter, K. L., Feuerriegel, E. M., Gurtov, A., Harrison, J. du G., Hunter, R., … Tucker, S. (2015). Geological and taphonomic context for the new hominin species Homo naledi from the Dinaledi Chamber, South Africa. ELife, 4, e09561.

      • Dirks, P.H.G.M., Berger, L.R., Hawks, J., Randolph-Quinney, P.S., Backwell, L.R., and Roberts, E.M. (2016). Comment on “Deliberate body disposal by hominins in the Dinaledi Chamber, Cradle of Humankind, South Africa?” [J. Hum. Evol. 96 (2016) 145-148]. Journal of Human Evolution 96:  149-153.

      • Dirks, P. H., Roberts, E. M., Hilbert-Wolf, H., Kramers, J. D., Hawks, J., Dosseto, A., Duval, M., Elliott, M., Evans, M., Grün, R., Hellstrom, J., Herries, A. I., Joannes-Boyau, R., Makhubela, T. V., Placzek, C. J., Robbins, J., Spandler, C., Wiersma, J., Woodhead, J., & Berger, L. R. (2017). The age of Homo naledi and associated sediments in the Rising Star Cave, South Africa. ELife, 6, e24231.

      • Donnelly, S., C. Donnelly & E. Murphy (1999). The forgotten dead: The cíllíní and disused burial grounds of Ballintoy, County Antrim. Ulster Journal of Archaeology 58, 109-113.

      • Duday, H. (2005). L’archéothanatologie ou l’archéologie de la mort. In: O. Dutour, J.-J. Hublin and B. Vandermeersch (eds) Objets et Méthodes en Paléoanthropologie. Paris: Comité des Travaux Historiques et Scientifiques. pp. 153–215.

      • Duday, H. (2009). Archaeology of the Dead: Lectures in Archaeothanatology. Oxford: Oxbow Books.

      • Finley, N. (2000). Outside of life: Traditions of infant burial in Ireland from cillin to cist.  World Archaeology 31, 407-422.

      • Gargett, R. H. (1999). Middle Palaeolithic burial is not a dead issue: The view from Qafzeh, Saint-Césaire, Kebara, Amud, and Dederiyeh. Journal of Human Evolution, 37(1), 27–90.

      • Goldberg, P., Aldeias, V., Dibble, H., McPherron, S., Sandgathe, D., & Turq, A. (2017). Testing the Roc de Marsal Neandertal “Burial” with Geoarchaeology. Archaeological and Anthropological Sciences, 9(6), 1005–1015.

      • Gómez-Olivencia, A., & García-Martínez, D. (2019). New postcranial remains from the Roc de Marsal Neandertal child. PALEO. Revue d’archéologie Préhistorique, 30–1, 30–1.

      • Green, E.C. (2022). An archaeothanatological approach to the identification of late Anglo-Saxon burials in wooden containers. In C.J. Knüsel and E.M.J. Schotsmans (eds.) The Routledge Handbook of Archaeothanatology. London: Routledge. pp 436-455.

      • Henderson, J. (1987). Factors determining the state of preservation of human remains. In A. Boddington, A. Garland and R. Janaway (eds). Death, Decay and Reconstruction: Approaches to Archaeology and Forensic Science. Manchester: Manchester University Press. pp 43-54.

      • Hunter, J. R. (2014). Human remains recovery: archaeological and forensic perspectives. In C. Smith (ed). Encyclopedia of Global Archaeology. New York: Springer New York. pp 3549-3556.

      • Hochrein, M. (2002). An Autopsy of the Grave: Recognizing, Collecting and Preserving Forensic Geotaphonomic Evidence. In M. H. Sorg and W. D. Haglund (eds). Advances in Forensic Taphonomy: Method, Theory and Archeological Perspectives. Boca Raton, FL, CRC Press: 45-70.

      • Knüsel, C.K. & Robb, J. (2016). Funerary taphonomy: An overview of goals and methods. Journal of Archaeological Science: Reports 10, 655-673.

      • Kuhn, B.F., Berger, L.R. & Skinner, J.D. (2010). Examining criteria for identifying and differentiating fossil faunal assemblages accumulated by hyenas and hominins using extant hyenid accumulations. International Journal of Osteoarchaeology 20, 15-35.

      • Lyman, R. (1994). Vertebrate Taphonomy. Cambridge, Cambridge University Press.

      • Martinón-Torres, M., d’Errico, F., Santos, E., Álvaro Gallo, A., Amano, N., Archer, W., Armitage, S. J., Arsuaga, J. L., Bermúdez de Castro, J. M., Blinkhorn, J., Crowther, A., Douka, K., Dubernet, S., Faulkner, P., Fernández-Colón, P., Kourampas, N., González García, J., Larreina, D., Le Bourdonnec, F.-X., … Petraglia, M. D. (2021). Earliest known human burial in Africa. Nature, 593(7857), 7857.

      • Mickleburgh, H.L & Wescott, D.J. (2018). Controlled experimental observations on joint disarticulation and bone displacement of a human body in an open pit: implications for funerary archaeology. Journal of Archaeological Science: Reports 20: 158-167.

      • Mickleburgh, H.L., Wescott, D.J., Gluschitz, S. & Klinkenberg, V.M. (2022). Exploring the use of actualistic forensic taphonomy in the study of (forensic) archaeological human burials: An actualistic experimental research programme at the Forensic Anthropology Center at Texas State University (FACTS), San Marcos, Texas. In C.J. Knüsel and E.M.J. Schotsmans (eds.) The Routledge Handbook of Archaeothanatology. London: Routledge. pp 542-562.

      • Owsley, D. & B. Compton (1997). Preservation in late 19th Century iron coffin burials. In W. Haglund and M. Sorg (eds). Forensic Taphonomy: The Postmortem Fate of Human Remains. Boca Raton, FL, CRC Press: 511-526.

      • Parker Pearson, M. (1999). The Archaeology of Death and Burial. College Station: Texas A&M University Press.

      • Pettitt, P. (2013). The Palaeolithic Origins of Human Burial. Routledge.

      • Pomeroy, E., Bennett, P., Hunt, C. O., Reynolds, T., Farr, L., Frouin, M., Holman, J., Lane, R., French, C., & Barker, G. (2020). New Neanderthal remains associated with the ‘flower burial’ at Shanidar Cave. Antiquity, 94(373), 11–26.

      • Randolph-Quinney, P.S. (2013). From the cradle to the grave: the bioarchaeology of Clonfad 3 and Ballykilmore 6. In N. Brady, P. Stevens and J. Channing (eds.). Settlement and Community in the Fir Tulach Kingdom. Dublin: National Roads Authority Press. pp A2.1-48.

      • Randolph-Quinney, P.S., Haines, S. and Kruger, A. (2018). The use of three-dimensional scanning and surface capture methods in recording forensic taphonomic traces: issues of technology, visualisation, and validation. In: W.J. M. Groen and P. M. Barone (eds). Multidisciplinary Approaches to Forensic Archaeology. Berlin: Springer International Publishing, pp. 115-130.

      • Rendu, W., Beauval, C., Crevecoeur, I., Bayle, P., Balzeau, A., Bismuth, T., Bourguignon, L., Delfour, G., Faivre, J.-P., Lacrampe-Cuyaubère, F., Tavormina, C., Todisco, D., Turq, A., & Maureille, B. (2014). Evidence supporting an intentional Neandertal burial at La Chapelle-aux-Saints. Proceedings of the National Academy of Sciences, 111(1), 81–86.

      • Sandgathe, D. M., Dibble, H. L., Goldberg, P., & McPherron, S. P. (2011). The Roc de Marsal Neandertal child: A reassessment of its status as a deliberate burial. Journal of Human Evolution, 61(3), 243–253.

      • Silver, M. (2016). Conservation Techniques in Cultural Heritage. In E. Stylianidis and F. Remondino (eds) 3D Recording, Documentation and Management of Cultural Heritage. Dunbeath: Whittles Publishing. pp 15-106.

      • Schotsmans, E.M.J., Georges-Zimmermann, P., Ueland, M. and Dent, B.B. (2022). From flesh to bone: Building bridges between taphonomy, archaeothanatology and forensic science for a better understanding of mortuary practices. In C.J. Knüsel and E.M.J. Schotsmans (eds.) The Routledge Handbook of Archaeothanatology. London: Routledge. pp 501-541.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Point-by-Point Response (author’s replies in plain text)


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: Silao et al make the intriguing observation that yeasts that are generally considered less pathogenic are unable to catabolize proline than Candida albicans. They then, in Candida albicans, construct mutants defective for the two key enzymes (Put1, Put2) required to convert proline to glutamate, which they show to be essential for proline utilization as an energy (carbon) and nitrogen source. The authors proceed to untangle the regulatory aspects of proline degradation, including the respective cellular localization of its key enzymes. They then make the important discovery that strains lacking either Put1 or Put2 suffer from a proline-dependent growth defect, which they attribute to resulting defects in mitochondrial metabolism.

      The manuscript then goes on to analyze a broad range of infection models including: reconstituted human epithelial skin model, Drosophila, mouse systemic infections, organ colonization in these mice (kidney, spleen, brain, liver and histochemistry of the kidneys) as well as survival when incubated with cultured human neutrophils. Finally, they use yeast cells constitutively expressing yEmRFP (so that yeasts can be distinguished from other host cells) and coated with FITC before incubation with the host cells (which coats the wall of the original cells, but does not spread to progeny) and they go on to perform an impressive set of analyses of C. albicans growth within mouse kidneys both in vivo and ex vivo, exploiting an implanted window together with intravital imaging with a two photon microscope at different time points. The system is impressive and visualizes tissue invasion by hyphal cells beautifully. Finally, they compare the intra vital images from WT and put2-/- cells and show that, as in vitro, put2-/- cells do not form filaments and do not show extensive invasion of the kidney tissue. While the in vivo aspect of the study includes many different models, it finds defects in virulence for different subsets of put mutants and the relative importance of filamentation vs proline utilization for virulence is not conclusively resolved.

      Overall, this is an important and timely manuscript, which significantly contributes to the understanding of how proline metabolism intersects with yeast fitness in the context of infections. However, there are several major concerns regarding some of the conclusions drawn from the study. In addition, some general recommendations that would improve the manuscript are provided.

      Specifically, the manuscript provides a very detailed description of experiments and observations. However, in several parts it is difficult to follow and the reader needs more guidance about the logic involved in reaching conclusion. Specifically, several aspects of the paper are written for experts in Candida (yeast) metabolism. Here, explaining the rationale for some of the experiments, and providing more background information that is not obvious to a non-expert, is required.

      In particular, writing a clear and measured summary sentence at the end of each paragraph and a conclusion paragraph that summarizes key findings in simple terms would help make the manuscript more digestible for readers.

      In addition, the impressive microscopy and broad range of in vivo experiments is comprehensive but only adds incremental information relevant to proline metabolism-that filamentous growth in vivo and virulence is reduced in cells carrying some mutations in one or more put genes. However, this broad sweep of model systems and the development of the in vivo imagining system might have more impact in a separate paper focused on the real-time in vivo visualization of kidney invasion.

      We thank Reviewer 1 for the extensive list of comments and have endeavored to adjust the manuscript to address all of the major and minor concerns. It is evident that Reviewer 1 clearly understood the significance of the work and we appreciate that the comments are presented in a positive manner intended to improve our manuscript.

      Major comments:

      1. The main finding that impressed this reviewer is that "removing the ability to catabolize proline, in an organism that evolved to catabolize it, leads to (growth) defects". This point could be better highlighted throughout the manuscript.

      Thanks for the comment. We will adjust the text to reflect this suggestion.

      1. The authors show that deletion strains for proline metabolism have defects that are important for in vivo pathogenicity. This is an important finding. However, as the manuscript reads now, it suggests that the main findings are that the ability to use proline in the respective host niche is key. Mechanistically, the manuscript revolves primarily around defects that arise when deleting PUT1 and/or PUT2 (i.e., an "unknown" toxicity of proline in the case of put1-/- (or put1-/- put2-/-) and the additional P5C-dependent toxicity for put2-/- mutants; see below).

      Yes, the reviewer is correct in that we believe that proline catabolism is necessary to initiate and power hyphal growth, which is coupled to virulence. We have previously shown that upon phagocytosis by macrophages, the expression of Put1, Put2 and even Gdh2 are induced in phagocytized C. albicans cells, which is consistent with the analysis shown in Fig. 2D and Fig. S2B. Consequently, proline, or an amino acid that is metabolized via the proline catabolic pathway, must be present in the phagosomal compartment. However, as we now report, proline inhibits growth of cells lacking the capacity to catabolize it. Although we cannot differentiate the cause of reduced virulence in put mutants, i.e., the lack of energy due to the inability to catabolize proline vs proline toxicity, proline catabolism is clearly important and a robust indicator of virulence. As point 1, we have adjusted the text to make this clearer.

      1. In order to claim that catabolizing prolines promotes pathogenicity (as opposed to the alternative hypothesis that the inability to catabolize proline leads to the observed defects), additional experiments would be required. For example, the put mutants would need to be compared with mutants that significantly reduce/impair proline uptake, such as the referenced gnp2 mutant (Garbe et al 2022). While the finding that less pathogenic yeast species are unable to catabolize proline is both intriguing and important, it also remains as is presented as a loose, non-quantitative correlation that only tangentially address the question of whether "proline catabolism is key for pathogenicity".

      We have in fact already shown that proline uptake is required to induce filamentation (Martínez and Ljungdahl 2003, Fig. 6). The main point of our current work, which we believe is important and of general interest, is that C. albicans is adapted to use proline as sole energy source, which reflects the environment (humans) in which it evolved. See the response to point 2. Interestingly, the differences in the expression levels of Put1 (off in the absence of proline, induced robustly by proline) and Put2 (low level of constitutive expression, induced robustly by proline) suggest that cells are primed to decrease the likelihood of becoming inhibited by P5C, i.e., the constitutive expression of Put2 is able to ameliorate the potential toxicity of P5C. Regardless, the finding that put1 and put2 mutants exhibit significantly reduced virulence in two host models provides clear support for proline catabolism being key for C. albicans pathogenicity.

      1. 238 onwards: The conclusion that "the primary growth inhibitory effect of proline is linked to catabolic intermediates formed by Put1 and that are metabolized further by Put2"does not appear to be fully supported by the evidence. Addition of proline to put1 mutants already reduced OD600 by ~50% (Figure 2); and is further reduced to ~10% when put2 is deleted. This implies that there are two inhibitory effects of proline, not one primary one. At the least, this option should be discussed, including why deletion of PUT1 leads to proline toxicity. The latter is not clear-is it that too much proline accumulates in the cell and this accumulation is toxic? If this is the case, the effect would be expected to be proline concentration dependent. Performing a relatively simple experiment as performed for the put2 mutant (Fig. 3 / S3F) may clarify this issue. Particularly, if the experiment would be coupled with intracellular quantification of proline.

      Precisely! Proline toxicity is evident even in put1 mutants, clearly suggesting that proline, without being further catabolized, exerts a growth inhibitory effect (Fig. 3A). We traced this inhibitory effect to decreased mitochondrial respiration (Fig. 3E). There are two parameters to consider regarding the inhibitory effects of proline in put2 mutants. First, the presence of proline induces the expression of Put1 independent of Put2 (Fig. S2C), consequently, the levels of the toxic intermediate P5C increases (Fig. 3B). P5C has previously been postulated to inhibit mitochondrial respiration, which is well-aligned with our analysis (Fig. 3E; see response Point 5). We initially tested whether a proline-P5C cycle, suggested by work in mammalian cells, would play a role in proline-mediated toxicity; however, increasing cytoplasmic pools of proline by supplying high levels of glutamate (which according to work in mammalian cells should efficiently convert to cytoplasmic proline) did not occur; we did not see glutamate-enhanced Put1 expression (Fig. 2D, S2A, S2B). We agree with the reviewer with respect to the suggested experiment, and have monitored growth of put1 in media with different proline concentrations. The results are incorporated in the revised Fig. 3.

      1. The caption "P5C mediates a respiratory block" is misleading, as the evidence is not that compelling: Although P5C increases in put2, but not in put1 mutants, and given that both single mutants experience a proline-dependent respiratory defect (Fig. 3E), the results suggest a more complex relationship.

      Previous work using pure P5C (Ref. 36; Nishimura et al) showed that it targets respiration, hence the caption “respiratory block” in the header. In mammals, PRODH (Put1) physically interacts with mitochondrial respiratory complex II in the inner mitochondrial membrane (line 89-90), while P5CDH (Put2) is in the matrix. The put1 mutation might affect basal activity of the respiratory chain resulting in lowered respiration, which may compound when proline accumulates in the mitochondria. The inhibitory mechanism remains unknown, and in going forward we have begun characterizing various GFP-tagged respiratory complex components in put1 mutants and in strains co-expressing Put1-RFP (for interaction studies). The results are out of the scope of this current work.

      1. The virulence assays and in vivo experiments do not present a unifying view: in Drosophila put2∆∆ is less virulent than put1∆∆, which appears similar to put3∆∆. Given that put2 mutants grow slowly, likely because of P5C inhibition, this seems logical. However, in mice, put3∆∆ remains highly virulent while put1∆∆ and put2∆∆ results for survival are mixed. Furthermore, in 4 mouse organs, put1∆∆ and put2∆∆ are not significantly different from one another but are different from wt, while put3∆∆ has no significant reduction in CFU. Kidney histology shows very little invasion by put1 and put2 and more by put3, but visually put3 appears to invade much less than the WT, and the human neutrophil experiment shows effects of put2 or put3 but not put1. This leaves the reader rather confused. It may be worth discussing the reasons for different results in different models. Is the availability of proline in each of the organisms and organs similar?

      We thank the reviewer for these thoughtful observations, however, we note that all of the diverse assay systems employed provide a clear and consistent indication that the inability to completely catabolize proline significantly reduces virulence. This is well-aligned with our previous data regarding the need for proline catabolism to escape macrophages (Silao et al, 2019). The requirement for Put3 may not be very strict since the Put enzymes are still expressed in the absence of Put3 (Fig. 2D/S2A/S2B), indicating the activity of additional regulatory factors; hence, this may explain why the put3 strain behaves like wildtype in the murine model (Fig. 5B). The dispensability of Put3 in the murine model could be due to a lower neutrophil count and that murine neutrophils exhibit a lower affinity for fungal cells as compared to human blood (Machata et al., 2020, Front Immunol). The more pronounced requirement of Put3 to survive in whole human blood and when co-cultured with human neutrophils could indeed be linked to the need to rapidly derepress PUT1/PUT2 (and even other target genes) as suggested by the global RNASeq analysis that shows that proline catabolism is a core response of C. albicans during neutrophil interaction (Niemiec MJ et al., 2017, BMC Genomics). In Drosophila, a well-established model to study innate immunity, the presence of hemocytes that fulfill the equivalent functions of neutrophils and macrophages could explain the increased requirement for Put3. In summary, although it is impossible to know the precise mechanistic basis underlying the observed differences, we believe it unreasonable to expect that all mutations behave identically in each virulence model. In fact, differences considered trivial such as the use of mouse background can have profound effects on virulence. Presumably the differences we report are due to the specific nutrient composition (proline and metabolites feeding into the proline catabolic network) and physical parameters intrinsic to each model. For instance, Lionakis et al. (2013) suggested that filamentation occurs faster in the kidney compared to other organs, such as the liver/spleen, indicating the presence of kidney-specific cues that drive infections of this organ.

      1. The ex vivo and in vivo analysis of the dynamics of C. albicans growth in the host is visually impressive, but it distracts from the focus of the paper and the metabolic findings. Showing that put mutant cells do not form filaments in vivo (as in vitro) does not add much conceptually to the paper. Furthermore, this lovely advance in in vivo visualization is lost at the end of this paper and the authors should consider whether it might fit better in manuscript that could really highlight the in vivo visualization approach.

      We appreciate this comment. Indeed, our lab is at an advanced stage of completing a manuscript focused on the use of intravital and clearing microscopy to follow the onset of an upper urinary tract infection (UTI) in a murine candidemia model. However, our ability to visualize in 3D the onset of an infection in a living host is not a trivial achievement and we were impressed that it provided a clear answer as to whether a single C. albicans cell can initiate an infection and undergo morphogenesis leading to hyphal growth. Furthermore, we tested a put2 strain, the growth of which is highly sensitive to the presence of proline, and found that it did not exhibit filamentous growth. This clearly shows that cells colonizing the kidney are exposed to an environment that requires a functional proline catabolic network to exhibit filamentous growth, a characteristic of renal infections. Our results are consistent with the kidney being a metabolic hub for arginine/proline biosynthesis, which likely increases the levels of these amino acids in this organ.

      1. The discussion of cells stained with FITC and expressing yEmRFP does not clearly point out that the FITC is only an indicator for those cells that were used to innoculate the tissue and that finding cells without FITC indicates that they are mitotic progeny, indicating that they have been dividing. The authors clearly understand this, but a naive reader may miss this important point if it is not stated explicitly.

      We have adjusted the text to explicitly clarify this.

      Minor comments:

      1. Throughout: what is the distinction between utilization of proline for C or for energy? These terms seem to be used interchangeably.

      C. albicans is heterotroph that can use proline to generate biomass (gluconeogenesis, etc) and its catabolism generates sufficient amounts of ATP to power growth. Thus, when proline is used as sole carbon source, it can also serves as the sole energy source. In the text, we have tried to be consistent using “carbon source” when discussing proline as a component of growth media, and “energy source” when discussing proline catabolism.

      1. Introducing the schematic in Fig. 2A at the beginning of Figure 1, would help explain proline catabolism before delving into the growth experiments that rely upon this framework. This should include an explanation, for readers less familiar with the metabolic issues, of the main limitations to catabolizing proline, and the key issues for being able to use proline for nitrogen, carbon, and energy (potentially indicated in the overview figure, e.g. pointing towards gluconeogenesis etc.).

      We have considered the reviewers suggestion, however, we believe that the placement of the schematic in Fig 2 is appropriate as is, and where it will hopefully enable readers to more readily grasp the strain construction and experiments documented in Fig.2.

      1. Saccharomyces can only grow on proline as a nitrogen source, but not as energy/carbon source. Could the authors briefly mention or discuss why this is the case? This is not clearly apparent after reading the manuscript and it leaves the reader confused and trying to understand if the fact that proline is required for carbon utilization is a new finding of this paper or was already known. Do the authors think this is tied to the presence of complex 1 components in C. albicans that are not found in S. cerevisiae. Is this consistent for the pathogenic, but not the non-pathogenic yeasts analyzed in figure 1?

      We have adjusted the text to clarify our thoughts regarding this. Indeed, we do believe that a major reason for the ability of C. albicans to efficiently grow using proline as a sole energy source is the presence of Complex I. However, C. glabrata appears to be able to grow well using proline as sole energy source despite apparently lacking Complex I. Consequently, alternative NADH dehydrogenases exist in C. glabrata, but how this is coupled to energy metabolism will require additional work that is out of the scope of the present work.

      1. 100: While Gdh2 is apparently an important enzyme for generating ammonium, why is it not necessary for macrophage escape and virulence as shown in reference 18? A recent paper from Garbe et al (ref 12) suggests that Gnp2 is the major proline permease in C. albicans and what is known, and not known, about proline uptake would be good to mention, given that PUT gene functions require that proline enters the cells.

      We have recently shown that ammonia generation by Gdh2 is dispensable for macrophage escape and documented that phagosome alkalinization is not a requisite for the induction of hyphal growth (Silao et al. 2020). We have referred to the work of Garbe et al., which is consistent with our previous work (Martinéz and Ljungdahl, 2004) where we reported that proline-dependent filamentation is dependent on Csh3. Csh3 is an ER membrane-localized chaperone responsible for catalyzing the proper folding of amino acid permeases, in csh3 null mutant strains, amino acid permeases accumulate in the ER as non-functional unfolded aggregates. Consistently, we have tested and found that proline-induced Put2-GFP expression is dependent on Csh3 (unpublished), clearly establishing that the regulatory effects of proline are dependent on its uptake. We have not generated a gnp2-/- strain, but suspect that we could find growth conditions where such a mutant would be refractory to proline induction. We have adjusted the text to include this information.

      1. 116: Is the "low sugar environment of the host" referring to a specific niche, such as the GI tract, or human blood? Compared to most natural environments, glucose is abundant in the host, e.g., at ~5 mM, it is the most abundant metabolite in blood, and similarly, in the GI tract, levels can go beyond 50 mM glucose (see e.g. PMIDs 34371983, 21359215). Or is this comment indicating that the in vivo sugar concentration is lower than that in common lab growth media? Please spell out the niche/concentration for clarification - and compare that to other niches that are considered "high sugar environments".

      We have adjusted the text to clarify our statement. The natural environment of C. albicans is the human host. Virulent infections are not within the GI with high sugar content, but rather result when C. albicans cells successfully cross into the blood with a relatively low glucose (5 mM), which importantly is a level that does not effectively repress mitochondrial function. A major point of our recent work is that laboratory experiments with C. albicans growing on YPD or SD with 2% glucose (111 mM) examine growth of cells with repressed mitochondrial functions.

      1. 123: "proline as sole energy source" - suggest "is the source of carbon, nitrogen, and energy"

      The text is adjusted (see response to Minor Point 1).

      1. 142: it is worth noting to readers that C. neoformans is a basidiomycete and thus VERY distant from the other yeasts studied here-it is in a different major phylum of fungi.

      Again, thanks for this suggestion, the text is adjusted. We included C. neoformans since the role of proline catabolism has been characterized and linked to its pathogenicity (reviewed in Christgen and Becker, 2018, Antioxi Redox Signal, Ref. 1).

      1. 143: Here it is implied that put1 and put2 mutant strains do not grow on SPD, but this is not stated explicitly.

      The put1 and put2 mutants are unable to grow in/on all media containing proline as sole nitrogen source. The phenotype is very tight that we were able to exploit this as a selection phenotype for reconstitution (Fig. 1A). We have adjusted the text to make this clear.

      1. 151: The abbreviation SPG is not explained in main text. This was explained in the methods (1% glycerol as primary carbon source).

      As suggested, we have defined SPG in the main text.

      1. Paragraph 156 onwards: this section is particularly hard to read and very dense. Also, it is difficult to understand the significance of these experiments for the overall findings of the paper. Please at least provide a small conclusion / summary at the end of the paragraph that puts the findings into perspective.

      We have adjusted text to make it more accessible.

      1. Figure 2 C: simplifying the scheme (e.g. lots of redundant information, P2 and Mito - just give it one name) would help. This figure may be better in the supplementary material.

      The schematic of our subcellular fractionation study uses standard designations routinely used by the cell biology community. We believe that its inclusion will help readers judge the how we mapped the intracellular localization of the reporter proteins, which is essential to understand the proline catabolic network.

      1. Figure 2B: It is not directly apparent from the micrographs that Put1-RFP localisation is mitochondrial. Co-localisation of the RFP with a mitochondrial dye (e.g., mitotracker) or something similar is required to validate it.

      We have previously reported that Put2 is a bona fide mitochondrial protein (by confocal microscopy, subcellular fraction, and co-localization with Mitotracker (Far Red) (Silao et al., Ref 17). The fact that the Put1-RFP associated fluorescence exhibits a distinct mitochondrial signature, is spatially exclusive and exhibits no overlap with the cytosolic pattern of Gdh2-GFP, co-fractionates with Put2-HA and the mitochondrial marker Atp1, should suffice to confirm that Put1-RFP is a mitochondrial localized protein.

      1. Throughout the manuscript (figure legends): Suggest using "mean" instead of "Ave."

      We have adjusted the legends.

      1. 175: According to the 'Yeasttract' and 'Pathoyeasttract' databases, Put1 regulates at least 36 and 22 genes, in S. cerev. and C. alb., respectively (based on DNA binding and/or regulatory changes). The only gene in common between these two lists of genes is PUT1. Thus, it is quite likely that Put3 regulates many other processes that explain its function and that its major function may not be only to regulate Put1.

      We assume that the reviewer is referring to Put3 (instead of Put1). Yes, Tebung et al. (2017) suggested that Put3 also regulates other genes. However, their data show that C. albicans put3 mutant was unable to grow in medium (YCB+Pro) compared to SPD (2% glucose as carbon source) where proline is used merely as a nitrogen source (Tebung et al., Fig. 3A). Our data in Fig. 1C shows that a put3 null strain exhibits residual growth on SPD, which aligns well with the expressed levels of PUT enzymes (Fig. 2D). Our conclusion is that despite being essential for rapid proline-dependent derepression of proline catabolic genes, Put3 is not the only transcription factor operating at the promoters of the PUT genes.

      1. 175: Is it clear whether the Put3-independent mechanisms are positive or negative with respect to Put1?

      We have accumulated evidence that an additional transcription factor positively regulates PUT1 expression and have a manuscript in preparation to describe this factors. The manuscript will focus on the Put3-independent regulation of PUT1, PUT2, and GDH2 expression.

      1. 218: Suggestion: "growth was indistinguishable".Unless growth curves or growth rates are provided and if one time-point data are the basis for this point, than "rates" is not a relevant term.

      The reviewer is correct; we will adjust the text accordingly. We have performed growth assays in a multi-well microplate format (Bioscreen) and found that the growth rates are not statistically different between WT, put1, put2, and put1 put2 strains in the presence and absence of proline in SD with 2% glucose. This is consistent with glucose repression of mitochondrial function, i.e., proline toxicity depends on derepression of mitochondrial function.

      1. 256 onwards: did the authors test if the ROS scavenging effectively reduced ROS? i.e. does the luminol-HRP assay yield less ROS in +proline +scavenger treatment? This is necessary to effectively conclude that the growth inhibitory effect of proline is due to blocking respiration.

      Indeed, we used NAC as a control in the luminol-HRP system and we saw reduction in ROS formation. In fact, this is the underlying reason why we used high levels of NAC for growth rescue (in Fig. 3D). We include the control data as Fig S3F.

      1. The Figure captions are extremely lengthy and detailed, making it cumbersome to find the relevant information. Suggest moving some of the information, such as additional experimental details, into the methods section.

      We have streamlined the figure legends.

      1. 277-301: Phloxine is not exclusively a live/dead cell indicator-it is an indicator of metabolic activity. In Scerev. and Calb. it also indicates slower growth, opaque growth, and it has been used as an indicator of aneuploidy in C. glabrata (https://journals.asm.org/doi/10.1128/msphere.00260-22) and of diploids vs haploids in S. pombe. The colonies illustrated aer made up of many live cells, and thus the section "Defective proline utilization is linked to cell death" needs to be presented more carefully. In addition, it appears that this section shifts from using defined medium to using rich medium and 37C instead of 30C. Why was this shift necessary?

      The reviewer is correct that phloxine (PXB) has been used to identify opaque growth (EFG1-dependent). However, the fact that the accumulation of PXB in the put mutants is evident in both SC5314 and cph1 efg1 backgrounds (Fig. 3G and Fig. S4C) suggests that we are not assaying opaque switching. We mention that we have observed an increase in the number of PI+ cells in put mutants under similar conditions, but as we pointed out, we were unable to reliably quantitate this by FACS due to the clumping of put mutants. Zheng et al 2022, the paper cited by the reviewer, used PXB to assess the ploidy of C. glabrata strains, but their assay was developed using 5 μg/ml PXB, half of the concentration we used. The homogenous accumulation of PXB as the macrocolonies grow (Fig. 3G), suggests that the accumulation is not a consequence of spontaneously occurring ploidy variations. Thus, we believe that the accumulation of PXB does indeed reflect enhanced cell death. The point here is to trace the consequences of proline toxicity and to test the dependency on mitochondrial function. We used complex media, which contains multiple nitrogen sources (amino acids, peptides), to specifically highlight the contribution of proline catabolism in the fitness of C. albicans. The put1, put2 and put1 put2 mutants grow normally on YPD+PXB (30 oC) without accumulating the dye; we only observed visible PXB uptake in put2 after 2-3 days in mature macrocolonies. We attribute the gradual increase in PXB accumulation to be a consequence of glucose becoming limiting, derepressing mitochondrial functions, a requisite for proline toxicity. Consistently, the accumulation is more evident in cells grown on non-fermentable C-sources (Fig. 3G and Fig S4C).

      1. 295-301: Related to the point above, these results are hard to interpret due to the switch from defined medium in all prior experiments to rich growth medium here. Also, it is not clear why a 48h old YPD culture was chosen to show that the degree of PI staining correlates with mitochondrial activity - is this due to the culture age? It would be more clear to image cells grown on glucose vs. glycerol/lactate, or under repressive / de-repressive glucose concentrations (e.g., as shown in Fig. S4C where a PI+ difference is apparent for 0.2% glucose vs. 2% glucose at 30 oC).

      See response to Point 19 for our rationale to switch to rich medium. We have adjusted the text to enhance its readability. In liquid YPD, all strains grow, however, we noticed that the put mutants tend to flocculate (sign of stress in yeast) when cells enter stationary phase, giving rise to erratic OD readings, particularly evident in the put1 mutant. At 48h, the cultures become dense and cells experience glucose limitation, derepress mitochondrial functions and exhibit maximal flocculation (Fig. S4D). In put mutants, the derepression of mitochondrial function results in proline sensitivity. We tested the notion that this would also increase cell death, which it does, see Fig. S4E.

      1. 313-14: The statement 'the invasion process was dependent on the ability of cells to catabolize proline' doesn't take into account that put mutant cells are defective in filamentous growth irrespective of their utilization of proline...and like the efg1 cph1 double mutant.

      Proline-induced filamentous growth is dependent on the catabolism of proline, which activates Efg1 and consequently the hyphal growth program. In Fig. 4A we show that put mutants grown on Spider media, initiate filamentation (as evidence by wrinkled colonies) but do not grow invasively (no halo). In Fig. 4B we developed and used a novel invasion assay to assess growth through a collagen plug. Similar to the control cph1 efg1 mutant, the put mutants exhibit drastically reduced capacity to penetrate through the plug, and reach the D10 media in the transwell (D10 = DMEM with 10% FBS). However, it is important to note that although these results are linked to two distinct processes - the filamentation defect of cph1 efg1 is due to the inability respond to multiple filamentation cues (e.g., CO2, 10% FBS, etc.), whereas the filamentation defect of the put mutants is linked to the inability to catabolize proline and to its toxicity. Clearly, the WT strain relies on proline catabolism, coming from one or three possible sources of proline (see response to Reviewer 3): 1) DMEM/F-12 medium used in the PureCol EZ Gel; 2) diffusion of nutrients up through the collagen from the recovery medium DMEM supplemented with 10% FBS; and 3) the proteolytic breakdown of collagen. Also, in contrast to the put mutants, WT cells are refractory to inhibition by proline.

      1. 316-327: The results of the experiment described can only be interpreted as an effect of proline catabolism if the three strains (efg1 cph1; put1; put2) have similar growth rates as yeast cells in vitro. Why weren't the cells competed directly (efg1 cph1 vs put cells)?

      We believe that the relevant comparisons are to WT. We recovered cells from the top of the collagen (see Fig. 4B inset) to monitor their ability to survive and grow on top of the collagen. We found that the ability to catabolize proline enables WT and cph1 efg1 cells to grow equally well (recovered similar ratio as starting input). This was not the case with the put mutants, they did not grow as well and almost 100% of the cells recovered were WT.23.

      Fig 6: The logical order of the experiments, and in the text, is: 1) 4 h window, 2) 26 h window and then 3) ex vivo. The cartoon in 6B should be in this order as well.

      Thanks for bringing this issue up. We have adjusted the figure and text placing the schematic time-lines in proper order.

      1. 337: it is not clear what the 'direct exposure...' is trying to tell us. Can this be made more explicit?

      The direct exposure means that the fungal cells are in contact with the culture media at the edges/border of the 3D skin model (see schematic diagram). Hence, fungal cells are in direct contact with 10% FBS, facilitating the observed filamentous growth. The inability of the put mutants to invade the skin model should be evaluated at the center of the artificial epithelium where there is likely a local increased concentration of proline stemming from the proteolytic activities associated with fibroblasts and keratinocytes.

      1. 340-346: Here proteins with high proline content were used to ask if they could be induce transcription of PUT1 or PUT2 RNA and protein. This experiment is designed only to test the role of these proteins to induce utilization of nitrogen, as glucose is included in the medium. Given that these proline-rich proteins need to be lysed by proteases before they can be imported, and since no import pathways were tested, the results appear to tell us that mucin is more readily digested to peptides that contain proline-but why that is the case is not clear and how it relates to proline utilization is also not clear.

      We thank the reviewer for raising this important point. First, we monitored protein not mRNA levels. We will adjust the text to provide better context for this experiment. Briefly, these experiments were initiated as we were perplexed as to why the wildtype cells took such a long time (14 days) to fully invade the collagen matrix (Fig. 4B); we naïvely assumed that fungal cells would secrete proteases to degrade the collagen and assimilate the liberated proline. In going forward, our experimental strategy was to incubate various proteins with a dense culture of cells in HBSS medium (pH 7.4) supplemented with low glucose (3.8 mM) and lactate (0.83 mM). This condition mimics interstitial fluid, where most broad range proteolytic enzymes are inactive or at least operating suboptimal. The results were clear; with the exception of mucin, the proteins did not stimulate Put1 or Put2 expression. We conclude that host-dependent processes play an important role on the release of the amino acids/peptides from these high-proline content proteins (see line 531-553 for discussion). The capacity of mucin to efficiently induce Put1 expression is interesting since mucin is abundant in the gut where systemic infections are thought to originate. It is important to be cautious here, we used a commercial mucin preparation (Sigma, 2 batches) that may contain degradation products, e.g., proline-rich peptides, that can easily be assimilated by C. albicans. Put1 expression is an excellent readout for proline uptake since its expression responds tightly to the presence of proline derived from exogenous supply or from intracellular conversion (Fig. 2D, S2A, S2B).

      1. 363-369 An alternative is that Put3 induces different proteins important for growth.

      We included this possibility in the revised text.

      1. 379-380-the conclusion for this paragraph is somewhat of an overstatement as there is no analysis of the degree to which proline utilization is a predictor of virulence. It simply shows that put mutants affect the ability to survive in neutrophils.

      We have adjusted the text.

      1. Discussion: The statement that "S. cerevisiae" evolved in high sugar environments is debatable. The natural niche could well be forest soil and tree bark, or insect/wasp guts with arguably little glucose around.

      The reviewer is correct, S. cerevisiae can be isolated from diverse environments with variable sugar contents, but it is the capacity to deal with high sugar environments that makes this yeast stand out in comparison to Candida spp. The unique attribute of S. cerevisiae have been exploited and truly benefited humankind in making alcohol and bread. We have amended the text to state this more accurately.

      1. 469-470-how strong is the 'correlation' between the ability to utilize proline and virulence? Given that different mutants had different effects in different models, this seems like a very loose 'correlation'; it would be good to have some quantitative measures to make this claim.

      We have used directed genetic approaches to determine whether a gene/protein is essential for virulence by testing them in currently available infection models. It is important to note that all virulence assays provided a consistent and clear read-out, namely that the inability to catabolize proline significantly reduced the expression of virulence characteristics. Presumably the differences we report are due to the specific nutrient composition (proline and metabolites feeding into the proline catabolic network) and physical parameters intrinsic to each model. In fact, the expression of virulence factors (i.e., hyphal growth) can significantly differ in different organs within a same mouse model (Lionakis et al., 2013) and that virulence outcomes can change depending on mouse background. We fail to see how this can be viewed as loose. This has not been shown before. Please refer to our response to major point 6.

      1. 500: Was the experiment was done in larvae, and not in adult Drosophila? Fig 5 legend says flies and shows a picture of a fly and larvae are only mentioned much later in the text.

      These experiments were performed using adult flies. We now include a reference regarding the levels of arginine in hemolymph in both larvae and adult Drosophila (Priyankage et al., 2012; Anal Chem).

      1. 512:Why is it presumed that proline accumulates in the mitochondria in put1 mutants? How strong is the presumption?

      Despite a great deal of efforts in many labs, the mechanism of proline transport across the mitochondrial membrane is not known. What has been shown in mammalian and plant systems is that proline can readily enter and accumulate in mitochondria where it is catabolized. (https://link.springer.com/article/10.1007/s00425-005-0166-z; https://www.sciencedirect.com/science/article/pii/0003986177902089). Our presumption that proline accumulates in the mitochondria is based on our finding that proline inhibits mitochondrial respiration when Put1, catalyzing the first oxidation reaction, is absent.

      1. 539: why are MMPs important for digestion of collagen? This is not clear at this point of the Discussion.

      In mammalians cells, some secreted MMPs have collagenase activity (e.g., MMP-1) that degrade proteins comprising the extracellular matrix, which releases proline. We emphasize this since the 3D skin model is comprised of dermal fibroblasts and keratinocytes that are known to secrete MMPs (Ref. 69).

      1. 574: Concluding sentence of this paragraph seems unsubstantiated. There are at least two defects in put2 strains-hyphal growth and growth in general, presumably because of P5C accumulation.

      See response to point 21. Proline-induced filamentous growth is dependent on its catabolism, which activates Efg1 and consequently the hyphal growth program. However, there are many potential cues in hosts that could induce hyphal growth in situ. Our finding that strains unable to catabolize proline do not filament, indicates that proline is a key modulator of virulence.

      1. Fewer abbreviations would make the manuscript easier for non-experts to read. For example, P5C is not defined in the abstract. Furthermore, if an abbreviation is not used more than 3 times, it is not necessary to provide it (e.g., mammalian proteins in the last paragraph).

      We have adjusted the text.

      typos:

      1. 82: should read 'is restricted to the mitoch...'

      2. 102-103: should read 'to evade macrophages'

      3. Fig. S4F is mislabelled as Fig. S4G.

      Thanks!

      **Referees cross-commenting**

      Overall, we stand by our initial assessment of the study. However, we were not aware of previous studies that investigated proline utilization in yeasts, as noted by Rev # 2 (https://onlinelibrary.wiley.com/doi/epdf/10.1002/yea.1845). The current study suggests that using proline as an energy/carbon source is more wide-spread, beyond pathogenic yeasts. Further, the C. albicans strain they used for this study (ATCC 10231) was apparently unable to grow on proline in the quoted paper. In light of this, we think the authors should reference this study, tone down the claims about the clear correlation of pathogenicity and proline utilization, and address this apparent discrepancy with the indicated Candida albicans isolate. We note that our review considered this a paper mostly of interest to specialists.

      Although other non-pathogenic fungi have been shown to use proline as pointed out by Reviewer 2, this metabolic attribute has not been previously tested in members of the pathogenic Candida spp. complex. We have included the reference and included a statement that many fungi, isolated from diverse environmental niches, can use proline as a carbon source.

      Reviewer #1 (Significance (Required)):

      1. The advance in this paper is conceptual for the proline utilization connection to virulence in a range of species and technical for the in vivo microscopy. Limitations are that the conceptual advance is based only on qualitative work in figure 1 and that the animal studies do not provide a conceptual advance, although the technical advance of in vivo visualization of kidney tissue is impressive and (to the knowledge of this reviewer) quite new as the only prior work was in mouse ears.

      In response to the reviewer’s comment regarding Fig. 1, although it is qualitative, it is very reproducible. We even tried several clinical isolates of S. cerevisiae and observed consistent behavior to the standard laboratory strains (i.e., they do not grow on SP medium where proline is used as sole carbon/nitrogen/energy source). We tried to quantify growth of all strain in liquid SP medium at 30 oC using a TECAN microplate reader, but then the results show very erratic reading among species (and replicates) as each behaves differently; C. tropicalis, C. krusei, and C. parapsilosis form pseudohyphae and clump readily, while C. albicans forms hyphae and pseudohyphae.

      2.The work fits well as an extension of the body of work from the corresponding author's lab with additions from the labs with expertise in models of infection.

      1. People interested in yeast metabolism and pathogenic yeast virulence will be the audience for this paper and as written it is for a specialized audience interested in pathogenic yeast metabolism and, perhaps, (although not mentioned at all in the text) for those who want to try PUT gene products as new drug targets.

      This was actually mentioned in the last paragraph of the discussion (line 581-582).

      1. Reveiwer expertise is in pathogenic yeast biology and yeast metabolism. Little expertise in high tech microscopy.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The study is part of the continuous work by the authors to dissect the mechanism of utilization of proline as a carbon source in Candida spp. In particular, this work shows that the inability to process proline leads to accumulation of the toxic intermediate P5C and subsequent inhibition of mitochondrial respiration and toxic effect on the cells. Furthermore, the study demonstrates that proline utilization is important for C. albicans kidney colonization. The experiments are meticulously designed and the study adds to the overall understanding of the metabolic utilization of proline as a carbon source and its potential relevance for infection.

      I find this work interesting, but the role of Put1 and Put2 in proline utilization is not particularly novel. The novelty here is the subcellular localization of the two proteins. Also, the importance of proline utilization for infection is unclear. The host-pathogen interaction assays are ambiguous as each assay gives different result. Lastly, the authors try to generalize the importance of use of proline as a energy source by other Candida spp.. This is not very surprising, given that it has been reported previously by others (example DOI: 10.1002/yea.1845) and that many pathogenic or closely related to C. albicans species use various amino acids, not only proline, as a carbon source.

      Yes, as reviewer 2, we are not surprised that many of the pathogenic members of the Candida spp. complex are able to use proline, but this needed to be checked. The fact that proline can be used as a sole carbon/nitrogen/energy source clearly set them apart from the paradigm yeast S. cerevisiae. A major question is what amino acids are important in the context of the host? To assess this, we have used mutations that specifically block proline utilization. Our past studies demonstrating that proline catabolism is rapidly activated in C. albicans cells phagocytized by macrophages indicates that proline is present in the phagosomal compartment. Furthermore, put mutations clearly affect virulence in flies and murine systems. We are at a loss to understand why the reviewer believes that our data, which consistently shows that proline catabolism is important, is ambiguous.

      The expectation that all three mutant strains, i.e., put1, put2 and put3, would behave identically in the different infection models reflects an unnuanced view of how infection works. In fact, differences considered trivial such as the use of mouse background can have a profound effects on virulence. Consequently, it is striking how the diverse infections models consistently and unequivocally demonstrate that proline catabolism affects virulence. Also, it should be appreciated that we are not testing mutations affecting proteins with many overlapping functions, where it may be appropriate to challenge claims as to their direct role in virulence. Here we tested mutants that lack the enzymes that catalyze proline utilization. A more reasonable expectation is that the virulence is commensurate to the specific nutrient composition of model systems (as asked by reviewer#1), which can fluctuate among models (see our response to the major comment 6 of reviewer 1). As it is not practical to precisely test the proline levels in the models, we have worked to identify and focus on critical phenotypes that can be analyzed in vitro. Our findings provide the basis for understanding the virulence and growth properties of the mutants in the context of the complex infection models.

      Moreover, the authors take C. albicans as an example to demonstrate the role of PUT in invasion and infection. Proline is known stimulus for hyphal growth in this species, but many other Candida spp., including C. auris, do not filament. So how, aside from supporting growth, proline is linked to infection in these species? I think the authors oversell the importance of proline in Candida spp. pathogenesis and should tone this part down or remove completely. A new story that validates the importance of PUT in non-albicans species can bring clarity to why and where proline is critical for survival and infection.

      The fact that proline supports growth in the host environment is one of the critical aspects of our work. The lack of appreciation for this finding represents a common misconception in infection biology. It is not just the ability to gain access to a host and initiate an infection that counts, it is equally important to sustain growth and to thrive within the host. Thus, the adaptation to the host environment is critical. Here we document that proline catabolism not only initiates but sustains an infection acting as a critical carbon/energy source. The inability of the put1 and put2 mutants, which are sensitive to proline, to grow and infect multiple models clearly suggests the substantial quantity of proline is accessible. Also, we have constructed C. glabrata (Fig. S1C) and C. auris (not shown) strains that lack the ability to catabolize proline, and are currently characterizing the virulence properties of these strains. This is out of the scope of the present study.

      Major comments: I am not convinced by the data that proline is important to initiate infection. Candida infections of the kidney occur only at late stages of sepsis. The authors need more compelling data to prove that proline is important for infection in the host.

      Again, not sure why there is such skepticism here, regardless of whether kidney infections occur late, the fact that in contrast to WT, we do not observe put mutants filamenting, clearly suggesting that the capacity to catabolize proline plays a role in the expression of virulence characteristics of C. albicans. Based on our findings using IVM, which provides 3D information, we can at least conclude that a single isolated C. albicans cell can initiate hyphal growth, initiating a point of infection. In addition, our newly added whole human blood data suggests that proline catabolism is required for survival in the blood; human blood contains high amount of proline, arginine, and ornithine that are all catabolized via the proline catabolic network.

      Minor comments: I find the manuscript difficult to read and the discussion part is overly long. Some streamlining and adding a bit more explanation for the rationale of each experiment will make the work easier to follow. Some language/style needs refining as well.

      We have attempted to take this critique into account during the revision of the manuscript and have streamlined the text and added explanations regarding the rationale underlying our experimental approaches.

      **Referees cross-commenting**

      In this manuscripts the authors clarify the cellular compartmentalization of steps in proline catabolism. However, it is not novel that proline is a valuable carbon source. The role of proline utilization for establishing or progression of infection remains ambiguous even after the authors provide different in vivo results. The overall significance of the study is limited.

      Please refer to our comments below. We do not understand that the reviewers apparently question the obvious role of proline utilization facilitating virulence.

      Reviewer #2 (Significance (Required)):

      The strengths of this study are in the experimental design and variety. The data is well presented and visualized. The limitations are as pointed above - I find it especially difficult to figure out where, in a real infection scenario (e.g. breach of the gut barrier and entry into the bloodstream) proline will be the primary energy source. To me the significance of this work is minor.

      C. albicans is the primary human fungal pathogen placed under the “Critical Priority Group” by WHO and yet our understanding of nutrient assimilation in this fungal pathogen is only a fraction of what is known in the model yeast S. cerevisiae, which has proven not to be the best paradigm for understanding the regulatory circuits operating in human fungal pathogens. This manuscript, as well as other recent publications, have revisited and corrected earlier assumptions regarding C. albicans growth, providing novel information that reflect important regulatory differences specifically relevant to the life of C. albicans in the host. For example, had it not been for the recent findings (Ref. 10, 18, 31) that show that proline utilization in C. albicans is not subject to nitrogen catabolite repression (NCR) and that glucose represses mitochondrial function, the perception in the field would remain that C. albicans cannot utilize proline as a carbon and/or nitrogen source in the presence of a “preferred” source of nitrogen, which is applicable in the blood that contains high concentrations of possible sources of carbon and nitrogen. Furthermore, the low but constitutive expression of Put2 and the tight highly responsive Put1 expression in response to proline (Fig. 2D, S2A, S2B), suggest that C. albicans is well equipped to productively anticipate proline availability depending on the host status, entirely consistent with its “opportunistic” character. The many incorrect and previously held assumptions regarding C. albicans, uncritically propagated in several influential reviews, likely have hampered efforts to develop novel antifungal therapies. We do not understand, nor accept the view that a more precise understanding of the proline catabolism is incremental.

      The type of question raised by the reviewer is exactly what we hope to achieve in the future but to get there we have to have correct assumptions in place, and this is only possible if we have a more thorough understanding of the regulatory mechanisms driving proline utilization in C. albicans. The idea that certain proteins are refractory to degradation by C. albicans suggest that other external factors are triggering the release of amino acids from these proteins. This work however, suggest that proline is likely accessible in the gut due to the presence of proline-rich proteins like mucin (Fig. S5A/B).

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The manuscript of Silao et al. describes an in-depth investigation of the role of Put1 and Put2 enzymes in proline catabolism and virulence in Candida albicans. This is an extension of previous work in this system. The basic biochemistry and genetics are solid and support the role of these enzymes in the proposed pathway and provide evidence that the build up a toxic intermediate in the absence of Put2 is likely involved in the poor growth of the strain when proline is the only carbon source.

      Note that we observe the toxic effects of proline even when it is not the sole carbon source, however, and importantly, toxicity is dependent on mitochondrial function, which is repressed by high levels of glucose. Proline toxicity is observed when glycerol/lactate are present as carbon sources in addition to proline. Under these conditions, mitochondria are not repressed and exogenous proline impairs growth, particularly evident in put2 cells that accumulate the toxic intermediate P5C.

      The conclusions regarding its role in virulence are less convincing, particularly the data derived from the collagen invasion assay, the ex vivo skin model and the ex vivo/in vivo imaging. The survival and fungal burden assays support a modest role in virulence and a modest reduction in infectivity (although the presented data for survival does not have statistical significance data reported for the kaplan analysis.

      See below for response regarding collagen assay. We have included the significance values derived from Kaplan analysis in the revised Fig. 5B.

      The manuscript is clearly written. The methods are well described.

      **Referees cross-commenting**

      I remain unconvinced of the broad significance of the advances and stand by my assessment that this is for the most part a reasonable study but does not move the field forward. The novel technical aspects are either extensions of previous in vivo imaging or are not well controlled (collagen invasion assay)s.

      See below for response.

      Reviewer #3 (Significance (Required)):

      This is a detailed study of an area that is fairly mature and thus will be of interest to those in the field but does not represent a large advance and is thus truly incremental.

      See below for response.

      Major limitations of the work are as follows. First, the collagen invasion assay may be flawed. The recovery media is made with DMEM which is a medium that lacks proline and is fairly stringent. Control experiments need to be done to be sure that the mutants grow in the recovery medium. Second, the data from the RHE model are hard to interpret since so few cells are present in the tissue. It is hard to see if there are few filaments of if there are just too few cells to assess in the tissue. Third, in vitro experiments assessing the filamentation of the mutants in the medium in which these assays are preformed need to be done as controls. Candida albicans filaments in many conditions such as tissue culture medium. Spider medium is a strong inducer of filamentation but is very different than in vivo/ ex vivo conditions.

      Related to the collagen invasion assay, there is a misunderstanding. The reviewer appears to confuse the put mutations with proline auxotrophy. The put mutants are proline prototrophs and can synthesize proline as they possess a full repertoire of biosynthetic enzymes. In contrast, the put mutants cannot utilize proline to obtain nitrogen or energy. In fact, the presence of excess proline imposes toxicity to the put mutants. There are three possible sources of proline. 1) PureCol EZ Gel is a ready-to-use collagen solution that forms a firm gel when warmed to 37 °C. It contains purified Type I bovine collagen (5 mg/ml) dissolved in DMEM/F-12 medium, which has multiple amino acids, including a substantial amount of arginine. 2) The recovery medium DMEM supplemented with 10% FBS. The presence of FBS provides amino acids and induces filamentous growth. As the reviewer points out, C. albicans grows in this media and exhibits filamentous growth. 3) The proteolytic breakdown of collagen is expected to liberate proline. Consequently, the poor growth of the mutants clearly demonstrate the importance of proline catabolism. Also, the fact that we recovered put mutants surviving on top of the collagen (Fig. 4B, inset) suggests that they remain viable but simply are unable to efficiently invade the collagen. Consistently, microscopic inspection of the wells of the put mutants showed extremely few or even complete absence of invading cells in the recovery medium. We will adjust the text and provide a more detailed description of the experimental set-up. In summary, the main concern of the reviewer with respect to lack of proline is not relevant.

      Regarding the 3D-skin model, equal numbers of fungal cells were applied on top of the RHE. To avoid overgrowth, only low numbers (100 C. albicans cells) can be applied for the WT strain, and consequently for all other strains. In contrast to WT, which clearly proliferates, the apparent low level of put1 and put2 cells at the center of the 3D skin model is the consequence of poor growth. The upper layer of the RHE consists of stratified keratinocytes. To grow, WT fungal cells obtain proline either directly from the keratinocyte, from secreted proteases that liberate proline from keratin (proline not as abundant in keratin as in collagen, the main component of the dermis), or from the medium that basolaterally feeds the RHE. At the border of the model leakage from the medium can occur. Our results, showing poor growth of the mutants in the center of the 3D-skin model, entirely consistent with the collagen plug experiments, indicates that proline catabolism plays a determinant role to enable invasive growth.

      Lastly, the imaging experiments are highly problematic. First, reference must be made to previous ex vivo imaging reported by the Lionakis lab in 2013. Second, the number of cells imaged is so low that there is no power to make any conclusions. At 24 hr, the mutants may be delayed in filamentation or they may be delayed in establishing infection. There is no way to know what is causing the apparent lack of filaments. This technique as presented is not any higher resolution than traditional histology and in fact histology would provide a more convincing case for reduced filamentation.

      These considerations significantly reduce the overall significance of the work.

      I work on Candida albicans.

      We thank the reviewer for highlighting the beautiful study by Lionakis et al which document the host response, specifically the role of macrophages in mitigating C. albicans infection of the kidney. However, the reviewer apparently failed to recognize that their method is completely differed from ours. Lionakis et al. performed ex vivo imaging of kidney slices using regular confocal imaging, and the authors express an awareness regarding the limitations of this approach. In fact, these authors even state in their discussion that intravital microscopy should be pursued in the future to further investigate Candida-macrophage interactions in the kidney. Also, they point out that kidney-specific factors seem to facilitate rapid filamentous growth of C. albicans. In our work, we have experimentally addressed both of these astute statements. To our knowledge, our work is the first report of imaging a Candida cell infecting a kidney in a living mouse, which on its own is a major development and achievement considering the complexity of the kidney microenvironment. The finding that the put2 mutant does not exhibit filamentous growth in the kidney of a living mouse (24 h) is striking and strongly suggests that a substantial quantity of proline, or amino acids (e.g., arginine) that are metabolized via the proline catabolic network, is present in the kidney. This is clear based on finding that WT C. albicans cells respond accordingly to initiate hyphal growth. Consistent to this, it is well documented that the kidney is a major metabolic hub for arginine and proline metabolism. The work by Lionakis aligns remarkably well with our previous and current work in that put mutants exhibit greatly reduced survivability in co-culture with macrophages and do not evade these primary immune cells due to their inability to induce filamentous growth within the phagosome (Silao et al., 2019). We have adjusted the text to include a discussion that places our work in the context of the Lionakis work.

      We have added a Fig. 6C showing an example of the scanned area of the kidney. Further we added the following in the revised legend to indicate that large areas of kidneys were imaged in our assessment of fungal growth and filamentation:

      “Sites of colonization where localized using a spiral scan in the Las-X Navigator-module in the FITC channel. The entire area of the renal surface attached to the glass imaging window was scanned; circles highlight examples of regions of interest (ROI) exhibiting stronger and deviating fluorescence from the background. Each ROI was examined in detail using FITC, yEmRFP and autofluorescence. Scale bar, 500 µm.”

      CONCLUDING STATEMENT – SUMMARY RESPONSE:

      Our current work is based our previous discovery that proline metabolism provides energy to induce and support filamentous growth (PLoS Genetics, 2019). This turned out to be important since we also discovered that C. albicans cells depend on mitochondrial proline metabolism to evade engulfing macrophages, implicating this process as being an important virulence determinant. Consistently, using time-lapse microscopy, we subsequently found that proline catabolic enzymes are rapidly induced in C. albicans cells upon phagocytosis by macrophages. These results demonstrated that proline is present within phagosomes. As exciting as these findings are, they focused on a single phenotype, i.e., filamentation, and were obtained using in vitro experimental approaches. These results demanded that we pursue additional avenues to further characterize and test the in vivo relevance and merely provide a solid background for the current work.

      In contrast to reviewer 2 and 3, we do not believe that our finding that proline catabolism plays such a critical role in virulence as being merely “incremental”. We also could not have foreseen that the ability to use proline as an energy source is a common feature of multiple fungal pathogens capable of causing human disease. This is conceptionally very important in that human fungal pathogens, unlike the well-studied yeast Saccharomyces cerevisiae, are not readily found out in nature, and thus have evolved to use a similar spectrum of nutrients as host cells, including cancer cells. It is important for the fungal pathogen community to realize that regulatory switches operating in C. albicans are wired substantially differently to those in S. cerevisiae, and are likely optimized to reflect the actual condition in the host environment. The growing appreciation that diverse cancers are able to shift metabolism to exploit proline as an energy source is strikingly and fascinatingly similar to our findings with pathogenic fungi. This represents a conceptual advance in that it points to the wealth of proline stored within extracellular matrix proteins as providing a potential and significant source of energy for virulent fungal and cancerous growth.

      Finally, we strongly believe it is improper to extrapolate virulence properties based on in vitro findings, and that it is essential to actually test host-microbial pathogen interactions using refined in vivo models. Our successful use of advanced intravital microscopy goes beyond traditional and accepted murine infection models and has provided us with a unique state-of-the-art vantage point. Our findings that a single C. albicans cell is able to initiate and establish a site of infection in a kidney within a living mouse is itself important, and coupled to the novel finding that hyphal development at sites of infection depends on the ability of the fungal cells to catabolize proline must reflect the physiological conditions in the kidney. This is not an incremental finding, and we do not understand that reviewers 2 and 3 diminish the significance of these findings. Clearly, our manuscript provides a strong foundation for more detailed and advanced studies.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      Silao et al make the intriguing observation that yeasts that are generally considered less pathogenic are unable to catabolize proline than Candida albicans. They then, in Candida albicans, construct mutants defective for the two key enzymes (Put1, Put2) required to convert proline to glutamate, which they show to be essential for proline utilization as an energy (carbon) and nitrogen source. The authors proceed to untangle the regulatory aspects of proline degradation, including the respective cellular localization of its key enzymes. They then make the important discovery that strains lacking either Put1 or Put2 suffer from a proline-dependent growth defect, which they attribute to resulting defects in mitochondrial metabolism.

      The manuscript then goes on to analyze a broad range of infection models including: reconstituted human epithelial skin model, Drosophila, mouse systemic infections, organ colonization in these mice (kidney, spleen, brain, liver and histochemistry of the kidneys) as well as survival when incubated with cultured human neutrophils. Finally, they use yeast cells constitutively expressing yEmRFP (so that yeasts can be distinguished from other host cells) and coated with FITC before incubation with the host cells (which coats the wall of the original cells, but does not spread to progeny) and they go on to perform an impressive set of analyses of C. albicans growth within mouse kidneys both in vivo and ex vivo, exploiting an implanted window together with intravital imaging with a two photon microscope at different time points. The system is impressive and visualizes tissue invasion by hyphal cells beautifully. Finally, they compare the intra vital images from WT and put2-/- cells and show that, as in vitro, put2-/- cells do not form filaments and do not show extensive invasion of the kidney tissue. While the in vivo aspect of the study includes many different models, it finds defects in virulence for different subsets of put mutants and the relative importance of filamentation vs proline utilization for virulence is not conclusively resolved.

      Overall, this is an important and timely manuscript, which significantly contributes to the understanding of how proline metabolism intersects with yeast fitness in the context of infections. However, there are several major concerns regarding some of the conclusions drawn from the study. In addition, some general recommendations that would improve the manuscript are provided.

      Specifically, the manuscript provides a very detailed description of experiments and observations. However, in several parts it is difficult to follow and the the reader needs more guidance about the logic involved in reaching conclusion. Specifically, several aspects of the paper are written for experts in Candida (yeast) metabolism. Here, explaining the rationale for some of the experiments, and providing more background information that is not obvious to a non-expert, is required.

      In particular, writing a clear and measured summary sentence at the end of each paragraph and a conclusion paragraph that summarizes key findings in simple terms would help make the manuscript more digestible for readers.

      In addition, the impressive microscopy and broad range of in vivo experiments is comprehensive but only adds incremental information relevant to proline metabolism-that filamentous growth in vivo and virulence is reduced in cells carrying some mutations in one or more put genes. However, this broad sweep of model systems and the development of the in vivo imagining system might have more impact in a separate paper focused on the real-time in vivo visualization of kidney invasion.

      Major comments:

      1. The main finding that impressed this reviewer is that "removing the ability to catabolize proline, in an organism that evolved to catabolize it, leads to (growth) defects". This point could be better highlighted throughout the manuscript.
      2. The authors show that deletion strains for proline metabolism have defects that are important for in vivo pathogenicity. This is an important finding. However, as the manuscript reads now, it suggests that the main findings are that the ability to use proline in the respective host niche is key. Mechanistically, the manuscript revolves primarily around defects that arise when deleting PUT1 and/or PUT2 (i.e., an "unknown" toxicity of proline in the case of put1-/- (or put1-/- put2-/-) and the additional P5C-dependent toxicity for put2-/- mutants; see below).
      3. In order to claim that catabolizing prolines promotes pathogenicity (as opposed to the alternative hypothesis that the inability to catabolize proline leads to the observed defects), additional experiments would be required. For example, the put mutants would need to be compared with mutants that significantly reduce/impair proline uptake, such as the referenced gnp2 mutant (Garbe et al 2022). While the finding that less pathogenic yeast species are unable to catabolize proline is both intriguing and important, it also remains as is presented as a loose, non-quantitative correlation that only tangentially address the question of whether "proline catabolism is key for pathogenicity".
      4. 238 onwards: The conclusion that "the primary growth inhibitory effect of proline is linked to catabolic intermediates formed by Put1 and that are metabolized further by Put2"does not appear to be fully supported by the evidence. Addition of proline to put1 mutants already reduced OD600 by ~50% (Figure 2); and is further reduced to ~10% when put2 is deleted. This implies that there are two inhibitory effects of proline, not one primary one. At the least, this option should be discussed, including why deletion of PUT1 leads to proline toxicity. The latter is not clear-is it that too much proline accumulates in the cell and this accumulation is toxic? If this is the case, the effect would be expected to be proline concentration dependent. Performing a relatively simple experiment as performed for the put2 mutant (Fig. 3 / S3F) may clarify this issue. Particularly, if the experiment would be coupled with intracellular quantification of proline.
      5. The caption "P5C mediates a respiratory block" is misleading, as the evidence is not that compelling: Although P5C increases in put2, but not in put1 mutants, and given that both single mutants experience a proline-dependent respiratory defect (Fig. 3E), the results suggest a more complex relationship.
      6. The virulence assays and in vivo experiments do not present a unifying view: in Drosophila put2∆∆ is less virulent than put1∆∆, which appears similar to put3∆∆. Given that put2 mutants grow slowly, likely because of P5C inhibition, this seems logical. However, in mice, put3∆∆ remains highly virulent while put1∆∆ and put2∆∆ results for survival are mixed. Furthermore, in 4 mouse organs, put1∆∆ and put2∆∆ are not significantly different from one another but are different from wt, while put3∆∆ has no significant reduction in CFU. Kidney histology shows very little invasion by put1 and put2 and more by put3, but visually put3 appears to invade much less than the WT, and the human neutrophil experiment shows effects of put2 or put3 but not put1. This leaves the reader rather confused. It may be worth discussing the reasons for different results in different models. Is the availability of proline in each of the organisms and organs similar?
      7. The ex vivo and in vivo analysis of the dynamics of C. albicans growth in the host is visually impressive, but it distracts from the focus of the paper and the metabolic findings. Showing that put mutant cells do not form filaments in vivo (as in vitro) does not add much conceptually to the paper. Furthermore, this lovely advance in in vivo visualization is lost at the end of this paper and the authors should consider whether it might fit better in manuscript that could really highlight the in vivo visualization approach.
      8. The discussion of cells stained with FITC and expressing yEmRFP does not clearly point out that the FITC is only an indicator for those cells that were used to innoculate the tissue and that finding cells without FITC indicates that they are mitotic progeny, indicating that they have been dividing. The authors clearly understand this, but a naive reader may miss this important point if it is not stated explicitly.

      Minor comments:

      1. Throughout: what is the distinction between utilization of proline for C or for energy? These terms seem to be used interchangeably.
      2. Introducing the schematic in Fig. 2A at the beginning of Figure 1, would help explain proline catabolism before delving into the growth experiments that rely upon this framework. This should include an explanation, for readers less familiar with the metabolic issues, of the main limitations to catabolizing proline, and the key issues for being able to use proline for nitrogen, carbon, and energy (potentially indicated in the overview figure, e.g. pointing towards gluconeogenesis etc.).
      3. Saccharomyces can only grow on proline as a nitrogen source, but not as energy/carbon source. Could the authors briefly mention or discuss why this is the case? This is not clearly apparent after reading the manuscript and it leaves the reader confused and trying to understand if the fact that proline is required for carbon utilization is a new finding of this paper or was already known. Do the authors think this is tied to the presence of complex 1 components in C. albicans that are not found in S. cerevisiae. Is this consistent for the pathogenic, but not the non-pathogenic yeasts analyzed in figure 1?
      4. 100: While Gdh2 is apparently an important enzyme for generating ammonium, why is it not necessary for macrophage escape and virulence as shown in reference 18? A recent paper from Garbe et al (ref 12) suggests that Gnp2 is the major proline permease in C. albicans and what is known, and not known, about proline uptake would be good to mention, given that PUT gene functions require that proline enters the cells.
      5. 116: Is the "low sugar environment of the host" referring to a specific niche, such as the GI tract, or human blood? Compared to most natural environments, glucose is abundant in the host, e.g., at ~5 mM, it is the most abundant metabolite in blood, and similarly, in the GI tract, levels can go beyond 50 mM glucose (see e.g. PMIDs 34371983, 21359215). Or is this comment indicating that the in vivo sugar concentration is lower than that in common lab growth media? Please spell out the niche/concentration for clarification - and compare that to other niches that are considered "high sugar environments".
      6. 123: "proline as sole energy source" - suggest "is the source of carbon, nitrogen, and energy"
      7. 142: it is worth noting to readers that C. neoformans is a basidiomycete and thus VERY distant from the other yeasts studied here-it is in a different major phylum of fungi.
      8. 143: Here it is implied that put1 and put2 mutant strains do not grow on SPD, but this is not stated explicitly.
      9. 151: The abbreviation SPG is not explained in main text.
      10. Paragraph 156 onwards: this section is particularly hard to read and very dense. Also, it is difficult to understand the significance of these experiments for the overall findings of the paper. Please at least provide a small conclusion / summary at the end of the paragraph that puts the findings into perspective.
      11. Figure 2 C: simplifying the scheme (e.g. lots of redundant information, P2 and Mito - just give it one name) would help. This figure may be better in the supplementary material.
      12. Figure 2B: It is not directly apparent from the micrographs that Put1-RFP localisation is mitochondrial. Co-localisation of the RFP with a mitochondrial dye (e.g., mitotracker) or something similar is required to validate it.
      13. Throughout the manuscript (figure legends): Suggest using "mean" instead of "Ave."
      14. 175: According to the 'Yeasttract' and 'Pathoyeasttract' databases, Put1 regulates at least 36 and 22 genes, in S. cerev. and C. alb., respectively (based on DNA binding and/or regulatory changes). The only gene in common between these two lists of genes is PUT1. Thus, it is quite likely that Put3 regulates many other processes that explain its function and that its major function may not be only to regulate Put1.
      15. 175: Is it clear whether the Put3-independent mechanisms are positive or negative with respect to Put1?
      16. 218: Suggestion: "growth was indistinguishable".Unless growth curves or growth rates are provided and if one time-point data are the basis for this point, than "rates" is not a relevant term.
      17. 256 onwards: did the authors test if the ROS scavenging effectively reduced ROS? i.e. does the luminol-HRP assay yield less ROS in +proline +scavenger treatment? This is necessary to effectively conclude that the growth inhibitory effect of proline is due to blocking respiration.
      18. The Figure captions are extremely lengthy and detailed, making it cumbersome to find the relevant information. Suggest moving some of the information, such as additional experimental details, into the methods section.
      19. 277-301: Phloxine is not exclusively a live/dead cell indicator-it is an indicator of metabolic activity. In Scerev. and Calb. it also indicates slower growth, opaque growth, and it has been used as an indicator of aneuploidy in C. glabrata (https://journals.asm.org/doi/10.1128/msphere.00260-22) and of diploids vs haploids in S. pombe. The colonies illustrated aer made up of many live cells, and thus the section "Defective proline utilization is linked to cell death" needs to be presented more carefully. In addition, it appears that this section shifts from using defined medium to using rich medium and 37C instead of 30C. Why was this shift necessary?
      20. 295-301: Related to the point above, these results are hard to interpret due to the switch from defined medium in all prior experiments to rich growth medium here. Also, it is not clear why a 48h old YPD culture was chosen to show that the degree of PI staining correlates with mitochondrial activity - is this due to the culture age? It would be more clear to image cells grown on glucose vs. glycerol/lactate, or under repressive / de-repressive glucose concentrations (e.g., as shown in Fig. S4C where a PI+ difference is apparent for 0.2% glucose vs. 2% glucose at 30{degree sign}C).
      21. 313-14: The statement 'the invasion process was dependent on the ability of cells to catabolize proline' doesn't take into account that put mutant cells are defective in filamentous growth irrespective of their utilization of proline...and like the efg1 cph1 double mutant.
      22. 316-327: The results of the experiment described can only be interpreted as an effect of proline catabolism if the three strains (efg1 cph1; put1; put2) have similar growth rates as yeast cells in vitro. Why weren't the cells competed directly (efg1 cph1 vs put cells)?
      23. Fig 6: The logical order of the experiments, and in the text, is: 1) 4 h window, 2) 26 h window and then 3) ex vivo. The cartoon in 6B should be in this order as well.
      24. 337: it is not clear what the 'direct exposure...' is trying to tell us. Can this be made more explicit?
      25. 340-346: Here proteins with high proline content were used to ask if they could be induce transcription of PUT1 or PUT2 RNA and protein. This experiment is designed only to test the role of these proteins to induce utilization of nitrogen, as glucose is included in the medium. Given that these proline-rich proteins need to be lysed by proteases before they can be imported, and since no import pathways were tested, the results appear to tell us that mucin is more readily digested to peptides that contain proline-but why that is the case is not clear and how it relates to proline utilization is also not clear.
      26. 363-369 An alternative is that Put3 induces different proteins important for growth.
      27. 379-380-the conclusion for this paragraph is somewhat of an overstatement as there is no analysis of the degree to which proline utilization is a predictor of virulence. It simply shows that put mutants affect the ability to survive in neutrophils.
      28. Discussion: The statement that "S. cerevisiae" evolved in high sugar environments is debatable. The natural niche could well be forest soil and tree bark, or insect/wasp guts with arguably little glucose around.
      29. 469-470-how strong is the 'correlation' between the ability to utilize proline and virulence? Given that different mutants had different effects in different models, this seems like a very loose 'correlation'; it would be good to have some quantitative measures to make this claim.
      30. 500: Was the experiment was done in larvae, and not in adult Drosophila? Fig 5 legend says flies and shows a picture of a fly and larvae are only mentioned much later in the text..
      31. 512:Why is it presumed that proline accumulates in the mitochondria in put1 mutants? How strong is the presumption?
      32. 539: why are MMPs important for digestion of collagen? This is not clear at this point of the Discussion.
      33. 574: Concluding sentence of this paragraph seems unsubstantiated. There are at least two defects in put2 strains-hyphal growth and growth in general, presumably because of P5C accumulation.
      34. Fewer abbreviations would make the manuscript easier for non-experts to read. For example, P5C is not defined in the abstract. Furthermore, if an abbreviation is not used more than 3 times, it is not necessary to provide it (e.g., mammalian proteins in the last paragraph).

      Typos: 1. 82: should read 'is restricted to the mitoch...' 2. 102-103: should read 'to evade macrophages' 3. Fig. S4F is mislabelled as Fig. S4G.

      Referees cross-commenting

      Overall, we stand by our initial assessment of the study. However, we were not aware of previous studies that investigated proline utilization in yeasts, as noted by Rev # 2 (https://onlinelibrary.wiley.com/doi/epdf/10.1002/yea.1845). The current study suggests that using proline as an energy/carbon source is more wide-spread, beyond pathogenic yeasts. Further, the C. albicans strain they used for this study (ATCC 10231) was apparently unable to grow on proline in the quoted paper. In light of this, we think the authors should reference this study, tone down the claims about the clear correlation of pathogenicity and proline utilization, and address this apparent discrepancy with the indicated Candida albicans isolate. We note that our review considered this a paper mostly of interest to specialists.

      Significance

      1. The advance in this paper is conceptual for the proline utilization connection to virulence in a range of species and technical for the in vivo microscopy. Limitations are that the conceptual advance is based only on qualitative work in figure 1 and that the animal studies do not provide a conceptual advance, although the technical advance of in vivo visualization of kidney tissue is impressive and (to the knowledge of this reviewer) quite new as the only prior work was in mouse ears.
      2. The work fits well as an extension of the body of work from the corresponding author's lab with additions from the labs with expertise in models of infection.
      3. People interested in yeast metabolism and pathogenic yeast virulence will be the audience for this paper and as written it is for a specialized audience interested in pathogenic yeast metabolism and, perhaps, (although not mentioned at all in the text) for those who want to try PUT gene products as new drug targets.
      4. Reviewer expertise is in pathogenic yeast biology and yeast metabolism. Little expertise in high tech microscopy.
    1. Author Response

      Reviewer #1 (Public Review):

      Various parts of the premotor cortex have been implicated in choices underlying decisionmaking tasks. Further, norepinephrine has been implicated in modulating behavior during various decision-making tasks. Less work has been done on how noradrenergic modulation would affect M2 activity to alter decision-making, nor is it clear whether noradrenergic modulation effects on activity would differ between the male and female sexes.

      This manuscript addresses some of these questions.

      • In particular, clear sex differences in task engagement are seen.

      • May also show some interesting differences and distributions of β2 adrenergic receptors in M2 between males and females.

      We thank the reviewer for their summary of our findings and thoughtful critique of our manuscript. In our revised manuscript we have taken measures to address the reviewer’s comments in line (blue edits in text and revised figures) with direct responses outlined below. We believe these revisions improve the scientific rigor of our findings and provide relevant context for our studies. We hope that they have sufficiently addressed the reviewer’s concerns.

      Less clear is the specificity of systemic antagonism of β adrenergic receptors on the changes in M2 activity reported. As propranolol was given systemically, changes in M2 firing rates could also be due to broader circuit (indirect) activity changes. As it was not given locally, nor were local receptor populations manipulated, one is unable to make the conclusion that changes in neural activity are due to the direct effects of adrenergic receptors within M2 populations.

      We agree that propranolol driven changes in anterior M2 activity may arise via multiple mechanisms, including direct action on the adrenoreceptors within M2, and indirect action via other regions that project to M2. Although locally activating inhibitory interneurons within M2 is sufficient to disrupt cueguided action plans and behavior in a 2AFC task (Inagaki et al., 2018), our noradrenergic manipulation was not restricted to M2. We have clarified our conclusions and provided additional discussion to highlight that propranolol actions were multifaceted and that direct actions in M2 are likely working in concert with propranolol mediated actions in other regions.

      Also not clear, is the contribution of M2 to this task, and whether the changes in M2 activity patterns observed are directly responsible for the behavioral disruptions measured.

      We have revised our introduction and discussion to more clearly outline the critical role of cue-guided action plans in M2 for successful behavior in 2AFC tasks. Suppression of cue-guided activity in M2 results in behavioral performance at near chance levels, similar to what we saw in females after propranolol (Guo et al., 2017; Inagaki et al., 2018; Li et al., 2016). Furthermore, targeted photostimulation of action plan encoding neurons in M2 is sufficient to drive behavioral responses (Daie et al., 2021). In our investigations it is plausible to expect propranolol related disruptions in other cognitive, sensory or motor regions. Based on the strong foundational evidence for M2 activity in 2AFC, the propranolol driven changes in anterior M2 in females, whether direct or indirectly mediated, are likely sufficient to drive behavioral disruptions in accuracy and/or trial completion.

      Reviewer #2 (Public Review):

      This paper by Rodbarg et al describes an interesting study on the role of beta noradrenergic receptors in action-related activity in the premotor cortex of behaving rats. This work is precious because even if the action of neuromodulatory systems in the cortex is thought to be critical for cognition, there is very little data to actually substantiate the theories. The study is well conducted and the paper is well written. I think, however, that the paper could benefit from several modifications since I can see 3 major issues:

      We thank the reviewer for their generous comments on the potential impact of our manuscript as well as their suggestions to improve this work. Below we outline responses to specific comments raised by the reviewer in addition to adresing them in the revised manuscript. We hope these responses sufficiently address the reviewer’s concerns.

      Both from a theoretical and from a practical point of view, the emphasis on 'cue-related' activity and the potential influence of NA on sensory processing is problematic. First, recent studies in rodents and primates have clearly demonstrated that LC activation is more closely related to actions than to stimulus processing (see Poe et al, 2020 for review).

      Indeed during optimal performance the peaks of LC activity are larger when PETH are aligned to action initiation rather than the cue itself (Clayton et al., 2004). This alignment resolves variability in decision processing times and omitted cues. Although LC responses align with action they are evoked by, and occur after, cue presentation with LC responses to visual cues occurring ~ 60ms after presentation (Aston-Jones & Bloom, 1981). The same behavioral action without preceding task relevant cues does not evoke an LC response (Rajkowski et al., 2004)

      In our current study cues initiate activity in anterior M2, this is our primary interest and where our electrodes are placed. The window between cue delivery and action completion hones in on our goal of investigating the role for β noradrenergic signaling in target cortical processing, rather than LC explicitly. In both NHP and rodents NE signaling (and evoked LC) promotes sustained cortical representations between cue onset and actions across cortical regions (dlPFC, S1) (Ramos & Arnsten, 2007; Vazey et al., 2018; Wang et al., 2007). In the current study we aligned neural data to either cue presentation (Figure 3) or action (lever press; Figure 4). Both presentations support a critical role for β adrenoreceptor signaling in suppressing irrelevant information, resolving and maintaining action plans. A unique feature of aligning the data to cue onset is that it allows us to see how the neural activity changes not only on completed trials (that end with a lever press) but also on omitted trials (which strongly increase after propranolol). We propose the reason we are seeing large increases in omitted trials is because β adrenoreceptor blockade either directly or indirectly prevents anterior M2 from resolving an action plan.

      Second, the analysis of neural activity around cue onset should be examined with spikes aligned on the action, since M2 is a motor region and raster plots suggest that activity is strongly related to action (I'll be more specific below).

      We agree that M2 shows important action plan activity which we highlight throughout the manuscript. In cued tasks, M2 neurons have been shown to represent action plans starting at cue onset that continues up to behavioral execution. Neural data was examined and results presented aligned to cue onset (illustrated in Figure 3) and aligned to action - lever press (illustrated in Figure 4). The impact of propranolol in diminishing action plan selection was similar in both action, and cue-aligned analyses.

      The distinction between neural activity and behavior or cognition is not always clear. I understand that spike count can be related to motor preparation or decision, but it should not be taken for granted that neuronal activity is action planning. The analysis should be clarified and the relation between neural activity, behavior, and potential hidden cognitive operations should be explicated more clearly.

      We have worked to clarify in our revised introduction, results and discussion the specifics of the known roles of neural activity in M2 in both action planning and decision making. We further expand that the neuronal activity in our study may reflect potential changes in cognitive processing and thus alter resultant behavioral outcomes.

      The sex difference is interesting, but at the moment it seems anecdotal. From a theoretical point of view, is there any ecological/ biological reason for a sex dependency of noradrenergic modulation of the cortex? Is there any background literature on sex differences in motor functions in rats, or in terms of NA action? If not, why does it matter (how does it change the way we should interpret the data?) From a practical point of view, is there a functional sex difference in absence of treatment, or is it that the drug has a distinct effect on males vs females? This has very distinct consequences, I think.

      We did not find overt differences in behavior in the absence of treatment. Only when noradrenergic function was challenged using propranolol did we identify functional sex differences. We agree that this has very distinct consequences – specifically it supports sex differences that can be revealed by perturbations of normal function. These functional sex differences may be a result of differences in the anatomy of central noradrenergic systems, a hypothesis further supported by our mRNA expression findings and existing literature on LC anatomy across species (Bangasser et al., 2011, 2016; Luque et al., 1992; Mulvey et al., 2018; Ohm et al., 1997; Pinos et al., 2001). Collectively these results have potential ramifications for understanding sex differences in disease prevalence and targeted treatments.

      Background literature supports some innate sex differences in motor function and executive function in rodents and humans. Of particular relevance to our investigation is an established difference in behavioral strategy with females being more risk averse than males (Grissom & Reyes, 2019). Ethologically risk adverse strategies may support parental care roles, and increased inhibitory mechanisms may be selected for in females. Although this strategy was not directly tested in our study, the large increase in omissions after propranolol seen in females is in line with avoiding risk (incorrect choices) during uncertainty (disrupted neural signaling). As with other executive functions, the utilization of norepinephrine within the cortex along with other neuromodulators, and local microcircuit interactions would all contribute to promoting risk averse behavior.

      These issues could be clarified both in the introduction and in the discussion, but the authors might have a different view on what is theoretically relevant here. In the result section, however, I think that both the lack of specificity in the description of behavior and cognitive operation and the confusion between 'sensory' and 'motor' functions make it very difficult to figure out what is going on in these experiments, both at a behavioral and at a neurophysiological level. First, the description of the behavior in the task is clearly not sufficient, which makes the interpretation of the measures very difficult.

      We have made an effort to better specify the task and relevant behavioral operations in both the methods and results and have included a clearer task schematic (Figure 1A). We agree that the confusion between ‘sensory’ and ‘motor’ functions may make it more difficult to understand the findings in this study. Anterior M2 plays a unique role in representing motor/action plans that can be informed by sensory information. This integrative function creates difficulty in parsing the neural activity of anterior M2 as strictly motor, sensory or cognitive. In attempts to improve clarity we have expanded and highlighted relevant information on the known roles of M2 in the introduction and discussion.

      One possible interpretation of the effects of the drug is a decrease in motivation, for instance, due to a decrease in reward sensitivity or an increase in sensitivity to effort. But there are others. More importantly, none of these measures can be used to tease apart action preparation from action execution, even though the study is supposed to be about the former.

      Neural activity during action planning, prior to action execution is known to be an essential function of M2 (Barthas & Kwan, 2017; Gremel & Costa, 2013; Guo et al., 2017; Inagaki et al., 2018, 2022; Li et al., 2016; Siniscalchi et al., 2016; Sul et al., 2011; Wei et al., 2019) for optimal performance in 2AFC tasks. In all, we found that the representation/separation of opposing action plans (a well validated function of M2) prior to responses (lever press) is degraded after propranolol, especially in females. We have provided additional emphasis on these foundational studies throughout our revised manuscript.

      To minimize impact of motivational factors, effort and reward size remain consistent within our task, and all trials require a random initiation hold prior to cue delivery. As described in our general response to the editor above (Figure 1, above), we investigated whether motivational changes may be reflected in our M2 recordings. PETHs from the first and last 10 trials within saline sessions did not identify potential motivation related differences in anterior M2 activity. Similarly, across propranolol sessions the neural activity was consistent between early and late trials. We used early and late trials as there was a mild decrease in trial rate during saline sessions in both males and females, potentially indicative of motivation/reward sensitivity changes during these sessions. M2 neural responses consistently separate action plans (after saline) or failed to separate action plans (propranolol sessions).

      Also, but this is less critical: In Figures 2C and D, it looks like there is a bimodal distribution for the effect of propranolol in females. Is there something similar in the neuronal effects of the drug? And in the distribution of receptors? Can it be accounted for by hormonal cycles/ anything else?

      Although there is some clustering in behavioral outcomes all data passed normality assumption as appropriate. Propranolol treatments were not synchronized to hormonal cycles, and the data likely include animals at various hormonal stages. Similar clustering was not apparent in neuronal effects of propranolol, although propranolol increased variability in many measures.

      In a pilot experiment we did not see any difference in baseline performance on our 2AFC task across the hormonal cycle (diestrous, proestrous, estrous or metestrous) of females in any measure including accuracy (F(3,33)=0.59, p=0.63, one-way ANOVA) and omissions (F(3,33)=0.51, p=0.68).

      The description of neural activity is also very superficial. In general, it is not clear how spike count measures have been extracted. For example, legend and figure C are not clear, is the (long) period of cue presentation included in the 'decision time'?? "Cues were presented at a variable interval 200-700ms after initiation and until animals left the well, 'Well Exit'. The time from cue onset to well exit was identified as the decision time (yellow)." Yet on the figure only the period after cue presentation is in yellow. This is critical because, given the duration of the cue, the animals are probably capable of deciding (to exit the well) before the cue turns off. Indeed, as shown in fig 2D, the animals can decide within about 500 ms. So to what extent is the 'cue response' actually a 'decision response'?

      We have clarified the task and spike count measurements in methods and added a revised task schematic. It is correct that the cues are available throughout the decision time (for up to 5 seconds or until well exit), and an action plan is generated before well exit/cues turn off as reflected by the separation of neural action plans (Fig 3, saline). Anterior M2 neurons maintain action plan representation from cue onset until the lever press under normal conditions (Fig 4, saline). These action plans encapsulate “cue responses” and “decision responses”. We have aligned neural data to discrete timestamps at either end of the window in which M2 processing is known to be critical, specifically between cues and actions (lever press) and focus on neural activity relative to those points. We refer to this activity throughout the manuscript as an ‘action plan’ as action planning functions of M2 activity have been well established in prior studies.

      When looking at figure 3A, there is clearly a pattern on the raster, a line going from top left to bottom right. If the trials are sorted chronologically, something is happening over time. If, as I suspect, trials are sorted by ascending response time, this raster is showing that what authors are calling a 'response to cues' is actually a response around action. Basically, if propranolol slows down reaction time, the spikes will be delayed from cue onset only because they remain locked to the action. Then the whole analysis and interpretation need to be reconsidered. But it might be for the best: as I mentioned earlier, recent work on LC activity has clearly emphasized its influence on motor rather than sensory processing (Poe et al, 2020).

      Figure 3A is a single neuron example, and data analyses focus on population-wide activity. Neural data is presented both aligned to cues, for all trials in which a cue was received, and aligned to lever press (action), for all trials on which a lever press occurred. In both cases, aligned to cue or aligned to action, the impact of propranolol is the same. β adrenoreceptor blockade reduces the separation of action plans in M2, severely so in females. However, a major finding is that females receive a cue but omit a large number of trials after propranolol, for this outcome the action does not occur. We propose this is due to the lack of action plan separation in anterior M2 (either directly or indirectly). When no behavioral response occurs, these trials cannot be aligned to action, yet we are still interested in the neural activity during the critical window between cue delivery and actions. We are not assigning this neural activity to sensory processing but using this discrete sensory event within our trials (cue) to align the data as there is substantial evidence that action plans in M2 arise after cue presentation in tasks such as ours where performance is guided by external cues.

      Fig 2D-F: it is hard to believe that the increase in firing rate induced by propranolol in females is not significant. Presumably, because the range of the median firing rate is so high in the first place, distribution (2E) really indicates an increase in firing. Maybe some other test? e.g paired t.test, or standardized values (z.score) to get rid of variability in firing across neurons?

      We agree that the session wide firing rate appears rightward shifted in females after propranolol. As our recordings were taken on different days, several days apart we cannot assume they are the same neurons for paired analyses. In our revised manuscript we evaluated these distributions using a MannWhitney test to increase power and decrease the impact of variability within the population. Previously we had used a Kolmogorov-Smirnov test. Using our new analysis, we can confirm that the propranolol significantly increases session wide firing rates in anterior M2 of females (p=0.027) but not males. This finding increases evidence for direct actions of propranolol within M2 and supports our hypothesis that propranolol leads to local disinhibition by reducing β noradrenergic signaling in interneurons and that without this noradrenergic tone anterior M2 is less efficient at suppressing irrelevant action plans.

      Along those lines, would it be worth looking for effects on specific populations (interneurons) which are sometimes characterized by thinner spikes and higher mean firing rates? Given the distribution of beta receptors RNA on interneurons, one would actually expect an effect of propranolol on the firing rate irrespective of task events. Or what is it that prevents the influence of propranolol on interneurons from changing the firing rate? In any case, one of the strengths of this study is the localization of beta receptors on specific neuronal populations in the cortex, so I think that the authors should really try to build on it and find something related to the neurophysiological effects. Otherwise, one cannot exclude the possibility that the behavioral effects are not related to the influence of the drug on these receptors in that region.

      Data were collected using stainless steel electrode arrays and our sample population of task related neurons is likely biased to pyramidal neurons, with a small number of fast spiking interneurons. We used validated spike waveform parameters of interneurons in premotor cortex (peak-to-trough ratio and duration; Giordano et al., 2023) in an attempt to isolate putative interneurons and found only a very small number of these cells in our recordings (n=5-7 per group). This population is too small to make any inferences about specific impacts. We have focused on the collective population activity of M2 as this is most strongly related to optimal action planning.

      You are correct that from the given findings we cannot conclusively show that the results found here are a result of propranolol acting solely within anterior M2. We have made sure to clarify throughout our revised manuscript that the behavioral and physiological changes we identified are a result of collective direct and indirect actions of propranolol.

      The conclusion that neuronal discrimination decreases because the proportion of neurons showing no effect increases is confusing (negative results, basically). It would be clearer if they were reporting the number of neurons that do show an effect, and presumably that this number shows a significant decrease.

      The reviewer is correct that the number of neurons that do show an effect (task related activity) does significantly decrease with propranolol (from n=70 to 27 in females and n=71 to 48 in males). These n are now given adjacent to the proportions rather than at the end of the paragraph. Proportions were used for statistical analysis due to an overall decrease in the total number of units after propranolol. All PETH presented are from neurons that show some task related activity, these PETH confirm that neural activity no longer effectively discriminates/separates action plans in M2.

      Figs 3F-I: a good proportion of neurons (at least 20%) show a significant encoding before cue onset. How is it possible? This raises the issue of noise level/ null hypothesis for this kind of repeated analysis. How did the author correct for multiple comparison issues?

      In response to reviews, we have altered the manner in which we identify the significantly modulated neurons to increase rigor and no longer include these figures or analyses. The proportion of neurons showing action plan encoding prior to cue onset was likely an artifact of how the data was analyzed and an insufficient correction for multiple comparisons, allowing inclusion of internally generated action plans in some neurons.

      The description of the action-related activity is globally confusing. Again, how can the authors discriminate between activity related to planning vs action itself? What is significant and what is not, in males vs females? What is being measured here? For example, a very unclear statement on line 238: "Propranolol primarily disrupted active inhibition of irrelevant action selection in M2 activity, reducing the ability to maintain action plan representation in M2, delaying lever press responses (Figure 4L, 4M)." What is 'active inhibition? What is an irrelevant action plan? What is selection? All of that should be defined using objective behavioral criteria and tested formally.

      We have changed our wording to clarify what we are describing and why we have chosen the words we have, and to ensure consistency and objectivity throughout the manuscript. Much of the wording we have used – for example action planning or action plan selection, are the words used in the literature to describe M2 neural activity. We call the activity in M2 action planning (either externally/cue guided or internally guided) because that is what has been previously demonstrated. In our task design and analysis we are tracking cue guided actions, as opposed to internally guided.

      We also separate the electrophysiology data as preferred and nonpreferred because the literature has shown individual M2 neurons show specific directional tuning as noted in our results, using the term ‘preferred’ encapsulates that tuning regardless of left/right direction. An example M2 neuron that increases activity for left cues and responses (preferred direction), will show active inhibition (low/negative z scores) on trials with right cues and responses (nonpreferred), other neurons would show the inverse relationship with direction.

      A primary impact of propranolol was the loss of negative z-scores for nonpreferred trials ie neurons with a left preference that are usually inhibited on right trials were still firing and vice-versa. After propranolol neurons continue to fire for an irrelevant action plan (for the opposite direction), and the resulting population activity is not significantly different for opposing cues/responses. Behavioral responses normally occur after opposing action plans have significantly separated in M2, collapsing action plans by preventing relevant signaling (Guo et al., 2017; Inagaki et al., 2018; Li et al., 2016) or facilitating irrelevant signaling as we see here with propranolol leads impairments in 2AFC performance.

      Also, the description of the classifier analysis should be more thorough. Referencing the toolbox is not sufficient to understand what has been done.

      We have added additional explanation in both the methods and description of the results to clarify the functions of the neural decoding box and how we are using it to evaluate information encoding within M2. We have provided detail on how the algorithm was trained, how shuffled data was generated and how we determined significance of decoding accuracy.

      Measuring Beta adrenoceptors is a great idea, and the results are interesting, especially the difference between neuron types. But again, how does that fit with neurophysiological results? Note, that since this is RNA measures, it should not be phrased as 'receptors' but 'receptors RNA' throughout. One possible interpretation of these anatomical results that cannot be reconciled with physiology is that protein expression at the membrane shows a distinct pattern.

      We have changed the references to β receptor expression to β receptor mRNA expression throughout the manuscript. Although mRNA provides a valuable proxy for adrenoreceptor production, as noted by the reviewer protein expression at the membrane may differ. Reliable antibodies that allow quantitative analysis of membrane bound adrenoreceoptors in situ with co-labeling of specific cell types are limited. The goal of assessing mRNA expression within M2 was to determine if the functional sex differences we identified in M2 neurophysiology when manipulating β adrenoreceptor function could be mediated by basal differences in adrenoreceptors. The causal impact of differential mRNA expression in anterior M2 was not directly tested but our findings provide preliminary evidence that adrenoreceptor regulation may differ across sexes. Our results provide a plausible avenue for differential sensitivity to β adrenoreceptor manipulation across sexes, that may also be found in other brain regions.

      In conclusion, I think that this is a very interesting study and that the results are potentially relevant for a wide audience. But the paper would clearly benefit from revisions. If the authors could clearly identify a significant relationship between the action of NA on beta receptors on specific cortical neurons, at a physiological and behavioral level, that would be a seminal study. At the moment, the evidence is not convincing enough but the data suggest that it is the case.

      We thank the reviewer for the kind remarks. We have undertaken a number of new analyses, refined existing analysis and clarified our claims in the manuscript to improve rigor. Collectively our data reflect that the behavioral and neural deficits after systemic propranolol are likely due to both direct and indirect actions on M2. We believe this work is compelling and that it will inform future work investigating potential sex differences in central noradrenergic anatomy and functional sex differences after perturbations of noradrenergic signaling.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      (1) What's the rationale of trypsinizing the tissue prior to mitochondrial isolation? This is not standard for subsequent proteomics analysis. This step will inevitably cause protein loss, especially for the post mitochondrial fractions (PMF). Treating samples with 0.01ug/uL trypsin for 37oC 30 min is sufficient to partially digest a substantial portion of the proteome. If samples from different subjects were not of the same weight, then this partial digestion step may introduce artificial variability as variable proportions of proteins from different subjects would be lost during this step. In addition, the mitochondrial protein enrichment in the mito fraction, despite statistically significant, does not look striking (Figure 1E, ~30% mitochondrial proteins in the mito fraction). As a comparison, Williams et al., MCP 2018 seem to have obtained high mitochondrial protein content in the mito fraction without trpsinizing the frozen quadriceps using a similar SWATH-MS-based approach.

      Trypsinisation of the tissue prior to mitochondrial isolation is based on previous work and a Nature Protocol (1, 2) which isolated mitochondria for skeletal muscle. The rationale is that it aids in mechanical homogenisation from highly fibrous tissues such as quadriceps muscle by digesting extracellular matrix proteins. The trypsin/protein ratio used to aid in this process is at least 400 times lower than the amount of trypsin used for formal proteomic tryptic digestion. Three pieces of evidence suggest this step has negligible effect on downstream proteomic analysis. First, because the trypsinisation buffer is detergent free, trypsin will only affect extracellular or exposed membrane proteins. Filtering our PMF dataset for proteins with ‘extracellular matrix’ gene ontology identifies at least 90 unique extracellular matrix proteins indicating good retention of proteins susceptible to partial digestion. Second, the trypsin dose used is 50 times lower than the concentration used for passaging cultured cells, which retain viability after trypsinisation. Third, and contrary to the point raised by the reviewer, we observe less missingness in PMF samples compared to mitochondrial samples. We thank the reviewer for bringing the Williams et al. 2018 MCP paper to our attention. We note that mitochondrial enrichment between the two papers is comparable (~2- fold). To improve clarity line 408 now reads: “Whole quadriceps muscle samples were prepared as previously described with modification (99, 100). First, tissue was snap frozen with liquid nitrogen…” and line 95 reads: “Mitochondrial proteins were defined based on their presence in MitoCarta 3.0 (24) and consistent with previous work (25) were approximately two-fold enriched in the mitochondrial fraction relative to the PMF (Fig 1E).”

      (2) The authors mentioned that the proteomics data were Log2 transformed and median- normalized. Would it be possible to provide a bit more details on this? Were the subjects randomized?

      Samples were randomised prior to sample processing and mass spectrometry analysis. Because of possible variation in total protein content, it is critical to normalise protein intensities between samples. Median normalisation adjusts the samples so that they have the same median, thereby accounting for technical variation. Log2 normalisation helps to achieve normal distributions, critical for many downstream statistical tests. Line 471 now reads: “…to achieve normal distributions and account for technical variation in total protein.”

      (3) In Figure 1D, what were the numbers of mice the authors used for the CV comparisons in each group? Were they of similar age and sex? Were the differences in CV values statistically significant?

      The mitochondrial and PMF proteomes originated from the same quadriceps sample from the same mouse, and thus the age and sex are the same across both proteomes. After quality control, we had mitochondrial proteomes for 194 mice and PMF proteomes for 215 mice. The overall CV in the mitochondrial fraction was significantly greater than in the PMF, however whether the source of this variation is biological, or the result of mitochondrial isolation is unclear and as such we have avoided making a statement within the body of the manuscript. We have now more clearly described the nature of the samples in the revised manuscript and added sample sizes to figure 1F.

      (4) The authors stated in lines 155-157 that proteins negatively associated with the Matsuda index were further filtered by presence of their cis-pQTLs. Perhaps more explanations would be needed to justify this filtering criterion? Having a cis-pQTL would mean the protein abundance variation is explained by the variation in its coding gene, this however conceptually would not be relevant to its association with the Matsuda index. With the data that the authors have in hand, would it not be natural to align the Matsuda index QTL with the pQTLs (cis and trans if available), and/or to perform mediation analysis to examine causal relationships with statistical significance?

      The rationale for filtering by cis-pQTL was not to study the genetics of either Matsuda or associated proteins but rather to identify proteins that were more likely to be causally associated with Matsuda Index as opposed to adaptively associated. To clarify this line 165 now reads: “Filtering based on cis-pQTL presence was based on the rationale that if genetic variation can explain protein abundance differences between mice, then we can be confident that phenotype (Matsuda Index) is not driving the observed differences and therefore the protein-phenotype associations are likely causal. Importantly, this assumption can only be made for cis-acting pQTLs.” Previous work by Matthew et al. (see https://qtlviewer.jax.org/) has demonstrated that cis-pQTL have markedly higher LOD scores than trans-pQTLs, and our own unpublished work suggests that trans-pQTLs do not reproduce well between datasets. The reviewer rightfully suggests aligning protein QTL with those for Matsuda. This is our long-term goal but to identify genome wide significant peaks associated with altered Matsuda will require many more mice than studied here.

      (5) It seems a bit odd that the first half of the paper focused extensively on the authors' discoveries in the mitochondrial proteome, and how proteins involved in mitochondrial processes (such as complex I) were associated with Matsuda Index, but the final fingerprint list of insulin resistance, which contained 76 proteins, only had 7 mitochondrial proteins. Was this because many mitochondrial proteins were filtered out due to no cis-pQTL presenting?

      There are three reasons our fingerprint is lacking mitochondrial proteins: 1) there are more non-mitochondrial than mitochondrial proteins in the muscle proteome; 2) we focussed on negatively associated proteins, and as demonstrated in figure 2c, the mitochondrial proteome is enriched for positively associated proteins; 3) as implied by the reviewer, we filtered for pQTL presence, further reducing the number of mitochondrial proteins in our fingerprint. To improve clarity, line 170 now reads: “Low mitochondrial representation in the fingerprint is the result of selecting negatively associating proteins, and as seen (Figure 2C) previously, the mitochondrial proteome is enriched for positive contributors to insulin resistance.”

      (6) The authors found that thiostrepton-induced insulin resistance reversal effects were not through insulin signalling. It activated glycolysis but the mechanism of action was not clear. What are the proteins in the fingerprint list that led to identification of thiostrepton on CMAP?

      Is thiostrepton able to bind or change the expression of these proteins? Since thiostrepton was identified by searching the insulin resistance fingerprint protein list against CMAP, it would be rational to think that it exerts the biological effects by directly or indirectly acting on these protein targets.

      This is indeed the implication of our data. Because of the timescales involved it is unlikely that thiostrepton is changing fingerprint protein levels but could be binding to and inhibiting them. Searching the CMAP thiostrepton signature reveals ARHGDIB and NAGK as the fingerprint proteins with the most positive and negative fold-changes respectively perhaps suggesting they play a role in thiostrepton’s mechanism of action. Experiments are underway to test this hypothesis however these are beyond the scope of the current paper.

      Reviewer #2 (Public Review):

      Line 105: The observation that variance in respiratory proteins is stable while lipid pathways is variable is quite interesting. Is this due to lower overall levels of lipid metabolism enzymes (ex. do these differ substantially from similar pathways ranked from high-low abundance?).

      The relationship between coefficient of variation (CV) and relative abundance of proteins is important to consider. To address this, we have now also performed GSEA on proteins ranked from high to low relative abundance. These comparisons have been added to supplementary figure 1 and line 110 now reads: “As a control experiment, we also performed enrichment analysis on proteins ranked by LFQ relative abundance. High CV pathways (enriched for high CV proteins) tended to be lower in relative abundance (enriched for low relative abundance proteins) (Supplementary Fig 1a, b). However, many high variability pathways, lipid metabolism for example, were not enriched in either direction based on relative abundance suggesting differences in relative abundance do not fully explain pathway variability differences.”

      Line 154: the 664 associations are impressive and potentially informative. It would be valuable to know which of these co-map to the same locus - either to distinguish linkage in a 2mb window or identify any cis-proteins which directly exert effects in trans-

      To assess this, we have analysed pQTL position relative to gene position to generate a ‘hotspot’ plot. We have also generated a histogram of this pQTL density (in a 2 Mbp window) and added these figures to figure 3. We did not detect any obvious pQTL hotspots, and the distribution of pQTLs across the genome appears fairly uniform. Line 159 now reads: “These were distributed across the genome and were predominately cis acting (Figure 3A)...”

      Line 194: Cross-platform validation of the CMAP fingerprint results is an admirable set of validations. It might be good to know general parameters like how many compounds were shared/unique for each platform. Also the concordance between ranking scores for significant and shared compounds.

      The Connectivity Map (CMap) query included 5163 compounds, the Prestwick library included 1120, and the overlap was 420. We have added these comparisons to supplementary figure 2. Supplementary figure 2 now also contains a comparison of CMap scores between overlapping compounds (found in CMap and the Prestwick library) against all significant compounds identified by CMap (supplementary figure 2b). Interestingly, compounds present in both platforms scored higher on average, suggesting the Prestwick library captures a significant proportion of highly scoring CMap candidates. Line 206 now reads: “In total, 420 compounds were found across both platforms, and these consensus compounds captured a significant proportion of highly scoring CMap compounds (Supplementary Figure 2A, B).”

      Line 319: Another consideration in the molecular fingerprint is how unique these are for muscle. While studies evaluating gene expression have shown that many cis-eQTLs are shared across tissues, to my knowledge, this hasn't been performed systematically for pQTLs. Therefore, consider adding a point to the discussion pointing out that some of the proteins might be conserved pQTLs whereas others which would be more relevant here present unique druggable targets in muscle.

      To examine tissue specificity, we determined whether our skeletal muscle fingerprint proteins were detected and contained a pQTL in two metabolically important tissues, liver and adipose. Despite detecting almost all the fingerprint proteins in both adipose and liver tissue, they were depleted for pQTL compared to skeletal muscle. These data have now been added to figure 3c. Line 172 now reads: “To assess the tissue specificity of our fingerprint we searched for the same proteins in metabolically important adipose and liver tissues. Despite detecting 94% and 82% of muscle fingerprint proteins across each tissue respectively, both adipose and liver were depleted for pQTL presence (Figure 3C) suggesting that regulation of our fingerprint protein abundance is specific to skeletal muscle.”

      Line 332: These are fascinating observations. 1, that in general insulin signaling and ampk were not themselves shown as top-ranked enrichments with matsuda and that this was sufficient to alter glucose metabolism without changes in these pathways. While further characterization of this signaling mechanism is beyond the scope of this study, it would be good to speculate as to additional signaling pathways that are relevant beyond ROS (ex. CNYP2 and others)

      We have now added further discussion to the manuscript to address this point., Line 347 now reads: “Aside from glycolysis, other pathways may be involved in enhancing insulin sensitivity. For example, the negatively associated protein ARHGDIA (Figure 2F) is a potent negative regulator of insulin sensitivity, and our fingerprint of insulin resistance contained its homologue ARHGDIB. Both ARHGDIA and ARHGDIB have been reported to inhibit the insulin action regulator RAC1 thus lowering GLUT4 translocation and glucose uptake. Further investigations may uncover a role for thiostrepton in modulating the RAC1 signalling pathway via ARHGDIB.”

      Line: 314: Remove the statement: "While this approach is less powerful than QTL co- localisation for identifying causal drivers,", as I don't believe that this has been demonstrated. Clearly, the authors provide a sufficient framework to pinpoint causality and produce an actionable set of proteins.

      We have edited line 314, which now reads: “Moreover, our approach has the major advantage that it requires far fewer mice to obtain meaningful outcomes (222 mice in this study) compared to that required for genetic mapping of complex traits like Matsuda Index.”

      Line 346: I would highlight one more appeal of the approach adopted by the authors. Given that these compound libraries were prioritized from patterns of diverse genetics, these observations are inherently more-likely to operate robustly across target backgrounds.

      This point is further supported by our thiostrepton results in both C57BL6/j and BXH9 mice. Line 317 now reads: “Furthermore, because we have used genetically diverse datasets (DOz mice and multiple cell lines in Connectivity Map) our findings are likely robust across diverse target backgrounds.”

      Line 434: I might have missed but can't seem to find where the muscle data are available to researchers. Given the importance and novelty of these studies, it will be important to provide some way to access the proteomic data.

      These data are now available via the ProteomeXchange Consortium. Line 465 now reads: “The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE (104) partner repository with the dataset identifier PXD042277.”

      1. Frezza C, Cipolat S, Scorrano L. Organelle isolation: functional mitochondria from mouse liver, muscle and cultured filroblasts. Nat Protoc. 2007;2(2):287-95.

      2. Acin-Perez R, Benador IY, Petcherski A, Veliova M, Benavides GA, Lagarrigue S, et al. A novel approach to measure mitochondrial respiration in frozen biological samples. The EMBO Journal. 2020;39(13):e104073.

      3. Chick JM, Munger SC, Simecek P, Huttlin EL, Choi K, Gatti DM, et al. Defining the consequences of genetic variation on a proteome-wide scale. Nature. 2016;534(7608):500- 5.

      4. Gatti DM, Svenson KL, Shabalin A, Wu L-Y, Valdar W, Simecek P, et al. Quantitative Trait Locus Mapping Methods for Diversity Outbred Mice. G3 Genes|Genomes|Genetics. 2014;4(9):1623-33.

    1. Reviewer #2 (Public Review):

      Accumulating data suggests that the presence of immune cell infiltrates in the meninges of the multiple sclerosis brain contributes to the tissue damage in the underlying cortical grey matter by the release of inflammatory and cytotoxic factors that diffuse into the brain parenchyma. However, little is known about the identity and direct and indirect effects of these mediators at a molecular level. This study addresses the vital link between an adaptive immune response in the CSF space and the molecular mechanisms of tissue damage that drive clinical progression. In this short report the authors use a spatial transcriptomics approach using Visium Gene Expression technology from 10x Genomics, to identify gene expression signatures in the meninges and the underlying brain parenchyma, and their interrelationship, in the PLP-induced EAE model of MS in the SJL mouse. MRI imaging using a high field strength (11.7T) scanner was used to identify areas of meningeal infiltration for further study. They report, as might be expected, the upregulation of genes associated with the complement cascade, immune cell infiltration, antigen presentation, and astrocyte activation. Pathway analysis revealed the presence of TNF, JAK-STAT and NFkB signaling, amongst others, close to sites of meningeal inflammation in the EAE animals, although the spatial resolution is insufficient to indicate whether this is in the meninges, grey matter, or both.

      UMAP clustering illuminated a major distinct cluster of upregulated genes in the meninges and smaller clusters associated with the grey matter parenchyma underlying the infiltrates. The meningeal cluster contained genes associated with immune cell functions and interactions, cytokine production, and action. The parenchymal clusters included genes and pathways related to glial activation, but also adaptive/B-cell mediated immunity and antigen presentation. This again suggests a technical inability to resolve fully between the compartments as immune cells do not penetrate the pial surface in this model or in MS. Finally, a trajectory analysis based on distance from the meningeal gene cluster successfully demonstrated descending and ascending gradients of gene expression, in particular a decline in pathway enrichment for immune processes with distance from the meninges.

      Although these results confirm what we already know about processes involved in the meninges in MS and its models and gradients of pathology in sub-pial regions, this is the first to use spatial transcriptomics to demonstrate such gradients at a molecular level in an animal model that demonstrates lymphoid like tissue development in the meninges and associated grey matter pathology. The mouse EAE model being used here does reproduce many, although not all, of the pathological features of MS and the ability to look at longer time points has been exploited well. However, this particular spatial transcriptomics technique cannot resolve at a cellular level and therefore there is a lot of overlap between gene expression signatures in the meninges and the underlying grey matter parenchyma.

      The short nature of this report means that the results are presented and discussed in a vague way, without enough molecular detail to reveal much information about molecular pathogenetic mechanisms.

      The trajectory analysis is a good way to explore gradients within the tissues and the authors are to be applauded for using this approach. However, the trajectory analysis does not tell us much if you only choose 2 genes that you think might be involved in the pathogenetic processes going on in the grey matter. It might be more useful to choose some genes involved in pathogenetic processes that we already know are involved in the tissue damage in the underlying grey matter in MS, for which there is already a lot of literature, or genes that respond to molecules we know are increased in MS CSF, although the animal models may be very different. Why were C3 and B2m chosen here?

      Strengths:<br /> - The mouse model does exhibit many of the features of the compartmentalized immune response seen in MS, including the presence of meningeal immune cell infiltrates in the central sulcus and over the surface of the cortex, with the presence of FDC's HEVs PNAd+ vessels and CXCL13 expression, indicating the formation of lymphoid like cell aggregates. In addition, disruption of the glia limitans is seen, as in MS. Increased microglial reactivity is also present at the pial surface.<br /> - Spatial transcriptomics is the best approach to studying gradients in gene expression in both white matter and grey matter and their relationship between compartments.<br /> - It would be useful to have more discussion of how the upregulated pathways in the two compartments fit with what we know about the cellular changes occurring in both, for which presumably there is prior information from the group's previous publications.

      Limitations:<br /> - EAE in the mouse is not MS and may be far removed when one considers molecular mechanisms, especially as MS is not a simple anti-myelin protein autoimmune condition. Therefore, this study could be following gene trajectories that do not exist in MS. This needs a significant amount of discussion in the manuscript if the authors suggest that it is mimicking MS.<br /> - The model does not have the cortical subpial demyelination typical of MS and it is unknown whether neuronal loss occurs in this model, which is the main feature of cytokine-mediated neurodegeneration in MS. If it does not then a whole set of genes will be missing that are involved in the neuronal response to inflammatory stimuli that may be cytotoxic.<br /> - Visium technology does not get down to single cell level and does not appear to allow resolution of the border between the meninges and the underlying grey matter.<br /> - Neuronal loss in the MS cortex is independent of demyelination and therefore not related to remyelination failure. There does not appear to be any cortical grey matter demyelination in these animals, so it is difficult to relate any of the gene changes seen here to demyelination.<br /> - No mention of how the ascending and descending patterns of gene expression may be due to the gradient of microglial activation that underlies meningeal inflammation, which is a big omission.

    2. Author Response:

      We thank Reviewer #1 for their positive assessment of our work.

      Reviewer #2 (Public Review):

      […] Although these results confirm what we already know about processes involved in the meninges in MS and its models and gradients of pathology in sub-pial regions, this is the first to use spatial transcriptomics to demonstrate such gradients at a molecular level in an animal model that demonstrates lymphoid like tissue development in the meninges and associated grey matter pathology. The mouse EAE model being used here does reproduce many, although not all, of the pathological features of MS and the ability to look at longer time points has been exploited well. However, this particular spatial transcriptomics technique cannot resolve at a cellular level and therefore there is a lot of overlap between gene expression signatures in the meninges and the underlying grey matter parenchyma.

      We appreciate the reviewer’s concise summary and comments on our manuscript. We agree that the Visium spatial sequencing technology we applied is limited in its resolution and cannot precisely distinguish individual cells or anatomic regions. For that reason, there is undoubtedly some overlap between gene expression signatures in the meninges and underlying parenchyma, particularly in spots on the borders of the meningeal inflammation clusters. However, we believe that the majority of meningeal inflammation (“cluster 11”) spots are indeed in the meninges and represent the spatial transcriptome of that niche. To support this, in the revised manuscript we will provide H&E images with the UMAP clusters overlayed to demonstrate the anatomic borders that correlate with the clusters.

      The short nature of this report means that the results are presented and discussed in a vague way, without enough molecular detail to reveal much information about molecular pathogenetic mechanisms.

      We thank the reviewer for this comment. The goal of this work is to transcriptomically characterize the spatial relationship between areas of meningeal inflammation and the underlying parenchyma. While we agree that mechanistic studies are needed to further evaluate the role of presented signaling pathways, those experiments are beyond the scope of this brief report.

      The trajectory analysis is a good way to explore gradients within the tissues and the authors are to be applauded for using this approach. However, the trajectory analysis does not tell us much if you only choose 2 genes that you think might be involved in the pathogenetic processes going on in the grey matter. It might be more useful to choose some genes involved in pathogenetic processes that we already know are involved in the tissue damage in the underlying grey matter in MS, for which there is already a lot of literature, or genes that respond to molecules we know are increased in MS CSF, although the animal models may be very different. Why were C3 and B2m chosen here?

      We appreciate the reviewer’s points here. C3 and B2m were chosen as examples of genes that have differential fit to the gradient descending pattern to assist the reader in interpreting subsequent gene set trajectory analysis. However, we agree that there are many other genes of interest and will expand the number of genes displayed in our revised manuscript. 

      Strengths: <br /> - The mouse model does exhibit many of the features of the compartmentalized immune response seen in MS, including the presence of meningeal immune cell infiltrates in the central sulcus and over the surface of the cortex, with the presence of FDC's HEVs PNAd+ vessels and CXCL13 expression, indicating the formation of lymphoid like cell aggregates. In addition, disruption of the glia limitans is seen, as in MS. Increased microglial reactivity is also present at the pial surface. <br /> - Spatial transcriptomics is the best approach to studying gradients in gene expression in both white matter and grey matter and their relationship between compartments. <br /> - It would be useful to have more discussion of how the upregulated pathways in the two .compartments fit with what we know about the cellular changes occurring in both, for which presumably there is prior information from the group's previous publications.

      Limitations: <br /> - EAE in the mouse is not MS and may be far removed when one considers molecular mechanisms, especially as MS is not a simple anti-myelin protein autoimmune condition. Therefore, this study could be following gene trajectories that do not exist in MS. This needs a significant amount of discussion in the manuscript if the authors suggest that it is mimicking MS. <br /> - The model does not have the cortical subpial demyelination typical of MS and it is unknown whether neuronal loss occurs in this model, which is the main feature of cytokine-mediated neurodegeneration in MS. If it does not then a whole set of genes will be missing that are involved in the neuronal response to inflammatory stimuli that may be cytotoxic. <br /> - Visium technology does not get down to single cell level and does not appear to allow resolution of the border between the meninges and the underlying grey matter. <br /> - Neuronal loss in the MS cortex is independent of demyelination and therefore not related to remyelination failure. There does not appear to be any cortical grey matter demyelination in these animals, so it is difficult to relate any of the gene changes seen here to demyelination. <br /> - No mention of how the ascending and descending patterns of gene expression may be due to the gradient of microglial activation that underlies meningeal inflammation, which is a big omission.

      We thank the reviewer for their insightful comments on the strengths and limitations of our study. Regarding the SJL EAE model we use in this paper, it certainly is not a perfect model of meningeal inflammation in MS, indeed we believe that no such animal model exists, but it does recapitulate several key features of human disease as described by the reviewer. Spatial transcriptomics of cortical grey matter lesions and overlying meninges of samples derived from patients with MS would be ideal, though access to this tissue is highly limited. In the revised manuscript we will include more detailed discussion of the limitations in applying these findings to MS. However, in addition to potential implications for MS research, our data contribute more generally to understanding of meningeal inflammation and penetrance of inflammation into brain tissue.

      We acknowledge that sub-pial neuronal loss has not been assessed in SJL EAE, and if present it would increase the relevance of this model to neurodegeneration. We are currently working to assess this.

      We agree with the reviewer that Visium technology is limited in its ability to discriminate individual cells, as discussed above (2.2).

      We agree that gene expression by activated microglia is likely a major driver of the transcriptomic changes observed in the parenchyma, and thank the reviewer for highlighting this. We will add discussion of this to our revised manuscript, and intend to generate additional data regarding the contribution of subpial microglial activation to the measured transcriptomic changes.

      Finally, we thank Reviewer #3 for their assessment of our work.

    1. Author Response

      eLife assessment:

      Trypanosoma brucei evades mammalian humoral immunity through the expression of different variant surface glycoprotein genes. In this fundamental paper, the authors extend previous observations that TbRAP1 both interacts with PIP5pase and binds PI(3,4,5)P3, indicating a role for PI(3,4,5)P3 binding and suggesting that antigen switching is signal dependent. While much of the evidence is compelling, one reviewer suggested that the work would benefit from further controls.

      We appreciate the evaluation of the work and agree that the findings substantially advance our understanding of antigenic variation. A detailed response to the public review is included below, which addresses and clarifies the issues raised by the reviewers, including those concerning controls. We also want to highlight the comment by Reviewer #3 “The methods used in the study are rigorous and well-controlled…. their results support the conclusions made in the manuscript.”. We hope this and our comments will help address the issue of controls in this eLife statement.

      Reviewer #1 (Public Review):

      Trypanosoma brucei undergoes antigenic variation to evade the mammalian host’s immune response. To achieve this, T. brucei regularly expresses different VSGs as its major surface antigen. VSG expression sites are exclusively subtelomeric, and VSG transcription by RNA polymerase I is strictly monoallelic. It has been shown that T. brucei RAP1, a telomeric protein, and the phosphoinositol pathway are essential for VSG monoallelic expression. In previous studies, Cestari et al. (ref. 24) have shown that PIP5pase interacts with RAP1 and that RAP1 binds PI(3,4,5)P3. RNAseq and ChIPseq analyses have been performed previously in PIP5pase conditional knockout cells, too (ref. 24). In the current study, Touray et al. did similar analyses except that catalytic dead PIP5pase mutant was used and the DNA and PI(3,4,5)P3 binding activities of RAP1 fragments were examined. Specifically, the authors examined the transcriptome profile and did RAP1 ChIPseq in PIP5pase catalytic dead mutant. The authors also expressed several C-terminal His6-tagged RAP1 recombinant proteins (full-length, aa1-300, aa301-560, and aa 561-855). These fragments’ DNA binding activities were examined by EMSA analysis and their phosphoinositides binding activities were examined by affinity pulldown of biotin-conjugated phosphoinositides. As a result, the authors confirmed that VSG silencing (both BES-linked and MES-linked VSGs) depends on PIP5pase catalytic activity, but the overall knowledge improvement is incremental. The most convincing data come from the phosphoinositide binding assay as it clearly shows that N-terminus of RAP1 binds PI(3,4,5)P3 but not PI(4,5)P2, although this is only assayed in vitro, while the in vivo binding of full-length RAP1 to PI(3,4,5)P3 has been previously published by Cestari et al (ref. 24) already. Considering that many phosphoinositides exert their regulatory role by modulating the subcellular localization of their bound proteins, it is reasonable to hypothesize that binding to PI(3,4,5)P3 can remove RAP1 from the chromatin. However, no convincing data have been shown to support the author’s hypothesis that this regulation is through an “allosteric switch”. Therefore, the title should be revised.

      We appreciate the reviewer’s detailed evaluation of our work. There are a few general comments that we would like to clarify. We will break them into three points. All data included here are new and were not previously published.

      i) “RNAseq and ChIPseq analyses have been performed previously …(ref. 24).” Reference 24 is Cestari et al. 2019, Mol Cell Biol. We, or others, have not published ChIP-seq of RAP1 in T. brucei. Previous work showed ChIP-qPCR, which analyses specific loci. The ChIP-seq shows genome-wide binding sites of RAP1, and new findings are shown here, including binding sites in the BES, MESs, and other genome loci such as centromeres. We also identified DNA sequence bias defining RAP1 binding sites (Fig 2A). We also show by ChIP-seq how RAP1-binding to these loci changes upon expression of catalytic inactive PIP5Pase. As for the RNA-seq, this is also the first time we show RNA-seq of T. brucei expressing catalytic inactive PIP5Pase, which establishes that the regulation of VSG silencing and switching is dependent on PIP5Pase enzyme catalysis, i.e., PI(3,4,5)P3 dephosphorylation. To improve clarity in the manuscript, we edited page 4, line 122, as follows: “We showed that RAP1 binds telomeric or 70 bp repeats (24), but it is unknown if it binds to other ES sequences or genomic loci.”

      ii) “The in vivo binding of full-length RAP1 to PI(3,4,5)P3 has been previously published by Cestari et al. (ref. 24) already.”. We published in reference 24 that RAP1-HA can bind agarose beads-conjugated synthetic PI(3,4,5)P3. Here, we were able to measure T. brucei endogenous PI(3,4,5)P3 associated with RAP1-HA (Fig 4F). Moreover, we showed that the endogenous RAP1-HA and PI(3,4,5)P3 binding is about 100-fold higher when PIP5Pase is catalytic inactive than WT PIP5Pase. The data establish that in vivo endogenous PI(3,4,5)P3 binds to RAP1-HA and how the binding changes in cells expressing mutant PIP5Pase; this data is new and relevant to our conclusions.

      iii) “no convincing data have been shown to support the author’s hypothesis that this regulation is through an “allosteric switch””. We show here in vitro and in vivo data supporting the conclusion. We show that PI(3,4,5)P3 binds to the N-terminus of rRAP1-His with a calculated Kd of about 20 µM (Fig 4B-E, Table 1). In contrast, we show by EMSA and binding kinetics by microscale thermophoresis that rRAP1-His binds to 70 bp and telomeric repeats via protein regions encompassing the Myb (central) or Myb-L domains (C-terminal) but not the N-terminus containing the VHP domain (Fig 3C-G, and Fig S5). Using microscale thermophoresis, we also show that rRAP1-His binds to 70 bp and telomeric repeats with Kd of 10 and 24 nM, respectively (Fig 3 and Table 1). Notably, we show that 30 µM of PI(3,4,5)P3, but not PI(4,5,)P2 – used as a control – disrupts rRAP1-His binding to 70 bp and telomeric repeats, changing Kds to about 188 and 155 nM, respectively (Fig 5A-C). We also show that PI(3,4,5)P3 does not disrupt the binding of rRAP1-His fragments (Myb or MybL) without the N-terminus domain (Fig S5), implying binding of PI(3,4,5)P3 to RAP1 N-terminus is required for displacement of RAP1 DNA binding domains (Myb and MybL) from telomeric and 70 bp repeats, and that PI(3,4,5)P3 is not competing for Myb or Myb-L binding to DNA. Moreover, we show that RAP1-HA binding to 70 bp and telomeric repeats in vivo is displaced in T. brucei cells expressing catalytic inactive PIP5Pase (Fig 5D-G), which we show results in RAP1-HA binding about 100-fold more endogenous PI(3,4,5)P3 than in T. brucei expressing WT PIP5Pase (Fig 4F). The in vivo data agrees with the in vitro data. The data show a typical allosteric regulator system, in which binding of a ligand to one site of the protein, here PI(3,4,5)P3 binding to RAP1 N-terminus, affects other domains (RAP1 Myb and Myb-L domains) binding to DNA. To improve the clarity of the title, we will change it in the revised version to imply a direct role of PI(3,4,5)P3 regulation of RAP1 in the process. This will provide more specific information to the readers and addresses the concern of the reviewer related to the “allosteric switch”. The new title will be: PI(3,4,5)P3 allosteric regulation of RAP1 controls antigenic switching in trypanosomes

      There are serious concerns about many conclusions made by Touray et al., according to their experimental approaches:

      1) The authors have been studying RAP1’s chromatin association pattern by ChIPseq in cells expressing a C-terminal HA tagged RAP1. According to data from tryptag.org, RAP1 with an N-terminal or a C-terminal tag does not seem to have identical subcellular localization patterns, suggesting that adding tags at different positions of RAP1 may affect its function. It is therefore essential to validate that the C-terminally HA-tagged RAP1 still has its essential functions. However, this data is not available in the current study. RAP1 is essential. If RAP1-HA still retains its essential functions, cells carrying one RAP1-HA allele and one deleted allele are expected to grow the same as WT cells. In addition, these cells should have the WT VSG expression pattern, and RAP1-HA should still interact with TRF. Without these validations, it is impossible to judge whether the ChIPseq data obtained on RAP1-HA reflect the true chromatin association profile of RAP1.

      Tryptag data show both N- and C-terminus RAP1 with nuclear localization in procyclic forms, although there are differences in signal intensities in the images (http://tryptag.org/?id=Tb927.11.370). It is important to note that Tryptag data is from procyclic forms, and DNA constructs are not validated for their integration in the correct locus. As for the RAP1-HA localization in bloodstream forms, we demonstrated that C-terminally HA-tagged RAP1 co-localizes with telomeres by a combination of immunofluorescence and fluorescence in situ hybridization (Cestari and Stuart, 2015, PNAS), and RAP1-HA co-immunoprecipitate telomeric and 70 bp repeats (Cestari et al. 2019 Mol Cell Biol). We also showed by immunoprecipitation and mass spectrometry that HA-tagged RAP1 interacts with nuclear and telomeric proteins, including PIP5Pase (Cestari et al. 2019). Others have also tagged T. brucei RAP1 in bloodstream forms with HA without disrupting its nuclear localization (Yang et al. 2009, Cell; Afrin et al. 2020, Science Advances). As for the experiment suggested by the reviewer, there is no guarantee that cells lacking one allele of RAP1 will behave as wildtype, i.e., normal growth and repression of VSGs genes. Also, less than 90% of T. brucei TRF was reported to interact with RAP1 (Yang et al. 2009, Cell), which might be indirect via their binding to telomeric DNA repeats rather than direct protein-protein interactions.

      2) Touray et al. expressed and purified His6-tagged recombinant RAP1 fragments from E. coli and used these recombinant proteins for EMSA analysis: The His6 tag has been used for purifying various recombinant proteins. It is most likely that the His6 tag itself does not convey any DNA binding activities. However, using His6-tagged RAP1 fragments for EMSA analysis has a serious concern. It has been shown that His6-tagged human RAP1 protein can bind dsDNA, but hRAP1 without the His6 tag does not. It is possible that RAP1 proteins in combination with the His6 tag can exhibit certain unnatural DNA binding activities. To be rigorous, the authors need to remove the His6 tag from their recombinant proteins before the in vitro DNA binding analyses are performed. This is a standard procedure for many in vitro assays using recombinant proteins.

      We show in Fig 3C-G that His-tagged full-length rRAP1 does not bind to scrambled telomeric dsDNA sequences, which indicates that His-tagged rRAP1 does not bind unspecifically to DNA. Moreover, in Fig 3G, we show that His-tagged rRAP11-300 also does not bind to 70 bp or telomeric repeats. In contrast, full-length His-tagged rRAP1, rRAP1301-560, or rRAP1561-855 bind to 70 bp or telomeric repeats (Fig 3C-G). Since all proteins were His-tagged, the His tag cannot be responsible for the DNA binding.

      As for the statement that human rRAP1-His has unspecific DNA binding properties, we could not find a reference to this statement; we cannot compare it without knowing the details of the experiment. Biochemical assays can result in unspecific binding depending on binding/buffer conditions. Also, humans and T. brucei RAP1 share only 15% of amino acid identity; unspecific binding to DNA could be specific to human RAP1.

      3) It is unclear why Nanopore sequencing was used for RNAseq and ChIPseq experiments. The greatest benefit of Nanopore sequencing is that it can sequence long reads, which usually helps with mapping, particularly at genome loci with repetitive sequences. This seems beneficial for RAP1 ChIPseq analysis as RAP1 is expected to bind telomere repeats. However, for ChIPseq, the chromatin needs to be fragmented. Larger DNA fragments from ChIPseq experiments will decrease the accuracy of the final calculated binding sites. Therefore, ChIPseq experiments are not supposed to have long reads to start with, so Nanopore sequencing does not seem to bring any advantage. In addition, compared to Illumina sequencing, Nanopore sequencing usually yields smaller numbers of reads, and the sequencing accuracy rate is lower. The Nanopore sequencing accuracy may be a serious concern in the current study. All telomeres have the perfect TTAGGG repeats, all VSG genes have a very similar 3’ UTR, and all 70 bp repeats have very similar sequences. In fact, the active and silent ESs have 90% sequence identity. Are sequence reads accurately mapped to different ESs? How is the sequencing and mapping quality controlled? Furthermore, it is unclear whether the read depth for RNAseq is deep enough.

      The mean sequence length for the ChIP-seq was about 500 bp (see Table S3), which helps to align reads to ESs and distinguish the different ESs, and it is a reasonable size range to define RAP1 binding sites. Although sequencing depths are usually higher in Illumina than in nanopore (all depending on the amount of sequencing), most Illumina short reads map to multiple genomic sequences, making it difficult to distinguish ESs. This is particularly important for RAP1 because it binds to repeats such as 70 bp and telomeric repeats. Mapping short reads to those regions would be virtually impossible; hence, our choice of nanopore sequencing. For RNA-seq, the ~500 bp read length help sequence alignment to the subtelomeric regions containing many VSG genes. The nanopore reads obtained here had an average sequencing score 12 (i.e., base call accuracy of 94%). Filtering reads with MAPQ ≥ 20 (99% probability of correct alignment) helped us to distinguish RAP1 binding to specific ESs, including silent vs active ES (ChIP-seq) or VSG sequences (RNA-seq). The details of the analysis and sequencing metrics (i.e., sequencing depth and read length) were described in the Methods section “Computational analysis of RNA-seq and ChIP-seq” and Table S3, respectively.

      4) Many statements in the discussion section are speculations without any solid evidence. For example, lines 218 - 219 “likely due to RAP1 conformational changes”, no data have been shown to support this at all. In lines 224-226, the authors acknowledged that more experiments are necessary to validate their observations, so it is important for the authors to first validate their findings before they draw any solid conclusions. Importantly, RAP1 has been shown to help compact telomeric and subtelomeric chromatin a long time ago by Pandya et al. (2013. NAR 41:7673), who actually examined the chromatin structure by MNase digestion and FAIRE. The authors should acknowledge previous findings. In addition, the authors need to revise the discussion to clearly indicate what they “speculate” rather than make statements as if it is a solid conclusion.

      The statement “likely due to RAP1 conformational changes” in lines 218-219 (page 6) is part of the Discussion. We did not make a strong statement but discussed a possibility. We believe that it is beneficial to the reader to have the data discussed, and we do not feel this point is overly speculative.

      For lines 224-226 (page 6), the statement refers to the finding of RAP1 binding to centromeric regions by ChIP-seq, which is a new finding but not the focus of this work. Hence, future studies are necessary for this finding, and we believe it is appropriate in the Discussion to be upfront and highlight this point to the readers. However, for the RAP1 binding to telomeric ES sites, e.g., 70 bp repeats and telomeric repeats (the focus of this work), we validated the binding by EMSA and by performing binding kinetics using microscale thermophoresis.

      We did not include Pandya et al. 2013 NAR because the authors demonstrated RAP1 compaction of chromatin to occur in procyclic forms only. Pandya et al. stated in their abstract: “no significant chromatin structure changes were detected on depletion of TbRAP1 in BF cells”. Hence, the suggested reference is not relevant to the context of our conclusions in bloodstream forms. Nevertheless, we have reviewed the Discussion to avoid broad speculations in the revised version of the manuscript.

      There are also minor concerns:

      1) In the PIP5Pase conditional knockout system, the WT or mutant PIP5Pase with a V5 tag is constitutively expressed from the tubulin array. What’s the relative expression level of this allele and the endogenous PIP5Pase? Without a clear knowledge of the mutant expression level, it is hard to conclude whether the mutant has any dominant negative effects or whether the mutant phenotype is simply due to a lower than WT PIP5pase expression level.

      The relative mRNA levels of the exclusive expression of PIP5Pase Mut compared to the WT is available in the Data S1, RNA-seq. The Mut allele’s relative expression level is 0.85-fold to the WT allele (both from tubulin loci). We also showed by Western blot the WT and Mut PIP5Pase protein expression (Cestari et al. 2019, Mol Cell Biol). Concerning PIP5Pase endogenous alleles, we compared RNA-seq reads counts per million from the conditional null PIP5Pase cells exclusively expressing WT or the Mut PIP5Pase alleles (Data S1, this work) to our previous RNA-seq of single-marker 427 strain (Cestari et al. 2019, Mol Cell Biol). We used the single-maker 427 because the conditional null cells were generated in this strain background. The PIP5Pase WT and Mut mRNAs expressed from tubulin loci are 1.6 and 1.3-fold the endogenous PIP5Pase levels in single-marker 427, respectively. We include a statement in the Methods, page 7, lines 265-268: “The WT or Mut PIP5Pase mRNAs exclusively expressed from tubulin loci are 1.6 and 1.3-fold the WT PIP5Pase mRNA levels expressed from endogenous alleles in the single marker 427 strain. The fold-changes were calculated from RNA-seq reads counts per million from this work (WT and Mut PIP5Pase, Data S1) and our previous RNA-seq from single marker 427 strain (24).”

      2) In EMSA analysis, what are the concentrations of the protein and the probe used in each reaction? The amount of protein used in the binding assay appears to be very high, and this can contribute to the observation that many complexes are stuck in the well. Better quality EMSA data need to be shown to support the authors’ claims.

      All concentrations were provided in the Methods section. See page 9 Electrophoretic mobility shift assays: “100 nM of annealed DNA were mixed with 1 μg of recombinant protein…”. For microscale thermophoresis, also see page 9, Microscale thermophoresis binding kinetics: “1 μM rRAP1 was diluted in 16 two-fold serial dilutions in 250 mM HEPES pH 7.4, 25 mM MgCl2, 500 mM NaCl, and 0.25% (v/v) N P-40 and incubated with 20 nM telomeric or 70 bp repeats…”. Note that two different biochemical approaches, EMSA and microscale thermophoresis, were used to assess rRAP1-His binding to DNA. Both show similar results (Fig 3 and 5, and Fig S5; microscale thermophoresis shows the binding kinetics, data available in Table 1). The EMSA images clearly show the binding of RAP1 to 70 bp or telomeric repeats but not to scramble telomeric repeat DNA.

      Reviewer #2 (Public Review):

      This manuscript by Touray, et al. provides a significant new twist to our understanding of how antigenic variation may be regulated in T. brucei. Key aspects of antigenic variation are the mutually exclusive expression of a single antigen per cell and the periodic switching from expression of one antigen isoform to another. In this manuscript, the authors show, as they have previously shown, that depletion of the nuclear phosphatidylinositol 5-phosphatase (PIP5Pase) results in a loss of mutually exclusive VSG expression. Furthermore, using ChIP-seq, the authors show that the repressor/activator protein 1 (RAP1) binds to regions upstream and downstream of VSG genes located in transcriptionally repressed expression sites and that this binding is lost in the absence of a functional PIP5Pase. Importantly, the authors decided to further investigate this link between PIP5Pase and RAP1, a protein that has previously been implicated in antigenic variation in T. brucei, and found that inactivation of PIP5Pase results in the accumulation of PI(3,4,5)P3 bound to the RAP1 N-terminus and that this binding impairs the ability of RAP1 to bind DNA. Based on these observations, the authors suggest that the levels of PI(3,4,5)P3 may determine the cellular function of RAP1, either by binding upstream of VSG genes and repressing their function, or by not binding DNA and allowing the simultaneous expression of multiple VSG genes in a single parasite.

      While I find most of the data presented in this manuscript compelling, there are aspects of Figure 1 that are not clear to me. Based on Figure 1F, the authors claim that transient inactivation of PIP5Pase results in a switch from the expression of one VSG isoform to another. However, I am not exactly sure what the authors are showing in this panel, nor do the data in Figure 1F seem to be consistent with those shown in Figure 1C. Based on Figure 1F, a transient inactivation of PIP5Pase appears to result in an almost exclusive switch to a VSG located in BES12. However, based on Figure 1E, the VSG transcripts most commonly found after a transient inactivation of PIP5Pase are those from the previously active VSG (BES1) and VSGs located on chr 1 and 6 (I believe). The small font and the low resolution make it impossible to infer the location of the expressed VSG genes, nor to confirm that ALL VSG genes located in expression sites are activated, as the authors claim. Also, I was not able to access the raw ChIP-seq and RNA-seq reads. Thus, could not evaluate the quality of the sequencing data.

      We appreciate the reviewer’s comments and evaluation of our work. Fig 1E shows VSG-seq of a population after transient (24h) exclusive expression of the PIP5Pase mutant, followed by re-expression of the WT PIP5Pase allele for 60 hours (multiple VSGs are detected). As a control, it also shows VSG-seq in cells continuously expressing WT PIP5Pase (mostly VSG2, BES1 is detected). Fig 1F and Fig S1 show the sequencing of VSGs expressed by clones isolated (5-6 days of growth) after a temporary knockdown (24h) of PIP5Pase (tet -), followed by its re-expression. For comparison, no knockdown (tet +) was included. Fig 1F shows potential switchers in the population, the Fig 1E confirms VSG switching in clones.

      To clarify the difference between Fig 1E and 1F, we edited the manuscript on page 3, lines 103-110: “To verify PIP5Pase role in VSG switching, we knocked down PIP5Pase for 24h (Tet -), then restored its expression (Tet +) and isolated clones by limiting dilution and growth for 5-6 days. Analysis of isolated clones after temporary PIP5Pase knockdown (Tet -/+) confirmed VSG switching in 93 out of 94 (99%) of the analyzed clones (Fig 1F, Fig S1). The cells switched to express VSGs from silent ESs or subtelomeric regions, indicating switching by transcription or recombination mechanisms. Moreover, no switching was detected in 118 isolated clones from cells continuously expressing WT PIP5Pase (Tet +, Fig 1F).”. We also edited Fig 1F to indicate temporary knockdown (Tet -/+) vs no knockdown (Tet -). The modifications will be available in the resubmitted version of the manuscript.

      We agree that the heat map is difficult to read due to the amount of information. We will include in the revised version of the manuscript a table with the data in the supplementary information; the reader will be able to evaluate the data in detail.

      A preference for switching to specific ESs has been observed in T. brucei (Morrison et al. 2005, Int J Parasitol; Cestari and Stuart, 2015, PNAS), which may explain several clones switching to BES12. Many potential switchers were detected in the VSG-seq (Fig 1F, the whole cell population is over 107 parasites), but not all potential switchers were detected in the clonal analysis because we analyzed 212 clones total, a fraction of the over 107 cells analyzed by VSG-seq (Fig 1E). Also, it is possible that not all potential switchers are viable. However, the point of the clonal analysis is to validate the VSG switching after genetic perturbation of PIP5Pase.

      Fig 1C shows examples of ES derepression by RNA-seq after 24h exclusive expression of the mutant compared to WT PIP5Pase. The RNA-seq shows that all ESs are derepressed (Fig 1B). This can be visualized in the volcano plot (Fig 1B, BES and MES VSGs are labelled) and on the spreadsheet Data S1. Although all ESs are derepressed after PIP5Pase mutant expression, not all ESs are selected during switching, as observed in Fig 1E-F. This agrees with our previous observations in switching assays with proteins that control VSG switching (Cestari and Stuart, 2015, PNAS).

      As for metrics of sequencing and raw sequencing data. See Methods section, page 13, lines 483-485: “Sequencing information is available in Table S3 and fastq data is available in the Sequence Read Archive (SRA) with the BioProject identification PRJNA934938.” Table S3 has a summary of sequencing data. Metrics information such as sequencing quality and analysis can be found in the Methods section “Computational analysis of RNA-seq and ChIP-seq”. The latter includes information about nanopore reads, i.e., mean Q-score of 12.

      Reviewer #3 (Public Review):

      In this manuscript, Touray et al investigate the mechanisms by which PIP5Pase and RAP1 control VSG expression in T. brucei and demonstrate an important role for this enzyme in a signalling pathway that likely plays a role in antigenic variation in T. brucei.

      The methods used in the study are rigorous and well-controlled. The authors convincingly demonstrate that RAP1 binds to PI(3,4,5)P3 through its N-terminus and that this binding regulates RAP1 binding to VSG expression sites, which in turn regulates VSG silencing. Overall their results support the conclusions made in the manuscript.

      There are a few small caveats that are worth noting. First, the analysis of VSG derepression and switching in Figure 1 relies on a genome that does not contain minichromosomal (MC) VSG sequences. This means that MC VSGs could theoretically be misassigned as coming from another genomic location in the absence of an MC reference. As the origin of the VSGs in these clones isn’t a major point in the paper, I do not think this is a major concern, but I would not over-interpret the particular details of switching outcomes in these experiments.

      The authors state that “our data imply that antigenic variation is not exclusively stochastic.” I am not sure this is true. While I also favor the idea that switching is not exclusively stochastic, evidence for a signaling pathway does not necessarily imply that antigenic variation is not stochastic. This pathway could be important solely for lifecycle-related control of VSG expression, rather than antigenic variation during infection. Nevertheless, these data are critical for establishing a potential pathway that could control antigenic variation and thus represent a fundamental discovery.

      Another aspect of this work that is perhaps important, but not discussed much by the authors, is the fact that signalling is extremely poorly understood in T. brucei. In Figure 1B, the RNA-seq data show many genes upregulated after expression of the Mut PIP5Pase (not just VSGs). The authors rightly avoid claiming that this pathway is exclusive to VSGs, but I wonder if these data could provide insight into the other biological processes that might be controlled by this signaling pathway in T. brucei.

      Overall, this is an excellent study that represents an important step forward in understanding how antigenic variation is controlled in T. brucei. The possibility that this process could be controlled via a signalling pathway has been speculated for a long time, and this study provides the first mechanistic evidence for that possibility.

      We thank the reviewer for the evaluation of our work. We agree that it is difficult to ensure the origin of all VSG genes not having minichromosome sequences; hence we did not emphasize this point in the manuscript. We used the 427-2018 reference genome assembled by PacBio and Hi-C (Muller et al. 2018, Nature), which we believe is the best assembly for the 427 strain, especially related to the VSG genes.

      We also agree that having signaling controlling switching in vitro does not mean the switching necessarily occurs by signaling in vivo. Nevertheless, stochastic switching is an accepted model; but it has not been proved, whereas we provide molecular evidence that signaling can cause switching. To express this reviewer’s suggestion, we edited the Discussion, page 7, line 250: from “our data imply that antigenic variation is not exclusively stochastic” to “our data suggest that antigenic variation is not exclusively stochastic”.

      Most of the RNA-seq data were VSGs genes/pseudogenes. Other genes upregulated included retrotransposons and DNA/RNA processing enzymes such as endonucleases and polymerases. We included in the Results, page 3, line 100: “Other genes upregulated include primarily retrotransposons, endonucleases, and polymerase proteins.”.

    1. Reviewer #3 (Public Review):

      It is well known that as seasonal day length increases, molecular cascades in the brain are triggered to ready an individual for reproduction. Some of these changes, however, can begin to occur before the day length threshold is reached, suggesting that short days similarly have the capacity to alter aspects of phenotype. This study seeks to understand the mechanisms by which short days can accomplish this task, which is an interesting and important question in the field of organismal biology and endocrinology.

      The set of studies that this manuscript presents is comprehensive and well-controlled. Many of the effects are also strong and thus offer tantalizing hints about the endo-molecular basis by which short days might stimulate major changes in body condition. Another strength is that the authors put together a compelling model for how different facets of an animal's reproductive state come "on line" as day length increases and spring approaches. In this way, I think the authors broadly fulfill their aims.

      I do, however, also think that there are a few weaknesses that the authors should consider, or that readers should consider when evaluating this manuscript. First, some of the molecular genetic analyses should be interpreted with greater caution. By bioinformatically showing that certain DNA motifs exist within a gene promoter (e.g., FSHbeta), one is not generating robust evidence that corresponding transcription factors actually regulate the expression of the gene in question. In fact, some may argue that this line of evidence only offers weak support for such a conclusion. I appreciate that actually running the laboratory experiments necessary to generate strong support for these types of conclusions is not trivial, and doing so may even be impossible. I would therefore suggest a clear admission of these limitations in the paper.

      Second, I have another issue with the interpretation of data presented in Figure 3. The data show that FSHbeta increases in expression in the 8Lext group, suggesting that endogenous drivers likely act to increase the expression of this gene despite no change in day length. However, more robust effects are reported for FSHbeta expression in the 10v and 12v groups, even compared to the 8Lext group. Doesn't this suggest that both endogenous mechanisms and changes in day length work together to ramp up FSHbeta? The rest of the paper seemed to emphasize endogenous mechanisms and gloss over the fact that such mechanisms likely work additively with other factors. I felt like there was more nuance to these findings than the authors were getting into.

      Third, studies 1 - 3 are well controlled; however, I'm left wondering how much of an effect the transitions in day length might have on the underlying molecular processes that mediate changes in body condition. While the changes in day length are themselves ecologically relevant, the transitions between day length states are not. How do we know, for example, that more gradual changes in day length that occur over long timespans do not produce different effects at the levels of the brain and body? This seemed especially relevant for study 3, where animals experience a rather sudden change in day length. I recognize that these experimental methods are well described in the literature, and they have been used by endocrinologists for a long time; nonetheless, I think questions remain.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their insights and comments on this manuscript. Specific responses to reviewer concerns are detailed below. We made a couple of significant changes based on the feedback. First, we performed more experiments to increase biologic replicates and then quantified image data for multiple figures. The new quantitative information added to Figure 3 fully supports our original conclusions about changes to the ONH in Hes-TKO mutants. The quantification of Atoh7, Otx2, Rbpms and Crx expressing cells among the different genotypes revealed interesting differences in Notch intracellular gene requirements for both RGC and cone development. The most startling outcome is that changes in both cell types correlate with significant changes in Otx2, but not Atoh7. This singular finding suggests interesting future work is needed, well beyond the scope of this paper about the molecular mechanisms underlying these cell fates. Second, our data presentation was reorganized with new information added to Fig 1 that clarifies the relationships between Hes1, Hes5, Foxg1 and Pax2; old Figs 6 & 7 about neurogenesis were merged; and some data moved to new Suppl Figs 2 and 5. The numbering for multiple figures changed and a new summary model (now Fig 8) is provided. In addition, the manuscript was completely rewritten to improve clarity. We hope this revised manuscript is acceptable for publication.

      Reviewer #1 Summary:

      In this study, the authors employed an impressive set of mouse mutant or Cre lines to investigate the complexity of Notch signaling across different stages of retinal development. These comprehensive analyses led to two main findings: 1. Sustained hes1 in the OHS/OS is Notch-independent; 2. Rbpj and Hes1 exhibited opposing roles in cone photoreceptor development. Although the study is potentially interesting, the current manuscript needs the essential research background and quantification, a lack of which significantly reduced the clarity of the manuscript and the credibility of the major conclusions. Also, how the authors organized the results is quite confusing, making the manuscript very difficult to follow.

      Response: We agree with all reviewers concerning incomplete quantification of the data. We directly addressed this shortcoming in revised Figs 3 and 6 (the latter combines old Figs 6 +7). To do this, we repeated some IHC experiments to add more replicates and reorganized all of the neurogenesis phenotypic data figures. Our quantifications uncovered several surprising outcomes that clarify our model. For these reasons, the manuscript was exhaustively rewritten. We merged E13 neurogenesis data into revised Figure 6 and moved the most relevant E16 analyses to new supplemental data Fig 5. All changes made should make the paper easier to understand for retinal development, neurogenesis, and Notch pathway aficionados, in addition to readers lacking such expertise.

      Major comments: 1. The authors needed to make the quantification for many analyses to strengthen the conclusions, such as Fig. 1F, 1G, and etc.

      Response: We quantified optic nerve head (ONL) immunohistochemistry data in the revised Fig 3. We also quantified neurogenesis markers Atoh7, Otx2, Rbpms (RGCs), and Crx at E13 in revised Fig 6 (former Figs 6 and 7). Older stages were moved to a new Suppl Fig 5.

      Respectfully, Hes5 mRNA expression in old Fig 1F and 1G shows that Hes5, like other retinal progenitor cell (RPC) markers, expanded in Rax-Cre deletion but not Chx10-Cre deletion conditions. This is analogous to Pax6 and Rax expansion in Rax-Cre;Hes1 CKO eyes and Pax2 mutants (doi: 10.1523/JNEUROSCI.2327-19.2020) (1). In revised Fig 1, we now show analogous expansion of Hes5 mRNA in Pax2 mutant retinas (compare Figs 1F-1I). Because Hes5 RNA in situ hybridization experiments are nonquantitative, we do not discuss the possibility of Hes5 mRNA level changes in labeled cells.

      The authors reported many exciting results. However, further mechanistic insights are largely missing. They may focus on one of these exciting findings and give some mechanistic insights. For example, hes1 suppresses hes5 expression as the ONH boundary forms; hes1 expression in the ONH is Notch independent; differential influences of Rbpj and Hes1 on cone development. It is better for the authors to select one of these exciting findings and provide a deeper mechanistic study.

      Response: This revision brings fresh focus to Notch regulation of RGC and photoreceptor development, particularly differential influences for Rbpj versus Hes1. We also better support our interpretation of image data in Fig 1. We include new data about the spatial relationships between Hes5-GFP/Pax2 and Hes5-GFP/Foxg1. In summary, we find that as Pax2 becomes restricted to the nasal optic cup prior to the onset of RGC genesis, it becomes mutually exclusive with Hes5-GFP, at the same time that Hes5-GFP+ cells coexpress Hes1. This is consistent with Hes1 indirectly regulating Hes5-GFP as a marker of neurogenic RPCs at the forming ONH. Furthermore, it emphasizes the importance of genetically teasing apart the separate and potentially compensatory roles for Hes1 versus Hes5 undertaken here. These relationships remain poorly resolved during vertebrate CNS development.

      Some analyses lack an explanation of the rationale. For example, "To understand if the loss of multiple Hes genes is more catastrophic than Hes1 alone..."(PAGE 7). Please explain its significance.

      Response: We assume the reviewer is referring to the first sentence of the last paragraph on this page. We analyzed Hes triple mutant mice (TKO) to understand if removing multiple Hes genes reveals redundant functions. This is an open question, given that Hes1 is expressed in the ONH/OS, which is normally devoid of Hes5 by the time retinal neurogenesis begins. These questions have only been explored in a handful of tissues throughout the body. Also see response to point 2 above. In general, we have expanded the rationale for all of the experiments throughout the revised manuscript.

      Significance: In general, many results are quite interesting. However, the significance of these findings is largely hampered in the following aspects: 1. The authors were unable to provide the sufficient research contexts that are essential for understanding many results.2. Many conclusions were solely based on descriptive images but lacked statistical quantification, which significantly weakened many conclusions. 3. Many interesting findings are quite descriptive, and some mechanistic understandings of one of these exciting findings will be beneficial to improve the focus and significance of the study. Current format of the manuscript fits more specialized audience.

      Response: During in vivo development, we wished to understand which particular Notch pathway genes can interact in a Notch-dependent versus a Notch-independent manner. Genetic (phenotypic) studies produce extremely rigorous datasets, in our opinion. This revision now extensively quantifies key findings. Here we dissected the "receipt" of a Notch signal by identically testing the functional requirements of particular pathway members. For Mastermind (Maml), there are 3 paralogues, double mutants for Maml1 and Maml3 are early lethal, and no floxed alleles exist, so it was logical to employ the ROSA-dnMaml mouse strain, particularly since it has been discussed throughout the Notch literature as "analogous" to removing either a Notch receptor or Rbpj. Our finding that the dnMAML allele does not function like a Rbpj null in the retina is important for researchers in the broad Notch field to consider when designing and interpreting experiments.

      Reviewer #2: Hes genes are effectors of the Notch signaling pathway but can also act down-stream of other signaling cascades. In this manuscript the authors attempt to address the complexity of Hes effectors during optic cup development and retinal neurogenesis. To do so, they compared optic cup patterning and retinal neurogenesis in seven germline or conditional mutant mouse embryos generated with two spatio-temporally distinct Cre drivers. These lines allowed for the analysis of the consequences of perturbing the Notch ternary complex and multiple Hes genes alone or in combination. The authors show that the optic disc/nerve head is regulated by Notch independent Hes1 function. They also confirm that perturbation of Notch signaling interferes with cell proliferation enhancing the production of differentiated ganglion cells, whereas photoreceptor genesis requires both Rbpj and Hes1 with Notch dependent and independent mechanisms. This is a rather complex study that dissects further the role of the Notch pathway and Hes proteins during eye development, a topic that has been addressed in many previous studies but perhaps not with the details that the authors have used here. In this respect, this study adds to current literature but will likely be of interest to retina aficionados. The manuscript reads well and the figures are of very good quality. However, many of the statements are based on qualitative rather than on quantitative analysis. This should be, at least in some cases, remediated, despite the effort that this may require given the number of mouse lines used in the study.

      Response: As described in the response to Reviewer 1, we agree and present considerably more quantification data. We extensively reorganized and rewrote this manuscript to emphasize that Hes1 in the ONH/OS is fully Notch-independent and highlight branchpoints in Notch-dependent signaling, for Rbpj versus Hes,1 during early retinal neurogenesis. It is too simplistic that the ternary complex (Rbpj-NICD-Maml) simply activates Hes1 (and/or multiple Hes genes) to regulate downstream signaling targets. This paradigm has been portrayed in the literature numerous times for many processes throughout vertebrate development, homeostasis or relative to particular diseases. By focusing on one tissue and a narrow window of development, our phenotypic studies delved more deeply to show the greater complexity and molecular cross-talk that we think underlie the modulation of signaling levels with in vivo context. Thus, our results are of broad interest and impact to the greater Notch field.

      1. The title is somewhat misleading. The authors have explored mostly the role of Hes1, 3 and5. Although these are Notch effectors, there is already evidence that they participate in other pathways This is confirmed by the data present here. I would suggest to eliminate Notch from the title and use instead "Hes" to better reflect the findings. Furthermore, it is unclear why there is a reference to "mutations" or what are the Notch branchpoints to which the authors refer at the beginning of the discussion.

      Response: We appreciate the reviewer’s viewpoint but disagree this paper is mostly about Hes genes, as there is a critical direct, comparable evaluation with Rbpj and dn-Maml. Direct comparison of 7 genotypes highlights where each pathway member exhibits idiosyncratic phenotypes. We are striving for a clear, simple title about a very complex topic, involving the in vivo genetic dissection of a signaling pathway. We modified the title to: "Notch pathway mutations do not equivalently perturb mouse embryonic retinal development "

      1. "Although the Pax6-Pax2 boundary is intact in Rax-Cre;RbpjCKO/CKO eyes, ONH shape was attenuated compared to controls (Fig 3I)". This statement is arguable as the difference seems subtle. Perhaps some kind of quantification would help.

      Response: We quantified Pax2+ cells (ONH domain) using the adjacent proximal terminus of the retinal pigmented epithelium (RPE) to indicate a transition from ONH to optic stalk (OS). We also quantified the number of Pax2+Pax6+ double positive cells where the 2 domains abut (boundary cells). Some higher magnification examples are now provided in Fig 3H';3K';3N'. Grossly, the imaging data support that the Pax2+ ONH is expanded in Chx10-Cre;TKO eyes, while boundary cells are most affected in Rax-Cre;HesTKO eyes, due to an expansion of retinal tissue. This is supported by our quantitative data (Fig 3O,3P). We observed even in controls that Pax2-expressing cells show some numerical variability. We attributed this to the position of the section through the ONH, which is a 3-dimsenional ring (torus). Therefore, we quantified additional wild-type controls and mutant samples in the new Fig 3O,3P graphs, improving statistical power, and allowing us to detect quantitative differences.

      Page 12 first paragraph. "....but all other genotypes were unaffected". This statement is unclear. All lines in which the Rax-Cre has been used seem to have an increased number of apoptotic cells. This should be better explained

      Response: Respectfully, only one genotype, Rax-Cre;Rbpj mutants contain a statistically significant increase in apoptotic cells (Fig 5P). This is demonstrated by one-way ANOVA analyses that included all pairwise comparisons. To ensure that the quantification was not misleading due to changes in tissue morphology, data in Figs 5, 6, and 7 were normalized to optic cup area. The area was traced in FIJI, creating a polygon whose area was determined in square microns. For every section image, the marker+ cells were divided by the square micron area of the retina (excluding the opening for the optic nerve). Such a method is critical for comparison across this allelic series, given the morphologic changes, differences in cell clustering where rosettes form, and reduced proliferation whenever Notch signaling is lost or reduced.

      Page 12, end of second paragraph: "E13.5 Chx10-Cre;HesTKO eyes had a milder RGC phenotype (Figs 6G, 6N, 6U), but all other mutants were unaffected (Figs 6E, 6F, 6L, 6M, 6S, 6T). This statement is also rather subjective. The phenotype of Chx10-Cre;HesTKO is quite strong and the other mutants seem to have a phenotype. Some quantifications here will help.

      Response: We agree and provide quantification for both Atoh7 and Rbpms positive cells in the revised Figure 6. This is now in the same figure with quantification of Otx2+, Otx2+Atoh7+ and Crx+ cells. The reviewer is correct that both ROSA-dnMaml and both HesTKO mutants have a statistically significant increase in RGCs. Surprisingly, neither of the Rbpj CKO mutants have this outcome (Fig 6Y).

      1. Page 13, toward the bottom..."...but noted that Chx10-Cre RbpjCKO/CKO eyes were not different from controls (Figs 7E, 7AA)". Again, this statement is questionable as staining for both CRX and Rbpms seem reduced as compared to controls as quantifications in 7AA seems also to indicate (about half?). Did the authors calculate whether there is a statistical difference between controls and Chx10-Cre RbpjCKO/CKO ?

      Response: Rbpms+ RGCs and Crx+ photoreceptor precursors were colabeled and quantified on sections for all genotypes. All counts were normalized to area as described above. Upon quantification and ANOVA with pairwise comparisons, there was no statistical difference in Crx+ or Rbpms+ cells between control and Chx10-Cre;Rbpj mutants (new Fig 6Y and Z).

      In Fig 7CC the authors should make the effort of including at least one additional sample, 2 biological replicates seem insufficient to draw a conclusion.

      Response: The Rax-Cre;Hes1CKO/+ X Hes1CKO/CKO matings stopped producing litters in late 2022. While this manuscript was out for review, we obtained younger mice, from which new control and Rax-Cre; Hes1 mutant littermates were collected, stained, imaged and quantified. Upon adding samples, we found that the outcome was unchanged, but the data better support the lack of a statistical difference in rods between genotypes at E17. These data were moved to revised Suppl Fig 5.

      Significance: This is a rather complex study that dissects further the role of the Notch pathway and Hes proteins during eye development, a topic that has been addressed in many previous studies but perhaps not with the details that the authors have used here. In this respect, this study adds to current literature but will likely be of interest to retina aficionados. The manuscript reads well and the figures are of very good quality. However, many of the statements are based on qualitative rather than on quantitative analysis. This should be, at least in some cases, remediated, despite the effort that this may require given the number of mouse lines used in the study.

      Response: To increase the impact of our manuscript, we quantified all markers except Tubb3, since its localization in cell bodies and axons make it impossible to assign to individual cells. We feel that this additional quantification strongly improves the quality of our findings and allowed us to make well-supported and novel conclusions. While we certainly believe that the retinal development community will find this paper of interest, it will also be of value to the broader Notch pathway scientific community. In this manuscript, we simultaneously compared phenotypes for Notch pathway genes in signal receiving cells. We could find essentially no studies like this for the mouse CNS and only a few from the Kopan lab about the kidney and immune system. Interestingly, one of us (NLB) is a coauthor on a recent paper about Notch signaling in the cortex, in which ROSA-dnMaml behaves analogously to Notch1CKO or RbpjCKO. This emphasizes that findings in one organ may not recapitulate the "rules" for this pathway for other cell types or tissues (doi: 10.1242/dev.201408)(2). Deeper understanding of how the Notch pathway in the retina functions, analogously or differently, is important. We feel our revised study advances when and where there are "branchpoints" in canonical signaling that may be overlooked in other developing tissues and organs.

      Reviewer #3: I have reviewed a manuscript submitted by Bosze et al., which is entitled "Not all Notch pathway mutations are equal in the embryonic mouse retina". The authors focused on Notch signaling pathway. Notch signaling is deeply conserved across vertebrate and invertebrate animal species: in general, two transmembrane proteins, Delta and Notch, interact as a ligand and a receptor, respectively, which induces proteolytic cleavage of Notch receptors to generate Notch intracellular domain (NICD). NICD is translocated into nucleus, then forms the transcription factor complex including Rbpj (also referred to as CBF1) and Mastermind-like (Maml), and activates the transcription of Hes family transcription factors. Three Hes proteins, Hes1, 3, and 5, are important for nervous system development. In the vertebrate developing retina, these Hes proteins inhibit neurogenesis to maintain a pool of neural progenitor cells. In addition to their primary role in neurogenesis, the authors recently reported that Hes1 promotes cone photoreceptor differentiation. In the later stages of development, Hes proteins also promote Müller glial differentiation. In addition, Hes1 is highly expressed in the boundary between the neural retina and optic stalk and required for this boundary maintenance. To understand precise regulation of Notch component-mediated signaling network for retinal neurogenesis and cell differentiation, the authors compared retinal phenotypes in the knockdown of three Notch pathway components, that is (1) Hes1/3/5 cTKO, (2) Rbpj KO, and (3) dominant-negative Maml (dnMaml) overexpression, under the control of two Cre derivers; Rax-Cre and Chx10-Cre. First, the authors found that Hes1 expression in the boundary between optic stalk and neural retina is lost in Rax-Cre; Hes1/3/5 cTKO, but still retained in Rax-Cre; Rbpj KO and Rax-Cre; dnMaml overexpression, suggesting that Delta-Notch interaction is not required for Hes1 expression in the boundary between optic stalk and neural retina. Furthermore, Hes1 expressing boundary region expands distally at the expense of the neural retina in Chx10-Cre; Hes1/3/5 cTKO. Maintenance of ccd2 expression in this expanded boundary area suggests that Hes1 normally maintains a proliferative state in the optic stalk, which may allow these cells to differentiate into astrocyte in later stages. Second, in addition to precocious RGC differentiation in all the Notch component KO, the authors found that, as compared with wild-type, cone and rod photoreceptor genesis is highly enhanced in Rax-Cre; Rbpj KO and Rax-Cre; dnMaml overexpression and mildly enhanced in Chx10-Cre; dnMaml overexpression. On the other hand, in Rax-Cre; Hes1/3/5 cTKO, cone and rod photoreceptor genesis is not enhanced but similar to wild-type level. Since the authors previously reported that cone genesis is reduced in Rax-Cre; Hes1 cKO and Chx10-Cre; Hes1 cKO, so Rax-Cre; Hes1/3/5 cTKO may rescue decrease in cone genesis in single Hes1 cKO. The authors raise the possibility that elevated Hes5 expression in single Hes1 cKO may suppress cone photoreceptor genesis. The authors also found that amacrine cell genesis is significantly suppressed in Rax-Cre; Rbpj KO but not changed in Rax-Cre; dnMaml overexpression and Rax-Cre; Hes1/3/5 cTKO, suggesting that Rbpj is specifically required for amacrine cell genesis. From these observations, the authors propose that there are at least two branchpoints for photoreceptor and amacrine cell genesis in Notch component-mediated signaling network. Their findings are very interesting and provide some new insight on how Notch signaling components are integrated into other signaling pathways and promote to generate diverse but well-balanced retinal cell-types during retinal neurogenesis and cell differentiation, in addition to conventional classic view of Notch signaling pathway. However, one weak point is that, although the authors figured out what kinds of phenotypic difference appear in the KO retinas between these Notch components, the research result is descriptive and less analytical. Most of their conclusions may be supported by their previous works or others; it is still hypothetical. So, it is important to show more analytical data to support their interpretation and more clearly show what is new conceptual advance for Notch signaling pathways.

      For example, sustained Hes1 expression in the boundary region between optic stalk and neural retina may be reminiscent to brain isthmus situation. I would like to request the authors to show more direct evidence that Hes1 regulation in optic stalk/retina boundary is independent of Delta-Notch interaction. One possible experiment is whether DAPT treatment phenocopies Rax-Cre; Rbpj KO and Rax-Cre; dnMaml overexpression (Hes1 in optic stalk boundary is normal?).

      Response: Usage of the gamma secretase inhibitor DAPT is an interesting experiment as it can phenocopy the loss of Notch signaling in developing tissues. However, the reviewer's proposed DAPT experiment is problematic for two major reasons. First, DAPT blocks the gamma secretase complex, which has more than 90 protein targets in the cell membrane (3). Therefore, DAPT may not be informative for Hes1 regulation given the myriad of expected off-target effects. Second, it would be difficult to treat embryos at the relevant stages with DAPT. Injections into pregnant mice are lethal and we cannot localize drug to the relevant area during in vivo development. Our direct phenotypic comparisons with two Cre drivers strongly indicate that Hes1 is independent of canonical Notch signaling in the developing optic stalk.

      We include an extra related data figure (Reviewer Fig 1) showing anti-Hes1 immunolabeling of E13.5 Rax-Cre;Notch1CKO/CKO (n=2) and E13.5 Rax-Cre;Notch2CKO/CKO eyes (n=3). The Notch1 mutant lost oscillating Hes1 expression in retinal progenitors, but the uniform Hes1 ONH domain remains. Interestingly, the Notch2 mutant had essentially no effect on Hes1 (oscillating or sustained), or Hes5 mRNA expression. A Notch2 RNA in situ hybridization demonstrates that Notch2 mRNA was lost in the E13 optic cup and RPE (Rax-Cre expressing tissues). These data emphasize: A) the Notch1-specific dependency of oscillating Hes1 expression in retinal progenitors is absent from the ONH; B) although coexpressed in the same tissue, Notch receptors have unequal activities.

      Does Rax-Cre; Rbpj KO; Hes1-cKO phenocopy Rax-Cre; Hes1-cKO (or Rax-Cre; Hes1/3/5 cTKO)?

      Response: This is a good question! The first author tried very hard to produce Rax-Cre; Rbpj CKO;Hes1 CKO double mutant embryos. However, these progeny could not be recovered from E10-E13 embryos, despite collecting more than 10 litters. Thus, it is likely that this genotype is lethal before eye formation.

      Could the authors identify an enhancer element that drives Hes1 transcription in optic stalk/retina boundary, which should be not overlapped with that of NICD/ Rbpj binding motif? Such additional evidence will make their conclusion more convincing.

      Response: Another interesting question. We have been working for >3 years on Hes1 cis regulatory enhancers, but the pandemic greatly delayed progress. The proximal Hes1 600bp upstream region is a generic enhancer that contains Hes1 binding sites for repressing its own expression (4) and has a pair of Rbpj consensus sites for Notch ternary complex activation of Hes1 expression (5,6). Nearby is a binding site occupied by Gli2 in the E16 mouse retina (7). Recently, it was shown that Ikzf4 binds slightly farther away (8). The upstream 1.8 kb region (including the 600bp just described) can drive destabilized GFP or dsRed reporters in early postnatal retinal explants (9). However, this sequence was used to make and analyze a classic Hes1-GFP transgenic reporter mouse, in which GFP was not expressed in the early embryonic mouse optic vesicle or cup (10). Therefore, any early eye-specific enhancer(s) are located farther upstream, in an intron, or downstream (or combination thereof). Public domain epigenetic and chromatin accessibility datasets support this idea. Identifying the gene regulatory logic for Hes1 expression in the eye will be an exciting future story, well beyond this manuscript. We are excited to use live imaging of enhancer reporters to discern oscillating versus sustained activity patterns during early ocular development.

      Regarding the conclusion on new branchpoints on photoreceptor and amacrine cell genesis, a model shown in Figure 9 is still hypothetical. Figure 9B indicate a model in which the increase of Otx2+ cells and Crx+ cells in Rax-Cre; Rbpj KO is mediated by Hes1, which is presumed to be activated in Notch-independent signaling. However, Hes1 expression in the neural retina is markedly reduced in Rax-Cre; Rbpj KO (Fig. 2I), which does not fit in with the model.

      Response: We removed Fig 9B and now present new models about the Notch-dependent versus -independent roles for both Rbpj and Hes1. The new summary is Fig 8.

      So, I would like to request the authors to examine whether the increase of Otx2+ cells and Crx+ cells in Rax-Cre; Rbpj KO, (or Rax-Cre; dnMaml overexpression and Chx10-Cre; dnMaml overexpression) is inhibited by Hes1 KO.

      Response: If we understand this correctly, it would mean generating double mutants, some of which we determined are not viable (see the response above, and Suppl Table 2). Given there is only a partial knockdown of Hes1 or Hes5 in either dnMaml mutant we do not believe repeating this in the Hes1 CKO genetic background to be informative and it would take 3 generations to perform.

      Second, the authors concluded that both cone and rod genesis are enhanced in Rax-Cre; Rbpj KO by showing the data on Crx/Nr2e3 labeling in Rax-Cre; Hes1 cKO in Fig. 7BB. However, as the authors mentioned in the manuscript, Hes5 expression is elevated in Rax-Cre; Hes1 cKO (Fig. 1G). So, since Rax-Cre; Hes1 cKO has residual Hes activity in the retina, Fig. 7BB should be replaced with labeling of Crx/Nr2e3 in Rax-Cre; Hes1/3/5 cTKO.

      Response: Unfortunately, Rax-Cre;HesTKO embryos do not live past E13 (Suppl Table 2). Thus, we cannot evaluate rods, whose genesis starts around E13.5. Revised Fig 1G shows the Hes5 domain is shifted with the expansion of retinal tissue in E13.5 Hes1 single mutants, but importantly, also analogously shifted in Pax2 mutants (Fig 1H). We do not conclude that mRNA levels are "elevated" since mRNA in situ hybridization is not a quantitative technique. Our initial examination of rods in E17 Rax-Cre;Hes1 CKO mutants tested the idea of a fate shift from cones to rods. However, deeper quantification (Suppl Fig 5) do not support such a fate change.

      Furthermore, possibly, it is best to examine labeling of the retinas of Rax-Cre; Rbpj KO with rod and cone-specific markers and confirm that the number of both rods and cones is significantly increased. Third, as for defects in amacrine cells genesis in Rax-Cre; Rbpj KO, I would like to request the authors to show the data on Crx10-Cre; Rbpj KO. Although Rbpj KO is mosaic in Crx10-Cre; Rbpj KO, we can distinct Rbpj KO cells by GFP expression (Fig. S2C, C', C'). So, the authors can confirm that amacrine cell genesis is inhibited in a cell-autonomous manner in Crx10-Cre; Rbpj KO retinas but not in Crx10-Cre; dnMaml overexpression. Addition of such data will make the authors' conclusion is more convincing.

      Response: Suppl Table 1 lists multiple references (two from the NLB lab) that demonstrated both a rod and cone increase in Rbpj loss-of-function conditions. Chx10;Rbpj CKO animals were evaluated by Zheng et al., who showed an amacrine loss phenotype in these mutants (11). This is equivalent to what we see in our Rax-Cre;Rbpj CKO data, but without the complications of Chx10 mosaic Cre expression upon Rbpj deletion.

      Other comments: 1) Title of this manuscript is "Not all Notch pathway mutations are equal in the embryonic mouse retina". However, this title is quite obscure in what is research advancement of their findings. I suggest the authors to include more concrete and conclusive sentence in the title, for example "Hes and Rbpj differentially promotes retina/optic stalk boundary maintenance and photoreceptor genesis, in parallel with neurogenic inhibition by Notch signaling pathway".

      Response: We appreciate the reviewer's perspective. We are striving for a relatively simple title about a very complex topic, involving the in vivo genetic dissection of a signaling pathway. We modified the title to "Notch pathway mutations do not equivalently perturb mouse embryonic retinal development ".

      2) The "Results" section is a bit difficult to follow logics without detailed knowledge on roles of Notch signaling in mouse retinal development. I suggest the authors to improve a writing style of "Results" section for readers without such detailed knowledge on mouse Notch mutant phenotypes to follow logical flow more easily. There are many additional descriptions on research background before start to mention results. Such introductory sentences should be moved to the "Introduction" section, by which logical flow in the Results section should be simpler. In addition, the authors should show a concrete question at the beginning of each result subsection. Furthermore, the authors sometimes jump over from one result subsection and suddenly move to cite another figure panel in a far ahead subsection whose data has not been explained. Such a back-and-forth citation of figure data generally makes it difficult to follow logical flow.

      Response: We now present a considerable amount of new quantified data, reorganized multiple figures, and extensively rewrote the paper. We significantly revised the summary figure to improve clarity. In addition, Suppl Table 1 provides a wealth of background information to orient the reader on this topic. We feel that this extensive revision has greatly improved the quality, logical flow, and readability of the manuscript.

      3) In addition, figure configuration is not well organized. Each figure compared some particular marker expression in wild-type, Rax-Cre; HesTKO, Rax-Cre; Rbpj cKO, Rax-Cre; dn-Maml-GFP, Chx10-Cre; HesTKO, Chx10-Cre; Rbpj cKO, Chx10-Cre; dn-Maml-GFP. For example, Fig. 2 shows Hes1 for inhibition of neurogenesis, Fig. 3 shows Vsx2; Mitf and Pax2; Pax6 for retinal pigmented epithelium and optic stalk, Fig. 6 shows Atoh7, Rbpms, and Tubb3 for retinal ganglion cells. Fig. 7 shows Crx, Otx2, and Thrb2 for photoreceptor differentiation. Fig. 8 shows Prdm1, and Ptf1a for photoreceptors and amacrine cells. Although this figure configuration is convenient to show phenotypic difference between different genetic mutations, it is difficult to know how each differentiation steps are spatially and temporally coordinated during development. At least, I recommend the authors to show one summary figure, which shows spatio-temporal expression profile of retinal markers in wild-type mouse retinas.

      Response: We recognize this point and completely reorganized and combined Figs 6 and 7 to improve clarity. New Figure 6 presents E13 quantification for Atoh7, Otx2, Atoh7/Otx2, Rbpms and Crx expressing retinal populations. E16-E17 data were condensed and moved to a new Suppl Fig 5.

      4a) Page 7, line 7-10 "With earlier deletion using Rax-Cre, hes5 mRNA abnormally extended into the optic stalk": I wonder how the authors define the optic stalk. It is likely that optic stalk area (Pax2+, Vax1+ area) is shifted to more proximal (depart from the optic cup and move toward the brain), and neural retina is expanded accordingly (Fig. 4B, 4F), resulting in expansion of hes5 expression. Thus, it may be better to mention that optic stalk/neural retina boundary is abnormally shifted toward the brain.

      Response: The retina, including the optic nerve head, ends where the adjacent RPE terminates. This is conspicuous morphologically in our sections. We also defined this by colabeling for Pax2 and Pax6, which is now quantified in revised Fig 3. To clarify this further, we added the words " in all panels the brain is to the right" in the Fig 4 legend.

      4b) Page 8, line 14-15, "ONH/OS cells still express it (Hes1), demonstrating that sustained Hes1 is independent of Notch": I presume that Cre-Rax drives Cre in neural retina as well as optic stalk and pigmented epithelium. However, it is likely that Rbpj is not expressed in optic stalk/neural retina boundary area in wild type (Fig. S2A). No expression of Rbpj in optic stalk/neural retina boundary may support that Hes1 expression in this boundary area is Notch-independent. However, Rbpj expression is retained in some vitreal cells near optic nerve head in Rax-Cre; Rbpj-CKO retinas (Fig. S2B). What are these Rbpj+ cells? I would like to request the authors to confirm that Rbpj expression is completely absent in both neural retina and optic stalk in Rax-Cre; Rbpj-CKO mice. Otherwise, this conclusion is still not fully supported.

      Response: We show the Rax-Cre lineage in Suppl Fig 2 via the Ai9 (tomato) reporter. The results are striking, with all of the optic cup derivatives (retina, RPE, ONH, optic stalk, and presumptive ciliary tissue and iris) being tomato positive, while the well-described population of vascular cells in the hyaloid space lack tomato expression. Furthermore, our figure shows that Rbpj expression is only absent from the optic cup derivates, rather than the vascular structures in the vitreous. Vascular cells also depend on the Notch pathway and express Rbpj. Based on considerable evidence from the literature and our lineage experiments, the population of cells the reviewer highlights represents the hyaloid vasculature and associated cell types. It does not represent any population that derives from neuroectoderm.

      4c) Page 9, line 16-18, "Foxg1 had spread into the nasal optic stalk": Is Foxg1 expanded nasal area really "OS" rather than expanded retina? I suggest the authors to confirm molecular markers Pax2 expression is overlapped with Foxg1. Otherwise, it is difficult to conclude that foxg1 is expanded into the optic stalk territory, because foxg1 is normally a marker of retina. Indeed, Fig. 3K shows pax2 expression is shifted into more inside towards the brain, suggesting that neural retina is expanded. Please explain the situation.

      Response: Foxg1 (BF-1) mRNA and protein are found in the nasal retina and are expressed in other brain tissues. Multiple studies show Foxg1 in the nasal side of the E10 optic cup/retina/optic stalk and developing hypothalamus (See extra data figure Reviewer Fig 2; top row figure is data from Smith et al., 2017 (12) with Foxg1 mRNA in purple. Also see our new manuscript panel Fig 1C. We include here for reviewers (extra data Reviewer Fig 2 showing E13 ocular cryosections colabeled for Foxg1 and Pax2, highlighting their relationship in the retina, optic stalk and adjacent forming hypothalamus. On page 9 the text now reads "At E13.5 Rax-Cre;HesTKO eyes, the Foxg1 nasal retinal domain was contiguous with the nasal optic stalk (Suppl Fig 4D). This is reminiscent of younger stages (Fig 1C), since normally at E13.5, Foxg1 in the nasal optic cup/retina is separated from expression in the ONH/OS (Suppl Fig 4A). Based on the expansion of Pax6, Vsx2 and Hes5 RPC domains into the optic stalk, we conclude that the change in Foxg1 similarly reflects an extension of retinal tissue."

      4d) Page 10, line 4-5, In Rax-Cre; Hes1/3/5 cTKO eye, this tissue (RPE) extended into the optic stalk": This description seems to be incorrect. A part of Pax2 area, which is adjacent to the neural retina, contacts with RPE in wild type (Fig. 3AH), so most of RPE covers the neural retina even in Fig. 3DK.

      Response: We disagree with the reviewer’s interpretation. Fig 3D shows Mitf labeling of RPE nuclei. Figure 3K shows the adjacent section labeled with Pax2 and Pax6 (labels both retina and RPE). As the retina extended "towards the brain", the RPE analogously extends and surrounds the retinal domain. We also added higher magnification data panels 3H, 3K and 3N, showing merged and single channels.

      4e) Page 10, line 22-23, "For Chk10-Cre; Hes1/3/5 cTKO, there was a unique presence of ectopic Pax2 within the retinal territories": I wonder if this description is correct. I suspect that proliferative Pax2+ cells expand into regressing territory of Hes KO retinal cells, which undergo precocious neurogenesis and lose proliferative activity, in Chk10-Cre; HesTKO. In this case, it is possible that the Pax2/Pax6 interface may be maintained. Please show red and green channel panels for Fig. 3N to confirm that there is ectopic pax2 and pax6 double positive cells.

      Response: New quantification in revised Fig 3 (see panels O,P) fully supports our original conclusion. Only Chx10-Cre;HesTKO mutants have a statistically significant increase in Pax2+ cells. There are not more Pax2+Pax6+ double labeled cells. Only this particular genotype has an increase in Pax2+ single labeled cells.

      5a) Page 11, line 20-25. There seems to be inconsistency between result description and image data of Fig. 5A-G, and histogram Fig. 5O. Authors mentioned that a modest loss of pH3+ cell fraction in Chx10-Cre; Hes1/3/5 cTKO but not in Rax-Cre; Hes1/3/5 cTKO. However, Fig. 5D indicates severe reduction of pH3+ cell fraction in Rax-Cre; Hes1/3/5/ cTKO, which is similar to reduction of pH3+ cell fraction in Rex-Cre; Rbpj (Fig. 5B), but histogram data is different (Fig. 5O). Furthermore, pH3+ cell fraction is severely reduced in Chx10-Cre; ROSA(dn-Maml-GFP) (Fig. 5F) and modestly reduced in Chx10-Cre; Hes1/3/5 cTKO (Fig. 5G). However, pH3+ cell fraction seems to be normal in Chx10-Cre; Rbpj (Fig. 5E). These Chx10-Cre image data do not match the histogram of Fig. 5O. Please check their situation.

      Response: Images in old Figs 5-8 were normalized using area measurements, see methods and above comments (note: old Figs 6&7 were combined into new Fig 6). One-way ANOVA with pairwise comparisons for each mutant genotype compared to control were calculated using Prism. All genotypes except two have a statistically significant loss of M phase cells and we discuss possibilities for this outcome (Fig 5O). A normalization method for the sampled area is an essential component of these studies since morphologic differences are apparent for particular genotypes. The quantitative data are consistent with our original conclusions.

      5b) Fig. 5H-N, P: I wonder if the stage E13 is appropriate to evaluate cell death and survival because optic cup already becomes smaller in Rax-Cre; Rbpj, Hes1/3/5 cTKO, or ROSA(dn-MAML-GFP) than in wild-type control. I suggest the authors examine more earlier stage.

      Response: While an earlier effect is possible, we only observed size differences in a subset of the genotypes. Thus, E13 serves as a critical timepoint to examine early developmental phenotypes across the totality of our mutant conditions. It is also first age when the ONH is fully formed.

      5c) Page 12, line 19-20, "all other mutants (Chx10-Cre; Rbpj, and Chx10-Cre; ROSA(dn-MAML-GFP) were unaffected (Fig. 6EF, LM, ST)": It is likely that atoh7 expressing cells are mildly decreased and neuronal marker, Tubb3 and Rbpms-expressing cells are increased in Chx10-Cre; Rbpj, and Chx10-Cre; ROSA(dn-MAML-GFP). I requested the authors to evaluate the fraction of these markers in retinal area statistically in all the cases.

      Response: As described above, we quantified Atoh7 and Rbpms nuclear expression by immunohistochemistry. We do not believe that Tubb3+ cells can be reliably quantified. Nonetheless, it is useful to qualitatively show the extent of excess neuron formation. Importantly, we observed that it is not the Atoh7 status that matters for RGC formation, rather it is the Otx2 expression status. This is in good agreement with single cell-RNA transcriptomics data from Wu et al 2021 showing that Atoh7 mRNA in all early transitional RPCs remains fairly constant and its loss does not block the formation of early RGC cell states (13). By contrast Otx2 fluctuates but remains expressed in transitional RPCs that progress to photoreceptor lineages.

      6a) Page 7, line 19 "Ectopic blood vessels protruded from the ONH (Fig. 1K, 1L)": It is difficult to see blood vessel structures in these panels (Fig. 1I-L). Please show some molecular marker of blood vessels to confirm how blood vessel is organized in Hes1/3/5 cTKO.

      Response: These vascular structures are highly conspicuous by morphology in the H&E insets. Nonetheless, we used adjacent P21 sections to immunolabel for Endomuscin (14) and Tubb3 antibodies. This colabeling confirms the morphology and position of ectopic blood vessels in the abnormal tissue masses in Chx10-Cre;HesTKO mutant eyes. Ectopic tissue contains only rare Tubb3+ cells or cell processes suggesting it is overwhelmingly nonneural. All P21 data were moved to a new Suppl Fig 2. A full detailing of vascular phenotypes is beyond the scope of this manuscript and, interestingly, would be potentially attributable to non-autonomous effects of perturbing the Hes genes in the adjacent retina.

      6b) Fig. 5: Increase of pH3 fraction indicates several possibilities, for example (1) increased fraction of mitotic cells due to precocious neurogenesis, (2) increased fraction of mitotic cells due to activated cell proliferation of retinal progenitor cells, (3) increased cell-cycle arrest in M phase due to some stress response of progenitor cells. So, I suggest the authors to examine (1) BrdU percentage of retinal section area, (2) the percentage of pH3+ cells in PCNA+ retinal cells.

      Response: The data listed in Suppl Table 1 presents a unified picture that disrupting Notch signaling reduced proliferation. This paradigm extends to other model organisms (e.g., Drosophila, chick, frog, zebrafish and even to nonneural tissues). We included the phospho-histone H3 staining so readers would see how the six mutants evaluated in this study align with this paradigm, providing confidence for the novel findings in other figures. A full evaluation of cell cycle kinetics is interesting, but beyond the scope and focus of this manuscript.

      6c) Fig. 5: It is better that cell death fraction will be evaluated by TUNEL and labeling with anti-activated caspase 3 antibody.

      Response: We disagree. The DNA repair enzyme PARP is inactivated upon cleavage by activated caspase 3. There are currently ~3,600 citations that use it as a marker of apoptosis. PARP also has a separate and very specific role in maintaining the integrity of sperm DNA. This antibody works on all metazoans and is amenable to many tissue preparations and fixatives, making it easy to use, robust and quantifiable.

      7a) Please show red channel (Hes1) image in Fig1BC.

      Response: This was added to Revised Fig 1 (Fig 1A).

      7b) Fig. 1DH should be shown in neighbor. Fig. 1H should be assigned as Fig. 1E.

      Response: The new Fig 1 layout addresses this point.

      7c) Fig. S2D, F, H, J: Please show GFP green channel as well. Otherwise, it is difficult to see non-overlapping expression in optic stalk area.

      Response: In the revision, this is Suppl Fig 3. Chx-10-Cre is not expressed by ONH-OS cells (1). The green and fuchsia overlap (coexpression) in RPCs is white, we feel this is fairly clear. If needed, all readers can turn on and off the green channel in the final PDF version of this figure to compare GFP with Hes1 expression for those panels.

      7d) Fig. 9B: It is better to show Rax-Cre: Hes1/3/5 TKO rather than Rax-Cre: Hes1 cKO. 7e) Fig. 9B: Lettering "Rbpj mutant" should be revised as "Rax-Cre: Rbpj KO".

      Response: Fig 9B was removed so these terms are now irrelevant. Our models are presented in new Fig 8.

      Significance: The senior author of this manuscript, Dr. Nadean Brown, is an expert scientist who has investigate the role of Notch signaling pathway in vertebrate ocular tissue, including the neural retina and lens. In general, Notch signaling pathway consists of signaling stream from the interaction of Delta and Notch, Notch receptor activation by proteolytic cleavage, translocation of Notch intracellular domain (NICD) into nucleus, formation of transcription factor complex consisting of NICD/Rbpj/Maml, to the transcriptional activation of Notch target genes, Hes family transcription factors. Finally, Hes suppresses neurogenic program and maintain a pool of neural progenitor cells. Therefore, Notch is a key factor to regulate the balance between neurogenesis and progenitor proliferation. In this manuscript, the authors investigated retinal phenotypes in the knockout mice of different Notch signaling components, including Rbpj, Maml, and Hes. They found that functions of these three factors are not always equal in retinal cell differentiation; rather, they specifically regulate a particular step of retinal development. The authors propose the possibility that each of Notch signaling components may be modified by other signaling pathways and achieve some new roles beyond the conventional frame of classic Notch signaling pathway. In this point, this work has a potential to provide a new conceptual advance in the field of developmental and cell biology.

      We fully agree this work is a significant advance for the fields of developmental and cell biology. Our findings provide new information and stimulate fresh ideas for anyone working on signal transduction and signal integration.

      References cited:

      1. Bosze et al., 2020 Journal of Neuroscience Vol 40:1501-13; Bosze et al. 2021 Dev Biol Vol 472:18-29.
      2. Han et al., 2023 Development Vol 150 dev201408.
      3. Kopan and Ilagan, 2004 Nat Rev Cell Biol. Vol 5:499-504
      4. Hirata et al., 2002 Science Vol 298:840-3
      5. Friedmann and Kovall, 2010 Protein Sci. Vol 19:34-46
      6. Ong et al., 2006 JBC Voll24:5106-19
      7. Wall et al., 2009 J Cell Biol. Vo 184: 101-12.
      8. Javed et al., 2023 Development Vol 150:dev200436
      9. Matuda and Cepko 2007 PNAS Vol 104: 1027-1032
      10. Ohtsuka et al., 2006 Mol. Cell Neurosci. Vol 31:109-22
      11. Zheng et al., 2009 Molecular Brain Vol 2:38
      12. Smith et al., 2017 Journal of Neuroscience Vol 37:7975-93.
      13. Wu et al., 2021 Nature Communications Vol 12:1465: doi 10.1038/s41467-021-21704-4
      14. Saint-Geniez et al., 2009 IOVS Vol 50: 311-21.
    1. Author Response

      Reviewer #2 (Public Review):

      Associative learning assigns valence to sensory cues paired with reward or punishment. Brain regions such as the amygdala in mammals and the mushroom body in insects have been identified as primary sites where valence assignment takes place. However, little is known about the neural mechanisms that translate valence-specific activity in these brain regions into appropriate behavioral actions. This study identifies a small set of upwind neurons (UpWiNs) in the Drosophila brain that receive direct inputs from two mushroom body output neurons (MBONs) representing opposite valences. Through a series of behavioral, imaging, and electrophysiological experiments, the authors show that UpWiNs are differentially regulated by the two MBONs, i.e., inhibited by the glutamatergic MBON-α1(encoding negative valence) while activated by the cholinergic MBON-α3 (encoding positive valence). They also show that UpWiNs control the wind-directed behavior of flies. Activation of UpWiNs is sufficient to drive flies to orient and move upwind, and inhibition of UpWiNs reduces flies' upwind movement toward the source of reward-predicting odors (CS+). These results, together with existing knowledge about the function of the mushroom body in memory processing, suggest an appealing model in which reward learning decreases and increases the responses of MBON-α1 and MBON-α3 to the CS+ odor, respectively, and these changes cause UpWiNs to respond more strongly to the CS+ odor and drive upwind locomotion. Interestingly, in the final part of the results, the authors reveal a wind-independent function of UpWiNs: increasing the probability that flies will revisit the site where UpWiNs were activated. Thus, UpWiNs guide learned reward-seeking behavior with and without airflow. Although the mushroom body has been extensively studied for its role in learning and memory, the downstream neural circuits that read the information from the mushroom body to guide memory-driven behaviors remain poorly characterized. This study provides an important piece of the puzzle for this knowledge gap.

      Strength

      1) Memory studies have predominantly relied on binary choice (go or no-go) assays as measures of memory performance. While these assays are convenient and efficient, they fall short of providing a comprehensive understanding of underlying behavioral structures. In an effort to overcome this limitation, the current study used video recording and tracking software to delve deeper into memory-guided behavior. This innovative approach allowed the authors to uncover novel neurons and examine their contribution to behavior with a level of detail not possible with binary choice assays.

      2) This study used electron microscopy-based Drosophila hemibrain connectome data to reveal the synaptic connection between UpWiNs and MBON-α1 and MBON-α3. Using this method, the study shows that a single UpWiN receives direct input from both MBON-α1 and MBON- α3, which is confirmed by a functional imaging experiment. The connectome dataset also reveals several neurons downstream of UpWiNs, opening avenues for further research into the neural mechanisms linking memory and behavior.

      Weakness

      1) The authors repeatedly state in the manuscript that MBON-α1 and MBON-α3 convey appetitive or aversive memories, respectively. This assertion may not be entirely accurate. Evidence from sugar reward conditioning experiments suggests that MBON-α3 is potentiated and required for sugar reward memory retrieval. Therefore, the compartmentalization for appetitive and aversive memories appears not as obvious at the level of MBONs.

      What we intended was that activation of DANs in these compartments can induce aversive and appetitive memories, respectively, when paired with odors, and that these are the sole output pathway from these compartments to read out the memories in these compartments. As we previously proposed (Aso et al., 2014a eLife), these MBONs can integrate inputs from MBONs of other compartments and their activity can reflect appetitive memory stored as synaptic plasticity in other compartments. Since DANs in the α3 compartment respond to heat, bitter and electric shock but not sugar, the observation that MBON-α3 acquires an enhanced CS+ odor response after appetitive conditioning is presumably due to these intercompartmental connections rather than plasticity of KC-MBON synapses in the α3 compartment. In any case, the fact that excitatory activity of MBON-α1 and MBON-α3 conveys opposite valence of memory still holds true since appetitive conditioning induces depression and potentiation of odor responses, respectively.

      To clarify this point, we now cited related literature in the following sentence in the final paragraph of Introduction: “UpWiNs receive inputs from several types of lateral horn neurons and integrate inhibitory and excitatory inputs from MBON-α1 and MBON-α3, which are the output neurons of MB compartments that store long-lasting appetitive or aversive memories, respectively (Aso and Rubin, 2016; Ichinose et al., 2015; Jacob and Waddell, 2022a; Pai et al., 2013; Yamagata et al., 2015).”

      2) This study did not conclusively establish the importance of the MBON-α1/α3 to UpWiN pathways in memory-driven behavior. In the experiments shown in Figure 5, flies were trained to associate the activation of reward-related DANs with a specific odor (CS+). After conditioning, UpWiNs were observed to show enhanced responses to the CS+ odor. However, the results should be interpreted with caution because the driver line used to activate DANs (R58E02-LexAp65) labels not only DANs projecting to the MBON-α1 compartment, but all DANs in the protocerebral anterior medial (PAM) cluster. Thus, it remains unclear to what extent the observed enhanced responses are influenced by changes in inhibitory inputs from MBON-α1. While UpWiNs have been shown to play a critical role in the expression of sugar reward memory (Figure 7), it should be noted that UpWiNs receive inputs from multiple upstream neurons, making it difficult to accurately assess the contribution of MBON-α1/α3 to UpWiN pathways in UpWiN recruitment. Further research is needed to fully address this issue.

      We totally agree with this point and added a sentence to explain an alternative mechanism. “This enhancement of CS+ response can be most easily explained as an outcome of disinhibition from MBON-α1 whose output had been decreased by memory formation; MBON-α1 is inhibitory to UpWiNs (Figure 4B) and MBON-α1 response to the CS+ is reduced following the same training protocol (Yamada et al. 2023). In addition to such a mechanism, plasticity in the β1 compartment may contribute to the enhanced CS+ response in UpWiNs because the driver R58E02 contains DANs in the β1 and glutamatergic MBON from the β1 directly synapse on the dendrites of MBON-α1 and MBON-α3. “

      3) UpWind neurons (UpWiNs) were so named because their activation promotes upwind locomotion. However, when activated in the absence of airflow, flies show increased locomotor speed and an increased probability of revisiting the same location (Figure 7 and Figure 7-figure supplement 1). The revisiting behavior can be observed during the activation of UpWiNs, which is distinct from the local search behavior that typically begins after a reward stimulus is turned off (e.g., Gr64f-GAL4 results in Figure 7-figure supplement 1).

      Return probability was calculated within a 15-s time window. High return probability during LED ON period (10-20s) in Figure 7-figure supplement 1 does not necessarily mean that flies returned during LED ON period. If a fly is at the position A when t=10s, to be counted as “returned”, it needs to move more than 10mm away from A and move back to the position less than 3mm distance from A by t=25s. In the case of sugar sensory neuron activation with Gr64f-GAL4, the peak of return probability is shifted toward a later time point because flies stop and extend proboscis during activation period.

      Because revisiting a location can also be a consequence of repeated turns, it seems more accurate to describe UpWiNs as controlling the speed and likelihood of turns and promoting upwind movement by integrating with neurons that sense the direction of airflow.

      The return probability plotted in Figure 7E is probability of return to the position at the end of LED period within 15s post LED period when angular speed of SS33917>CsChrimson and SS33918>CsChrimson flies are identical to empty-split-GAL4>CsChrimson control flies (Figure 7-figure supplement 1). Thus, revisiting behavior cannot be explained by a simple increase in turing probability.

      Although functions of UpWiNs are not limited to promotion of wind-directed walking, we still think that the “UpWind Neurons” is a practical name for broad readers and oral communications at the current stage of investigations, because EM neuron IDs and names (SMP348, SMP353, SMP354, SLP399 and SLP400) are too lengthy and do not contain any functional information. We initially defined a set of 11 neurons labeled by SS33197 split-GAL4 as “UpWind Neurons (UpWiNs)” based on initial optogenetic screening (Figure 2A). We found other driver lines for mushroom body interneuron cell types that can promote release of dopamine and more robust returning phenotype (e.g. SS49755), but SS33917 remained to be the champion driver line for upwind locomotion phenotype.

      Reviewer #3 (Public Review):

      Aso et al. provide insight into how learned valences are transformed into concrete memory-driven actions, using a diverse set of proven techniques.

      Here the authors use a four-armed arena to evaluate flies' preference for a reward-predicting odor and measure upwind locomotion. This behavioral paradigm was combined with the photoactivation of different memory-eliciting neurons, revealing that appetitive memories stored in different compartments of the mushroom bodies (center of olfactory memory) induce different levels of upwind locomotion. The authors then proceed to a non-exhaustive optogenetic screen of the neurons located downstream of the output neurons of the mushroom bodies (MBONs) and identify a group of 8-11 Cholinergic neurons promoting significant changes in upwind locomotion, the UpWins. By combining confocal immunolabelling of these neurons with electron microscope images, they manage to establish the UpWins' connectome within themselves and with the MBONs. Then, using two in vivo cell recording techniques, electrophysiology, and calcium imaging, they define that UpWins integrate both inhibitory and excitatory synaptic inputs from the MBONs encoding appetitive and aversive memory, respectively. In addition, they show that the UpWins' response to a reward-predicting odor is increased after appetitive training. On a behavioral level, the authors establish that the UpWins respond to wind direction only and are not involved in lower-level motor parameters, such as turning direction and acceleration. Finally, they demonstrate that the UpWins' activity is necessary for long-term appetitive memory retrieval, and even suggest a broader role for the UpWins in olfactory navigation, as their photoactivation increases the probability of revisiting behavior. In the end, the authors state that they provide new insights into how memory is translated into concrete behavior, which is fully supported by their data. Altogether, the authors present a pretty complete study that provides very interesting and reliable data, and that opens a new field of investigation into memory-driven behaviors.

      Strengths of the study:

      • To support their conclusions, the authors provide detailed data from different levels of analysis (behavioral, cellular, and molecular), using multiple sophisticated techniques.

      • The measurement of multiple parameters in the behavioral analysis supports the strong changes in upwind locomotion. In addition, taken individually these parameters provide precise insights into how upwind locomotion changes, and allow the authors to more precisely define the role of the UpWins.

      • The authors use split-Gal4 drivers instead of Gal4, allowing them to better refine neuron labelling.

      The authors discussed and investigated all possible biases, making their data very reliable. For example, they demonstrated that the phenotypes observed in the behavioral assay were wind-directed behaviors and could not be explained by bias avoidance of the arena's center area.

      Limitations of the study:

      • In the absence of more precise drivers, the UpWins' labelling lacks precision. For example, there is no way to know exactly which UpWin is responding in the electrophysiological experiment presented in Figure 4.

      We have ongoing efforts to generate split-GAL4 and split-LexA driver lines for specific subsets of UpWiN neurons, but the data using those lines are not ready for this manuscript. However, we would like to point out that historically, identification of a group of neurons with striking phenotype has been foundational to promote follow-up studies. A good example is P1 neurons for courtship behavior.

      • The screening of neurons located downstream of the MBONs is not exhaustive, meaning that other groups of neurons might be involved in memory-driven upwind locomotion. Although, it does not diminish the authors' conclusions.

      The UpWiNs is certainly not the only one cell type for mediating memory-driven upwind locomotion, since our and other groups’ studies (e.g. Matheson et al., 2022; PMCID: PMC9360402) identified a collection of cell types that can promote upwind locomotion upon optogenetic activation.

      In 2021, we released images and driver lines of a larger collection of split-GAL4 driver lines at https://splitgal4.janelia.org. We are preparing a manuscript to provide anatomical descriptions of these lines. This collection of new drivers will help elucidate more comprehensive views of circuits for memory-driven actions.

      • All data were obtained with walking flies. So far, there have been no experiments on flying flies.

      This is an intriguing question and we mentioned in Discussion that “Our study was limited to walking behaviors, and the role of UpWiNs in flight behaviors remains to be investigated.”

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer # 1

      Specific comments

      1) Figure 1: it is unclear how many mice were used for the described phenotypic analyses (panels D and E). Please clarify.

      We acknowledge that we made a mistake in failing to clearly describe the phenotypic analyses. In Figure 1D and E, we performed statistical analysis on the number of TEBs in whole mammary mounts. One mouse stained a mammary whole mount with Carmine-alum staining. Thus, “n” represents the 10 mice we analyzed. We have modified the legend of Figure 1 to " D, E. Quantification of the average number of TEBs and bifurcated TEBs in littermate Crb3fl/fl (n=10) and Crb3fl/fl;MMTV-Cre (n=10) mice at 8 weeks old" in lines 909-911.

      2) Figure 2: in panels B and C it is unclear how the data was quantified; the legend states "n=10", does this mean the experiment in B was done 10 times? And that 10 acini per condition were measured in panel C? In panel D a difference in 0.3% between NC and shCRB3 seems miniscule; do the authors mean 30% instead? And how many acini were counted per condition per (how many) experiments? Same applies to panels G and H, it is unclear how many cells were analyzed per (how many) experiments.

      Thanks for your suggestions. We failed to describe the details of the statistical analysis well in the experimental method. To provide a brief overview of our statistical analysis method, we took 3-4 random bright-field micrographs of each well in the chamber slide system and repeated the experiment three times. We then counted the number of acini in all micrographs (Figure 2B) and examined the diameter of all acini in each photograph, averaging the values as data (Figure 2C). We also determined the percentage of aberrant acini in each photograph, which was used as an analysis value (Figure 2D). We carefully confirmed that the vertical axis of Figure 3D was indeed mislabeled and should mean 30%, and revised the original figure. For IF analysis of the mitotic spindle orientation during lumen formation, we examined the division angle of one cell in one acinus that was mitotically dividing, 3-4 acini were randomly examined in each well in the chamber slide system, and this experiment was repeated three times (Figure 2G and H). Therefore, we have provided a detailed description of these issues in the Figure 2 legend. The revised parts are found in lines 922-924, lines 926-927, lines 929-930, and line 932.

      3) Figure 2: it would be desirable if authors were able to quantify the data in panels E and I.

      Thank you for your comments. According to your suggestions, we performed the quantitative analysis of Figure 2E and I, which is now presented in the new Figure 2D and H.

      4) For all cell-based assays using shRNA to knock down CRB3 (Fig. 2A-H; Fig. 3A-F; Fig. 4C-E; Fig. 5G-J; Fig. 6C; Fig. 7C, D; Fig. 8E-G), it would be desirable to perform rescue experiments to ensure that the observed phenotype of CRB3 depleted cells is specific and not due to off-target effects of the shRNA.

      Yes, rescue experiments involving overexpression of CRB3 in CRB3 depleted cells can accurately account for the specific phenotype as well as eliminate the off-target effects of shRNA. However, our group has long focused on the role of the cell polarity protein CRB3 in contact inhibition and tumorigenesis. Our previous studies have ruled out the off-target effects of shRNA and reported that CRB3 regulates contact inhibition and tumorigenesis through Hippo or Wnt signaling pathways (Cell Death Dis 2017;8(1):e2546, Oncogenesis 2017;6(4):e322, J Cell Mol Med 2018;22(7):3423-33). Therefore, we will pay close attention to rescue experiments to ensure experimental integrity and phenotypic specificity in our subsequent studies.

      5) Figure 3: how many cells were counted/measured per condition (in how many experiments) in panels B, D, H, F, G and H? In panels C and D, what is the CRB3 protein level in these cells? This is of relevance as protein overexpression per se could impinge on ciliation frequency. This question could be addressed by performing a western blot analysis with CRB3 antibody.

      We did not clearly describe the measurement and statistical analysis methods in the previous manuscript. Similarly, we took 3-4 random IF and SEM micrographs of each sample in one experiment, and this experiment was repeated three times. Subsequently, the number of ciliated cells and total cells were counted, and the proportion of ciliated cells was calculated (Figure 3B, D and F). In these figures, the cilium length of representative ciliated cells was measured in each photograph. In the knockout mouse model, we needed to find the intact mammary ductal lumen and renal tubule in IF staining of mouse mammary and renal tissue sections, with 5-6 random fields micrographs taken per slice, and the proportion of ciliated cell was measured by counting and taking the average. A total of ten mice were repeated in these experiments (Figure 3G and H). Therefore, the legend of Figure 3G and H has been partially modified and a detailed description has been added to the Figure 3 legend. The revised parts are in lines 945-946, lines 950-951, line 953.

      Thank you for your suggestions that we perform a western blot analysis with CRB3 antibody in Figure 3C and D. And we have added the western blotting with CRB3 analysis in the new Supplementary Figure 3A.

      6) Figure 3G: it is very difficult to see that the red stained structures are primary cilia.

      Yes, the staining structure of primary cilia in mammary ductal lumen are less clear than that of individual cells and in renal tubule in Figure 3G. We used recognized acetylated tubulin and γ-tubulin to stain the primary cilia, which were clearly labeled in individual cells. However, the labeled primary cilia in renal tubule were longer length and demonstrated a more pronounced structure than those in the mammary ductal lumen. In the mammary ductal lumen of the 10 mice we analyzed, the primary cilia showed shorter length and staining structure than the others shown in Figure 3G. This difference may be due to the distinct characteristics of primary cilia in different tissues.

      7) Figure 4B: how many cells were analyzed in how many experiments?

      Our statistical methods for analyzing cellular experiments using IF were essentially the same. We randomly selected 3-4 IF micrographs of each sample in one experiment, and this experiment was repeated three times. Subsequently, the number of colocalization cells and total cells were counted, and the proportion of cells with pericentrin and CRB3 colocalization was calculated (Figure 4B). The detailed description has been added to the Figure 4 legend. The revised part is in lines 962-963.

      8) Lines 217-219: since the cells were not stained with a cilia marker, only a centrosome marker, the claim that CRB3 localizes to the base of cilia is unsubstantiated.

      Thank you for your comments. The base of cilia is the basal body, which develops from the mother centriole of the centrosome (Cancer Res. 2006;66(13): 6463-7). Firstly, we found colocalization of CRB3 and pericentrin, a centrosome marker, in MCF10A cells (Figure 4A and B). Secondly, we verified the colocalization of CRB3 with γ-tubulin, a marker of basal body in primary cilia, in confluent quiescence cells (Figure 4C and D). In addition, we found that CRB3 was localized at the base of primary cilia labeled with acetylated tubulin (Figure 4E and F). Due to the species of commercialized CRB3 antibody, we were able to indirectly claim that CRB3 localizes to the base of cilia through these experiments.

      9) Figure 3 and Figure 4: is it problematic to use gamma tubulin as centrosome marker if CRB3 depletion causes reduced centrosomal recruitment of gamma tubulin ring complex components? Also, in Figure S3A no gamma tubulin staining can be seen in the lower panel, why?

      Thank you for your positive comments. As is well known, γ-tubulin is a marker of the centrosome, and we found that CRB3 depletion causes reduced centrosomal recruitment of gamma tubulin ring complex components. However, Our Figure 3 was illustrated the effect of CRB3 on ciliary assembly, and Figure 4 was analyzed the localization of CRB3 in primary cilia. In some reports on ciliary assembly, the fluorescent double staining of acetylated tubulin and γ-tubulin have been used to label primary cilia, and the effect of target genes on ciliary number and assembly were analyzed by these markers (Nature. 2013;502(7470): 254-7, Cell. 2007;130(4): 678-90 and so on). Although CRB3 affects the recruitment of gamma tubulin ring complex components, it does not affect the analysis of ciliary number and localization in Figures 3 and 4.

      In Figure S3A, green staining labeled with γ-tubulin could be clearly found in the lower left panel. The representative area from the left amplification may have been poorly selected, resulting in no γ-tubulin staining on the right side. We have updated the lower right panel in the new Supplementary Figure 3B.

      10) Figure S4A: the grouping of indicated proteins is factually wrong. For example, FBF1, SCLT1 and ODF2 are not IFT-B components, and several of the proteins indicated as localizing to the basal body also localize to (unciliated) centrioles. In contrast, CP110 is usually only found on unciliated centrioles and not mature basal bodies. Authors should consult the relevant literature and correct the figure accordingly. Alternatively, this misleading text/grouping could be removed from the figure. Furthermore, in the legend to Figure S4 there is no information provided about this quantitative analysis (how many independent experiments, which cells were analyzed etc.).

      Thank you for your helpful suggestions. We have taken your advice and removed this misleading information from the manuscript, Supplementary Figure 4A and its corresponding legend. In the legend to Supplementary Figure 4A, we have added the detailed information for this quantitative analysis in the legend. The revised legend is shown in lines 1098-1100.

      11) Figure S4B: how do authors know which of the bands correspond to CRB3 fusion protein?

      Based on the construction strategy of the CRB3-GFP fusion protein (Figure 6D) and its base sequence, we were able to calculate its molecular weight. Then the molecular weight of CRB3-GFP fusion protein was verified by western blotting (Figure 6F and 7A). Meanwhile, exogenous overexpression allowed for the production of the CRB3-GFP fusion protein in large quantities. Due to these features, we could know that the band indicated by the black arrow is most likely CRB3-GFP fusion proteins. In order to check the molecular weight, we have labeled the key molecular weight markers in the new Supplementary Figure 4B.

      12) Lines 251-253: this seems like data overinterpretation.

      Thank you for your comments. We have revised this sentence in lines 252-254.

      13) Lines 260-261: the data showing perturbed gamma tubulin localization is not convincing as data was not quantified.

      According to your suggestions, we performed the quantitative analysis of Figure 4C, which is now presented in the new Figure 4E.

      14) Figure 5H and Figure 6C: to show that the GCP6 IP actually worked, these blots should be probed also for GCP6.

      Thank you for your good suggestions. We have added these blots probed for GCP6 in new Figure 5H and 6C.

      15) Figure 5I: how many cells were analyzed in how many experiments?

      Our statistical methods for analyzing cellular experiments using IF were essentially the same. We took 3-4 random IF micrographs of each sample in one experiment, and this experiment was repeated three times. The detailed description has been added to the Figure 5 legend. The revised part is in lines 992-994.

      16) Figure S5: it looks like GPC6 and Rab11 are localizing all over the cell, are the antibodies used for the IFMs specific for these proteins?

      After checking the specificity of these antibodies used for the IFMs, we have decided to delete the corresponding results in the Supplementary Figure 5 and their description in the original manuscript.

      17) Lines 43, 89, and 314-315: the claim that CRB3 directly binds Rab11 is not supported by the data. The data provided only shows that these proteins interact indirectly. To show direct interaction, yeast-2-hybrid analysis or pull-down assays with purified proteins would be required.

      Thank you for your positive comments. Since we were unable to complete the relevant experiments to demonstrate direct interaction of two proteins, we have revised our conclusions. Replace " CRB3 directly binds Rab11" with " CRB3 binds Rab11" in the manuscript.

      18) Figure 6G and lines 314-315: this result is surprising as it indicates GTP- and GDP-locked versions of Rab11 have the same inhibitory effect on CRB3 binding? Please comment, and also indicate how data in Figure 6G was quantified (and how many independent experiments were used for the quantification).

      We were also puzzled by the results shown in Figure 6G. Based on the western blotting bands, we suspected that there may have been some issues with the experiment. Specifically, we believed that the inefficient transfection of Flag-Rab11aWT, Flag-Rab11a[Q70L], Flag-Rab11a[S20V], and Flag-Rab11a[S25N] plasmids, as well as the insufficient amount of GFP antibody used in the co-IP experiment, led to the corresponding bands being too weak and masking the true differences.

      To address this, we optimized the experimental conditions, strictly increased the experimental control, and repeated the experiment in triplicate. The new results are shown in the revised Figure 6G. The statistics from the three independent experiments revealed that CRB3b had a stronger interaction with Rab11a[Q70L] and Rab11a[S20V], while showing a weaker interaction with Rab11a[S25N], compared to Rab11aWT. As this result, we revised the original manuscript in lines 308-310 and added a detailed description to the Figure 6 legend in lines 1012-1013.

      19) Figure 8G: data needs to be quantified.

      Thank you for your comments. We replaced the unattractive bands in the western blotting of Figure 8G with better quality ones. The statistical analysis of the Figure 8G data is shown in Supplementary Figure 6.

      Further minor comments

      1) Abstract should indicate that this study describes conditional knockout of Crb3 in mouse mammary gland epithelial cells.

      This is good writing advice. We have added the relevant description in lines 40-42.

      2) Line 87: specify which gland (mammary?).

      We have modified to " mammary gland" in line 87.

      3) Line 140: sentence states that knockout of Crb3 is essential for branching morphogenesis in mammary gland development, I do not think this is correct.

      We have removed the inappropriate finding.

      4) Line 152: "formed more number" should be "formed more" or "formed higher number of".

      We modified "formed more number" to "formed more" in line 154.

      5) Lines 157-163: text and logic are difficult to follow for a non-expert.

      We have modified the logic of this paragraph, as detailed in lines 158-165.

      6) Figure 4A, C: figure resolution could be improved. It is difficult to see what the authors claim these figures are showing.

      The clarity of the original images in Figure 4A and C is acceptable, while the images on the right are electronically enlarged. Although there is a decrease in pixels, it can still display our findings.

      7) Figure 7D, E: images look pixelated.

      The clarity of the original images in Figure 7D and E is acceptable using a laser confocal microscope, while the images on the right are electronically enlarged.

      8) Line 222: unclear what authors mean by "detected a series".

      We modified "detected a series" to "some important" in line 226.

      9) Lines 221-225: which cells were used for the analysis in Fig. S4?

      We used MCF10A cells for the analysis in Supplementary Figure 4, and modified its legend in line 1098.

      10) Line 245: what is "cytomembrane"?

      We modified "cytomembrane" to "cell membrane" in lines 246-247.

      11) Lines 246-250: wording is unclear/difficult to understand.

      We have modified this paragraph, as detailed in lines 248-251.

      12) Line 273: should "regimented" be "sedimented"?

      We modified "regimented" to "sedimented" in line 274.

      13) Line 287-288: sentence does not make sense.

      We have removed this sentence.

      14) Figure 5A: it would be desirable to show the original dataset (Excel file) used for generating this figure.

      To maintain data integrity, we should provide the original dataset (Excel file). However, there are some unpublished data in this file that we must withhold for the time being. If needed, the corresponding author can be requested to provide the file.

      15) Lines 298-299: wording is unclear.

      We have modified this sentence, as detailed in lines 296-298.

      16) Lines 285-287: replace "instead of" with "but not".

      We modified "instead of" to "but not" in line 286.

      17) For all IFMs showing merged images of the green and red channel, please also show the red and green channel separately.

      Most of our fluorescence images are presented separately for each channel in this manuscript, with only a few merged images due to space limitations. This type of presentation is commonly used in published papers.

      18) Lines 326 and 327: replace "bonded" with "bound".

      We have modified in lines 322-323.

      19) Lines 327-328 and 361-364: wording is unclear/grammatically incorrect.

      We have modified these paragraphs, as detailed in line 323 and lines 357-360.

      20) Line 342: what is meant by "the combination of"?

      We modified "the combination of" to "the binding of" in line 338.

      21) Line 365: localization of what?

      This means "subcellular localization" in lines 360-361.  

      Reviewer # 2

      Major points

      1) CRB3 is present in mammals as 2 isoforms, A and B, originating from alternative splicing. In this study, the authors never mention this fact and when using approaches to KO or KD CRB3A/B they are likely to deplete both isoforms which have been shown to have different C-terminal domains and functions (Fan et al., 2007). This is also important for the CRB3 antibodies used in the study since according to the material and methods section they are either against the extracellular domain common to both isoforms or the intracellular domain which is only similar in the domain close to transmembrane between the 2 isoforms. Since the antibodies used in each figure are not detailed it is impossible to know if the authors are detecting CRB3A or B or both. Please provide the information and correct for the actual isoform detected in the data and conclusions.

      Thanks for your positive comments. In mammals, CRB3 has two isoforms, CRB3a and CRB3b, distinguished by alternative splicing within the fourth exon of the CRB3 gene, which in turn produces a protein with 23 amino acid differences at the C terminus. Both CRB3a and CRB3b have mostly identical amino acid sequences, and have indistinguishable molecular weight sizes. As a result, the knockout mouse construction strategy and the design principles of RNAi sequences target both CRB3a and CRB3b. This is described in lines 100-104 and lines 149-150. Additionally, commercially available antibodies detect both CRB3a and CRB3b, as mentioned in line 123 and lines 636-637 in revised manuscript.

      However, it should be noted that our CRB3 overexpression, as shown in the CRB3 structural domain in Figure 6D, refers specifically to the sequence of CRB3b. As a result, we have updated the original manuscript as well as the legends of Figures 3C, 3E, 4A, 5A, 5B, 6D-G, 7A, 7B and Supplementary Figure 2F-H, 3A, 4B, 6B to reflect this change. All instances of overexpressed CRB3 have been changed to CRB3b.

      2) CRB3A and B have been localized in the cilium itself (Fan et al., 2004; 2007) but in the study CRB3A/B does not enter the cilium but is localized in the basal body (figure 4). How the authors reconcile these different localizations?

      Indeed, we found that CRB3 is mainly localized at the basal body of the primary cilium, which differs from previous reports in the literature (Curr Biol. 2004;14(16):1451-61 and J Cell Biol. 2007;178(3):387-98). However, upon closer examination of one of these reports (Curr Biol. 2004;14(16):1451-61), it appears that CRB3 was actually scattered on the primary cilia, with a strong focus at the basal body. Additionally, in rat kidney collecting ducts, the localization of CRB3 on primary cilia was significantly reduced, with obvious localization at the basal body. Another study (J Cell Biol. 2007;178(3):387-98) also reported the co-localization of CRB3b and γ-tubulin in MDCK cells, which is consistent with our conclusion. We further verified the co-localization of CRB3 with the centrosome by overexpressing CRB3b in mammary epithelial cells, indicating that CRB3 mainly localizes to the basal body of the primary cilium. This information is discussed in the Discussion section of the manuscript (lines 400-410).

      3) The authors use GFP-CRB3A/B, it is not stated which isoform, over-expression to localize CRB3A/B in MCF10A cells (figure 4A). The levels of expression appear to be very high in the GFP panel and it is likely that the secretory pathway of the cells is clogged with GFP-CRB3A/B in transit from the ER to the plasma membrane. Thus, the colocalization with pericentrin might be due to the accumulation of ER and Golgi around the centrosome. This colocalization should be done with the endogenous CRB3A/B and with a better resolution.

      Thank you for your comments. We were also interested in the co-localization of endogenous CRB3 and centrosome proteins. However, the only commercial CRB3 antibody available is the rabbit species, and the pericentrin antibody (Abcam, ab4448) that is very useful is also the rabbit species. We had difficulty finding commercial centrosome-associated antibodies for other species. Therefore, we examined the co-localization of endogenous CRB3 with γ-tubulin in Figure 4C and combined the results with those of exogenous CRB3 to illustrate the co-localization of CRB3 with centrosomes.

      4) The staining for CRB3A/B in figure 4C (red) is striking with a very strong accumulation in an undefined intracellular structure and the authors do not provide any explanation for such a difference with the GFP-CRB3A/B just above.

      Thank you for your good suggestions. The immunofluorescence images of GFP-CRB3 in Figure 4a were obtained using a fluorescence microscope, while the images of endogenous CRB3 were obtained using a laser confocal microscope. The fluorescence microscope excites a fluorescent dye to emit a signal, which is amplified into a visible light signal and presents a full fluorescent signal. In Figure 4a, we can clearly see the full distribution of exogenous CRB3 in MCF10A cells, including its tight junctional localization consistent with previous reports in the literature and its co-localization with centrosomal proteins. On the other hand, laser confocal microscopy uses a laser as the light source to excite the fluorescence within the sample point by point. It employs a precision pinhole filtering technique with strong laminar imaging capabilities. In the specific analysis of endogenous CRB3 co-localization studies with centrosomes and primary cilium, signals at tight junctions must be excluded. Therefore, Figure 4c represents the fluorescence signal at the level of intracellular CRB3 co-localization with γ-tubulin. The two methods use different detection means and techniques, and are not directly comparable.

      5) The staining in figure 4E is also different from those shown in figure 4F in which the CRB3A/B staining is right at the base of the axoneme while it is not the case in figure 4E where we can see a red dot close to but not right at the base of the axoneme.

      Thank you for your comments. The new Figure 4F displays the localization relationship between CRB3 and primary cilium, analyzed using laser confocal microscopy. With the unique single-level detection function of this microscope, the problem of level selection may cause the red dots to appear close to, rather than right at the basal body of the primary cilium. However, the new Figure 4G, based on the use of 3D reconstruction scanning technique, clearly demonstrates the localization of CRB3 at the basal body of the primary cilium under the same cells and conditions.

      6) The authors claim that CRB3A/B interacts directly with Rab11 but they only show co-immunoprecipitation experiments from cell lysates which do not support direct interactions. The only way to show a direct interaction is to produce both proteins in vitro. Thus, the term direct interaction should be removed.

      Thank you for your positive comments. Since we were unable to complete the relevant experiments to demonstrate direct interaction of two proteins, we have revised our conclusions. Replace " CRB3 directly binds Rab11" with " CRB3 binds Rab11" in the manuscript.

      7) In addition, the authors claim (Line 251/252) that Rab11 is necessary for the transport of CRB3A/B but they should KD Rab11 to show this.

      Thank you for your good suggestions. It is essential to observe CRB3 trafficking after knockdown Rab11. However, in Figure 5C, we used the endocytosis inhibitor dynasore, which also inhibits Rab11-positive endosomes. This result shows that dynasore can significantly inhibit CRB3 trafficking in MCF10A cells. We believe that this experiment partially demonstrates that inhibiting Rab11 function can affect CRB3 trafficking.

      8) The domain of CRB3A/B that is necessary for the interaction with Rab11 is the N-terminal part of the extracellular domain. This domain is thus inside the transport vesicles and not accessible from the cytoplasm. Given that Rab11 is a cytoplasmic protein, how the 2 proteins could interact across the membrane? The authors do not even discuss this essential point for their hypothesis.

      Thank you for your positive comments. As shown in the schematic model in Figure 9, we believe that when cells form tight junctions, CRB3 is primarily located on the cell membrane. Subsequently, endosomes are involved in the intracellular degradation process of CRB3 on the cell membrane. Intracellular CRB3 can bind to Rab11 through the extracellular domain, which in turn participates in primary cilia assembly. We have made detailed modifications to lines 418-421.

      9) Figures are not numbered.

      Thank you for your comments. We have updated the numbers in the original manuscript as well as the legends of Figures 1D, 1E, 2B, 2D, 2F, 2G, 3B, 3D, 3F-H, 4B, 4E, 5I, 6, 8G and Supplementary Figure 1E, 2, 3C, 4A, 5B, 6.

      Minor points

      1) The authors cite several studies showing that a down regulation of CRB3A/B in human cells promotes cancer but other studies show the contrary: Lin et al., 2015 for example. Please discuss these discrepancies.

      Thanks for your good suggestion. We have included additional studies with contrasting results in the discussion section, specifically in lines 378-380.

      2) Line 98: "exhibit smaller" smaller than what?

      We modified "exhibit smaller" to "exhibit smaller size" in line 97.

      3) Line 152: "form more number, ..." ???

      We modified "formed more number" to "formed more" in line 154.

      4) Line 180: "Compared with the control, the number of cells with primary cilium was significantly increased ». To me it is the contrary! This part is not clear at all. Please rewrite.

      We have revised the sentence in lines 183-185.

      5) Authors should check and review extensively for improvements to the use of English.

      Thanks for your good writing advice. We have carefully reviewed and revised the entire manuscript to improve its readability.

    1. Reviewer #1 (Public Review):

      In principle a very interesting story, in which the authors attempt to shed light on an intriguing and poorly understood phenomenon; the link between damage repair and cell cycle re-entry once a cell has suffered from DNA damage. The issue is highly relevant to our understanding of how genome stability is maintained or compromised when our genome is damaged. The authors present the intriguing conclusion that this is based on a timer, implying that the outcome of a damaging insult is somewhat of a lottery; if a cell can fix the damage within the allocated time provided by the "timer" it will maintain stability, if not then stability is compromised. If this conclusion can be supported by solid data, the paper would make a very important contribution to the field.

      However, the story in its present form suffers from a number of major gaps that will need to be addressed before we can conclude that MASTL is the "timer" that is proposed here. The primary concern being that altered MASTL regulation seems to be doing much more than simply acting as a timer in control of recovery after DNA damage. There is data presented to suggest that MASTL directly controls checkpoint activation, which is very different from acting as a timer. The authors conclude on page 8 "E6AP promoted DNA damage checkpoint signaling by counteracting MASTL", but in the abstract the conclusion is "E6AP depletion promoted cell cycle recovery from the DNA damage checkpoint, in a MASTL-dependent manner". These 2 conclusions are definitely not in alignment. Do E6AP/MASTL control checkpoint signaling or do they control recovery, which is it?

      Also, there is data presented that suggest that MASTL does more than just controlling mitotic entry after DNA damage, while the conclusions of the paper are entirely based on the assumption that MASTL merely acts as a driver of mitotic entry, with E6AP in control of its levels. This issue will need to be resolved.

      and finally, the authors have shown some very compelling data on the phosphorylation of E6AP by ATM/ATR, and its role in the DNA damage response. But the time resolution of these effects in relation to arrest and recovery have not been addressed.

      Revised manuscript:<br /> I think the authors did a good job in revising the paper, and provide compelling support for a timer function in the checkpoint. I do think they still have missed one important point how MASTL could act as a timer to control recovery. The data clearly show that MASTL somehow controls ATM/ATR activity, whilst their final model (fig.9) places MASTL upstream of CDK activity, without mentioning its feedback on ATM/ATR. I think there are 2 possible explanations for the timer function of MASTL they have discovered here, both may be relevant. The first is enhanced CDK activation by direct control of CDK phosphorylation through MASTL/B55/PP2A. The second is through MASTL-mediated shut-down of ATM/ATR activation (mechanism to be determined) which is also reported here. Their final model and discussion do not display sufficient appreciation for this latter option, and I would argue that the HU-recovery experiment shown in Fig.5B is actually in strong support of the second explanation, rather than the first.

    1. Public Review:

      In countries endemic for P vivax the need to administer a primaquine (PQ) course adequate to prevent relapse in G6PD deficient persons poses a real dilemma. On one hand PQ will cause haemolysis; on the other hand, without PQ the chance of relapse is very high. As a result, out of fear of severe haemolysis, PQ has been under-used.

      In view of the above, the Authors have investigated in well-informed volunteers, who were kept under close medical supervision in hospital throughout the study, two different schedules of PQ administration: (1) escalating doses (to a total of 5-7 mg/kg); (2) single 45 mg dose (0.75 mg/kg).

      It is shown convincingly that regimen (1) can be used successfully to deliver within 3 weeks, under hospital conditions, the dose of PQ required to prevent P vivax relapse.

      As expected, with both regimens acute haemolytic anaemia (AHA) developed in all cases. With regimen (2), not surprisingly, the fall in Hb was less, although it was abrupt. With regimen (1) the average fall in Hb was about 4 G. Only in one subject the fall in Hb mandated termination of the study.

      Since the data from the Chicago group some sixty years ago, there has been no paper reporting a systematic daily analysis of AHA in so many closely monitored subjects with G6PD deficiency. The individual patient data in the Supplementary material are most informative and more than precious.

      Having said this, I do have some general comments.<br /> 1. Through their remarkable Part 1 study, the Authors clearly wish to set the stage for a revision of the currently recommended PQ regimen for G6PD deficient patients. They have shown that 5-7 mg/kg can be administered within 3 weeks, whereas the currently recommended regimen provides 6 mg/kg over no less than 8 weeks.<br /> 2. Part 2 aims to show that, as was known already, even a single PQ dose of 0.75 mg/kg causes a significant degree of haemolysis: G6PD deficiency-related haemolysis is characteristically markedly dose-dependent. Although they do not state it explicitly in these words (I think they should), the Authors want to make it clear that the currently recommended regimen does cause AHA.<br /> 3. Regulatory agencies like to classify a drug regimen as either SAFE or NOT-SAFE; they also like to decide who is 'at risk' and who is 'not at risk'. A wealth of data, including those in this manuscript, show that it is not correct to say that a G6PD deficient person when taking PQ is at risk of haemolysis: he or she will definitely have haemolysis. As for SAFETY, it will depend on the clinical situation when PQ is started and on the severity of the AHA that will develop.

      The above three issues are all present in the discussion, but I think they ought to be stated more clearly.

      Finally, by the Authors' own statement on page 15, the main limitation is the complexity of this approach. The authors suggest that blister packed PQ may help; but to me the real complexity is managing patients in the field versus the painstaking hospital care in the hands of experts, of which volunteers in this study have had the benefit. It is not surprising that a fall in Hb of 4 g/dl is well tolerated by most non-anaemic men; but patients with P vivax in the field may often have mild to moderate to severe anaemia; and certainly they will not have their Hb, retics and bilirubin checked every day. In crude approximation, we are talking of a fall in Hb of 4 G with regimen (1), as against a fall in Hb of 2 G with regimen (2), that is part of the currently recommended regimen: it stands to reason that, in terms of safety, the latter is generally preferable (even though some degree of fall in Hb will recur with each weekly dose). In my view, these difficult points should be discussed deliberately.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements [optional]

      Reply to general assessment of referee #2:

      1. General assessments: The current study adds some to these observations…some of these observations are incremental…biological significance is limited. While this reviewer does not suggest additional experimentation, this manuscript would be suitable as a resource paper.

      Reply: It appears we were not clear enough in explaining the novel aspects of our study.

      The starting points are two published studies from our lab demonstrating a global increase of ISGF3 association with ISG promoters in IFNγ-treated cells and a remarkable similarity of IFN-γ and type I IFN-induced early transcriptome changes. These findings challenge the notion in the field (as mentioned by the referee) that IFNγ specificity is produced by the predominant deployment of STAT1 homodimers. We thus tested the hypothesis that the specificity of the IFNγ-induced transcriptome is generated over time, rather than during the early response, and relies on secondary responses to transcription factors such as IRF1. In contrast, IRF1 plays no or only a small role in the type I IFN response that utilises ISGF3 and/or unknown secondary factors in the delayed response. We tested this hypothesis with PRO-seq technology to rule out confounding effects of mRNA processing over a 48h period. The data are clear in showing that many genes associated with the antibacterial or anti parasite profile of activated macrophages are indeed much more abundant in late-stage rather than briefly IFNγ-treated macrophages and these delayed changes are to a large extent dependent on IRF1. Our findings are based on the best available technologies, a combination of nascent transcript analysis with genetics and protein interaction studies. In addition, our findings rule out alternative models of sustained or secondary ISG transcription, such as the employment of alternative ISGF3 complexes (such as STAT2-IRF9) or of ISGF3 complexes formed with unphosphorylated STAT1 and STAT2. We provide evidence for higher order waves of transcription caused by unknow transcription factors that are produced by transcriptional activation of ISGF3 or IRF1 target genes and identify candidates among the AP1 and Ets transcription factor families. We agree that some of the data are confirmatory rather than novel (i.e. some of the genes we describe were known from previous literature to be IRF1 targets), but it is the systems approach of our study, and particularly the delineation of conditions under which the largely neglected delayed response diverts the IFNβ and IFNγ-induced transcriptomes, that generates a comprehensive and conclusive view of IFNγ acting predominantly as a macrophage activating factor, and IFNβ being an essential antiviral cytokine. We do think this main outcome is immunologically meaningful and not incremental. For this reason, we would prefer to publish the paper as a relevant contribution to innate immunology rather than a resource. Emphasizing our point, a paper appeared in ‘Cell’ while our study was under review, showing that human IRF1 mutations cause mendelian susceptibility to mycobacterial disease (MSMD), a term coined by JL Casanova and colleagues for immunological defects that reduce the ability of macrophages to cope with intracellular bacteria (new ref. 65). This important study emphasizes the main conclusions of our study about the relevance of IRF1 for macrophage activation. We discuss this paper on p. 14 lines 9-14.

      Revision: We tried to better explain the scientific motivation for this study and the significance of the results (p. 4, lines, lines 12-25).

      Revision plan: n. a.

      2. Description of the planned revisions

      Referee #3; major comment 1:

      In Fig. 1d is difficult to interpret and misleading for many reasons. First, the cluster numbering is disconnected from the cluster order; why not numbering them based on the hierarchical clustering and writing the cluster number besides the cluster itself? Second, having a 2-color gradient is misleading; negative values shouldn't be in the same color tone than the positive values. Third, the authors did not provide adequate rationale behind using only the top 1,000 most expressed gene? Why not using all the differentially expressed genes in at least one of the condition to provide a comprehensive analysis? Could this potentially lead to bias in the data, and is there any information lost by not using the - lower - expressed genes fraction? Fourth, it is not clear what the color scale is representing and how the data was transformed. Was a mean centering of the expression values of the log2FC applied to the RNA-seq data to facilitate clustering? Mean centering and z-scoring is a common technique used to adjust expression data, but it can potentially exaggerate differences between samples. More information about the data and analysis should be provided, as it is difficult to determine whether this was a valid approach or not.

      Reply:

      • To create the heatmap, we used the pheatmap package from R and the cutree_rows option to separate 11 clusters with strikingly different patterns of gene expression based on visual exploration. The numbering was autogenerated by the program.
      • The data is now shown in red-blue.
      • We restricted our list to only 1000 genes from each comparison as we aimed to analyze the prominent patterns of gene expression across timepoints. Considering all differentially expressed genes based on a padj value would also include genes expressed at very low levels as evident from the low baseMean values obtained from DESeq2. Hence, we applied a selection of 1000 genes which effectively represented the major patterns of gene expression across timepoints.
      • Variance stabilized transformation was applied on read counts obtained from PRO-seq using the DESeq2 package. The transformed reads were z-score normalized and used for performing hierarchical clustering by the “Ward.D2” method using the pheatmap package in R. A total of 3126 genes were used for this analysis. 11 distinct clusters were defined using cutree_rows option. The color scale represents z-score normalized counts. The genes represented in the heatmap were selected based on the following criteria: each timepoint of interferon treatment was compared to the homeostatic condition (untreated sample) in wildtype BMDMs. The differentially expressed genes from each comparison were selected based on the filtering criteria: absolute log2FoldChange >=1 and adjusted p value <0.01 by Wald test. Following the differential analysis, the first 1000 differentially expressed genes in each treatment condition (ordered based on adjusted p values) were selected for both IFN types and combined and selected for creating a list which consisted of 3126 unique genes. The scale in the heatmap represents z-scores of variance-stabilized reads, calculated across all genotype and treatment conditions, separately for each IFN type.

      Revision plan: We will label the clusters with the cluster number next to it in addition to the color codes.

      Referee #3; major comment 3:

      The large standard deviation bars in the claim that ChIP data confirmed the binding of ISGF3 components to the promoter of Mx2 cast doubt on the validity of the results and conclusions. The authors should consider additional experiments or complementary analyses to validate their findings. Or alternative, to adjust their claims accordingly.

      Reply: To demonstrate sufficient quality of the data the ratio of Stat1/ Stat2 was calculated for early (1.5hrs) and late (48h) separately. The unpaired two-tailed t test comparing this ratio between 1.5 hrs and 48hs, shows that they are not significantly different. This indicates that all ISGF3 components are associated with ISG during both early and delayed responses, i. e., that STAT2/IRF9 complexes are unlikely to contribute to delayed ISG control. However, we agree with the referee that the standard deviations of the kinetic ChIP experiment are high and that it would be good to generate additional data.

      Revision plan: We will perform additional ChIP experiments to improve the statistical power of the results in fig. S2c.

      Referee #3, major comment 6:

      The authors interpret their ATAC-seq and ChIP-seq results based on a 2kb window to the TSS of genes, not considering relatively close enhancers or longer range cis-regulatory interactions in their interpretation. For example, they mention on p.7 "Contrasting the strong binding of IRF9 and IRF1 to the Mx2 (cluster 2) and Gbp2 (cluster 9) promoters, respectively, we saw no evidence for direct binding to Lrp11 (cluster 3) and Ptgs2 (cluster 10)", but on Fig 3d they show only the proximal regions. No scale bars are shown either. Moreover, exploring the same published IRF1 ChIP-seq dataset, there is a clear IRF1 binding site at the promoter of Ptgs2, while the authors report none.

      Reply:

      • According to the literature (e. g. refs. 11, 27), most IFN-induced accessibility changes occur in the vicinity of the TSS of ISG. This is further strengthened by the data shown in this manuscript. In addition, most functionally validated GAS and ISRE sequences are in the DNA interval chosen for our analysis. While distal ISG enhancers have been reported (e. g. DOI: 10.26508/lsa.202201823), an analysis beyond the placement of most control regions increases the risk of wrong assignments between ISG and their regulatory elements, hence the causality between transcription factor binding and accessibility changes.
      • We extended the regions for the analysis of the Lrp11 and Ptgs2 regulatory regions and found no evidence for the binding of ISGF3 or IRF1. We find no evidence for a clear peak in the Ptgs2 promoter. There is a peak called by the Macs2 algorithm, but visual inspection of the track (bigwig file) shows it consists of a minor increase in reads above background that does not suggest a bona fide IRF1 binding site (see below). This view is supported by our inability to find an IRF binding site in the vicinity of the peak.

      IRF1 binding indicated by bigWig browser tracks and corresponding peakfiles detected at the locus. We identified the peakfile from Langlais et al., 2016 and identified peaks using MACS2, however using mm10 genome as the analysis in the original paper was done with mm9 genome. The peak identified here appears to be an artefact of the MACS2 program as there is no evident enrichment at the gene promoter region upon inspection of the bigWig files.

      Revision plan: Scales will be added to the browser tracks as requested.

      Referee #3, major comment 7:

      Lack of statistical analysis on chromatin accessibility claims: The authors claim that ATAC-seq data in BMDMs stimulated with IFNβ or IFNγ for a short (1.5 hours) or long (48 hours) period reveals a striking similarity between transcription and the general trends of chromatin accessibility at regions up to 1000 bp upstream of the TSS (Fig. 2a), suggesting continuous chromatin remodeling during the transcriptional response. However, I would like to know if this conclusion is well-supported by the correlation between the chromatin accessibility from ATAC-seq data from only one sample and the PRO-seq data.

      Reply: See revision plan.

      Revision plan: We will analyze single experiments whether they support the conclusions derived from the z-score of the triplicate samples.

      Referee #3, major comment 8:

      The need for additional experiments to verify claims such as the dependence of Ifi44 on IRF1 for gaining ATAC signal, as stated in the claim, "Expression required IRF1 for both, but accessibility of the Ifi44 regulatory region depended upon IRF1 whereas that of Gbp2 acquired an open structure independently of IRF1 (Fig. 5c).

      Reply: We think the lack of clarity might be related to the size of figures 5a and 5b and the density of the dots in some areas of the plot. We agree it is very difficult to assign our gene labels unambiguously to a single dot.

      Fig. 5a combines ATACseq data in wt and IRF1 knockout cells with the expression data from the Pro-seq experiment, Fig. 5b is the same set-up, but IRF9-deficient macrophages are analyzed.

      Blue dots show ATACseq signals induced by IFN treatment. Violet dots represent genes that require IRF1 (Fig. 5a) or IRF9 (Fig. 5b) for transcriptional induction. Yellow dots mark genes such as IFI44 requiring IRF1 (Fig. 5a) or IRF9 (Fig. 5b) for both expression and the accessibility change in the promoter region. Fig. 5c visualizes representative examples of genes whose accessibility is coupled to the transcription factor dependence of the transcriptional induction (IFI44), or not (Gbp2). Thus Fig. 5c must be interpreted based on the dot color code in fig. 5a and we admit this has been difficult with the figure in its present form.

      Revision plan: We will improve the clarity of figs 5a and 5b in several ways:

      • We will label the panels to better indicate the intersected data sets.
      • We will increase the size of the panels and figure legends and make sure that the correspondence between gene names and dots are unambiguous.
      • We will include trend lines of the Ifi44 and Gbp2 genes to visualize their induction and IRF1 dependence.

      Referee #3, major comment 13 (see also section 3):

      The authors have not adequately addressed the methodological limitations in their discussion, which extends beyond the aforementioned comments. It is suggested they include a comprehensive discussion of the claims made pertaining to the necessity of IRF1 for accessibility and the potential biases in the interactomes, along with their associated consequences.

      Reply: The contribution of IRF1 to the accessibility of ISG promoters emerges from the data in figures 5a, whose clarity will be improved (see reply to point 8). We do not interpret the impact of IRF1 beyond the data, in fact we state a relatively minor effect of IRF1 in the control of promoter accessibility (p. 10, lines 20-22) and we have added a reference in agreement with an impact of IRF1 on basal expression of antiviral genes (ref. 39, as suggested by the referee).

      We have added discussion on potential limitations of the TurboID approach (p. 11, lines 22-24 and p. 15, lines 3-11).

      Revision plan: Improvement of fig 5a (see ref. #3, point 8).

      Referee #3, minor comment 2

      Fig 1e. The color scales on the GO enrichment graphs are misleading since they use the same blue-to-red gradient for adj p-values ranging from 10-25 to 10-49 and 0.008 to 0.016, which could be considered non significant.

      Reply: We agree that this is confusing. It results from automated assignments of the color gradients by the software.

      Revision plan: We will investigate possibilities to change color codes for different ranges of p values.

      Referee #3, minor comment 4

      The incomplete schema in Figure 1a, which only focuses on PRO-seq and does not include the ATAC-seq element.

      Reply: We will add a new figure to visualize the set-up of the ATAC seq experiments and their intersection with the Pro-seq data.

      Revision plan: We will add a new figure in accordance with the referee’s request.

      Referee #3, minor comment 6

      The clearer labeling of Figure 5a and 5b.

      Reply: Please refer to our reply to major point 8.

      Referee #3, minor comment 10

      Fig S1b, S3b. The PRO-seq was generated in triplicates, hence these graphs should include the Log2FC for the individual data points.

      Reply: The Log2FC from DESeq2 were calculated from the triplicates, the software does not compute Log2FC from individual replicates.

      Revision plan: We mention the p-values for the Log2FC to show the degree of consistency (figure legends). We will provide a table with log2FC and corresponding padj values of the genes represented at each timepoint (table_showing_padj_values_and_log2fc).

      Referee #3, minor comment 12

      In the genomic snapshot shown, only bars or fading triangles are shown in place of the gene body. The authors should provide an accurate gene structure; i.e., exons and introns.

      Reply: We will try to include the exon-intron structure wherever the size of the figure allows this.

      Revision: n. a.

      Revision plan: If figure size permits, we will add the exon-intron structure of the genes in browser tracks as requested.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Referee #1, major comment 1

      Figure 2. Difficult to interpret data as it is presented. Consider quantifying figure 2C in order to make "changes in Pol II pausing were more pronounced during IFNb signaling" statement more apparent.

      Reply: We presented the pausing data in two different graphic representations (figures 2c and S2) to make the understanding of the information content easier. In hindsight we may have generated more confusion than clarity.

      Revision: We removed the original figure 2c and replaced it with original figure S2. This representation is quite intuitive as the graphs represent a direct quantitative logarithmic display whether and how much the relative amount of paused polymerase changes when comparing IFN-treated and untreated cells. The calculation of these ratios is now explained better in the legend to figure 2.

      Referee #1, major comment 2

      How are you distinguishing autocrine signaling in the BMDMs driven by IFN treatment from late transcripts (for example, at 48 hours are differential genes due to autocrine cytokine signaling or are they truly late transcripts)?

      Reply: We do not exclude autocrine effects. In case of ISG, the most likely autocrine factor would be secreted interferon. According to our Proseq data, the differentially expressed genes do not include any interferon genes. That being said, it is possible that the transcription factors from the AP1 family we hypothesize as drivers of secondary or tertiary waves of transcription are activated by non-IFN cytokines secreted from IFN-treated cells (see also reply to comment 3).

      Revision: We now mention that enhanced IFN production is not sustaining ISG responses (p.5 lines 18/20). We mention the possibility that secreted factors may drive secondary or tertiary waves of ISG transcription (p. 8, lines 21/23).

      Referee #1, major comment 3

      Figure 3D. Authors choose Gbp2 (as positive control for IFNg driven gene), but don't show that Gbp2 is a IFNb independent gene. Consider using IRF1 KO BMDMs in this data as well.

      Reply: This is a misunderstanding. Gbp2 is not shown as an IFNγ-specific gene (it’s induction by both IFN types has been shown previously and emerges from our Pro-seq analysis, see also response to minor issue no. 2). It represents the cluster of genes that are sustained specifically after IFNγ treatment in an IRF1-dependent manner. The purpose of fig. 3D is to show that not all ISGF3/IRF9-dependent genes have promoter binding sites for ISGF3 and not all IRF1-dependent genes have binding sites for IRF1. This suggests indirect effects of both transcription factors in sustaining IFN-induced transcription (in line with the referee’s comment 1).

      Previous figure S3e (now S2f) confirms binding of IRF1 to the GBP2 promoter by ChIP with kinetics correlating to its transcriptional effect. This experiment is normalized with an IgG control. IRF1 knockout cells did not produce a ChIP signal with IRF1 antibody, as expected (data not shown).

      Revision: We better explain the rationale behind the experiments shown in figure 3D (text on p8, lines 12-16). In addition, we show the trend line of Gbp2 expression in WT vs IRF1KO as well as that of additional genes showing delayed/sustained responses in the new Figure S3.

      Referee #1, minor comment 2

      Define known IFNg and IFNb driven genes when they are introduced in figure 2 rather than in discussion.

      Reply: Following the referee’s suggestion we provide the examples of IFNβ and IFNγ-controlled genes and the characteristics of their regulation in the context of our description of the results displayed by fig. 2 (p.6 lines 15-21). This includes Gbp2 (see major issue no. 3).

      Revision: The text on p. 6 lines 15-21 has been modified in accordance with the request.

      Referee #1, minor comment 4

      Unclear whether IRF1 expression in figure 3A is from whole cell lysate or nuclear fraction.

      Reply: We indicate in the figure legend that whole cell lysates were used.

      Revision: We added a sentence with the relevant information in the legend of figure 3.

      Referee #1, minor comment 5

      Authors suggest IFNb treatment induces less IRF1 at later time points, however loading control also seems slightly lower than other considerations. Is it possible that IFNb treated cells are dying at later time points, given that type I IFN signaling can be pro-apoptotic.

      Reply: The graph below the blot represents quantified IRF1 signals, normalized to the loading control. It shows that the differences are not generated by unequal loading of the blotted gel. We and others have shown that IFNβ may indeed enhance macrophage death, however only when the cells are simultaneously infected with an intracellular pathogen (e.g. new ref. 25). These studies also show that treatment with IFNβ alone over periods used in the present study does not affect macrophage viability.<br /> Revision: We added a sentence about the viability of IFN-treated macrophages (p. 4, lines 31-32).

      Revision plan: n. a.

      Referee #2, major comment 3

      The sequencing and BioID data are not submitted to public databases.

      Reply: An accession number has been added.

      Revision: The accession number was added on p.29, line 25.

      Referee #3, major comment 1 (see also revision plan, section 2):

      Revision: The rationale for using the top 1.000 genes is explained (p.5, lines 7-9). The description of the pro-seq read count processing has been extended in accordance with our reply to the referee in the legend of figure 1d and in the methods section (p. 33, lines following line 10.)

      Referee #3, major comment 2

      Fig 2c. The authors claim that RNA Pol II pausing is a major factor in controlling the dynamics of ISG transcription. However, they did not provide sufficient explanation of the results, and in all fairness there is not much variation between the clusters to sustain the claim that this is a major factor in ISG transcriptional control.

      Reply: We agree with the referee that we cannot posit RNA pol II pausing as a major factor for the differences of transcriptional control of ISG in individual clusters. We have made sure to remove any statements suggesting this possibility. We also try to better integrate our findings with RNA pol II pausing into the existing literature.

      Revision: We added relevant literature on p. 6 lines 28-30 and p. 7, lines 4-6.

      Referee #3, major comment 4

      On p.5, the authors mention "Representative browser tracks from the Gbp2 and Slfn1 genes further validate this observation" but they are simply referring to genome browser snapshot, i.e., specific genomic examples, extracting from the same single dataset. Without using an independent dataset, this can not "further validate" the initial findings.

      Reply: We agree the wording is incorrect.

      Revision: We changed the paragraph describing this experiment (p. 6, lines 15-21).

      Referee #3, major comment 5

      IRF1 was successfully pulled down with STAT1 bait but not in the reciprocal experiment. The author should discuss this point as it is important for the conclusions. Could it potentially indicate issues with the technique used, and if this could introduce any bias into the results. The statement, "In contrast, interactors of the IRF1 bait did not include STAT1. This discrepancy could result from steric constraints of the tagged proteins due to the limitation of the 10nm distance reached by the biotin ligase," does not seem to be sufficient to explain this discrepancy.

      Reply: STAT1 was present in the IRF1 pull-down and the interaction increased significantly after IFN treatment but after normalization to the NLS control it did not conform to our criterium of a 95% confidence interval for the FDR. To be consistent we did not include it in the list of IRF1 interactors. We have observed on several occasions that the significance of proximity is not reciprocal, even for well- documented physical interactions. A prime example for this is the interaction between STAT1 and IRF9 in IFN-treated cells which is recorded in the STAT1 pull-down, but not that with IRF9 (ref. 10). Apart from steric reasons the lack of reciprocity may result from different signal/noise ratios in pull downs with different baits.

      Revision: We mention that IRF1 was a STAT1 interactor below the statistical cut-off (p. 11, lines 26-28) as well as the possibility of different signal/noise ratios in the IRF1 and STAT1 pull-downs on p.11, lines 22-24.

      Referee #3, major comment 9

      In the figure legends, there is missing information about the number of times experiments were replicated, suggesting that some were done a single time. Moreover, some graphs are missing statistical analysis, e.g., in Fig S3cS3e, S3f, the ChIP-qPCR experiments were done on biological triplicates, there is no mention of statistical test performed, it is not mentioned what the error bars represents (SD, SEM, etc.) and the variance is large, but the authors still interpret these results as significant enrichment of the transcription factors to the Mx2 promoter.

      Reply: Where missing the relevant information has been added to figure legends. In brief, all experiments represent at least three biological replicates. The only exception is the western blot shown in figure S3a, (no S2a) which represents two independent replicates. Here, the clarity of the difference of IRF1 expression and the fact that the only purpose is to show that Raw264.7 macrophages behave like bone marrow-derived macrophages in fig. 3a justifies the omission of another replicate (please see also answer to point 3).

      Revision: The relevant information has been added to figure legends where necessary (figs. 1, a, 3a, 6a-f, S1, S4, S5).

      Referee #3, major comment 10

      Another example are the RNA Pol II pausing index ratios, which show minor variations and not are supported by statistics to support a possible significance. Proper description, replication and statistical analyses of the results are critical.

      Reply: We agree.

      Revision: Statistics underlying the RNA Pol II pausing data are included in supplementary data 2.

      Referee #3, major comment 11

      The authors used CRISPR-Cas9 genome editing to generate knockout cell lines. However, they did not verify the knockouts at the protein level. Further experiments could confirm that the targeted proteins are not expressed in the knockout cell lines.

      Reply: We included a western blot showing the lack of IRF1 and STAT1 expression in the respective cell lines.

      Revision: New figure S6.

      Referee #3, major comment 12

      On p.9, it is mentioned "IRF1 affects chromatin structure ...". Here chromatin structure is related to minor changes in chromatin accessibility, this can not be qualified as changes in chromatin structure.

      Reply: ‘structure’ has been changed in accordance with the request.

      Revision: ‚structure‘ has been replaced with ‘accessibility’. (p. 10, lines 19 and 21).

      Referee #3, major comment 13 (see also section 2, revision plan, major comment 8)

      The authors have not adequately addressed the methodological limitations in their discussion, which extends beyond the aforementioned comments. It is suggested they include a comprehensive discussion of the claims made pertaining to the necessity of IRF1 for accessibility and the potential biases in the interactomes, along with their associated consequences.

      Reply: The contribution of IRF1 to the accessibility of ISG promoters emerges from the data in figures 5a, whose clarity will be improved (see reply to point 8). We do not interpret the impact of IRF1 beyond the data, in fact we state a relatively minor effect of IRF1 in the control of promoter accessibility (p. 10, lines 20-22) and we have added a reference in agreement with an impact of IRF1 on basal expression of antiviral genes (ref. 39, as suggested by the referee).

      We have added discussion on potential limitations of the TurboID approach (p. 11, lines 22-24 and p. 15, lines 3-11).

      Revision: Change of the discussion section (p. 11, lines 22-24 and p. 15, lines 3-11).

      Revision plan: Improvement of fig 5a (see ref. #3, point 8).

      Referee #3, major comment 15

      The work should be discussed in the context of the demonstrated physiopathological evidence of the IRF1 and IRF9 functions. IRF9 (Hernandez et al., JEM 2018) and more recently IRF1 (Rosain et al Cell, 2023) were identified as causing non overlapping phenotypes in human patients carrying loss-of-function mutations for these genes. The authors must interpret their results in this context.

      Reply: We thank the referee for reminding us about the importance of these papers for our work.

      Revision: The papers have been mentioned and discussed (p. 13 lines 19-28 and p.14, lines 9-14).

      Referee #3, minor comment 3

      The inconsistency in the title referring to IFNb as Type 1 but using IFNg instead of Type 2 nomenclature, perhaps consistency is best.

      Reply: We agree about the importance of consistency but find ourselves in yet another quandary. While the use of ‘type I IFN’ is clearly indicated and widely used as a collective name for this group of cytokines, the use of ‘type II IFN’ for IFNγ is rare because it is the only member of this type. Hence, we decided for sticking with convention at the expense of a bit of consistency. We agree about the title, though, and have changed type I IFN to IFNβ.

      Revision: We adapted the title in agreement with the referee’s comment.

      Referee #3, minor comment 5

      Figure 6d includes a color scale of -1 to +3, but it is unclear what these values represent and how they were calculated per interactor. The figure legend should be revised to clarify this information.

      Reply: We agree. The relevant information has been added to the figure legend.

      Revision: We added information (log2FC with regard to the NLS control) to the legend of fig. 6d.

      Referee #3, minor comment 9

      Fig 1e, S1c. Graphs having circles of varying sizes in function of a value are named "bubble plots" and not "dot plots".

      Reply: Thank you for pointing this out, we corrected our mistake.

      Revision: We changed dot plot to bubble plot in legend to figure S1c.

      Referee #3, minor comment 11

      Fig S3c legend. It is mentioned "Graph represents RT-qPCR of genomic Mx2". RT-qPCR usually stands for reverse transcription quantitative PCR, hence we suggest to change to "ChIP-qPCR" or qPCR. Confusingly, in the literature the term "RT-PCR" is used for real-time PCR and "qPCR" for quantitative PCR. Also, the authors should be specific about the "genomic" region targeted; the graphs mention "promoter", hence it would be appropriate to use the same designation in the legend.

      Reply: We agree and thank the referee for correction of the terminology.

      Revision: We changed RT-PCR to qPCR throughout the manuscript. Moreover, we specifically refer to ‘promoter region’ as the amplified DNA.

      Referee #3, minor comment 12

      Fig S3e. The y-axis names are missing.

      Reply: Thanks for spotting this.

      Revision: The y axis in the figure received its proper label.

      Referee #3, minor comment 14

      Raw cells are sometimes spelled as "Raw" and other times as "RAW". Please choose one for consistency.

      Revision: This inconsistency has been corrected

      Referee #3, minor comment 15

      In p.10 l.20, the figure number is missing.

      Revision: We corrected this mistake.

      4. Description of analyses that authors prefer not to carry out

      Referee #1, minor comment 1

      Simplify figure 4B- consider focusing on most differentially expressed genes between clusters

      Reply: The purpose of fig. 4B is to provide a visual overview of the kinetics of eRNA transcription in response to both IFN types and of the effects of IRF9 and IRF1 knockouts. This information needs to be given to demonstrate the similarities and differences between the control of eRNA and the corresponding ISG transcripts in the different regulatory clusters (as shown in figs. 1d and 2a).

      Simplifying the figure would mean to separate it according to time point, IFN type treatment or knock-out effect. We think this would require to mentally reassemble the figure to understand the interrelationships between these parameters. To our opinion the visual display of the data interrelationship in fig. 4B facilitates the impropriation of the information content.

      Revision: n. a. - we hope our reasoning has become sufficiently clear.

      Revision plan: n. a.

      Referee #1, minor comment 3

      Clarify which cell types (IRF1 KO vs IRF9 KO) are used in figure 5 A/B.

      Reply: The cell type (bone marrow-derived macrophages) is mentioned in the first sentence of the figure legend. Since all experiments except the Bio-ID experiment were performed with this cell type we decided not to label each figure.

      Revision: n. a.

      Revision plan: n. a.

      Referee #2, major comment 2 and referee #3, major comment 14

      Ref #2: Biological significance is limited as this study is largely descriptive and they do not test the hits obtained from BioID.

      Ref #3: Although the TurboID experiments identify known STAT1 and IRF1 interactors, the proposed new interactors are numerous, and none are validated through independent co-IP experiments. Moreover, the results are very noisy, with little differences between untreated BMDMs (where IRF1 is barely expressed) and IFN-treated conditions.

      Reply: The big advantage of BioID or TurboID is the ability to score proximity and very transient interactions. Validating BioID hits with technologies such as coIP is not particularly useful as the two technologies will obviously produce different interactomes. In fact, we show in this manuscript that IRF1 and STAT1 show proximity, but they do not form a stable complex under co-IP conditions. This leaves genetic approaches (LOF or GOF) as alternatives. However, apart from the workload (> 100 genes would have to be knocked out or their products overexpressed), most of our hits are expected to produce very broad effects in such experiments, hard to interpret regarding ISGF3 and IRF1 activities.

      In view of this situation, we publish exclusively the high confidence nuclear interactors identified in our screen: biological replicates were performed in triplicate, a stringent internal control (TurboID-NLS) was used, and a stringent statistical cut-off for high-confidence interactors (95% FDR between groups) was applied. We further account for the experimental situation by limiting interpretation of the data to confirmed molecular events. For example, STAT1 dimers and the ISGF3 complex are required for histone acetylation in response to IFN, and ISGF3 is known to contribute to the exchange of the H2AZ histone variant (refs 11, 14, 71, 72). Our data show that IRF1 contributes to promoter accessibility changes and this is in line with its proximity to a remodelling complex. Thus, the BioID data indeed validate previous findings. However, in agreement with the referee’s comment, some of the data remain descriptive (such as the intriguing proximity of both STAT1 and IRF1 to nuclear products of ISG). To determine the importance of this molecular proximity is a major undertaking and beyond the scope of this study.

      Revision: We added discussion to state the difficulty of validating TurboID-based interactions and the limitations of the TurboID experiments (p.15 lines 3-11).

      Referee #3, minor comment 1

      In most graphs the expression values or log2FC are shown separately for IFNb and IFNg, however in the heatmaps (Fig 1d, S1d) the IFNb and IFNg results are intercalated keeping them side-by-side for each time point, which makes them more difficult to interpret.

      Reply: We are in a quandary about the design of the figure. On the one hand our goal is to visualize gene clusters with distinct behaviors for each IFN type. For this purpose, it would be advantageous to separate the IFN types. On the other hand, we aim at showing similarities and differences between genes induced by each IFN type, for this purpose it is better to maintain the current sample order. While understanding the referee’s point, we prefer to keep the figure as it is, because the suggested change will not increase its overall clarity.

      Revision: n. a.

      Revision plan: n. a.

      Referee #3, minor comment 7

      The statement that "IFN-I are the more important mediators of antiviral immunity" is not entirely accurate and may be an oversimplification, as there are certainly articles which suggest a larger role for type ll IFN elements than type l (ref: Yamane D et al., 2019 Nature microbiology). While yes, IFN-I plays a critical role in the innate immune response to viral infections, IFNγ also has antiviral activity and is involved in the adaptive immune response to viral infections, and in some instances to a larger extent than IFN l.

      Reply: The Yamane et al study (now mentioned on p 10, lines 22-25 and referenced) agrees with our findings because it shows that IRF1 contributes to the basal expression of an ISRE-driven ISG subset. Our statement about the predominant role of type I IFN versus IFNγ refers to genetic data in both humans (mainly Casanova’s work including effects of autoantibodies against type I IFN, see also the paper about human STAT2 deficiency in the June 15th issue of the JCI, https://doi.org/10.1172/JCI168321) and mice (hundreds of papers) showing that disruption of type I IFN synthesis or response causes profound effects of antiviral immunity (i.e. resulting susceptibilities are first and foremost to viral pathogens) whereas susceptibilities as a consequence of disrupting the IFNγ pathway are first and foremost to intracellular nonviral pathogens such a mycobacteria. In fact, the term mendelian susceptibility to mycobacterial disease (MSMD) was coined by Casanova and colleagues to describe a variety of human mutations that include those of the IFNγ, but not the type I IFN pathway.

      Maybe more importantly, the Rosain et al. paper mentioned by the referee which appeared in ‘Cell’ while our study was under review, shows that human IRF1 mutations also fall into the MSMD category (new ref. 65). In contrast, the authors did not observe diminished antiviral immunity. This emphasizes the main conclusions of our study about the relevance of IRF1 for macrophage activation. We discuss this paper on p 14. lines 9-14.

      Obviously, this does not exclude a role of type I IFN in nonviral infection or of IFNγ in viral infection, in fact much of our own work has been dedicated to a role of type I IFN in infections with L. monocytogenes. Nevertheless, we think that in a generic statement about the difference between type I IFN and IFNγ it is correct to label the former as predominantly antiviral and the latter predominantly as a macrophage activating factor against nonviral, intracellular pathogens.

      Revision: We added discussion of Rosain et al. (ref. 65) on p 14. lines 9-14.

      Referee #3, minor comment 8

      The authors claim that a significant portion of ISG promoters is associated with ISGF3 upon IFNγ receptor engagement and that the transcriptomes of macrophages treated briefly with IFNβ or IFNγ exhibit remarkable similarity and sensitivity to Irf9 deletion. However, I am uncertain about the extent of consensus on this claim.

      Reply: The data were surprising but supported by ChIP-seq and RNA-seq in wt and IRF9 ko macrophages (ref 10). Data in a follow-up study (ref. 11) and in this manuscript support our original conclusion by demonstrating the impact of the IRF9 ko on IFNγ responses. Importantly, we don’t claim this is true in all cell types, it may well depend on STAT/IRF9 expression levels and tonic IFN signaling.

      Revision: n. a.

      Revision plan: n. a.

    1. Background Polygenic risk score (PRS) analyses are now routinely applied in biomedical research, with great hope that they will aid in our understanding of disease aetiology and contribute to personalized medicine. The continued growth of multi-cohort genome-wide association studies (GWASs) and large-scale biobank projects has provided researchers with a wealth of GWAS summary statistics and individual-level data suitable for performing PRS analyses. However, as the size of these studies increase, the risk of inter-cohort sample overlap and close relatedness increases. Ideally sample overlap would be identified and removed directly, but this is typically not possible due to privacy laws or consent agreements. This sample overlap, whether known or not, is a major problem in PRS analyses because it can lead to inflation of type 1 error and, thus, erroneous conclusions in published work.Results Here, for the first time, we report the scale of the sample overlap problem for PRS analyses by generating known sample overlap across sub-samples of the UK Biobank data, which we then use to produce GWAS and target data to mimic the effects of inter-cohort sample overlap. We demonstrate that inter-cohort overlap results in a significant and often substantial inflation in the observed PRS-trait association, coefficient of determination (R2) and false-positive rate. This inflation can be high even when the absolute number of overlapping individuals is small if this makes up a notable fraction of the target sample. We develop and introduce EraSOR (Erase Sample Overlap and Relatedness), a software for adjusting inflation in PRS prediction and association statistics in the presence of sample overlap or close relatedness between the GWAS and target samples. A key component of the EraSOR approach is inference of the degree of sample overlap from the intercept of a bivariate LD score regression applied to the GWAS and target data, making it powered in settings where both have sample sizes over 1,000 individuals. Through extensive benchmarking using UK Biobank and HapGen2 simulated genotype-phenotype data, we demonstrate that PRSs calculated using EraSOR-adjusted GWAS summary statistics are robust to inter-cohort overlap in a wide range of realistic scenarios and are even robust to high levels of residual genetic and environmental stratification.Conclusion The results of all PRS analyses for which sample overlap cannot be definitively ruled out should be considered with caution given high type 1 error observed in the presence of even low overlap between base and target cohorts. Given the strong performance of EraSOR in eliminating inflation caused by sample overlap in PRS studies with large (>5k) target samples, we recommend that EraSOR be used in all future such PRS studies to mitigate the potential effects of inter-cohort overlap and close relatedness.

      This work has been peer reviewed in GigaScience (see Description), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      ** Jack Pattee**

      Overall, I think that this manuscript is strong and describes a well-formulated method to address a relevant problem. There are a few outstanding questions about the performance of the EraSOR method from my perspective, which I'll detail as follows.My understanding of reference [16] indicates that equation (3) of this manuscript only holds for null SNPs, i.e. if SNP g is not associated with the outcome Y. If this is the case, then this should be discussed in the manuscript. I wonder if this can partially explain the 'under-estimation' behavior we see in the application to real data in Supplementary Figure 3. In particular, I am referencing the behavior where the EraSOR correction will under-estimate the predictive accuracy of the PRS in the target data, i.e. where delta-R^2 is negative. This behavior is not seen in the simulation and warrants further investigation and discussion. While the bias appears small, for some cases delta-R^2 approaches -.025, which corresponds to an under-estimation of Pearson's r by roughly .15; this is substantial. Could it be the case that, for highly polygenic traits such as height and BMI, the null-SNP assumption is unreliable and the performance of EraSOR is degraded? Does a fundamental assumption of sparse genetic association underlie EraSOR?I recommend that the real data application play a larger role in the manuscript narrative and be moved out of the supplementary. The simulations are appreciated and helpful, but there is nuance in the analysis of real data that cannot be replicated in simulation.I believe the reference to "Supplementary Figure 2" on line 346 should actually be "Supplementary Figure 3". I believe that the axis labels in Supp Figure 3 are flipped.Lines 82 and 83 reference genetic stratification and subpopulations; I think the relevance of these concepts should be introduced more clearly and they should be defined in this context. EraSOR concerns the overestimation of predictive accuracy and association incurred by sample overlap between the base and target GWASs; to this reader, it's not clear what this central issue has to do with population stratification. I realize that the derivation of the LD score method is motivated heavily by correcting for stratification; however, these concepts should be introduced more clearly in this manuscript.Line 88: consider defining LD score l_j.Lines 94-96: consider outlining the mathematical consequence of the assumption that "the two outcomes and cohorts are identical." It's the case that N_1 = N_2 = N_c = N, correct?Line 109 / equation (11): My understanding is that the relevant quantity of this derivation is N_c / sqrt(N_1 N_2), which allows us to define the correct matrix C in expression (4). If this is the case, perhaps the quantity of interest should be moved to the LHS of the equation in the final line of the expression, for clarity.As discussed in the manuscript, the estimated heritability is in the denominator of the expression for N_c / sqrt(N_1 N_2). The authors correctly discuss that the method should not be applied when there is doubt as to whether the heritability is different from zero. I would take this a step further; in cases where the heritability is zero, we cannot meaningfully apply the EraSOR correction, and thus I am not sure of the utility of the 'type I error' simulations in the manuscript. Perhaps an explicit test for h^2 > 0 should be worked into the EraSOR workflow?Line 148 / expression (12): If beta has a normal distribution here, it is the case that all SNPs in the simulation are associated with the outcome Y. This is a somewhat unusual choice for the distribution of SNP effects in a simulation; other applications such as LDPred (Vilhjalmsson et al, AJHG 2015) and LassoSum (TSH Mak et al, Genetic Epi 2017) use a point-normal distribution for simulated SNP effects, which effectively simulates the sparsity frequently observed in nature. Is there a reference or justification for the non-sparse simulation structure here?Line 215: there may be a typo in the expression for the variance of the residual term. Is it the case that the variance of the residual depends on the variance of a covariance term? If so, I am confused as to the derivation.Line 241: 'triat' should be 'trait'.The simulation results in this paper are based on clumping and thresholding for PRS, which does not estimate joint SNP effects i.e. account for LD. Methods such as LDPred and LassoSum do so. Is there any reason to believe the results would be different for a method such as LassoSum?I am confused by the very low Fst between the simulated Finnish and Yoruban samples in simulation. As detailed on line 385: the reported Fst is > .1, but the simulated Fst is essentially zero. This seems likely to be an undesirable simulation artefact, and potentially invalidates the simulation study (or, at least, doesn't provide evidence that EraSOR functions correctly when Fst is large, which was the ostensible motivation for this simulation). Is there no way to effectively simulate populations with a larger Fst?

    1. And I bid you all do likewise. In an ordinary crime, how does one defend the accused? One calls up witnesses to prove his innocence. But witchcraft is ipso facto, on its face and by its nature, an invisible crime, is it not? Therefore, who may possibly be witness to it? The witch and the victim. None other. Now we cannot hope the witch will accuse herself; granted? Therefore, we must rely upon her victims – and they do testify, the children certainly do testify. As for the witches, none will deny that we are most eager for all their confessions. Therefore, what is left for a lawyer to bring out? I think I have made my point. Have I not?

      The logic Danforth uses to justify and explain to Hale why a lawyer is not necessary in this instance is a flawed logic.

      He states that witchcraft is "an invisible crime" in which only the witch and victim are present, also that as the witch herself will hardly accuse herself, the court must rely upon the victim to testify buy identifying the witch in question.

      BUT what he fails to take into account here is that he is assuming that the "victims" are actually victims in the first place and that their accusations are true. He has no real evidence of this other than the girls' confessions. Danforth thus makes a big mistake in assuming that their accusations are valid and to be believed.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1* (Evidence, reproducibility and clarity (Required)): *

      * Srinivasan et al. present a comprehensive study on systematizing the structure-dynamics-function relation of lipid transfer proteins (LTPs), combining extensive molecular simulations and complementary experiments. Indeed, the current state-of-the-art in the field is quite chaotic and fractional, and such systematic studies are necessary to advance our general and conceptual understanding of the mechanisms of action of LTPs. The selected techniques and research strategies are all suitable, their description is sufficient and enables reproducibility; the obtained results are carefully presented and discussed; the conclusions are adequately supported by the data.

      Given my primarily computational background, I evaluated mainly the simulation part of the manuscript. Considering experiments, I do not see any significant flows or deficiencies that could diminish the value of the data and following conclusions given in the manuscript. I would even suggest improving the abstract by more explicitly saying that this work includes experimental measurements because it currently reads like purely computational work was performed. *

      We thank Reviewer #1 for the positive evaluation of our work. The abstract has now been updated to include that our work allows us to interpret existing data but also to design and perform new experimental measurements.

      * Major comments: *

      1) Although I like the central message of the paper and have no objections, I am curious whether the conclusion "a more "dynamic" or/and "mobile" part of the protein interacts with the membrane or any other (macro)(bio)molecule" makes sense globally and is not limited to LTPs. For example, it is a reasonable assumption that a more flexible part of the protein, i.e., capable of adopting necessary binding configurations, would be a more likely interacting spot. Locking in a less flexible and more specific configuration upon binding with a target molecule is also anticipated and quite typical, e.g., when ligands interact with target proteins, thereby blocking their function. The authors themselves recognize this paradigm as referring to the enzymes' dynamics. It would be great if authors could comment more on dynamics-function relation, referring to the existing literature, where such observations were/were not observed for different protein families. Performing simulations on proteins that do not exhibit such a feature and do not belong to LTPs, but, e.g., structurally similar to some of the studied LTPs, would be an excellent addition too, highlighting this signature characteristic of LTPs.

      We have now added a discussion comparing the mechanism we observe with those described for other proteins such as membrane transporters and receptors. Since those proteins are very different and have been already thoroughly characterized (including with molecular simulations) we don’t think that additional simulations are required. Also, concerning protein binding dynamics, we refer to the excellent review of Wade and coworkers: "Acc. Chem. Res. 2016, 49, 5, 809–815"

      "____Notably, the conformational plasticity we observe for LTPs is reminiscent of other, previously described, functional protein mechanisms, including enzyme dynamics during catalysis (____DOI: 10.1126/science.1066176____), the alternating-access model of membrane transporters (____https://doi.org/10.1038/nsmb.3179____) or GPCR dynamics (____https://doi.org/10.1021/acs.chemrev.6b00177____). In all these cases, protein dynamics is strongly coupled to ligand binding (____https://doi.org/10.1021/acs.accounts.5b00516____) and protein function, be it for signaling, transport or enzymatic activity. Unlike for these fields, however, the contribution of structural and spectroscopic studies to uncover LTP dynamics remains quite limited, and our simulations provide an important contribution to fill this gap. We hope that our results will motivate researchers to increase efforts to experimentally quantify LTPs conformational plasticity, e.g. by structural determination of LTPs in different states (or bound to different lipids) or by single-molecule spectroscopy studies."

      *Minor comments: *

      *

      1) Fig 1d. What is so special in Lysine compared to Arginine? Is there any disbalance in their presence in studied proteins? Any correlations between the binding affinity of certain amino acids and their overall presence on the protein surface? *

      Indeed, there is disbalance in the presence of lysine and arginine residues in our proteins. The relation between the number of these residues in our dataset is Lys:Arg = 1.6:1. On top of that, and as described in (Tubiana T et al PLoS Comput Biol. 2022 ;18(12):e1010346) lysine is preferred over arginine in peripheral membrane proteins, likely because it induces fewer perturbations in the lipid bilayer. Our data also agree with Tubiana et al, concerning the correlation between abundance of specific residues on the protein surface and membrane binding.

      * 2) Fig S1. GM2A and TTPA seem to be irreversibly adsorbed to the membrane on the microsecond timescale in most replicas. Is anything special in these proteins? Did this affect the sampling of a claimed membrane-binding interface?*

      Our interpretation of the different adsorption profile of GM2A and TTPA is that these two proteins appear to have higher membrane affinity in our computational assay in comparison with the other proteins in our dataset. However, this has no effect on the membrane-binding interface as the proteins are still able to undergo significant tumbling before binding to the lipid bilayer, as demonstrated by the angle between the two main protein axes and the bilayer normal before membrane binding (Fig. S8 in Supplementary Information).

      * 3) A related follow-up question. Multiple replicas were performed to identify the membrane-binding interface. However, if I understand well, the initial orientation of the protein with respect to the membrane was always the same. I found it a pity since performing multiple replicas starting from different initial geometries (e.g., rotating the protein in a somewhat systematic way) would likely result in a more efficient exploration of the conformation space. Can the authors comment on whether this predefined initial configuration could negatively affect the results? Performing a few additional simulations for the most problematic proteins I mentioned earlier (GM2A and TTPA) could be a nice opportunity to apply this strategy. *

      In our protocol, all proteins start from the same initial orientation but undergo significant tumbling in solution before interacting with the lipid bilayer, including for the two most extreme cases, GM2A and TTPA (Fig. R1). Hence, we think that there is no bias for what pertains to the final membrane interacting region. We have added the Fig. R1 in Supplementary Information (Fig. S8) and added the following text in the Methods Section:

      "____Despite starting from a single orientation, all proteins undergo extensive tumbling before binding to the bilayer, as illustrated by the angle between the two principal protein axes and the membrane normal for the two proteins that display the highest binding propensity, GM2A and TTPA (Fig. S8)."

      * 4) How was the volume of the cavity affected by mutations in STARD11 and Mdm12? Do these data somehow correlate with the experimentally observed reduced efficiency of the lipid transfer? *

      Our data on the volume of the cavity in STARD11 and Mdm12 are inconclusive. However, we caution from such a simplistic interpretation, since it completely neglects the lipid-bound conformation that normally has a much larger cavity than the apo form (Fig. 3).

      *5) I would appreciate it if the authors considered playing with the templates of the main Figures at later stages because in the current version, and when printed on A4 paper, the readability of certain graphs and pictures is uncomfortable and sometimes even impossible. Obviously, the final schematics would depend on the journal and its formatting. *

      We will modify the templates of the main Figures to improve readability according to journal formatting.

      * **Referees cross-commenting** *

      * I would like to acknowledge the thoughtful and detailed reviews provided by other reviewers. I do like their reports, and I believe that by addressing the reviewers' comments and incorporating their revisions, the article will significantly improve in terms of scientific rigor and contribution to the field. *

      *Reviewer #1 (Significance (Required)):

      This manuscript is a solid scientific work addressing gaps in our knowledge about Lipid Transfer Proteins by employing state-of-the-art methods. It advances the field on conceptual and fundamental levels. This study is of interest to both computational biophysicists and physical chemists (to whom I belong myself) as well as experimentalists, who seek a rational explanation of the experimental observations. *

      We thank the reviewer again for the positive evaluation of the significance of our work.

      Reviewer #2* (Evidence, reproducibility and clarity (Required)): *

      * Summary:

      In a combined computational and experimental study, the authors provide insights into general features of lipid transfer proteins (LTPs), which play key roles in lipid trafficking: Through molecular dynamics simulations of a diverse set of 12 shuttle-like LTPs, they demonstrate that LTPs consistently exist in an equilibrium between two or more conformations, whose populations are modulated by a bound lipid, and that residues significantly involved in these collective conformational changes typically interact with a membrane. Their simulations indicate that conformational plasticity is a general feature of LTPs, leading them to suggest that the ability to change conformations is essential for LTP function. They test the generality of this hypothesis through in cellulo assays of two LTPs (STARD11 and Mdm12) that were not originally simulated. While experiments of STARD11 support their hypothesis, those presented for Mdm12 provide ambiguous results. *

      *

      Major comments: *

      * Throughout the manuscript, it's stated that common 'dynamical features' correlate with LTP function. The accuracy of this statement is unclear since 'dynamical features' are never precisely defined and, while equilibrium conformational ensembles are characterized, dynamics (ie kinetics or time-dependent observables) are not. Please clarify.*

      We plan to improve the scholarly presentation of our article to clarify this issue. In short, two distinct properties modulate protein function: 1. Conformational plasticity, i.e. the (thermodynamic) ability of the protein to adopt different conformations (and with different populations depending on the bound substrate). 2. Conformational “dynamics”, i.e. the propensity to exchange between these different thermodynamic states. This ability depends on the free energy barriers between different states and it is intrinsically a kinetic (rather than thermodynamic) property.

      *More importantly, further evidence is needed to determine a correlation with *function*. LTPs are suggested to have faster transfer rates (a measure of function) if the apo form adopts a substantial population of holo-like conformations, akin to enzyme preorganization. This is further tested by rationally mutating STARD11 and Mdm12. However, the support for this conclusion and if these mutations alter the LTPs conformational ensembles as desired is unclear: *

      In our opinion, the interpretation suggested by Reviewer #2 that there is a “correlation” between transfer rates and the overlap of apo-like and holo-like conformations, though fascinating, cannot be derived from the available data at this stage, and we did not mean to imply as such. Rather, lipid transport is a complex phenomenon that involves several steps (membrane binding/unbinding, lipid uptake/release,…). Our simulations indicate that protein conformational plasticity, including potentially the overlap between apo-like and holo-like conformations, also influences lipid transfer rates. We will clarify this aspect in the text.

      * Is there a quantitative correlation between the overlap of apo and holo conformational distributions (as could be quantified by KL divergence or Wasserstein distance, for example) and difference in transfer rates as suggested by Fig S6?*

      We plan to compute quantitative correlation between apo and holo conformational distribution for Fig.S6 and for mutant simulations (see answer below) but, as discussed above, we are skeptical that we will observe a clear correlation.

      * The conclusion and the generality of the findings would be greatly strengthened if a correlation can be shown for other LTPs through additional simulations of mutants whose transfer rates have been previously characterized experimentally in the literature. (For example: Ryan 2007 PMID 17344474, Grabon 2017 PMID 28718450, Iaea 2015 PMID 26168008, among many others)*

      We are currently running simulations of several mutants to address this point and provide additional data/context.

      * While differences in the apo conformational ensembles of the WT and mutants are observed in Fig S7b and d, if these mutations reduce overlap with holo-like conformations is not determined. Simulations of the WT holo forms are needed to properly test this hypothesis. *

      We are currently performing these simulations.

      • For Mdm12, mutations are specifically made to "lock the protein in the apo-like state;" however, the mutant adopts conformations distinct from the apo form as show in Fig S7d. How do the authors interpret the results of the cellular assays considering this and could it help explain why the mutant has similar kinetics to WT? What may explain the puzzling results of similar transfer kinetics but differing mitochondrial morphology? *

      As discussed above, interpretation of lipid transport rates based exclusively on apo and holo conformational population is premature, as this is a complex mechanism that depends on many variables. For what concerns the experimental results, we think three explanations are possible: 1. Mitochondrial morphology could be more sensitive to small variations in lipid composition than our METALIC assay. 2. Our assay only quantifies transport of unsaturated PC and PE species, and we can’t quantify variations in transport of other lipid species that are likely to also be transported by ERMES, such as PS and PA. 3. According to a recent structural model (Wozny et al, Nature 618, 88–192, 2023), Mdm12 might be part of a tunnel-like LTP complex in which it doesn't establish direct interactions with nearby organellar membranes. As such, its mechanism might be different from the one described here for other shuttle-like lipid transport domains. We will discuss these possibilities in the main text.

      • Confounding factors potentially complicate the interpretation of the in cellulo experiments. Simpler in vitro experiments may be better suited to determine if altering LTP's biophysical properties, namely rationally altering the population of apo- vs holo-like configurations, quantitatively affects transport rates as suggested.*

      We agree with Reviewer #2 that this information could be useful. However, this is beyond our technical abilities, and it would require lengthy and expensive experiments that are unlikely to be completed within a reasonable time framework for a revision (3 months). We have rather opted to better discuss our model in the context of published in vitro lipid transport experiments.

      • The abstract, intro, and title highlight that the manuscript's findings are indicative of and correlated with *function* but on p. 12 it's foreseen "that future studies will focus on the functional consequence of such observation." Please reconcile these conflicting statements and ensure connections to function are accurately described. The current title is rather bold. *

      We will rewrite and clarify the extent of our hypotheses and validations.

      * All mentions of "correlation" throughout the manuscript need to be quantitatively evaluated or properly qualified. In addition to that mentioned above regarding Fig S6, what is the correlation coefficient between residues' contribution to PC1 and membrane interaction frequency (Fig 2)? *

      To address this point, we will quantify the correlation between residues' contribution to PC1 and membrane interaction frequency. However, we expect a low correlation between residues' contribution to PC1 and membrane interaction frequency for at least two main reasons. __ First, not all residues contributing to PC1 interact with membranes, but only a subset, as discussed above. Second, our methodology to compute membrane binding, based on the geometric distance between residues and bilayer, is intrinsically quite noisy (since residues in proximity of bona fide membrane binding regions will also appear as involved in membrane binding), thus making quantification of correlations somewhat inaccurate. Rather, we will try to explain in the text that our observations are not of "correlation" but rather of dependence/association, and we will use quantitative measures to quantify these properties (such as rank correlation coefficients or multivariate analyses).__

      * Residue's contributions to collective conformational changes are found to be indicative of membrane binding. Yet, membrane interacting residues are identified from CG simulations that cannot capture such collective conformational changes due to the use of an elastic network. Given that the CG simulations agree with previous experimental findings, this suggests that collective conformational changes are not important for membrane binding. *

      We disagree with this interpretation by Reviewer #2 of our data: we do not claim that residue's contributions to collective conformational changes is indicative of membrane binding. Rather, membrane binding happens at protein regions displaying high contribution to collective conformational changes. This distinction is subtle but important: protein motion does not determine membrane binding regions. Rather, it appears that, for LTPs, membrane binding regions are also characterized by collective motions (suggesting function). We will clarify this in the main text.

      *Are similar conclusions drawn from residues' RMSFs? In other words, are local conformational fluctuations just as indicative of membrane binding? *

      We will compute protein residues’ RMSFs and compare it with the membrane binding data. However, given that RMSF is representative of thermal fluctuations, we again expect a bad correlation between RMSF and membrane binding. On the other hand, we indeed observe that most membrane binding regions are protein loops, but this is not unexpected (e.g. Tubiana et al, PLoS Comput Biol. 2022 Dec; 18(12): e1010346.). However, such observation does not provide any information on lipid transport, but only on the mechanism of membrane binding. Rather, the observation of a relationship between membrane binding and global motion is more interesting, since the latter is often indicative of protein function.

      *The stated correlation may in fact be spurious and instead arise because residues at the entrance to LTP's hydrophobic cavities need to be positioned at the membrane surface for productive lipid uptake and these same residues must undergo significant conformational changes to allow lipid entry. *

      This is exactly what we think it is happening and what our data suggest. However, one must remember that our simulations allow us to predict the membrane binding interface, that is often difficult to determine experimentally (and often via indirect evidence). Hence our data provide novel evidence in this direction.

      *Is proximity to cavity entrance more or less correlated with membrane binding than 'dynamics'? *

      If we consider that, as discussed before, dynamics does not correlate with membrane binding (there are many dynamical regions that are not at the membrane interface), it is safe to assume that proximity to cavity entrance would correlate more with membrane binding. However, we have to consider that often we do not know where the cavity entrance in LTPs is located simply based on structure alone, and hence our approach provides important clues into this process.

      p.12 speculatively suggests "the high degree of protein dynamics we observed in membrane proximal regions could potentially facilitate the energetically unfavorable reaction that involves the extraction of a lipid from a membrane." Yet, the logic behind this idea does not make sense since a free energy barrier, an equilibrium thermodynamic quantity, cannot be lowered by changes in dynamics. Please explain.*

      Our current understanding of the mechanism of lipid extraction is quite poor. However, both using chemical intuition and following a recent MD study on one LTP (Rogers et al, 2023, Plos Comp Biol), it is safe to assume that the hydrophobic environment around the lipid is important for its stabilization in the lipid bilayer. Hence, reducing the number of hydrophobic contacts between the lipid and its environment could facilitate transport. A highly dynamic protein, by cycling between different conformations, could “stir” the bilayer, and hence decrease the number of contacts between the lipid and its environment favoring transport. We will clarify this point in the text.

      *Examining how the LTPs impact membrane properties would offer insight into the functional relevance of such residues for lipid extraction. *

      Indeed, our point above is connected to this one. We are performing simulations to compute hydrophobic contacts in bilayer as proposed in (Rogers et al, 2023, Plos Comp Biol).

      The authors highlight that a bound lipid alters LTPs' conformational ensembles akin to "conformational selection" or "induced fit." How sensitive are these findings to the bound lipid species? Do LTPs with multiple known substrates exhibit an increasing diversity of holo conformations and are different conformations stabilized by different substrates? Would similar observations (Fig 3) be made with a lipid that is not known to be transferred by a given LTP? An interesting future direction would be to examine if lipid substrate specificity could be assessed by comparing conformational ensembles to that of a known substrate and/or by overlap with the apo ensemble.

      We deem that the role of lipid specificity on LTP conformational plasticity is beyond the scope of the current work. While this topic is certainly worth future investigations, we must point out that (i) not all proteins bind/transport multiple lipids (at least according to current knowledge) and (ii) only few LTPs have been structurally characterized bound to different lipids (Osh4, Osh6, …). This limitation prevents a wide generalization, and we prefer not to speculate on this topic. So far, we have tested our approach for Osh4 bound to cholesterol or PI(4)P and found that indeed the protein exhibits different holo conformations (in agreement with the experimental data) when bound to different substrates. We have added a short comment on this topic in the Discussion section.

      "____We foresee that future studies will focus on the functional consequence of such observation, and most notably to the characterization of the extent to which such conformational changes affect multiple steps of protein function, including membrane binding or lipid extraction and release, and whether these are further modulated when different lipids are being transported."

      For LTPs to transfer lipids between membranes, transitions between apo and holo forms ought to occur when LTPs are membrane bound. How does membrane binding influence the conformational ensembles observed in solution? Does it promote conformational changes between apo- and holo-like structures, as suggested to regulate lipid uptake and release by previous studies of Osh/ORP, Ups/PRELI, and START family members? (For example: Miliara 2019 PMID 30850607, Watanabe 2015 PMID 26235513, Grabon 2017 PMID 28718450, Iaea 2015 PMID 26168008, Kudo 2008 PMID 18184806, Dong 2019 PMID 30783101) While answering these questions would require further computational effort, doing so will allow more accurate assessment of the role of conformational changes in LTP function.

      We can’t unfortunately currently quantify how membrane binding influences the conformational ensembles observed in solution, as the slowdown in diffusion at the water-membrane interface makes this task computationally challenging (and certainly not feasible within the time framework of a review). We have so far tested two different proteins and have not succeeded in converging their conformational distribution when membrane-bound despite long MD simulations that lasted several months (even though the non-converged data indicate sampling of both “open” and “closed” conformations). Interestingly, our observations are in qualitative agreement with a recent study on CPTP (Rogers et al, PLOS Comp Biol, 2023), where membrane-bound CPTP is able to sample different conformations (“open” and “closed”) but not to transition between the two states in 300 ns-long MD simulations.

      * The authors motivate the study with the *assumption* that a common molecular mechanism of LTP function exists. Yet LTPs have evolved diverse sequences, structures, and substrate preferences; thus there seems to be no a priori requirement (or even necessarily a benefit) for a single molecular mechanism. What evidence then supports this premise? While previous studies are limited to individual LTPs, when viewed altogether retrospectively, they suggest features that could be shared among LTPs. Synthesizing previous studies and more thoroughly referencing them (only 5 are cited in the intro on p. 3) would strengthen both the premise and findings of the manuscript. *

      Indeed, despite having different structures, substrates and the ability to target distinct organelles, previous evidence on LTPs seem to suggest a potential role for protein conformational plasticity for function, e.g. for Osh/ORP (Jun Im et al, Nature 2005; Canagarajah et al, JMB 2008; Moser von Filseck et al, Nat Comm, 2015; Lipp et al, Nat Comm. 2019,...), StART (Arakane et al, PNAS, 1996; Feng et al, Biochemistry, 2000; Grabon et al, JBC, 2017; Khelashvili et al, eLife, 2019;...) and PITP domains (Tremblay et al, Archives of Biochemistry and Biophysics, 2005; Ryan et al, MBOC, 2007; …). Our simulations provide additional evidence in this direction and allow for generalizing these observations, allowing to draw parallelisms with “enzyme-like” or transporter-like” features that could be exploited for further design of testable hypotheses. We will rewrite our text to better contextualize/acknowledge previous findings and to clarify these points.

      *The LTPs investigated are known to target distinct membranes. Should they then be expected to share structural or sequence-based features predictive of membrane binding interfaces, as motivates the analysis in Fig 1d, 1e, and S3? Or is it beneficial for LTPs to recognize membranes in different ways? *

      Since membrane binding is membrane/organelle-specific, it is possible that residue’s diversity in membrane binding interfaces could indeed be beneficial for this diversity. We will add this comment as a potential explanation of our finding of a lack of conserved sequence-based features for membrane binding interfaces.

      *

      Minor comments:*

      * 2 "making lipid transfer across the cytoplasm a potentially energetically favorable process": Is it meant that it is less energetically costly than transfer without a LTP? Why it would be energetically favorable is unclear (and would indicate that the LTP sequesters lipids away from membranes instead of transferring them between membranes). *

      Yes, this is what we meant. We will rewrite this appropriately.

      * 3 "The excellent agreement between the membrane interface determined from the simulations and the experimentally-proposed one available for... Osh6" is missing a citation. *

      We have now added the relevant citation.

      * The plots in Fig 1d and S3 are difficult to interpret. Bar plots, for example, would allow easier comparison and evaluation. Currently, it seems that most proteins individually exhibit some of the same trends observed among the whole set, counter to the conclusion on p 5. *

      We will improve the presentation of our Figures.

      * Negatively charged residues engage in a number of membrane interactions (Fig 1d and S3). What is a potential explanation for this unconventional observation? *

      One possible interpretation is that negatively charged residues could interact with positively charged moieties (ethanolamine, choline) of PC and PE lipids.

      * How much variance is captured by PC1, and how many PCs are needed to capture most of the variance in the conformations? *

      PC1 explains 38 % of the total variance, by average, whereas PC2 accounts for 17 % of it. Therefore, PC1 and PC2 capture most of the variance in almost all cases.

      We have also added this to the text:

      "____We specifically focused on PC1 as it explains most of the variance in the dynamics (38% on average for all the proteins in our dataset, see Supplementary Table 2).____ "

      We have computed this variance and we have added this analysis in Supplementary Information.

      * Plots in Fig 3, especially panels c and d are difficult to see. Please make the panels larger (perhaps a 3 x 4 layout instead of 2 x 6 would work better). *

      We will improve the presentation of our Figures.

      * 8 "these conformational changes are localized in protein regions that interact with the lipid bilayer" is contradicted by the results in Fig 2b showing that all residues with large contributions to PC1 do not interact with the membrane and discussed on p 5. *

      As discussed above, we don’t observe “correlation” between membrane binding and conformational plasticity, but we rather observe that membrane binding regions display high conformational plasticity (the opposite is not true). We will further clarify in the text.

      *

      8 "in the absence of bound lipids, it is able to sample multiple conformations" is not supported by the orange distributions in Fig 3d that appear unimodal. Is it instead meant that the apo form exhibits larger variance in cavity volume? *

      Yes, this is what we meant. We’ll clarify.

      *

      Please clarify if the elastic network was constructed to maintain the holo or apo structures of each protein and if a bound lipid was used in the CG simulations. *

      For membrane binding CG simulations, we used the apo structure and no bound lipid was used in the simulations. However, analogous simulations in the holo form (not shown) have essentially identical membrane binding interfaces.

      *

      Was *CHARMM* TIP3P used? *

      Yes.

      * Please clarify how membrane interacting residues were defined and how interaction frequency was calculated from the longest duration of interaction. *

      We will add this explanation in the Methods. The method is identical to (Srinivasan et al, Faraday Discussion, 2021).

      * Refs 16 and 45 refer to the same paper. *

      Thanks, it is now corrected!

      * Reviewer #2 (Significance (Required)): *

      * General assessment: *

      * The work aims to tackle a grand question regarding membrane homeostasis mechanisms-what are universal principles underlying LTP function-and offers initial insights; however, further evidence is needed to support the conclusions as written, and some key results require further investigation and explanation. *

      *Advance and audience: *

      *

      By concurrently investigating the largest number of lipid transfer proteins to-date, the authors provide data invaluable for uncovering general mechanisms of non-vesicular lipid transport and advancing our understanding of membrane homeostasis mechanisms. By illuminating the wide-spread importance of conformational plasticity among lipid transfer proteins, the work presents a conceptual advance in our understanding of lipid transfer mechanisms and unifies previous studies. Because the manuscript emphasizes common biophysical principles and draws connections to enzyme biophysics, it ought to be of interest not only to membrane biologists but biochemists and molecular biologists more broadly.*

      We thank Reviewer #2 for the very positive evaluation of the significance of our work and for the in-depth analysis provided that will certainly help improve the quality of our work.

      Reviewer #3* (Evidence, reproducibility and clarity (Required)): *

      *The article "Conformational dynamics of lipid transfer domains provide a general framework to decode their functional mechanism." by Sriraksha Srinivasan, Andrea DiLuca, Arun Peter, Charlotte Gehin, Museer Lone, Thorsten Hornemann, Giovanni D'Angelo and Stefano Vanni study the interaction of Lipid transport Domains with membranes. This is done mainly by molecular modelling but also with selected experimental validations. *

      * Major comments: *

      * - The key conclusions are generally well supported by the analysis. - The authors could however analyze in more details some aspects in which specific cases appear. For example, p3 "multiple binding and unbinding events, as shown by the minimum distance curves" does not give an entire description of the variability seen in Fig S1, e.g. LCN1 versus GM2A.*

      We now discuss in more detail the variability seen in Fig. S1 and attribute it to different membrane binding affinities of the proteins in our dataset. We also discuss how this variability could reflect the diversity of organellar membranes to which these proteins bind in vivo.

      "____Notably, the proteins in our dataset display distinct binding affinities, with some proteins showing very transient binding while others remain membrane-bound for most of the simulation trajectory (Fig. S1). This behavior could be, in part, attributed to the wide diversity of organellar membranes to which the LTDs in our dataset bind to in vivo, and to the comparative simplicity of our in silico model DOPC lipid bilayers."

      • Later the "excellent agreement" for the data in Fig S2 is not quantified which does not allow the reader to know whether it better than would have been with other methods (SASA, OPM, DREAM). *

      We have explicitly quantified this agreement by providing a direct comparison between the experimental results and our in silico assay, and we further compared it against two alternative methods: OPM and DREAMM. In detail, we have identified 12 experimentally-characterized spots suggested to be involved in membrane binding in our protein dataset (see shaded blue regions in Fig. S2). Of those 12, our method identifies all of them (100%), while DREAMM identifies 7 of them (58 %) and OPM 4 out of 8 (50 %), since of the 12 proteins we tested, only 7 are available in the OPM database. Overall, even if our approach is much noisier than the others, and thus suggesting multiple binding regions that are not currently supported by experimental observations, using physics-based methodologies appears to remain a preferable strategy to characterize the binding of peripheral proteins to lipid bilayers. Given the limited size of our dataset, we prefer not to make a direct comparison between our assay and OPM/DREAMM in the main text as this won't be representative of the various methodologies.

      *p5 commenting on Fig2b the case of Osh6 that appears to disagree should probably be mentioned. *

      We now discuss this case, and attribute to this disagreement to insufficient sampling for the peculiar case of Osh6:

      "____One interesting exception in our database appears to be Osh6, where the experimentally determined membrane-binding region at the N-terminus (https://doi.org/10.1038/s41467-019-11780-y) is only marginally binding to the lipid bilayer in silico and it also appears to have limited contribution to PC1. However, our simulations are unable to sample the large conformational changes that the N-terminal lid of Osh6 has been proposed to undergo from its lipid-bound to its apo state, indicating that insufficient sampling could be the reason for this apparent discrepancy."

      *

      -The data and the methods are generally well presented allowing to be reproduced.

      • The experiments adequately replicated with adequate statistical analysis. *

      * Minor comments: *

      * - When presenting the dataset the authors could probably detail a bit more the protocol undertaken to chose the cases. In particular it is unclear whether the chosen proteins have any membrane selectivity, which in principle could be affected by the choice of lipid used here.*

      We have now added in Table 1 a column with a list of potential organelles the different LTPs have been shown to localize to (source: UniProt). As model membrane bilayer, we opted to use a pure DOPC bilayer, for both simplicity and to compare membrane binding in a uniform setting. We foresee that future studies investigating the membrane specificity of the various proteins will shed further light into the molecular mechanism of LTPs. Finally, we also indicate that our choice of proteins was mainly driven by the availability of lipid-bound structures in the protein data bank. We have added the following sentences in the main text:

      "____Specifically, we selected all LTPs for which a crystallographic structure in complex with a lipid was available at the start of our project, plus two additional proteins (GM2A and LCN1) to increase the structural diversity of our dataset (Fig. 1a)"

      and

      "____Notably, the proteins in our dataset display distinct binding affinities, with some proteins showing very transient binding while other remain membrane-bound for most of the simulation trajectory (Fig. S1). This behavior could be, in part, attributed to the wide diversity of organellar membranes to which the LTDs in our dataset bind to in vivo, and to the comparative simplicity of our in silico model DOPC lipid bilayers."

      *- The authors could probably give some indication of how much of the variance is explained by PC1 and comment briefly on the choice to ignore other PCs. *

      PC1 explains 38 % of the total variance, on average. This means that PC1 has a large contribution to the variance, especially in comparison to the other PCs. For instance, PC2 only accounts for 17 % of the total variance. This is the reason we limited our discussion to PC1. We have added a table in supplementary Information quantifying the variance explained by PC1 and PC 2 and added the following sentence in the main text:

      "____We specifically focused on PC1 as it explains most of the variance in the dynamics (38% on average for all the proteins in our dataset)____. "

      * - When analyzing the residues involved in the interaction with the membrane the results could probably be compared with that of the systematic analysis performed recently: Tubiana, T., Sillitoe, I., Orengo, C., & Reuter, N. (2022). Dissecting peripheral protein-membrane interfaces. PLOS Computational Biology, 18(12), e1010346. *

      We have added in the text a reference to the work by Tubiana et al and we have further stressed that our results agree with previous observations (including theirs). This includes the preference for Lys over Arg and the importance of protruding hydrophobes:

      "____Concomitant analysis of all LTDs (Fig. 1d) indicates that the membrane binding interface of LTDs is enriched in the positively charged amino acid Lysine, as this amino acid is less membrane-disruptive than Arginine22, and aromatic/hydrophobic ones (Phe, Leu, Val, Ile). This confirms previous observations, as (i) binding of negatively charged lipids via positively charged residues and (ii) hydrophobic insertions are two of the main mechanisms involved in membrane binding by peripheral proteins22-27."

      * - In the discussion on allostery/conformational selection might not be centered so much on enzymes. *

      We thank the reviewer for this important observation. We have now included in the Discussion the following paragraph that provides additional references and discussion of membrane transporters and receptors.

      "____Notably, the conformational plasticity we observe for LTPs is reminiscent of other, previously described, functional protein mechanisms, including enzyme dynamics during catalysis (____DOI: 10.1126/science.1066176____), the alternating-access model of membrane transporters (____https://doi.org/10.1038/nsmb.3179____) or GPCR dynamics (____https://doi.org/10.1021/acs.chemrev.6b00177____). In all these cases, protein dynamics is strongly coupled to ligand binding and protein function, be it for signaling, transport or enzymatic activity. Unlike for these fields, however, the contribution of structural and spectroscopic studies to uncover LTP dynamics remains quite limited, and our simulations provide an important contribution to fill this gap. We hope that our results will motivate researchers to increase efforts to experimentally quantify LTPs conformational plasticity, e.g. by structural determination of LTPs in different states (or bound to different lipids) or by single-molecule spectroscopy studies."

      * Reviewer #3 (Significance (Required)): *

      *

      The article shows convincing results on the debated issue of the mechanism of lipid transport by lipid transfer proteins. *

      First the study employs molecular modelling to allow a rather large test on 12 cases. The molecular dynamics experiments allow the authors to draw clear hypotheses on role of protein dynamics on the interaction with membranes and the effect on bound lipids on the modification of this dynamics.

      *Then the authors use this knowledge to design experiments that largely confirm those hypotheses. The results should therefore be interesting for a large audience of biochemists and cell biologists interested in lipid transport in the cell. *

      We thank Reviewer #3 for its very positive evaluation and contextualization of our work.

    1. Since the family is the site where biology,society and psychology converge mostevidently, Freud's rooting of sexuality in adeterminate way in the family makesperfect sense. Sexual desire may indeed bedeeply structured by infantile experience,internal conflicts not fully resolved, andrepressions of instincts in early life. Butthe drive model also has blind spots: itobscures the importance of later develop-ment and adult experience, understatesthe impact of the social milieu that shapesthose experiences, and retains a telos ofnormal sexual development, even as itexpands the meaning of the word "sexual".In the final analysis, it can be argued thatFreud rendered nature partly social, movingbeyond the biological determinism ofsexology to begin to understand howdesire is constituted intersubjectively. Butlacking a theory of social structure beyondthe family, the drive model of sexualitytended to downplay the actual links betweensocial structure and sexual behavior.

      Some people believe that our feelings about love and our bodies are shaped when we are very young and that can affect us as we grow up. But this idea only focuses on the family and doesn't think about how other things and experiences in life can also shape our feelings. So, it's important to remember that there are many things that can make us feel different and that how we feel is not only because of our family.

    Annotators

  2. Jun 2023
    1. Author Response:

      Reviewer #1 (Public Review):

      […] The major strength of the study is the elegant and well-powered data set. Longitudinal data on this scale is very difficult to collect, especially with patient cohorts, so this approach represents an exciting breakthrough. Analysis is straightforward and clearly presented. However, no multiple comparison correction is applied despite many different tests. While in general I am not convinced of the argument in the citation provided to justify this, I think in this case the key results are not borderline (p<0.001) and many of the key effects are replications, so there are not so many novel/exploratory hypothesis and in my opinion the results are convincing and robust as they are. The supplemental material is a comprehensive description of the data set, which is a useful resource.

      The authors achieved their aims, and the results clearly support the conclusion that the AD and mean confidence in a perceptual task covary longitudinally. I think this study provides an important impact to the project of computational psychiatry.Sspecifically, it shows that the relationship between transdiagnostic symptom dimensions and behaviour is meaningful within as well as across individuals.

      Response: We thank the reviewer for their appraisal of our paper and positive feedback on the main manuscript and supplementary information. We agree with the reviewer that the lack of multiple comparison corrections can also justified by key findings being replications and not borderline significance. We have added this additional justification to the manuscript (Methods, Statistical Analyses, page 15, line 568: “Adjustments for multiple comparisons were not conducted for analyses of replicated effects”)

      Reviewer #2 (Public Review):

      […] The major strength and contribution of this study is the use of a longitudinal intervention design, allowing the investigation of how the well-established link between underconfidence and anxious-depressive symptoms changes after treatment. Furthermore, the large sample size of the iCBT group is commendable. The authors employed well-established measures of metacognition and clinical symptoms, used appropriate analyses, and thoroughly examined the specificity of the observed effects.

      However, due to the small effect sizes, the antidepressant and control groups were underpowered, reducing comparability between interventions and the generalizability of the results. The lack of interaction effect with treatment makes it harder to interpret the observed differences in confidence, and practice effects could conceivably account for part of the difference. Finally, it was not completely clear to me why, in the exploratory analyses, the authors looked at the interaction of time and symptom change (and group), since time is already included in the symptom change index.

      Response: We thank the reviewer for their succinct summary of the main results and strengths of our study. We apologise for the confusion in how we described that analysis. We examine state-dependence., i.e. the relationship between symptom change and metacognition change, in two ways in the paper – perhaps somewhat redundantly. (1) By correlating change indices for both measures (e.g. as plotted in Figure 3D) and (2) by doing a very similar regression-based repeated-measures analysis, i.e. mean confidence ~ time*anxious-depression score change. Where mean confidence is entered with two datapoints – one for pre- and one for post-treatment (i.e. within-person) and anxious-depression change is a single value per person (between-person change score). This allowed us to test if those with the biggest change in depression had a larger effect of time on confidence. This has been added to the paper for clarification (Methods, Statistical Analysis, page 14, line 553-559: “To determine the association between change in confidence and change in anxious-depression, we used (1) Pearson correlation analysis to correlate change indices for both measures and, (2) regression-based repeated-measures analysis: mean confidence ~ time*anxious-depression score change, where mean confidence is entered with two datapoints (one for pre- and one for post-treatment i.e., within-person) and anxious-depression change is a single value per person (between-person change score)”).

      The analyses have also been reported as regression in the Results for consistency (Treatment Findings: iCBT, page 5, line 197-204: ‘To test if changes in confidence from baseline to follow-up scaled with changes in anxious-depression, we ran a repeated measure regression analyses with per-person changes in anxious-depression as an additional independent variable. We found this was the case, evidenced by a significant interaction effect of time and change in anxious-depression on confidence (b=-0.12, SE=0.04, p=0.002)… This was similarly evident in a simple correlation between change in confidence and change in anxious-depression (r(647)=-0.12, p=0.002)”).

      This longitudinal study informs the field of metacognition in mental health about the changeability of biases in confidence. It advances our understanding of the link between anxiety-depression and underconfidence consistently found in cross-sectional studies. The small effects, however, call the clinical relevance of the findings into question. I would have found it useful to read more in the discussion about the implications of the findings (e.g., why is it important to know that the confidence bias is state-dependent; given the effect size of the association between changes in confidence and symptoms, is the state-trait dichotomy the right framework for interpreting these results; suggestions for follow-up studies to better understand the association).

      Response: Thank you for this comment. We have elaborated on the implications of our findings in the Discussion, including the relevance of the state-trait dichotomy to future research and how more intensive, repeated testing may inform our understanding of the state-like nature of metacognition (Discussion, Limitations and Future Directions, page 10, line 378-380: “More intensive, repeating testing in future studies may also reveal the temporal window at which metacognition has the propensity to change, which could be more momentary in nature.”).

      Reviewer #3 (Public Review):

      […] I think these findings are exciting because they directly relate to one of the big assumptions when relating cognition to mental health - are we measuring something that changes with treatment (is malleable), so might be mechanistically relevant, or even useful as a biomarker?

      This work is also useful in that it replicates a finding of heightened confidence in those with compulsivity, and lowered confidence in those with elevated anxious-depression.

      One caveat to the interest of this work is that it doesn't allow any causal conclusions to be drawn, and only measures two timepoints, so it's hard to tell if changes in confidence might drive treatment effects (but this would be another study). The authors do mention this in the limitations section of the paper.

      Another caveat is the small sample in the antidepressant group.

      Some thoughts I had whilst reading this paper: to what extent should we be confident that the changes are not purely due to practice? I appreciate there is a relationship between improvement in symptoms and confidence in the iCBT group, but this doesn't completely rule out a practice effect (for instance, you can imagine a scenario in which those whose symptoms have improved are more likely to benefit from previously having practiced the task).

      Response: We thank the reviewer for commenting on the implications of our findings and we agree with the caveats listed. We thank the reviewer for raising this point about practice effects. A key thing to note is that this task does not have a learning element with respect to the core perceptual judgement (i.e., accuracy), which is the target of the confidence judgment itself. While there is a possibility of increased familiarity with the task instructions and procedures with repeated testing, the task is designed to adjust the difficulty to account of any improvements, so accuracy is stable. We see that we may not have made this clear in some of our language around accuracy vs. perceptual difficulty and have edited the Results to make this distinction clearer (Treatment Findings: iCBT, pages 4-5, lines 184-189: “Although overall accuracy remained stable due to the staircasing procedure, participants’ ability to detect differences between the visual stimuli improved. This was reflected as the overall increase in task difficulty to maintain the accuracy rates from baseline (dot difference: M=41.82, SD=11.61) to follow-up (dot difference: M=39.80, SD=12.62), (b=-2.02, SE=0.44, p<0.001, r2\=0.01)”.)

      However, it is true that there can be a ‘practice’ effect in the sense that one may feel more confident (despite the same accuracy level) due to familiarity with a task. One reason we do not subscribe to the proposed explanation for the link between anxious-depression change and confidence change is that the other major aspect of behaviour that improved with practice did so in a manner unrelated to clinical change. As noted above in the quoted text, participants’ discrimination improved from baseline to follow-up, reflected in the need for higher difficulty level to maintain accuracy around 70%. Crucially, this was not associated with symptom change. This speaks against a general mechanism where symptom improvement leads to increased practice effects in general. Only changes in confidence specifically are associated with improved symptoms. We have provided more detail on this in the Discussion (page 9, lines 324-326: “This association with clinical improvements was specific to metacognitive changes, and not changes in task performance, suggesting that changes in confidence do not merely reflect greater task familiarity at follow-up.”).

      Relatedly, to what extent is there a role for general task engagement in these findings? The paper might be strengthened by some kind of control analysis, perhaps using (as a proxy for engagement) the data collected about those who missed catch questions in the questionnaires.

      Response: Thank you for your comment. We included the details of data quality checks in the Supplement. Given the small number of participants that failed more than one attention checks (1% of the iCBT arm) and that all those participants passed the task exclusion criteria, we made the decision to retain these individuals for analyses. We have since examined if excluding these small number of individuals impacts our findings. Excluding those that failed more than one catch item did not affect the significance of results, which has now been added to the Supplementary Information (Data Quality Checks: Task and Clinical Scales, page 5, lines 181-185: “Additionally, excluding those that failed more than one catch item in the iCBT arm did not affect the significance of results, including the change in confidence (b=0.16, SE=0.02, p<0.001), change in anxious-depression (b=-0.32, SE=0.03, p<0.001), and the association between change in confidence and change in anxious-depression (r(638)=-0.10, p=0.011)”).

      I was also unclear what the findings about task difficulty might mean. Are confidence changes purely secondary to improvements in task performance generally - so confidence might not actually be 'interesting' as a construct in itself? The authors could have commented more on this issue in the discussion.

      Response: Thank you for this comment and sorry it was not clear in the original paper. As we discussed in a prior reply, accuracy – i.e. proportion of correct selections (the target of confidence judgements) are different from the difficulty of the dot discrimination task that each person receives on a given trial. We had provided more details on task difficulty in the Supplement. Accuracy was tightly controlled in this task using a ‘two-down one-up’ staircase procedure, in which equally sized changes in dot difference occurred after each incorrect response and after two consecutive correct responses. The task is more difficult when the dot difference between stimuli is lower, and less difficult when the dot difference between stimuli is greater. Therefore, task difficulty refers to the average dot difference between stimuli across trials. Crucially, task accuracy did not change from baseline to follow-up, only task difficulty. Moreover, changes in task difficulty were not associated with changes in anxious-depression, while changes in confidence were, indicating confidence is the clinically relevance construct for change in symptoms.

      We appreciate that this may not have been clear from the description in the main manuscript, and have added more detail on task difficulty to the Methods (Metacognition Task, page 14, lines 540-542: “Task difficulty was measured as the mean dot difference across trials, where more difficult trials had a lower dot difference between stimuli.”) and Results (Treatment Findings: iCBT, pages 4-5, lines 184-186: “Although overall accuracy remained stable due to the staircasing procedure, participants’ ability to detect differences between the visual stimuli improved.”). We have also elaborated more on how improvements in symptoms are associated with change in confidence, not task performance in the Discussion (page 9, lines 324-326: “This association with clinical improvements was specific to metacognitive changes, and not changes in task performance, suggesting that changes in confidence do not merely reflect greater task familiarity at follow-up”).

      To make code more reproducible, the authors could have produced an R notebook that could be opened in the browser without someone downloading the data, so they could get a sense of the analyses without fully reproducing them.

      Response: Thank you for your comment. We appreciate that an R notebook would be even better than how we currently share the data and code. While we will consider using Notebooks in future, we checked and converting our existing R script library into R Notebooks would require a considerable amount of reconfiguration that we cannot devote the time to right now. We hope that nonetheless the commitment to open science is clear in the extensive code base, commenting and data access we are making available to readers.

      Rather than reporting full study details in another publication I would have found it useful if all relevant information was included in a supplement (though it seems much of it is). This avoids situations where the other publication is inaccessible (due to different access regimes) and minimises barriers for people to fully understand the reported data.

      Response: We agree this is good practice – the Precision in Psychiatry study is very large, with many irrelevant components with respect to the present study (Lee et al., BMC Psychiatry, 2023). For this reason, we tried to provide all that was necessary and only refer to the Precision in Psychiatry study methods for fine-grained detail. Upon review, the only thing we think we omitted that is relevant is information on ethical approval in the manuscript, which we have now added (Methods, Participants, page 11, lines 412-417: “Further details of the PIP study procedures that are not specific to this study can be found in a prior publication (21). Ethical approval for the PIP study was obtained from the Research Ethics Committee of School of Psychology, Trinity College Dublin and the Northwest-Greater Manchester West Research Ethics Committee of the National Health Service, Health Research Authority and Health and Care Research Wales”). If any further information is lacking, we are happy to include it here also.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      __Reviewer 1____: __

      1-Localization of ESYT1 and SYNJ2BP

      The claim of a localization at ER-mitochondria contacts relies on two type of assays. Light microscopy and subcellular fractionation. Concerning microscopy, while the staining pattern is obviously colocalizing with the ER (a control of specificity of staining using KO cells would nevertheless be desirable)

      the idea that ESYT1 foci "partially colocalized with mitochondria" is either trivial or unfounded

      Every cellular structure is "partially colocalized with mitochondria" simply by chance at the resolution of light microscopy

      If the meaning of the experiment is to show that ESYT1 'specifically' colocalizes with mitochondria, then this isn't shown by the data

      There is no quantification that the level of colocalization is more than expected by chance

      nor that it is higher than that of any other ER protein

      Moreover, the author's model implies that ESYT1 partial colocalization with mitochondria is, at least partially, due to its interaction with SYNJ2BP. This is not tested.

      • To analyze and measure MERCs parameters and functions, we used a set of validated methods described in the following specialized review articles (Eisenberg-Bord, Shai et al. 2016, Scorrano, De Matteis et al. 2019).
      • To support and confirm the localization of ESYT1-SYNJ2BP complex at MERCs, we performed supplementary BioID analysis using ER target BirA*, OMM targeted BirA* and ER-mitochondria tether BirA* (Table S1, Figure S1 and Figure 1 A and B). These results confirmed the specificity of the interaction of the 2 partners. ESYT1 is not identified as a prey in OMM BioID and SYNJ2BP is not identified in ER BioID, on the other hand both partners are identified in the ER-mitochondria tether BioID.
      • To improve our description of the partial localization of ESYT1 at mitochondria, we performed a quantitative analysis using confocal microscopy on control human fibroblasts stably overexpressing SEC61B-mCherry as an ER marker which were labelled with ESYT1 and TOMM40 for mitochondria. We measured the % of ESYT1 signal colocalizing with mitochondria and the % of mitochondria positive for ESYT1 (Figure 1E).
      • To demonstrate than ESYT1 partial colocalization with mitochondria is, at least partially, due to its interaction with SYNJ2BP, we performed a quantitative analysis using confocal microscopy. Human control fibroblasts, KO SYNJ2BP fibroblasts and SYNJ2BP overexpressing fibroblasts were labelled with ESYT1, TOMM40 for mitochondria and CANX for ER. We measured the % of ESYT1 signal colocalizing with mitochondria in each condition (Figure 3C). Membranes (MAM) can be purified and are enriched for proteins that localize at ER-mitochondria contacts. This idea originated in the early 90's and since then, myriad of papers has been using MAM purification, and whole MAM proteomes have been determined. Yet the evidence that MAM-enriched proteins represent bona fide ER-mitochondria-contact-enriched proteins (as can nowadays be determined by microscopy techniques) remain scarce. Here, anyway, ESYT1 fractionation pattern is identical to that of PDI, a marker of general ER, with no indication of specific MAM accumulation.

      • To highlight the enrichment of ESYT1 in the MAM fraction, we quantified the ESYT1 signal in each fraction. Those results show a similar fractionation pattern than the MAM resident protein SIGMAR1 (Figure 1F). For SYNJ2BP, it is different as it is more enriched in the MAM than the general mitochondrial marker PRDX3. However, PRDX3 is a matrix protein, making it a poor comparison point, since SYNJ2BP is an OMM protein.

      • To confirm the partial enrichment of SYNJ2BP in the MAM fraction compared to another outer mitochondrial membrane protein, we added the signal of the well characterized OMM protein CARD19 (Rios, Zhou et al. 2022). Again, the model implies that ESYT1 and SYNJ2BP accumulation in the MAM should be dependent on each other. This is not tested.

      • As describe above, we demonstrated in Figure 3C than the accumulation of ESYT1 at mitochondria is, at least partially, dependent on the quantity of SYNJ2BP.

      • We moreover showed a reciprocal effect in Figure 3E. A quantitative analysis using confocal microscopy demonstrated that the effect of SYNJ2BP overexpression on MERCs formation is partially dependent of the presence of ESYT1. 2-ESYT1-SYNJ2BP interaction.

      The starting point of the paper is a BioID signal for SYNJ2BP when BioID is fused to ESYT1. One confirmation of the interaction comes in figure 4, using blue native gel electrophoresis and assessing comigration. Because BioID is promiscuous and comigration can be spurious, better evidence is needed to make this claim. This is exemplified by the fact that, although SYNJ2BP is found in a complex comigrating with RRBP1, according to the BN gel, this slow migrating complex isn't disturbed by RRBP1 knockdown, but is somewhat disturbed by ESYT1 knockdown. More than a change in abundance, a change in migration velocity when either protein is absent would be evidence that these comigrating bands represent the same complex.

      • We showed in Figure 4C that the presence of SYNJ2BP in a complex of a similar molecular weight that ESYT1 (410KDa) is totally dependent of the presence of ESYT1, suggesting an interaction of the 2 proteins.
      • To confirm this interaction, in figure 4A we analyzed on BN cells overexpressing SYNJ2BP together with a 3xFlag tagged version of ESYT1. As a result of the addition of the Flag tag, the complex positive for ESYT1 shifted to a higher molecular weight. The complex positive for SYNJ2BP shifted to a similar the molecular weight, demonstrating the interaction and dependence of the 2 partners. ESYT1-SYNJ2BP interaction needs to be tested by coimmunoprecipitation of endogenous proteins, yeast-2-hybrid, in vitro reconstitution or any other confirmatory methods.

      • To confirm the interaction of the 2 partners, we performed co-immunoprecipitation of the ESYT1-3xFlag protein that we showed in Figure 1H to form complexes similar to the endogenous protein. SYNJ2BP is found as the strongest prey, followed by ESYT2 and SEC22B two described interactors of ESYT1, confirming the quality of the analysis (Table S2) (Giordano, Saheki et al. 2013, Gallo, Danglot et al. 2020). 3-Tethering by ESYT1- SYNJ2BP.

      This is assessed by light and electron microscopy. Absence of ESYT1 decreases several metrics for ER-mitochondria contacts (whether absence of SYNJ2BP has the same effect isn't tested).

      • Using PLA (proximity ligation assay) we demonstrated that the loss of SYNJ2BP leads to a decrease in MERCs (Figure 7 H and I), confirming previous studies (Ilacqua, Anastasia et al. 2022, Pourshafie, Masati et al. 2022). This interesting phenomenon could be due to many things, including but not limited to the possibility that "ESYT1 tethers ER to mitochondria".

      This statement and the respective subheading title are therefore clearly overreaching and should be either supported by evidence or removed.

      Indeed, absence of ESYT1 ER-PM tethering and lipid exchange could have knock-on effects on ER-mito contacts, therefore strong statements aren't supported.

      Moreover, the effect on ER-mitochondria contact metrics could be due to changes in ER-mitochondria contact indeed but may also reflect changes in ER and/or mitochondria abundance and/or distribution, which favour or disfavour their encounter. Abundance and distribution of both organelles are not controlled for.

      • The mitochondrial phenotypes caused by the loss of ESYT1 are all rescued by the introduction of an artificial mitochondrial-ER tether, demonstrating that they are due to loss of the tethering function of ESYT1. Finally, the authors repeat a finding that SYNJ2BP overexpression induces artificial ER-mitochondria tethering. Again, according to the model, this should be, at least in part, due to interaction with ESYT1. Whether ESYT1 is required for this tethering enhancement isn't tested.

      • As described above, we demonstrated in Figure 3C that the accumulation of ESYT1 at mitochondria is, at least partially, dependent on the quantity of SYNJ2BP.

      • We moreover showed a reciprocal effect in Figure 3F. A quantitative analysis using confocal microscopy demonstrated that the effect of SYNJ2BP overexpression on MERC formation is partially dependent of the presence of ESYT1. 4-Phenotypes of ESYT1/SYNJ2BP KD or KO.

      The study goes in details to show that downregulation of either protein yields physiological phenotypes consistent with decreased ER-mitochondria tethering. These phenotypes include calcium import into mitochondria and mitochondrial lipid composition.

      Figure 5 shows that histamine-evoked ER-calcium release cause an increase in mitochondrial calcium, and this increase is reduced in absence of ESYT1, without detectable change in the abundance of the main known players of this calcium import. This is rescued by an artificial ER-mitochondria tether. However, Figure 5D shows that the increase in calcium concentration in the cytosol upon histamine-evoked ER calcium release is equally impaired by ESYT1 deletion, contrary to expectation. Indeed, if the impairment of mitochondrial calcium import was due to improper ER-mitochondria tethering in ESYT1 mutant cells, one would expect more calcium to leak into the cytosol, not less.

      The remaining explanation is that ESYT1 knockout desensitizes the cells to histamine, by affecting GPCR signalling at the PM, something unexplored here.

      In any case, a decreased calcium discharge by the ER upon histamine treatment, explains the decreased uptake by mitochondria.

      The authors argue that ER calcium release is unaffected by ESYT1 KO, but crucially use thapsigargin instead of histamine to show it. Thus, the most likely interpretation of the data is that ESYT1 KO affects histamine signalling and histamine-evoked calcium release upstream of ER-mitochondria contacts.

      • Silencing ESYT1 impairs SOCE efficiency in Jurkat cells (Woo, Sun et al. 2020), but not in HeLa cells (Giordano, Saheki et al. 2013, Woo, Sun et al. 2020). Analysis of the role of ESYT1 in HeLa cells prevents confounding effects due to the loss of ESYT1 at ER-PM. In this model, knock-down of ESYT1 led to a decrease of mitochondrial Ca2+ uptake from the ER upon histamine stimulation, as monitored by genetically encoded Ca2+ indicator targeted to mitochondrial matrix (Figure 5A and B). ESYT1 silencing in HeLa cells did not impact ER Ca2+ store measured by the ER-targeted R-GECO Ca2+ probe (Figure 5C and D). The expression of the artificial mitochondria-ER tether was able to rescue mitochondrial Ca2+ defects observed in ESYT1 silenced cells (Figure 5B), confirming that the observed anomalies are specifically due to MERC defects.
      • In contrast loss of ESYT1 impaired SOCE efficiency in fibroblasts (Figure 6 A and B). This phenotype was fully rescued by re-expression of ESYT1-Myc but not the artificial tether. We therefore investigated the influence of ESYT1 loss on cytosolic Ca2+ concentration following ATP (Figure 6F to H) or histamine stimulation (Figure S3 D to F), both of which showed a reduced cytosolic Ca2+ concentration and uptake in ESYT1 KO cells. This phenotype was fully rescued by the re-expression of ESYT1-Myc but not the artificial tether. Measurment of cytosolic Ca2+ after tharpsigargin treatment in Ca2+-fee media, an inhibitor of the sarco/endoplasmic reticulum Ca2+ ATPase SERCA that blocks Ca2+ pumping into the ER, showed that ESYT1 KO does not influence the total ER Ca2+ pool (Figure 6K and L). However, ER-Ca2+ release capacity upon histamine stimulation (Figure 6I and J) is decreased in ESYT1 KO cells. This phenotype was fully rescued by the re-expression of ESYT1-Myc but not the artificial tether. Loss of ESYT1 decreased the Ca2+ uptake capacities of mitochondria after activation with histamine (Figure S3 A to C) or ATP (Figure 6 C to E). This phenotype was rescued by re-expression of ESYT1-Myc and also the engineered ER-mitochondria tether. Thus, despite the ER-Ca2+ release defect observed after ESYT1 loss, the artificial tether fully rescued the mitochondrial phenotype.
      • These results highlight the distinct and dual roles of ESYT1 in Ca2+ regulation at the ER-PM and at MERCs. The data with SYNJ2BP deletion are more compatible with decreased ER-mito contacts, as no decreased in cytosolic calcium is observed. This is compatible with the previously proposed role of SYNJ2BP in ER-mitochondria tethering, but the difference with ESYT1 rather argue that both proteins affect calcium signaling by different means, meaning they act in different pathways.

      • We explain the different results concerning cytosolic calcium by the fact that ESYT1 is a bi-localized protein with dual functions on cellular calcium. Implicated both in SOCE at ER-PM and in mitochondrial calcium uptake at MERCs. On the other hand, SYNJ2BP is only present at MERCs and its loss do not influence PM-ER signaling or ER-Ca2+ release. Finally, the study delves into mitochondrial lipids to "investigated the role of the SMP-domain containing protein ESYT1 in lipid transfer from ER to mitochondria". In reality, it is not ER-mitochondria lipid transport that is under scrutiny, but general lipid homeostasis, and changes in ER-PM lipids could have knock-on effects on mitochondrial lipids without the need to invoke disruptions in ER-mitochondria transfer activity.

      • The fact that the artificial tether, which specifically rescue MERCs, fully rescue the lipid phenotype argue for a direct loss of MERCs tethering function when ESYT1 is missing. The changes observed are interesting but could be due to anything. Surprisingly, PCA analysis shows that the rescue of the knockout by the ESYT1 gene clusters with the rescue by the artificial tether, and not with the wildtype. This indicates that overexpressing either ESYT1 or a tether cause similar lipidomic changes. These could be due, for instance, to ER stress caused by protein overexpression, and not to a rescue.

      • In order to verify if the overexpression of ESYT1 or the artificial tether induces ER stress, we performed a WB analysis to compare markers of ER stress in control fibroblasts, KO ESYT1 fibroblasts, KO ESYT1 fibroblasts overexpressing ESYT1-Myc or the tether (Figure S4C). This showed no changes in the levels of several different markers of ER stress or cell death. __Reviewer 2____: __

      1) the interaction between those proteins is direct,

      2) if SYNJ2BP is necessary and sufficient to localize E-Syt1 at MERC, and

      3) if MERCs extension induced by SYNJ2BP is dependent on E-Syt1.

      Those points are important to investigate because SYNJ2BP has already been shown to induce MERCs by interacting with the ER protein RRBP1. In addition, some experiments need to be better quantified.

      Major comments: E-syt1/SYNJ2BP in MERCs formation: the authors provide several convincing lines of evidence that both proteins are in the same complex (proximity labelling, localization in the same complex in BN-PAGE, localization in MAM) but it is not clear in which extent the direct interaction between both proteins regulates ER-mitochondria tethering. 1- Pull down experiments or BiFC strategy could be performed to show the direct interaction between both proteins.

      • We showed in Figure 4C that the presence of SYNJ2BP in a complex of a similar molecular weight to that ESYT1 (410KDa) is totally dependent of the presence of ESYT1, suggesting an interaction of the 2 proteins.
      • To confirm this interaction, in figure 4A we analyzed on BN cells overexpressing SYNJ2BP together with a 3xFlag tagged version of ESYT1. As a result of the addition of the Flag tag, the complex positive for ESYT1 shifted to a higher molecular weight. Significantly, the complex positive for SYNJ2BP shifted to a similar the molecular weight, demonstrating the interaction and dependence of the 2 protein partners.
      • To confirm the interaction of the 2 partners, we performed co-immunoprecipitation of the ESYT1-3xFlag protein (Table S2). SYNJ2BP was found as the strongest prey, followed by ESYT2 and SEC22B two described interactors of ESYT1, confirming the quality of the analysis (Giordano, Saheki et al. 2013, Gallo, Danglot et al. 2020). 2- SYNJ2BP OE has already been demonstrated to increase MERCs and this being dependent on the ER binding partners RRBP1 (10.7554/eLife.24463). Therefore, it would be of interest to perform OE of SYNJ2BP in KO Esyt1 to address the question of whether ESyt1 is also required to increase MERCs.

      • A quantitative analysis using confocal microscopy demonstrated that the effect of SYNJ2BP overexpression on MERCs formation is partially dependent of the presence of ESYT1 (Figure 3F). 3- The authors show that Esyt1 punctate size increases when SYNJ2BP is OE (Fig3C), but this can be indirectly linked to the increase of MERCs in the OE line. Thus, it could be interesting to test if the number/shape of E-syt1 punctate located close to mitochondria decreases in KO SYNJ2B. This could really show the dependence of SYNJ2BP for E-syt1 function at MERCs.

      • To improve our description of the partial localization of ESYT1 at mitochondria, we performed a quantitative analysis using confocal microscopy on control human fibroblasts stably overexpressing SEC61B-mCherry as an ER marker which were labelled with ESYT1 and TOMM40 for mitochondria. We measured the % of ESYT1 signal colocalizing with mitochondria and the % of mitochondria colocalizing with ESYT1 (Figure 1E).

      • To demonstrate than ESYT1 partial colocalization with mitochondria is, at least partially, due to its interaction with SYNJ2BP, we performed a quantitative analysis using confocal microscopy. Human control fibroblasts, KO SYNJ2BP fibroblasts and SYNJ2BP overexpressing fibroblasts were labelled with ESYT1, TOMM40 for mitochondria and CANX for ER. We measured the % of ESYT1 signal colocalizing with mitochondria in each condition (Figure 3C). Lipid analyses: the results of MS on isolated mitochondria clearly show that mitochondrial lipid homeostasis is affected on KO-Syt1 and rescued by expression of Syt1-Myc and artificial mitochondria-ER tether. However, p.15, the authors wrote "The loss of ESYT1 resulted in a decrease of the three main mitochondrial lipid categories CL, PE and PI, which was accompanied by an increase in PC ». As the results are expressed in mol%, this interpretation can be distorted by the fact that mathematically, if the content of one lipid decreases, the content of others will increase. I would suggest to express the results in lipid quantity (nmol)/mg of mitochondria proteins instead of mol%. This will clarify the role of E-Syt1 on mitochondrial lipid homeostasis and which lipid increase and decrease.

      • We changed the sentence in the text as suggested. Also it could be of high interest to have the lipid composition of the whole cells to reinforce the direct involvement of E-Syt1 in mitochondrial lipid homeostasis and verify that the disruption of mitochondrial lipid homeostasis is not linked to a general perturbation of lipid metabolism as this protein acts at different MCSs.

      • This is beyond the scope of the project and we would argue that the results of such an experiment would be difficult to interpret. To better understand the impact of Esyt1 of mitochondria morphology, the author could analyze the mitochondria morphology (size, shape, cristae) on their EM images of crt, KO and OE lines. Indeed, on OE (Fig3A), the mitochondria look bigger and with a different shape compared to crt.

      • As we do not observe obvious differences in mitochondrial morphology between control, KO and OE fibroblasts we do not think that quantitative analysis would add to the understanding of the effect of ESYT1 on mitochondrial function. Also, they performed a lot of BN-PAGE. Is it possible to check whether the mitochondrial respiratory chain super-complexes are affected on Esyt1 KO line compared to crt?

      • We decided to remove the data on the metabolic consequences of ESYT1 loss since it was too preliminary and required deeper investigations, focusing instead on the effect of ESYT1 loss on calcium homeostasis. Quantifications: some western blots needs to be quantified (Fig 5K, 6J, S3E);

      • We did not observe obvious differences in the protein levels so we think that quantitation would not add significantly to the understanding of the differences in calcium dynamics that we report. Fig1A: Can the author provide a higher magnification of the triple labeling and perform quantification about the proportion of E-Syt1 punctate located close to mitochondria?

      • We added higher magnification of the same area in all channels and arrows that point to the foci of ESYT1 colocalizing with both ER and mitochondria (Figure 1D).

      • To improve our description of the partial localization of ESYT1 at mitochondria, we performed a quantitative analysis using confocal microscopy on control human fibroblasts stably overexpressing SEC61B-mCherry as an ER marker which were labelled with ESYT1 and TOMM40 for mitochondria. We measured the % of ESYT1 signal colocalizing with mitochondria and the % of mitochondria colocalizing with ESYT1 (Figure 1E). Minor comments:

      • Fig1E + text: according to the legend, the BN-PAGE has been performed on Heavy membrane fraction. Why the authors speak about complexes at MAM in the text of the corresponding figure? Is-it the MAM or the heavy fraction (MAM + mito + ER...)? If BN have been performed from heavy membranes, it is not a real proof that E-syt1 is in MAMs.

      • Heavy membranes have been used in this experiment. The text and conclusions have been changed accordingly.

      • On fig3C (panel crt): it seems like SYNJ2BP dots are not co-localizaed with mito. Is this protein targeted to another organelle beside mitochondria?

      • It is not described that SYNJ2BP would be targeted to another organelle beside mitochondria. It is possible that those dots outside of mitochondria could be non-specific signals from the antibody we used.

      • Fig4A: can the author provide a control of protein loading (membrane staining as example) to confirm the decrease of E-Syt1 in siSYNJ2BP?

      • As we performed this experiment only once we have removed the statement suggesting a decrease in ESYT1 protein in response to the siSYNJ2BP.

      • Fig5E/F: it is not clear to me why the expression of E-Syt1 in the KO is not able to complement the KO phenotype for cytosolic Ca++. Can the authors comment this?

      • We performed further analysis using ATP to trigger calcium release from the ER (figure 6 F to H). In those conditions, expression of ESYT1 in the KO is able to complement the KO phenotype for cytosolic Ca2+. __Reviewer 3____: __

      Main points 1. Confirming the MERC localization of ESYT1 should include some more of tethering factors as demonstrated interactors (some are mentioned above) and should not be limited to lipid homeostasis.

      • As shown in Figure 1B, VAPB, PDZD8 and BCAP31 are found as preys in the ESYT1 bioID analysis. Those proteins have been described as MERC tethers, their loss leading to mitochondrial calcium defects. To support and confirm the specificity of ESYT1-SYNJ2BP complex at MERCs, we performed a supplementary BioID analysis using ER targeted BirA* and OMM targeted BirA* (Table S1, Figure S1 and Figure 1 A and B). These results confirmed the specificity of the interaction of the 2 partners. ESYT1 is not identified as a prey in OMM BioID and SYNJ2BP is not identified in ER BioID. Additional ER-mitochondria tether BirA* analyses showed that tether-BirA* identified both ESYT1 and SYNJ2BP as a prey at MERCs, confirming the localisation of this interaction. Interestingly, a large majority of the known MERCs tethers VAPB-PTPIP51, MFN2, ITPRs, BCAP31 are also found as preys in the tether-BirA* (Figure 1B), confirming the quality of these data.
      • To confirm the interaction of the 2 partners, we performed co-immunoprecipitation of the ESYT1-3xFlag protein. SYNJ2BP is found as the strongest prey, followed by ESYT2 and SEC22B two described interactors of ESYT1, confirming the quality of the analysis (Table S2) (Giordano, Saheki et al. 2013, Gallo, Danglot et al. 2020).

      The fact that in ESYT1 KO cells both mitochondrial calcium transfer and cytosolic calcium accumulation are accompanied by decreased ER-cepia1ER signal decay upon histamine addition suggest that the main reason for ER-mitochondria calcium transfer defects are due to impaired SOCE. Calcium-free medium and histamine are used to show that ESYT1 does not affect ER calcium content. However, if it affects SOCE, then the absence of extracellular calcium would abolish such an effect; moreover, histamine does not test for leak effects. As additional information, the authors should investigate whether ER calcium content is affected by the presence of extracellular calcium in the ko scenario using thapsigargin. The authors should inhibit SOCE to test whether this mechanism is affected in ESYT1 KO and could account for observed signal differences. Excluding SOCE is critical, since any change in calcium entry from the outside would potentially negate a role of ESYT1 in mitochondrial calcium uptake.

      • Silencing ESYT1 impairs SOCE efficiency in Jurkat cells (Woo, Sun et al. 2020), but not in HeLa cells (Giordano, Saheki et al. 2013, Woo, Sun et al. 2020). Analysis of the role of ESYT1 in HeLa cells prevents confounding effects due to the loss of ESYT1 at ER-PM. In this model, knock-down of ESYT1 led to a decrease of mitochondrial Ca2+ uptake from the ER upon histamine stimulation, as monitored by genetically encoded Ca2+ indicator targeted to mitochondrial matrix (Figure 5A and B). ESYT1 silencing in HeLa cells did not impact ER Ca2+ store measured by the ER-targeted R-GECO Ca2+ probe (Figure 5C and D). The expression of the artificial mitochondria-ER tether was able to rescue mitochondrial Ca2+ defects observed in ESYT1 silenced cells (Figure 5B), confirming that the observed anomalies are specifically due to MERC defects.
      • In contrast loss of ESYT1 impaired SOCE efficiency in fibroblasts (Figure 6 A and B). This phenotype was fully rescued by re-expression of ESYT1-Myc but not the artificial tether. We therefore investigated the influence of ESYT1 loss on cytosolic Ca2+ concentration following ATP (Figure 6F to H) or histamine stimulation (Figure S3 D to F), both of which showed a reduced cytosolic Ca2+ concentration and uptake in ESYT1 KO cells. This phenotype was fully rescued by the re-expression of ESYT1-Myc but not the artificial tether. Measurment of cytosolic Ca2+ after tharpsigargin treatment in Ca2+-fee media, an inhibitor of the sarco/endoplasmic reticulum Ca2+ ATPase SERCA that blocks Ca2+ pumping into the ER, showed that ESYT1 KO does not influence the total ER Ca2+ pool (Figure 6K and L). However, ER-Ca2+ release capacity upon histamine stimulation (Figure 6I and J) is decreased in ESYT1 KO cells. This phenotype was fully rescued by the re-expression of ESYT1-Myc but not the artificial tether. Loss of ESYT1 decreased the Ca2+ uptake capacities of mitochondria after activation with histamine (Figure S3 A to C) or ATP (Figure 6 C to E). This phenotype was rescued by re-expression of ESYT1-Myc and also the engineered ER-mitochondria tether. Thus, despite the ER-Ca2+ release defect observed after ESYT1 loss, the artificial tether fully rescued the mitochondrial phenotype.
      • These results highlight the distinct and dual roles of ESYT1 in Ca2+ regulation at the ER-PM and at MERCs.

      The authors claim that ER-Geco measurements show that no change of ER calcium was observed. However, they use thapsigargin treatment and then get a peak, when the signal should show a decrease due to leak. This suggests they did not use ER-Geco in Figure S3C. What was measured and what does it mean?

      • We used R-GECO (not ER-GECO) which measures the cytosolic calcium.
      • We measured total ER Ca2+ store using the cytosolic-targeted R-GECO Ca2+ probe upon thapsigarin treatment, an inhibitor of the sarco/endoplasmic reticulum Ca2+ ATPase SERCA that blocks Ca2+ pumping into the ER (Figure 5C and D) and observed no difference in our different conditions.

      The findings on growth in galactose medium are intriguing but are not accompanied by respirometry to confirm mitochondria are compromised upon ESYT1 KO.

      • We decided to remove the data on the metabolic consequences of ESYT1 loss since it was to preliminary and required deeper investigations, focusing instead on the effect of ESYT1 loss on calcium homeostasis

      Minor points: 1. The authors mention they measure mitochondrial uptake of "exogenous" calcium by applying histamine. They should specify that these measures transferred calcium from the ER rather than uptake of calcium from the exterior (directly at the plasma membrane).

      • The text was clarified as suggested.

      • Expression levels of IP3Rs are not very indicative of any change of their activity. The authors should discuss how ESYT1 could affect their PTMs.

      • A large numer of post translational modifications are known to regulate IP3R activity (Hamada and Mikoshiba 2020), and it is possible that the loss of ESYT1 could interfere with these modifications, but an exploration of this issue is beyond the scope of this study. The text was clarified as suggested. Eisenberg-Bord, M., N. Shai, M. Schuldiner and M. Bohnert (2016). "A Tether Is a Tether Is a Tether: Tethering at Membrane Contact Sites." Dev Cell 39(4): 395-409.

      Gallo, A., L. Danglot, F. Giordano, B. Hewlett, T. Binz, C. Vannier and T. Galli (2020). "Role of the Sec22b-E-Syt complex in neurite growth and ramification." J Cell Sci 133(18).

      Giordano, F., Y. Saheki, O. Idevall-Hagren, S. F. Colombo, M. Pirruccello, I. Milosevic, E. O. Gracheva, S. N. Bagriantsev, N. Borgese and P. De Camilli (2013). "PI(4,5)P(2)-dependent and Ca(2+)-regulated ER-PM interactions mediated by the extended synaptotagmins." Cell 153(7): 1494-1509.

      Hamada, K. and K. Mikoshiba (2020). "IP(3) Receptor Plasticity Underlying Diverse Functions." Annu Rev Physiol 82: 151-176.

      Ilacqua, N., I. Anastasia, D. Aloshyn, R. Ghandehari-Alavijeh, E. A. Peluso, M. C. Brearley-Sholto, L. V. Pellegrini, A. Raimondi, T. Q. de Aguiar Vallim and L. Pellegrini (2022). "Expression of Synj2bp in mouse liver regulates the extent of wrappER-mitochondria contact to maintain hepatic lipid homeostasis." Biol Direct 17(1): 37.

      Pourshafie, N., E. Masati, A. Lopez, E. Bunker, A. Snyder, N. A. Edwards, A. M. Winkelsas, K. H. Fischbeck and C. Grunseich (2022). "Altered SYNJ2BP-mediated mitochondrial-ER contacts in motor neuron disease." Neurobiol Dis: 105832.

      Rios, K. E., M. Zhou, N. M. Lott, C. R. Beauregard, D. P. McDaniel, T. P. Conrads and B. C. Schaefer (2022). "CARD19 Interacts with Mitochondrial Contact Site and Cristae Organizing System Constituent Proteins and Regulates Cristae Morphology." Cells 11(7).

      Scorrano, L., M. A. De Matteis, S. Emr, F. Giordano, G. Hajnoczky, B. Kornmann, L. L. Lackner, T. P. Levine, L. Pellegrini, K. Reinisch, R. Rizzuto, T. Simmen, H. Stenmark, C. Ungermann and M. Schuldiner (2019). "Coming together to define membrane contact sites." Nat Commun 10(1): 1287.

      Woo, J. S., Z. Sun, S. Srikanth and Y. Gwack (2020). "The short isoform of extended synaptotagmin-2 controls Ca(2+) dynamics in T cells via interaction with STIM1." Sci Rep 10(1): 14433.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We are grateful to both reviewers for reviewing our manuscript, and for providing very helpful feedback as to how we can improve this work. We have now implemented nearly all of the changes as recommended, and provide responses to these points below.

      In terms of novelty, while recent pre-prints and publications have suggested that the application of multi-omics analysis improves GRN inference, there has yet to be a systematic comparison of linear and non-linear machine learning methods for GRN prediction from single cell multi-omic data. here are many computational and statistical challenges to such a study, and we therefore believe that others in the field will be especially interested in our systematic comparison of network inference methods, especially given the increased interest and utility of multi-omic data.

      In addition, we report the first comprehensive inference of GRNs in early human embryo development. This is a particularly challenging to study developmental context given genetic variation, limitations of sample size due to the precious nature of the material and regulatory constraints. We anticipate that the methodology we developed and datasets we generated will be informative for computational, developmental and stem cell biologists.

      We have uploaded all the network predictions on FigShare and these can be accessed using the following link: https://doi.org/10.6084/m9.figshare.21968813. In addition, we anticipate that the computational and statistical codes and pipelines we developed (available on https://github.com/galanisl/early_hs_embryo_GRNs) will be applied to other cellular and developmental contexts, especially in challenging contexts such as human development, non-typical model organisms and in clinically relevant samples.

      Reviewer 1

      Major comments

      - The proposed strategy (i.e. combining gene expression-based regulatory inference with cis-*regulatory evidence) have been well developed (and implemented) by multiple published works like SCENIC and CellOracle, which is also properly acknowledged by the authors in the discussion section too. This leads to a serious concern on the major methodological contribution of this work. *

      We would like to note that our study is the first to comprehensively evaluate machine learning linear or non-linear gene regulatory network prediction strategies from single-cell transcriptional datasets combined with available multi-omic data. We also apply these methods to a challenging to study context of human early embryogenesis. There are specific methodological challenges arising in this context that other published work has not yet addressed. In particular, the precious nature of the source material means that sample sizes are limited, unlike the contexts where SCENIC and CellOracle were applied. Notably, the numbers of cells available for downstream analysis is typically several orders of magnitude fewer than when scRNA-seq data are collected from adult human tissue or from cell culture. This restriction on sample sizes places corresponding restrictions on statistical power, and is therefore likely to mean that different statistical network inference methodologies are optimal in specific contexts. Furthermore, the inclusion of multi-omic data from complementary platforms (such as ATAC-seq data) becomes even more important in this context to mitigate the effect of reduced sample sizes. These issues are very important for choice of gene regulatory network inference methodology in relation to studies of human embryo development, and ours is the first study to address these issues directly in any context. We have further clarified the novelty of our work in the manuscript in the abstract, introduction and discussion sections.

      - Most of the compared network reconstruction methods involve hyper-parameters setup (e.g., *sparsity regularization weights of the regression methods). The authors did not discuss how these hyper-parameters were chosen. *

      For sparse regression, the hyperparameter controlling sparsity was set by cross-validation (CV), using the internal CV function of the R package. All default settings for GENIE3 were used. This information has now been added to the manuscript (in the Methods section), along with a description of the implementation of the mutual information method we use.

      - For the real-world blastocyst data, the network prediction methods were compared in terms of their reproducibility across validation folds (Fig. 3, Fig. S4-6). However, reproducibility does not necessarily imply accuracy. In fact, statistical learning methods are generally subject to the bias-variance tradeoff, where lower variance (i.e., higher reproducibility) could imply higher bias in model prediction. While there is a lack of gold-standard ground truth to evaluate network accuracy in real biological systems, silver-standards like the ranking of known regulatory interactions in the predictions could be employed as an indirect estimate.

      We thank the reviewer for the opportunity to clarify this point. We would like to avoid any misunderstanding of the reproducibility statistic R, as follows. A higher value of R indicates that the fitted model would generalise well to new data; i.e., R=1 indicates that the model is robust (stable) to perturbations of the data-set. We note that this is not the same as analysing the residual variance of the data after model fitting and related over-fitting (i.e., bias-variance trade-off). The variance that is referred to when discussing bias-variance trade-off is the mean-squared error (of data compared to model), which is not the same as what is assessed by reproducibility statistic R . Specifically, R is a Bayesian estimate of the posterior probability of observing a gene regulation given the data. R is calculated by taking a random sample of the data, doing the network inference again, checking if each gene regulation still appears in the GRN, and then recording (as the R statistic) the average fraction of inclusions over many repetitions. So when we have R close to 1, this indicates that our model predictions generalise well to new data, which is the opposite of what is suggested in this comment. In summary, the accuracy quantified by the reproducibility statistic R relates to the stability of the model predictions to perturbation of the data. We thank the reviewer for the helpful comment to draw our attention to this point, and have now clarified this point in the manuscript on page 6 line 252.

      - The gene set enrichment results were reported only on EPI and TE cell types (Fig. 4C and Fig. *S12), due to the reason that CA data is only available for TE and ICM. However, many of the other results presented in Fig. 3-6 did include the PE cell type albeit using the same CA data. It is not particularly convincing why the cell type inclusion standard for gene set enrichment is different from the other results. *

      We thank the reviewer for noting this and would like to clarify that we restricted the analysis to the EPI and TE, because similar lists of gene-sets were not available for primitive endoderm, where it is currently unclear which pathways are most relevant to this cell type. This has now been clarified in the manuscript on page 8, line 337.

      - The authors cited TF binding in cis-regulatory regions as supporting evidence of several MICA-inferred regulatory interactions (e.g., NANOG -> ZNF343). However, the same cis-regulatory *evidence has already been used in the CA filtering step. All interactions passing CA filtering should in principle have TF-binding support. It would be more convincing if the authors provided other types of evidence as independent support, such as genetic associations like eQTL, experimental perturbations like gene knockdown/knockout, etc. *

      We appreciate the reviewer’s point. We address this by describing published ChIP-seq validation in human pluripotent stem cells which is widely used as a proxy for the study of the epiblast. We feel that the ChIP-seq validation in this context is an appropriate independent validation to support the MICA-inferred cis regulatory interactions predicted from the human embryo datasets we analysed. Our inferences from ATAC-seq data cannot identify TF-DNA binding directly. ChIP-seq data is a widely accepted independent methods to support the inferred interactions from ATAC-seq data.

      We agree that knockdown/knockout would provide further evidence suggesting gene regulation, and indeed these are experiments we would like to conduct systematically in the future, but such perturbations are difficult to achieve at genome-wide scale, especially with very restricted quantities of human embryo material. Notably, these studies would not be evidence of direct regulation and the gold-standard in our opinion is to perturb the cis regulatory region to demonstrate its functional importance in gene regulation. These are important experiments to conduct systematically in the future. We also note that assessing quantitative trait loci in the context of human pre-implantation embryos is extremely challenging due to the restricted sample sizes and genetic variance in the samples collected.

      *- Many of the MICA-inferred regulatory interactions do not exhibit Spearman correlation (Fig. 5, Fig. S17), which could probably be explained by the ability of mutual information to capture complex non-monotonic dependencies. It would be interesting to provide further investigation on these "uncorrelated" edges, which may help demonstrate the superiority of mutual information over Spearman correlation. *

      This has been added as a new Fig.S18.

      - The authors conducted immunostaining experiments to validate the MICA-inferred regulatory *interaction between TFAP2C and JUND. While the identified protein co-localization is a step further than RNA co-expression, it is still correlation rather than causality. Additional evidence like the effect of knockout/knockdown perturbations would be more convincing. *

      We agree with Reviewer 1 that experimental perturbations of TFAP2C and JUND to determine what consequence this has for interactions between these proteins would be informative. However due to the complexity of such an investigation in human embryos, we feel that this is beyond the scope of the current study. One option is to conduct the perturbations in human pluripotent stem cells, however it is unclear if the GRN in this context reflects the same interactions as human embryos and is a distinct question to address in the future. Moreover, while knockdown/knockout studies would be suggestive of up-stream regulation, it will not address the question of whether this is a direct or indirect effect without systematic further analysis including transcription factor-DNA binding (such as CUT&RUN, CUT&Tag or ChIP-seq) analysis as well as perturbations of the putative cis regulatory regions. These are all exciting future experiments and our study provides us and others with hypotheses to functionally test in the future. These are future directions and we have clarified this in the discussion section on page 16, line 576.

      __Minor comments __

      • *The γ symbols in AP-2γ are not correctly rendered. *

      We note that this applies only to the way AP-2γ appears on the Review Commons website, and we are trying to fix this issue. We hope this transformation after the manuscript upload will not apply to a subsequent transfer to a journal.

      • The UMAP figures (Fig. 4A, Fig. S7) are of low resolution compared to other figures.

      We thank the reviewer for noting this. These figures have now been added as vector graphics files to overcome this issue.

      • As the authors are focused on studying the blastocyst regulatory network, the inferred regulatory interactions should be provided as supplementary data.

      We have included all of the inferred gene regulatory interactions as a supplementary folder for the MICA predictions using FigShare: doi.org/10.6084/m9.figshare.21968813. We have included code to reproduce the inferred gene regulatory interactions for the other methods which we compared to MICA. Because this includes 100,000 regulatory interactions per method, we feel that it would be impractical to include the alternative inferred interaction as supplementary data.

      Reviewer 2

      Minor comments

      *- In the abstract, it would be adequate to already mention which normalisation method works the best. *

      This has now been added to the abstract and we appreciate this suggestion.

      *- In Fig. 1: *

      * Describe what are squares and circles

      This information has been included in the figure 1 legend.

      ** In the GRNs refined by keeping CA-predicted regulations only, mention that this are Cis interactions *

      We have modified the figure 1 legend and the text on page 5, line 224 to clarify that these are putative cis-regulatory interactions.

      * The ATAC seq shows KRT8, GATA3, RELB motifs, while the rest of the figure is very general. Maybe make the ATAC-seq peaks panel also as a sketch and relate it to the square/circles graphs on the right hand side to showcase how the filtering of the network is performed.

      We appreciate this suggestion and modified figure 1 accordingly.

      ** The caption says Five GRN inference approaches, while abstract and text say 4. If is clear after reading that the 5th is a random approach. However, it was a surprise at first. *

      We have modified the figure 1 legend to clarify that we also compared random prediction in addition to the 4 GRN inference approaches.

      *- How the Simulation study was performed is not understandable for non experts as it is described in the Methods section. This is an important approach in general, and I think the audience would benefit if the authors add a full section about it in their supplementary data. *

      Further details have now been added to the subsection ‘simulation study’ in the Methods section.

      *- Fig. 2: *

      ** As it is, it is hard to tell the difference between GRN inference methods for a given sample size and number of regulators. Could the authors add a comparative panel for this (maybe some scatter plots would be enough)? MI by itself looks worse here? *

      We thank the reviewer for this helpful suggestion. This comparative plot has now been included in figure 2 and indicates that MI is on par with the other GRN inference methods using simulation RNA-seq data.

      *- When mentioning "samples" (e.g. last paragraph of section 1 in results), do the authors refer to "cells"? *

      We appreciate the reviewer pointing this out and have amended the text throughout to state that these are cells.

      *- What about normalisation effects in the simulated data? *

      With regards to the simulated data, normalisation effects are not relevant as we are generating data that are idealised and therefore not subject to unwanted sources of variation such as read depth. However, in future work, this could be investigated with an expanded simulation study and we appreciate the reviewer’s suggestion.

      *- Figure S7 should be cited in the first paragraph of section 2 in results. *

      This has now been cited.

      *Could the authors add a panel to indicate whether the data is SMART-seq2 or 10X. *

      We thank the reviewer for the suggestion to clarify this, which we think is an important point. We have included a statement that all data used was generated using the SMART-seq2 sequencing technique in the figure legend. The choice of sequencing method/depth of sequencing will likely impact on the choice of GRN inference method and we have also clarified this in the discussion section on page 13, line 516.

      *- In the association of inferred GRNs to human blastocyst cell lineages, the authors find the GRN edges predicted that overlap between the 4 inference methods in each cell type. Do they, therefore, recommend to always use more than one GRN inference method? *

      Identifying overlapping inferences by comparing more than one GRN inference method may be a strategy to identify network edges with more confidence due to the agreement between several inference methodologies. However, this strategy may also miss some edges which can only be detected by one method and not another. We have included a statement in the discussion section to clarify this point on page 15, line 571.

      - If the CA data used was only generated for the TE and ICM only, how do the authors use it to perform MICA on PE?

      We appreciate that this is confusing and have since revised the manuscript on page 5, line 223 to state that the inner cell mass (ICM), comprises EPI (epiblast) and PE (primitive endoderm) cells. It may be that we miss putative cis-regulatory interactions if the ICM CA data does not reflect developmentally progressed PE and EPI cells and we have noted this caveat in the discussion section on page 15, line 561.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      She et al studied the evolution of gene expression reaction norms when individuals colonise a new environment that exposes them to physiologically challenging conditions. Their objective was to test the "plasticity first" hypothesis, which suggest that traits that are already plastic (their value changes when facing a new environment compared to the original environment) facilitates the colonisation of novel environments, which, if true, would be predicted to result in the evolution of gene expression values that are similar in the population that colonised the new environment and evolved under these particular selection pressures. To test this prediction, they studied gene expression in cardiac and muscle tissues in individuals originating from three conditions: lowland individuals in their natural environment (ancestral state), lowland individuals exposed to hypoxia (the plastic response state), and a highland population facing hypoxia for several generations (the coloniser state). They classified gene expression patterns as maladaptive or adaptive in lowland individuals responding to short term hypoxia by classifying gene expression patterns using genes that differed between the ancestral state (lowland) and colonised state (highland). Genes expressed in the same direction in lowland individuals facing hypoxia (the plastic state) as what is found in the colonised state are defined as adaptative, while genes with the opposite expression pattern were labelled as maladaptive, using the assumption that the colonised state must represent the result of natural selection. Furthermore, genes could be classified as representing reversion plasticity when the expression pattern differed between the plasticity and colonised states and as reinforcement when they were in the same direction (for example more expressed in the plastic state and the colonised state than in the ancestral state). They found that more genes had a plastic expression pattern that was labelled as maladaptive than adaptive. Therefore, some of the genes have an expression pattern in accordance with what would be predicted based on the plasticity-first hypothesis, while others do not.

      Thank you for a precise summary of our work. We appreciate the very encouraging comments recognizing the value of our work. We have addressed concerns from the reviewer in greater detail below.

      Q1. As pointed out by the authors themselves, the fact that temperature was not included as a variable, which would make the experimental design much more complex, misses the opportunity to more accurately reflect the environmental conditions that the colonizer individuals face at high altitude. Also pointed out by the authors, the acclimation experiment in hypoxia lasted 4 weeks. It is possible that longer term effects would be identifiable in gene expression in the lowland individuals facing hypoxia on a longer time scale. Furthermore, a sample size of 3 or 4 individuals per group depending on the tissue for wild individuals may miss some of the natural variation present in these populations. Stating that they have a n=7 for the plastic stage and n= 14 for the ancestral and colonized stages refers to the total number of tissue samples and not the number of individuals, according to supplementary table 1.

      We shared the same concerns as the reviewer. This is partly because it is quite challenging to bring wild birds into captivity to conduct the hypoxia acclimation experiments. We had to work hard to perform acclimation experiments by taking lowland sparrows in a hypoxic condition for a month. We indeed have recognized the similar set of limitations as the review pointed out and have discussed the limitations in the study, i.e., considering hypoxic condition alone, short time acclimation period, etc. Regarding sample sizes, we have collected cardiac muscle from nine individuals (three individuals for each stage) and flight muscle from 12 individuals (four individuals for each stage). We have clarified this in Supplementary Table 1.

      Q2. Finally, I could not find a statement indicating that the lowland individuals placed in hypoxia (plastic stage) were from the same population as the lowland individuals for which transcriptomic data was already available, used as the "ancestral state" group (which themselves seem to come from 3 populations Qinghuangdao, Beijing, and Tianjin, according to supplementary table 2) nor if they were sampled in the same time of year (pre reproduction, during breeding, after, or if they were juveniles, proportion of males or females, etc). These two aspects could affect both gene expression (through neutral or adaptive genetic variation among lowland populations that can affect gene expression, or environmental effects other than hypoxia that differ in these populations' environments or because of their sexes or age). This could potentially also affect the FST analysis done by the authors, which they use to claim that strong selective pressure acted on the expression level of some of the genes in the colonised group.

      The reviewer asked how individual tree sparrows used in the transcriptomic analyses were collected. The individuals used for the hypoxia acclimation experiment and represented the ancestral lowland population were collected from the same locality (Beijing) and at the same season (i.e., pre-breeding) of the year. They are all adults and weight approximately 18g. We have clarified this in the Supplementary Table S1 and Methods. We did not distinguish males from females (both sexes look similar) under the assumption that both sexes respond similarly to hypoxia acclimation in their cardiac and flight muscle gene expression.

      The Supplementary Table 2 lists the individuals that were used for sequence analyses. These individuals were only used for sequence comparisons but not for the transcriptomic analyses. The population genetic structure analyzed in a previously published study showed that there is no clear genetic divergence within the lowland population (i.e., individuals collected from Beijing, Tianjing and Qinhuangdao) or the highland population (i.e., Gangcha and Qinghai Lake). In addition, there was no clear genetic divergence between the highland and lowland populations (Qu et al. 2020).

      Q4. Impact of the work

      There has been work showing that populations adapted to high altitude environments show changes in their hypoxia response that differs from the short-term acclimation response of lowland population of the same species. For example, in humans, see Erzurum et al. 2007 and Peng et al. 2017, where they show that the hypoxia response cascade, which starts with the gene HIF (Hypoxia-Inducible Factor) and includes the EPO gene, which codes for erythropoietin, which in turns activates the production of red blood cell, is LESS activated in high altitude individuals compared to the activation level in lowland individuals (which gives it its name). The present work adds to this body of knowledge showing that the short-term response to hypoxia and the long term one can affect different pathways and that acclimation/plasticity does not always predict what physiological traits will evolve in populations that colonize these environments over many generations and additional selection pressure (UV exposure, temperature, nutrient availability). Altogether, this work provides new information on the evolution of reaction norms of genes associated with the physiological response to one of the main environmental variables that affects almost all animals, oxygen availability. It also provides an interesting model system to study this type of question further in a natural population of homeotherms.

      Erzurum, S. C., S. Ghosh, A. J. Janocha, W. Xu, S. Bauer, N. S. Bryan, J. Tejero et al. "Higher blood flow and circulating NO products offset high-altitude hypoxia among Tibetans." Proceedings of the National Academy of Sciences 104, no. 45 (2007): 17593-17598.

      Peng, Y., C. Cui, Y. He, Ouzhuluobu, H. Zhang, D. Yang, Q. Zhang, Bianbazhuoma, L. Yang, Y. He, et al. 2017. Down-regulation of EPAS1 transcription and genetic adaptation of Tibetans to high-altitude hypoxia. Molecular biology and evolution 34:818-830.

      Thank you for highlighting the potential novelty of our work in light of the big field. We found it very interesting to discuss our results (from a bird species) together with similar findings from humans. In the revised version of manuscript, we have discussed short-term acclimation response and long-term adaptive evolution to a high-elevation environment, as well as how our work provides understanding of the relative roles of short-term plasticity and long-term adaptation. We appreciate the two important work pointed out by the reviewer and we have also cited them in the revised version of manuscript.

      Reviewer #2 (Public Review):

      This is a well-written paper using gene expression in tree sparrow as model traits to distinguish between genetic effects that either reinforce or reverse initial plastic response to environmental changes. Tree sparrow tissues (cardiac and flight muscle) collected in lowland populations subject to hypoxia treatment were profiled for gene expression and compared with previously collected data in 1) highland birds; 2) lowland birds under normal condition to test for differences in directions of changes between initial plastic response and subsequent colonized response. The question is an important and interesting one but I have several major concerns on experimental design and interpretations.

      Thank you for a precise summary of our work and constructive comments to improve this study. We have addressed your concerns in greater detail below.

      Q1. The datasets consist of two sources of data. The hypoxia treated birds collected from the current study and highland and lowland birds in their respective native environment from a previous study. This creates a complete confounding between the hypoxia treatment and experimental batches that it is impossible to draw any conclusions. The sample size is relatively small. Basically correlation among tens of thousands of genes was computed based on merely 12 or 9 samples.

      We appreciate the critical comments from the reviewer. The reviewer raised the concerns about the batch effect from birds collected from the previous study and this study. There is an important detail we didn’t describe in the previous version. All tissues from hypoxia acclimated birds and highland and lowland birds have been collected at the same time (i.e., Qu et al. 2020). RNA library construction and sequencing of these samples were also conducted at the same time, although only the transcriptomic data of lowland and highland tree sparrows were included in Qu et al. (2020). The data from acclimated birds have not been published before.

      In the revised version of manuscript, we also compared log-transformed transcript per million (TPM) across all genes and determined the most conserved genes (i.e., coefficient of variance ≤  0.3 and average TPM ≥ 1 for each sample) for the flight and cardiac muscles, respectively (Hao et al. 2023). We compared the median expression levels of these conserved genes and found no difference among the lowland, hypoxia-exposed lowland, and highland tree sparrows (Wilcoxon signed-rank test, P<0.05). As these results suggested little batch effect on the transcriptomic data, we used TPM values to calculate gene expression level and intensity. This methodological detail has been further clarified in the Methods and we also provided a new supplementary Figure (Figure S5) to show the comparative results.

      The reviewer also raised the issue of sample size. We certainly would have liked to have more individuals in the study, but this was not possible due to the logistical problem of keeping wild bird in a common garden experiment for a long time. We have acknowledged this in the manuscript. In order to mitigate this we have tested the hypothesis of plasticity following by genetic change using two different tissues (cardiac and flight muscles) and two different datasets (co-expressed gene-set and muscle-associated gene-set). As all these analyses show similar results, they indicate that the main conclusion drawn from this study is robust.

      Q2. Genes are classified into two classes (reversion and reinforcement) based on arbitrarily chosen thresholds. More "reversion" genes are found and this was taken as evidence reversal is more prominent. However, a trivial explanation is that genes must be expressed within a certain range and those plastic changes simply have more space to reverse direction rather than having any biological reason to do so.

      Thank you for the critical comments. There are two questions raised we should like to address them separately. The first concern centered on the issue of arbitrarily chosen thresholds. In our manuscript, we used a range of thresholds, i.e., 50%, 100%, 150% and 200% of change in the gene expression levels of the ancestral lowland tree sparrow to detect genes with reinforcement and reversion plasticity. By this design we wanted to explore the magnitudes of gene expression plasticity (i.e., Ho & Zhang 2018), and whether strength of selection (i.e., genetic variation) changes with the magnitude of gene expression plasticity (i.e., Campbell-Staton et al. 2021).

      As the reviewer pointed out, we have now realized that this threshold selection is arbitrarily. We have thus implemented two other categorization schemes to test the robustness of the observation of unequal proportions of genes with reinforcement and reversion plasticity. Specifically, we used a parametric bootstrap procedure as described in Ho & Zhang (2019), which aimed to identify genes resulting from genuine differences rather than random sampling errors. Bootstrap results suggested that genes exhibiting reversing plasticity significantly outnumber those exhibiting reversing plasticity, suggesting that our inference of an excess of genes with reversion plasticity is robust to random sampling errors. We have added these analyses to the revised version of manuscript, and provided results in the Figure 2d and Figure 3d.

      In addition, we adapted a bin scheme (i.e., 20%, 40% and 60% bin settings along the spectrum of the reinforcement/reversion plasticity). These analyses based on different categorization schemes revealed similar results, and suggested that our inference of an excess of genes with reversion plasticity is robust. We have provided these results in the Supplementary Figure S2 and S4.

      The second issue that the reviewer raised is that the plastic changes simply have more space to reverse direction rather than having any biological reason to do so. While a causal reason why there are more genes with expression levels being reversed than those with expression levels being reinforced at the late stages is still contentious, increasingly many studies show that genes expression plasticity at the early stage may be functionally maladapted to novel environment that the species have recently colonized (i.e., lizard, Campbell-Staton et al. 2021; Escherichia coli, yeast, guppies, chickens and babblers, Ho and Zhang 2018; Ho et al. 2020; Kuo et al. 2023). Our comparisons based on the two genesets that are associated with muscle phenotypes corroborated with these previous studies and showed that initial gene expression plasticity may be nonadaptive to the novel environments (i.e., Ghalambor et al. 2015; Ho & Zhang 2018; Ho et al. 2020; Kuo et al. 2023; Campbell-Staton et al. 2021).

      Q3. The correlation between plastic change and evolved divergence is an artifact due to the definitions of adaptive versus maladaptive changes. For example, the definition of adaptive changes requires that plastic change and evolved divergence are in the same direction (Figure 3a), so the positive correlation was a result of this selection (Figure 3d).

      The reviewer raised an issue that the correlation between plastic change and evolved divergence is an artifact because of the definition of adaptive versus maladaptive changes, for example, Figure 3d. We agree with the reviewer that the correlation analysis is circular because the definition of adaptive and maladaptive plasticity depends on the direction of plastic change matched or opposed that of the colonized tree sparrows. We have thus removed previous Figure 3d-e and related texts from the revised version of manuscript. Meanwhile, we have changed Figure 3a to further clarify the schematic framework.

      Reviewer #1 (Recommendations For The Authors):

      Q1. Here are private recommendations that I think could help improve the manuscript. West-Eberhard was a pioneer back in 2003 in explicating the hypothesis of "plasticity first". I think it is important to cite their main work in the first paragraph of introduction and to use the term "plasticity-first", which is widely known among evolutionary biologists studying phenotypic plasticity, instead of "plasticity followed by genetic change", since the three papers cited in paragraph 1 call it « plasticity first ».

      West-Eberhard, M.J. (2003) Developmental Plasticity and Evolution, Oxford University Press.

      Thank you for suggesting West-Eberhard (2003) and we have cited this important work. We have also changed “plasticity followed by genetic change” to “plasticity first”.

      Q2. Introduction. Line 5, Change for « On the one hand, if plasticity changes ... »

      We have modified as suggested.

      Q3. Line 52, Change for « ...same direction as adaptive evolution does ...»

      We have modified as suggested.

      Q4. Line 66,When presenting papers that address the plasticity and evolution of gene expression in response to environmental variables, paper by Morris et al is another example that could be useful to include (but this is only a suggestion in case the authors missed it).

      Thank you for suggesting this nice work. We have cited Morris et al. (2014).

      Q5. Line 94, Change for "We acclimated"

      We have modified as suggested.

      Q6. In Figure 3, the figure in panel A and B is labelled "normaxia", but I think that "normoxia" is usually the term used.

      Thank you for spot the typo. We have modified Figure 3a and we no longer used the term “normaxia”.

      Material and methods

      It would be important to merge supplementary table 1 and 2 and only present the individuals that were used with their respective cardiac and muscle libraries (if they come from the same individual?). Also, the origin of the individuals used in the hypoxia experiment should be explained at the beginning of the methods section and explicated in the supplementary table. Information on sex or stage of development (juvenile? Adult? Male? female?) and time of year (in breeding stage? Pre-migration (if any), etc) would allow the reader to see that individuals from lowland differed only in their exposure to hypoxia or not, or if other variables may affect gene expression patterns. Similarly, if all individuals form the highland are males and the lowland hypoxia exposed individuals are females (or juveniles versus breeders, or different time of year, etc) this should be stated in the methods. Gene expression is labile so the reader should know if other variables influence the results presented or not.

      Thank you for suggestion. We have added detailed information (i.e., age, collecting time and season) to the supplementary Table 1. We have also added this information to the Methods. Because the birds used in transcriptomic analysis (Supplementary Table 1) were different individuals from those used in the sequence analyses (Supplementary Table 2), these two tables cannot be merged.

      References:

      Campbell-Staton SC, Velotta JP, Winchell KM. 2021. Selection on adaptive and maladaptive genes expression plasticity during thermal adaptation to urban heat islands. Nat. Commun. 12: 6195.

      Ghalambor CK, Hoke KL, Ruell EW, Fischer EK, Reznick DN, Hughes KA. 2015. Non-adaptive plasticity potentiates rapid adaptive evolution of gene expression in nature. Nature 525:372–375.

      Hao et al. 2023. Divergent contributions of coding and noncoding sequences to initial high-altitude adaptation in passerine birds endemic to the Qinghai–Tibet Plateau. Mol. Ecol. Doi: 10.1111/mec.16942.

      Ho WC, Zhang J. 2018. Evolutionary adaptations to new environments generally reverse plastic phenotypic changes. Nat. Commun. 9: 350.

      Ho WC, Zhang J. 2019. Genetic gene expression changes during environmental adaptations tend to reverse plastic changes even after correction for statistical nonindependence. Mol. Biol. Evol. 36: 604–612.

      Ho WC, Li D, Zhu Q, Zhang J. 2020. Phenotypic plasticity as a long-term memory easing readaptations to ancestral environments. Sci. Adv. 6: eaba3388.

      Kuo KC, Yao CT, Liao BY, Weng MP, Dong F, Hsu YC, Hung CM. 2023. Weak gene-gene interaction facilitates the evolution of gene expression plasticity. BMC Biol. 21: 57.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. Point-by-point description of the revisions

      Reviewer #1

      Evidence, reproducibility and clarity (Required):

      In this paper by Wideman et al, the authors seek to determine the role of cellular iron homeostasis in the pathogenesis of murine malaria.

      The authors to attempt to disentangle the effects of anemia from that of cellular iron deficiency. The authors elegantly make use of a murine model of a rare human mutation in the transferrin receptor. This mutation leads to decreased receptor internalization and decreased cellular iron, but otherwise healthy mice. Using this model, the authors use a P. chabaudi infection model and show an increase in pathogen burden and a decrease in pathology. They show in some detail that the immune response to P. chabaudi infection is blunted, both T and B-cell responses are attenuated in the TfRY20H/Y20H model, and the block in proliferation can be rescued by exogenous iron supplementation. They also show that decreased cellular iron attenuates liver pathology through potentially multiple mechanisms.

      Minor comments:

      • The peak of parasitemia is relatively low (approx..3%) compared to other published studies (e.g. PMID: 22100995, 16714546, 31110285) where the peak in C57BL/6 mice reached 25 - 40%. Can the authors account for this low parasitemia?

      Response: We thank the reviewer for their constructive comments and appreciate that they are highlighting this important point. It has previously been shown (PMID: 23217144, 23719378) that mosquito-transmission of P. chabaudi leads to significantly lower parasitaemia (“Recently mosquito-transmitted parasites were used to mimic a natural infection more closely, as vector transmission is known to regulate Plasmodium virulence and alter the host’s immune response (48-50). Consequently, parasitaemia is expected to be significantly lower upon infection with recently mosquito-transmitted parasites, compared to infection with serially blood-passaged parasites that are more virulent (48,49).”

      • Figure 1K - At homeostasis, serum iron is low in TfR mice however increases to significantly higher than the WT mice at 8 days post infection. Do the authors have an explanation on why these dramatic changes in serum iron are seen?

      Response: During malaria infection, RBC lysis releases haem and iron into circulation, which leads to an increase in serum iron levels. This effect is observed in both wild-type and TfrcY20H/Y20H mice infected with P. chabaudi (Supplementary Figure 1F & Figure 1K). However, the significantly higher serum iron levels observed in infected TfrcY20H/Y20H mice can likely be explained by their decreased capacity for transferrin receptor-1 mediated iron uptake, leading to relatively slower uptake and storage of circulating transferrin-bound iron into tissues. This has been clarified in the manuscript (line 142-143):

      “The elevated serum iron observed in infected TfrcY20H/Y20H mice was consistent with their restricted capacity to take up transferrin-bound circulating iron into tissues.”

      • Figure S3 - Is it surprising that no effects on splenic neutrophils are seen? Were neutrophils quantified at any other point? These would also be expected to have a role in both the control of malaria infection and on any pathology.

      Response: We thank the reviewer for raising this interesting question. It is known that neutrophils can be sensitive to cellular iron deficiency (PMID: 36197985) and that neutrophils can play an important part in malaria infection (PMID: 31628160). However, the magnitude and significance of the neutrophil response to recently mosquito-transmitted P. chabaudi parasites has not been thoroughly investigated. A recent study demonstrated that monocytes and macrophages may be more important than granulocytes in the early response to recently mosquito-transmitted P. chabaudi infection (PMID: 34532703).

      Moreover, we performed neutrophil quantifications in our initial experiments and found that the splenic neutrophil response was not altered in TfrcY20H/Y20H mice eight days after infection. Additionally, no neutrophil infiltration was observed in the liver of either genotype upon P. chabaudi infection. In light of these findings, we did not characterise the neutrophil response further, as it appeared unlikely that neutrophils were the principal causal agent of either the altered immunity or pathology, in this context. However, we agree with the reviewer that larger question of whether neutrophil iron plays a role in the pathology of malaria is an interesting open question which we hope future studies can elucidate.

      A section was added to the discussion to address the role of innate immune cells in our model (line 354-363):

      “The inhibited innate immune response to P. chabaudi in TfrcY20H/Y20H mice likely contributed to both the increased pathogen burden and the decreased liver pathology. Splenic MNPs are important for controlling parasitaemia (34,35,72), but MNPs are also vital for maintaining tissue homeostasis and preventing tissue damage in malaria (43,73). Although other innate cells, such as neutrophils, NK cells and γδT cells are an important part of the immune response to malaria, only the MNP response was distinctly impaired in TfrcY20H/Y20H mice. Notably, neutrophils are known to be sensitive to iron deficiency (16,74) and to affect both immunity and pathology in malaria (75,76). However, in the context of recently mosquito-transmitted P. chabaudi it appears that monocytes and macrophages, rather than granulocytes, may be particularly important for parasite control and tissue homeostasis (43,72).”

      Changes to the text:

      • Fig S1EandF - Please add to the figure legend that these were measured at homeostasis.

      Response: This clarification has been added to the legend of Supplementary Figure 1 (line 954-957).

      • Figure 3 - In the legend, H and I are the wrong way around.

      Response: The legend of Figure 3 has been corrected accordingly (line 888-890).

      • Figure 4 - please add the units of concentration of FeSO4 to all panels

      Response: The units of concentration for FeSO4 and AFeC have been added to all panels of Figure 4 and 6, respectively.

      • Line 246 - The authors state: "there was some evidence of decreased malaria-induced hepatomegaly" however there is no significant difference between WT and TfR mice and both show significant hepatomegaly. I feel that this line should be reworded.

      Response: The sentence (line 252-254) has been reworded as follows:

      Furthermore, while both genotypes developed malaria-induced hepatomegaly, there was a trend toward less severe hepatomegaly in TfrcY20H/Y20H mice (Figure S5C).”

      Significance (Required):

      This work is one of the first to attempt to define the requirements for cellular iron in malaria infection. This is a difficult topic, as infection and associated inflammation and the red blood cell destruction caused by malaria all have complex effects on iron within the body. This study fits well with previous observations showing that anemia can be protective as it both prevents parasite growth and limit immunopathology. This work advances the field by demonstrating a cell intrinsic role for iron in malaria infection. There is a broad possible audience for this work, including malaria researchers, immunologists and people interested in the role or iron, both at a cellular level and systemically.

      Reviewer #2

      Evidence, reproducibility and clarity (Required):

      In this manuscript, the authors have studied the role of iron deficiency in the host response to Plasmodium infection using a transgenic mouse model that carries a mutation in the transferrin receptor. They show that restricted cellular iron acquisition attenuated P. chabaudi infection- induced splenic and hepatic immune responses which in turn mitigated the immunopathology, even though the peak parasitemia was significantly high in the mutant mice. Interestingly, the course of parasite infection doesn't seem to be affected in the mutant mice compared to the wildtype mice despite the induction of poor immune responses. The authors show that the decreased cellular iron uptake broadly impact both innate and adaptive components of the immune system. Conversely, free iron supplementation restored the immune cell functions.

      • The study is well performed, and the manuscript is well written. However, the authors should show how conserved the role of cellular iron is across other rodent malaria parasite species at least with * yoelii or P. berghei* blood stage infection models. This question becomes critical to address in order to understand broad relevance to human malaria infections where both the host and parasites are genetically diverse.

      Response: We thank the reviewer for appreciating our study and for the thoughtful comments. We agree with the reviewer that the diverse genetic background of both parasites and hosts makes it difficult to draw broad conclusions about human malaria infection from animal studies performed in a laboratory setting. The recently mosquito-transmitted P. chabaudi chabaudi AS blood-stage infection model replicates many key features of mild to moderate malaria infection in humans, such as low parasitaemia, anaemia, cyto-adhesive sequestration in microvasculature, and self-resolving immunopathology. Importantly, the immune response elicited by recently mosquito-transmitted parasites also more closely mimics the immune response to a natural infection (PMID: 23719378). Therefore, we consider the recently mosquito-transmitted P. chabaudi chabaudi AS model as the most relevant to answer our particular research questions.

      Furthermore, specific pathogen-free parasitised erythrocyte stabilates made from recently mosquito-transmitted P. berghei or P. yoelii parasites are unfortunately not readily accessible (e.g. through the European Malaria Reagent Repository), in contrast to P. chabaudi. Consequently, preparing and characterising recently mosquito-transmitted strains to perform the experiments suggested by the reviewer would require a substantial amount of additional time and labour, which we deem out of scope for this study.

      In the design of our model we have also taken care to minimise the effects of anaemia, something which would be difficult or impossible to achieve using serially blood passaged P. yollii or P. berghei parasites. Both P. yoelii and P. berghei merozoites preferentially invade immature RBCs (PMID: 34322397) making readouts such as parasitaemia far more sensitive to small variations in erythropoietic output. In addition, the extensive RBC destruction caused by most serially blood-passaged murine Plasmodium strains would likely exaggerate any erythropoietic impairment caused by the TfrcY20H/Y20H mutation.

      Although we strongly believe that the chosen mouse model of malaria is the most appropriate for our study, ultimately, no mouse model can replicate all features of human malaria infection. Inevitably, the direct relevance of animal studies for human infection will always be somewhat opaque. Hence, we respectfully disagree with the reviewer that repeating the experiments with additional murine malaria parasite species would allow us to extrapolate conclusions about human malaria infection. Such experiments would also conflict with the 3Rs principles that govern work with animals in the UK (https://nc3rs.org.uk/). Especially, because most strains of P. yoelii and P. berghei cause severe or non-resolving infections and have a significant negative impact on animal welfare.

      In our opinion, the logical continuation of this study must be to utilise the insights from our research to inform future human studies on the relationships between iron deficiency and malaria-related immunopathology. However, we agree that this is an important topic and have added a section addressing the broad relevance of our findings to the discussion (line 393-396):

      “It remains to be seen what the broader importance of cellular iron is in human malaria infection, in particular within the diverse genetic context of both humans and parasites found in malaria endemic regions. Murine models of malaria are useful in providing hypothesis-generating results, but such findings ultimately ought to be confirmed and developed further through studies in human populations.”

      • Since, restricted cellular iron uptake mitigates the immunopathology, the authors should explore whether this could also relieve the cerebral malaria condition that is caused by the hyper inflammation in the brain. They should use the * berghei* ANKA parasite strain which causes t cerebral malaria in mice. I think would increase impact of the paper.

      Response: Although we agree that this would be an interesting line of inquiry, we think that it is outside of the scope of this study, which predominantly aims to characterise and study the effects of cellular iron deficiency in host cells, particularly immune cells, during mild to moderate malaria infection. The severe pathology underlying cerebral malaria differs greatly from that of a self-resolving blood-stage infection. Furthermore, the relevance to human cerebral malaria of the P. berghei ANKA model is controversial within the field (PMID: 21288352) and as a severe infection its use would again conflict with the 3Rs principles.

      Minor comments:

      • Line 222: repeating word, "iron iron-supplemented...."

      Response: The sentence has been corrected (line 228).

      • Figure 3C, S4C & S5F: Why Mann-Whitney test is performed in these particular graphs, whereas rest of the two groups comparison were done using Welch's test? The authors should clearly mention this in the methods section.

      Response: We apologise if this was unclear in the manuscript. We routinely tested all our datasets for normality to identify the appropriate tests for each dataset. In case of the graphs shown in figure 3C, S4C and S5F, the dataset did not pass the D’Agostino-Pearson normality test and we therefore applied a non-parametric test (i.e. Mann-Whitney), in contrast to the other datasets that passed the test for normal or lognormal distribution. This has been further clarified in the method section (line 581-586):

      The D’Agostino-Pearson omnibus normality test was used to determine normality/lognormality. Parametric statistical tests (e.g. Welch’s t-test) were used for normally distributed data. For lognormal distributions, the data was log-transformed prior to statistical analysis. Where data did not have a normal or lognormal distribution, or too few data points were available for normality testing, a nonparametric test (e.g. Mann-Whitney test) was applied.“

      • Have authors explored whether gamma-delta T cell responses are affected in the mutant mouse strain compared to wildtype mice as they are one of the early responders and the key cytokine producing cells against the Plasmodium blood stage infection.

      Response: __We thank the reviewer for this valuable comment. We briefly explored the role of γδT cells, but did not observe a significant difference in splenic γδT cell numbers between wild-type and TfrcY20H/Y20H mice, eight days post-infection (__Reviewer Figure 1). It is of course possible that γδT cell numbers were affected at an earlier stage, or that γδT cell function (e.g. cytokine production) was affected by cellular iron deficiency during P. chabaudi infection. However, γδT cells may also be less sensitive to cellular iron deficiency than conventional T cells, as has been previously demonstrated for developing T cells (PMID: 7957580).

      A section was added to the discussion to address the role of innate immune cells in our model (line 354-363):

      “The inhibited innate immune response to P. chabaudi in TfrcY20H/Y20H mice likely contributed to both the increased pathogen burden and the decreased liver pathology. Splenic MNPs are important for controlling parasitaemia (34,35,72), but MNPs are also vital for maintaining tissue homeostasis and preventing tissue damage in malaria (43,73). Although other innate cells, such as neutrophils, NK cells and γδT cells are an important part of the immune response to malaria, only the MNP response was distinctly impaired in TfrcY20H/Y20H mice. Notably, neutrophils are known to be sensitive to iron deficiency (16,74) and to affect both immunity and pathology in malaria (75,76). However, in the context of recently mosquito-transmitted P. chabaudi it appears that monocytes and macrophages, rather than granulocytes, may be particularly important for parasite control and tissue homeostasis (43,72).”

      Significance (Required):

      Overall, the study provides novel insights into the role of iron in the immune response to Plasmodium blood stage infection using a rodent malaria model and the interplay of infection, immunity and the development of pathology. As such it is an important study.

      Reviewer #3

      Evidence, reproducibility and clarity (Required):

      Herein Wideman provide novel and important evidence on the role of iron availability for mounting an efficient immune response in a malaria infection model. They employed TfRC Y201H/Y201H mice which develop iron deficiency due to impaired cellular ingestion of transferrin bound iron. They found that those mice develop higher peak parasitemia after vector borne exposure to Pl. chabaudi chabaudi which was paralleled by an impaired immune response as reflected by altered CD4 cell activation, reduced IFN-g formation or reduced B-cell responsiveness. Those deficiencies could be re-covered upon ex vivo iron supplementation pointing to the importance of iron availability for mounting-CD4+ and B-cell specific anti-plasmodial immune responses at the initial phase of infection. However, TFRC mutated mice were able to clear infection over time in a comparable fashion to wt mice.

      This excellent study is important in convincingly showing (by employing high quality immunological analyses) the importance of cellular iron deficiency on immune responses in an infection model of general interest. It also indicates that overwhelming immune response as seen in wt mice is associated with organ damage over time.

      Minor comments:

      • The authors should discuss why and how TFRC mutated mice were able to control infection over time in a comparable fashion as wt mice although peak parasitemia was significantly higher?

      __Response: __We thank the reviewer for the helpful feedback on our study and for posing this interesting question. It does indeed appear as if the immune response, while significantly inhibited in the TfrcY20H/Y20H mice, is still sufficient to clear the infection. It is plausible that the early cell-mediated immune response is inhibited to the degree that parasite control is impaired, resulting in higher peak parasitaemia in TfrcY20H/Y20H mice. In contrast, parasite clearance is comparable and contemporary in both genotypes. Based on the fact that parasite clearance occurs at a time when a substantial adaptive immune response is expected to emerge, we hypothesize that this significantly contributes to pathogen clearance. Thus, it seems likely that the humoral response in TfrcY20H/Y20H mice, even if inhibited, may still be effective enough to clear the parasites and prevent recrudescence.

      As malaria infection progresses, RBC loss and increasing anaemia also contributes to limiting exponential parasite growth. This occurs more or less equally in both genotypes, but it could be particularly important for parasite control in the TfrcY20H/Y20H mice that have an inhibited immune response.

      We have added a section to the discussion to address this (line 380-386):

      “Despite the higher peak parasitaemia in TfrcY20H/Y20H mice, both genotypes were able to clear P. chabaudi parasites at a comparable rate and prevent recrudescence. It follows that even a weakened humoral immune response appears to be sufficient to control P. chabaudi infection. However, our study did not investigate the effects of immune cell iron deficiency on the formation of long-term immunity, which may have been more severely affected. The impaired GC response, in particular, suggests that iron deficiency could counteract the formation of efficient immune memory to subsequent malaria infections.”

      • The authors and others have previously shown (Frost J et al. Sci Adv 2022, Hoffmann et al. EBioMedicine 2021) that iron deficiency results in reduced neutrophil numbers in different infection models. This could also have contributed to the observed effect in initial infection control but may have also been linked altered histopathology seen in Figure 7. However, no mention of neutrophil numbers in this model is made. It would be important if the authors could provide information on neutrophil numbers (only if this analysis has been already performed) and discuss this issue in association with their observation.

      Response: We appreciate that the reviewer has brought attention to this important topic. As they mention, iron deficiency can have a negative impact on the neutrophil response (PMID: 36197985, 34488018) but it can also cause a maladaptive excessive neutrophil response due to failed adaptive immunity (PMID: 33665641). In this study, we show that there is no difference in splenic neutrophil numbers between wild-type and TfrcY20H/Y20H mice, eight days after P. chabaudi infection (Figure S3B). Moreover, the histopathologists detected no liver neutrophil infiltration in either genotype, but rather observed infiltration of mononuclear leukocytes upon P. chabaudi infection. Hence, it appears unlikely that neutrophils were a major contributor to differences in either immunity or pathology in this specific context. However, we cannot definitively rule out that neutrophil numbers were affected earlier in the infection or that neutrophil function was impaired due to cellular iron deficiency.

      A section was added to the discussion to address the role of innate immune cells in our model (line 354-363):

      “The inhibited innate immune response to P. chabaudi in TfrcY20H/Y20H mice likely contributed to both the increased pathogen burden and the decreased liver pathology. Splenic MNPs are important for controlling parasitaemia (34,35,72), but MNPs are also vital for maintaining tissue homeostasis and preventing tissue damage in malaria (43,73). Although other innate cells, such as neutrophils, NK cells and γδT cells are an important part of the immune response to malaria, only the MNP response was distinctly impaired in TfrcY20H/Y20H mice. Notably, neutrophils are known to be sensitive to iron deficiency (16,74) and to affect both immunity and pathology in malaria (75,76). However, in the context of recently mosquito-transmitted P. chabaudi it appears that monocytes and macrophages, rather than granulocytes, may be particularly important for parasite control and tissue homeostasis (43,72).”

      • In addition, alternative mechanism leading to immune tolerance and reduced tissue damage such as induction of heme oxygenase-1, which is also affected by systemic iron availability, should be discussed.

      Response: __An addition was made to the results section and to Figure S5 to address this reviewer comment (line __269-274):

      “In addition, we measured the expression of two genes that are known to have a hepatoprotective effect in the context of iron loading in malaria: Hmox1 (encodes haemoxygenase-1) and Fth1 (encodes ferritin heavy chain). Liver gene expression of Hmox1 was higher in TfrcY20H/Y20H mice, while the expression of Fth1 did not differ between genotypes, eight days after infection (Figure S5H-I). Thus, the higher expression of Hmox1 may have contributed to the hepatoprotective effect in TfrcY20H/Y20H mice.”

      A relevant sentence was also added to the discussion (line 313-318):

      “For example, HO-1 plays an important role in detoxifying free haem that occurs as a result of haemolysis during malaria infection, thus preventing liver damage due to tissue iron overload, ROS and inflammation (62). Interestingly, infected TfrcY20H/Y20H mice had higher expression of Hmox1, but levels of liver iron and ROS comparable to that of wild-type mice. Consequently, this may be indicative of increased haem processing that could have a tissue protective effect”

      Significance (Required):

      Important and intersting study highlighting the central role of iron homeostasis for immune repsonse to infection. General interest because iron deficiency has high prevalence in areas with high enedemic burden of infection

      Reviewer's expertise: infectious disease, immunity, iron homeostasis-- both basic science and clincal expertise (more than 300 peer reviewed publications on these topcis)

    1. Author Response

      Reviewer #1 (Public Review):

      The cerebral cortex, or surface of the brain, is where humans do most of their conscious thinking. In humans, the grooves (sulci) and bumps (convolutions) have a particular pattern in a region of the frontal lobe called Broca's area, which is important for language. Specialists study features imprinted on the internal surfaces of braincases in early hominins by casting their interiors, which produces so-called endocasts. A major question about hominin brain evolution concerns when, where, and in which fossils a humanlike Broca's area first emerged, the answer to which may have implications for the emergence of language. The researchers used advanced imaging technology to study the endocast of a hominin (KNM-ER 3732) that lived about 1.9 million years ago (Ma) in Kenya to test a recently published hypothesis that Broca's remained primitive (apelike) prior to around 1.5 Ma. The results are consistent with the hypothesis and raise new questions about whether endocasts can be used to identify the genus and/or species of fossils.

      We would like to thank Rev. 1 for their comments on our paper.

      Reviewer #2 (Public Review):

      The authors tried to support the hypothesis that early Homo still had a primitive condition of Broca's cap (the region in fossil endocasts corresponding to Broca's area in the brain), being more similar to the condition in chimpanzees than in humans. The evidence from the described individual points to this direction but there are some flaws in the argumentation.

      We are grateful to Rev. 2 for their comments, although we partially agree with some of them.

      First, we would like to rectify the statement of Rev. 2 that we “tried to support the hypothesis that early Homo still had a primitive condition of Broca's cap”, indeed, our aim was to test this hypothesis and not to try to validate it.

      First, only one human and one chimpanzee were used for comparison, although we know that patterns of brain convolutions (and in addition how they leave imprints in the endocranial bones) are very variable.

      We understand the point raised by Rev. 2 about the variation of brain convolutions in humans and chimpanzees. We used atlases published by Connolly (1950), Falk et al. (2018) and de Jager et al. (2019, 2022) to analyse the endocast of KNM-ER 3732 and compare it to the extant human and chimpanzee cerebral conditions. However, in Figure 2, for the sake of clarity only two Homo and Pan specimens were used to illustrate the comparison (as it has been done in other published papers, e.g., Carlson et al., 2011; Science, Gunz et al., 2020 Sci Adv). In the revised version, we modified the manuscript to explain further our approach (line 156) “We used brain and endocast atlases published in Connolly (1950), Falk et al. (2018) and de Jager et al. (2019, 2022; see also www.endomap.org) for comparing the pattern identified in KNM-ER 3732 to those described in extant humans and chimpanzees. To the best of our knowledge, these atlases are the most extensive atlases of extant human and chimpanzee brains/endocasts available to date and are widely used in the literature to explore variability in sulcal patterns. In Figure 2, the extant human and chimpanzee conditions are illustrated by one extant human (adult female) and one extant chimpanzee (adult female) specimens from the Pretoria Bone Collection at the University of Pretoria (South Africa) and in the Royal Museum for Central Africa in Tervuren (Belgium), respectively (Beaudet et al., 2018).”.

      Second, the evidence from this fossil specimen adds to the evidence of previously describe individuals but still not yet fully prove the hypothesis.

      We tempered our discussion by concluding that (line 116) “Overall, the present study not only demonstrates that Ponce de León et al.’s (2021) hypothesis of a primitive brain of early Homo cannot be rejected, but also adds information […]”.

      Third, there is a vicious circle in using primitive and derived features to define a fossil species and then using (the same or different) features to argue that one feature is primitive or derived in a given species. In this case, we expect members of early Homo to be derived compared to their predecessors of the genus Australopithecus and that's why it seems intriguing and/or surprising to argue that early Homo has primitive features. However, we should expect that there is some kind of continuum or mosaic in a time in which a genus "evolves into" another genus. This discussion requires far more discussions about the concepts we use, maybe less discussion about what is different between the two groups but more discussion about the evolutionary processes behind them.

      We fully agree with Rev. 2 on this aspect. We believe that identifying these differences/similarities between fossil and extant hominids constitute the first step of a better understanding of the evolutionary mechanisms. Our work suggests indeed a certain continuity between genera and raises questions on the genus concept and how to interpret the specimens currently attributed to early Homo. In the revised version of the manuscript we included a reference to this possible scenario (line 134): “[…] or to the absence of a definite threshold between the two genera based on the morphoarchitecture of their endocasts (Wood and Collard, 1999).”.

      Fourth, the data of convolutional imprints presented are rather subjective when identifying which impressions represent which brain convolutions. Not seeing an impression does not necessarily mean that the corresponding brain feature did not exist. Interestingly, the manuscript does not mention and discuss at all the frontoorbital sulcus. This is a sulcus that usually runs from the orbital surface of the frontal lobe up to divide the inferior frontal gyrus in chimpanzees, a condition totally different than in humans who do not have a frontoorbital sulcus. Could such a sulcus be identified, this would provide a far more convincing argument for a primitive condition in this specimen. In Australopithecus sediba, e.g., the condition in this region seems to be a mosaic in which some aspects of the morphology seem to be more modern while one of the sulcual impressions can well be interpreted as a short frontoorbital sulcus. For this specimen, by the way, I would come back to my third point above: some experts in the field might argue that this specimen could belong to Homo rather than Australopithecus...

      We agree that the presence of a fronto-orbital sulcus would be more conclusive. However, this sulcus has not been identified in KNM-ER3732 and the region in which we would expect to find it is not preserved. As demonstrated by Ponce de León et al. (2021), because of the topographic relationships between sulci (and cranial structures), it is possible to interpret imprints on endocasts and the evolutionary polarity of some traits even in the absence of landmarks such as the fronto-orbital sulcus. In Australopithecus sediba the main derived feature of the endocast corresponds to the ventrolateral bulge in the left inferior frontal gyrus, and not to the sulcal pattern itself (Carlson et al., 2011 Science). However, the discussion around the taxonomic status of this taxon confirms the urgent need for reconsidering specimens from that time period and clarifying the mosaic-like or concerted evolution of the derived Homo-like traits within our lineage. Regarding the subjective nature of this approach, we invite readers to examine the specimen on MorphoSource (https://www.morphosource.org/concern/media/000497752?locale=en) and to request access to the National Museums of Kenya to the physical or virtual specimen to falsify our hypothesis.

      According to my arguments above, I think that this manuscript might revive interesting discussions about this topic but it is not likely to settle them because the data presented are not strong enough to fully support the hypothesis.

      We would be more than happy to consider new/other specimens with similar chronological and geographical contexts and investigate further this hypothesis in the future.

      Reviewer #3 (Public Review):

      The authors provide a detailed analysis of the sulcal and sutural imprints preserved on the natural endocast and associated cranial vault fragments of the KNM-ER3732 early Homo specimen. The analyses indicate a primitive ape-like organization of this specimen's frontal cortex. Given the geological age of around 1.9 million years, this is the earliest well-documented evidence of a primitive brain organization in African Homo.

      In the discussion, the authors re-assess one of the central questions regarding the evolution of early Homo: was there species diversity, and if yes, how can we ascertain it? The specimen KNM-ER1470 has assumed a central role in this debate because it purportedly shows a more advanced organization of the frontal cortex compared to other largely coeval specimens (Falk, 1983). However, as outlined in Ponce de León et al. 2021 (Supplementary Materials), the imprints on the ER1470 endocranium are unlikely to represent sulcal structures and are more likely to reflect taphonomic fracturing and distortion. Dean Falk, the author of the 1983 study, basically shares this view (personal communication). Overall, I agree with the authors that the hypothesis to be tested is the following: did early Homo populations with primitive versus derived frontal lobe organizations coexist in Africa, and did they represent distinct species?

      I greatly appreciate that the authors make available the 3D surface data of this interesting endocast.

      We are grateful to Rev. 3 for their comments and for contextualizing our finding. We would also like to point out that, although the 3D surface can be viewed on MorphoSource, permission from the National Museums of Kenya has to be requested for studying the specimen and getting access to the physical specimen and/or the 3D model.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their comments and insights, we feel the manuscript is now greatly improved. Please find below our answers to the reviewer’s queries

      Reviewer #1 (Evidence, reproducibility and clarity):

      The manuscript by Niccoli et al. describes the identification of a novel modifier of C9orf72-derived toxicity based on the manipulation of the brain metabolic pathways. The premise for this work is supported by strong literature describing the aberrant glucose metabolism in FTD, AD and other degenerative disorders. The idea tested here is whether increasing the import of pyruvate produced in glia into neurons. They test three different types of importers and find that one of them, Bumpel, the orthologue of human SLC5A12, suppresses toxicity and reduces the accumulation of arginine-containing repeats, GP and PR. The authors investigate several potential mechanisms mediating this reduction of toxic DPRs, but do not find strong evidence linking pyruvate import and increase autophagy or mitochondria metabolism.

      Overall, this is an interesting discovery based on a candidate approach that shows the power of Drosophila to efficiently identify novel mediators of neurodegeneration. The article is well written, although more detailed explanations of some experiments would be helpful. The weaknesses of the manuscript are the lack of a clear mechanism mediating the protective activity of pyruvate, the incomplete experiments lacking relevant controls, and the presentation of western blots.

      Specific comments:

      1. The reduced levels of DPRs require that the expression of C9 mRNA or the GR and PR constructs is examined by qPCR. In figure 3E, GP is not even detectable_

      We agree with the reviewer, ideally we would have measured the RNA by qPCR. However, the C9 repeats and the DPR constructs are highly repetitive, it is therefore impossible to do a qPCR for them. The upstream and downstream sequence is identical for the C9 and the bumpel constructs, there isn’t, to our knowledge any unique sequence we can use to measure levels of expression in the presence of bumpel.

      We did run a GFP control (Fig 2D) and did not see any difference and we have now carried out a qPCR for Gal4-GeneSwitch (Fig S3) to show that the levels of the driver do not change.

      1. I wonder if there are constructs available to silence Bumpel or overexpress the human orthologues of bumpel. These would be nice controls for the effects observed with the Bumpel overexpression

      This would be an extremely interesting experiment, however bumpel is normally only expressed in glia, therefore we can’t down-regulated it in glia whilst upregulating 36R in neurons, as we are limited to one driver (since everything is driven by the Gal4/UAS system). Expression of C9 in glia does not have a clear phenotype (our observation), so we can’t drive both in glia. We tried over-expressing the human homologue SLC5A12 , but it did not rescue the C9 phenotype (data not shown), possibly because it requires (like other human SLC5A type transporters) PDZK1 as extra co-factor (Srivastava S. et al, 2019), and this is not present in flies.

      1. The argument about bumpel modulating autophagy downstream of Atg1 is not supported by the experimental data

      We now have imaging data showing that bumpel modulates the formation of lysosomes, downstream of Atg1 (Fig 5). We also show that bumpel and Atg1 can act synergistically, leading to a much stronger rescue of C9 expression (See Fig 5I.), which also suggests that the two are acting at different points in the same pathway. We also show that bumpel rescues the downregulation of TFEB targets (Fig 5J)

      1. Western blots throughout show no control lanes and in several occasions are created with cutout bands. The standard for this type of experiments should be more stringent, with entire gels showing all experimental conditions, which requires consistent methods and results vs selecting the best bands from different gels.

      We apologise if this was mis-understood, the lanes shows are all from the same blot, where other samples were run too, and it would be confusing for the reader to include them. We have re-run samples where we had remaining sample from our quantifications, so that the lanes are now contiguous and we provide original blot images in the supplemental information for those we could not re-run. The control for all experiments are the C9 expressing line without bumpel, and this is always present, if the reviewer means we are missing -RU controls, these do not produce any DPRs so are not included in western blot or ELISA quantifications as the signal is not above back-ground.

      1. For figures 2B and 5C, please, show representative WBs

      These are ELISA quantifications, not western blots, we choose to run these when possible, as they are more quantitative.

      1. Figure 5D describes the survival curve as significantly rescued. Statistical tests can indicate differences, but that is in no way convincing. The test may show the curves are different, but the abeta Atg1 flies also seem to start falling early, so an argument could be made in both directions, as a suppressor or an enhancer.

      We agree the rescue is not strong enough, we have now removed this lifespan.

      1. It is unclear why several results are placed in the supplemental materials. In general, all this material seems highly relevant and related to what is shown in the main figures

      We are happy to include them in the main manuscript if this would help the reader, and we have now placed all mitochondrial data in Fig 4.

      Minor comments:

      Please, define several abbreviations throughout

      We apologise for this over-sight, we have now does this.

      A couple of sections could be improved by carefully sequencing human vs Drosophila background to advance the argument rather than going in circles. There is also a section on mitophagy in between two sections related to autophagy that could be sequenced better.

      We have re-structured the sections, we think this has improved the flow.

      There is a sentence at the end of page 6 that seems misplaced

      We apologise for the over-sight, and we have removed this

      Reviewer #1 (Significance):

      Overall, this is an interesting discovery based on a candidate approach that shows the power of Drosophila to efficiently identify novel mediators of neurodegeneration. The article is well written, although more detailed explanations of some experiments would be helpful. The weaknesses of the manuscript are the lack of a clear mechanism mediating the protective activity of pyruvate, the incomplete experiments lacking relevant controls, and the presentation of western blots.

      We thank the reviewer for the helpful comments, we have added some details in the methods section, we apologise for not having made it clear that the westerns were all derived from the same blot (we have now placed the originals in the supplemental materials). Regarding mechanism, we now show that bumpel over-expression increases clearance of late stage autolysosomes, possibly by increasing transcription of TFEB target lysosomal genes.

      Reviewer #2 (Evidence, reproducibility and clarity):

      Summary:<br /> Project investigates the role in dementias of glial glucose uptake, conversion to lactate and shuttling via transporters to neurons to produce pyruvate to fuel TCA cycle production of ATG. The experiments are conducted in Drosophila melanogaster, which have become a powerful model system for understanding neurodegeneration mechanisms associated with ALS/FTD associated C9orf72 pathology. Bumple misexpression is shown to rescue early death phenotype in flies expressing a C9orf72 expansion and flies expressing arginine containing di-peptide repeat proteins. The report describes novel insight into the function of bumpel, demonstrating that this conserved orthologue of human SLC14A functions as a sodium exchange transporter for monocarboxylates pyruvate and lactate. These findings conclude that increased neuronal pyruvate, but not its metabolites, rescues C9orf72 associated pathology.<br /> The authors next set out to describe the mechanism by which increase pyruvate rescues survival in C9orf72 expressing flies. Levels of autolysosomes were increased in C9orf72 expressing flies, and stimulation of autophagy by overexpression of atg1 shown to decrease levels of DPRs (though not to same extent as bumple expression). Expression of bumple in C9orf72 flies led to a modest increase in LC3-II, indicating increased autophagy. Co-overexpression of bumple and atg1 did not have an additive effect, suggesting bumple activates autophagy downstream or independent of atg1 activity. Finally the author extend their findings to amyloid models, suggest a common protective mechanism for elevating neuronal pyruvate levels in neurodegenerative disease.

      Major comments

      Prior data suggests that bumpel is expressed in glia (for example Yildirim et al 2022). In their study the authors do not present any data to demonstrate that the transporter is normally expressed in neurons in flies. This calls into questions the physiological relevance of their findings, that neuronal upregulation of bumpel is protective against C9orf72 associated pathology in neurons, from which it is reasonable for a reader to conclude that bumpel may be a neuronal target for therapeutic intervention. However, the report well demonstrates that regardless of whether the transporter in native to neurons, the increase in monocarboxylates it facilitates is projective against C9orf72 pathology and thus the overall conclusion of the project is supported by experimental evidence. The point of upregulation of a natively expressed gene versus misexpression of a glial enriched transporter should be considered in a bit more detail in the discussion text. The authors may consider speculating the identify of members of the sodium coupled monocarboxylate transporters that are enriched in neurons. Are any of the bumple human orthologues expressed in neurons?_

      We thank the reviewer for this comment and suggestion. The reviewer correctly points out that we do not show whether there is a defect in pyruvate import in C9 expressing flies. We could not identify a validated sodium coupled pyruvate transporter in flies with a strong neuronal expression, we have added a comment in the discussion about this. There are a number of human homologues, some, such as SLC5A8, are expressed in neurons, thus providing a possible therapeutic target. We have added a sentence to this regard in the discussion.

      [_OPTIONAL] cDNA overexpression of neuron specific sodium coupled monocarboxylate transporters in C9orf72 fly models would strengthen the conclusion their physiological relevance for ALS/FTD. Fly lines for these are not available in repositories, but could be generated and tested at reasonable cost (<£700, ~3 month duration).

      This would be an ideal experiment, however, we could not find a neuronal sodium coupled transporter which is known to import monocarboxylates. There are a number of sodium coupled neuronal transporters, but they are mostly homologous to SLC5A6, which is a glucose coupled transporter. Going forward, we will screen a number of transporters to identify if there are any which import pyruvate.

      The role of bumple expression in survival (Figure 1) could be a technical artifact due to dilution of Gal4 between C9orf72 and bumple-ORF transgenes. No expression control is shown (for example GFP, LacZ etc). This theory is unlikely as no improvement in survival was seen for the SLC14A class of transporters which have a matching site directed transgene insertion. For clarity this point relating to controls should be commented on in the text.

      The reviewer is correct, there could be a dilution of the Gal4. We don’t like using GFP as a control as we have often seen a worsening when expressing other highly stable proteins at high levels. We have generated an “empty” flyORF line (generated by injecting the empty plasmid into the identical attP site), and used it as a control to check for dilution effects, bumpel still rescued relative to this control, we now include this is the supplementary (Fig S1B).

      Reduced Mito-GFP levels are used to support a role for bumple in increasing mitophagy. As mito-GFP is a marker for mitochondria but not specifically mitophagy, an alternative explanation for decreased levels could be reduced mitochondria biogenesis. The text should be amended to clarify this point.<br /> The role of Pink1 RNAi in modifying mitophagy is a bit overstated. Whilst Pink1 is involved in stress associated mitophagy, its role in basal mitochondria turnover is less well defined. Text should be adapted.

      We have added qualifying statements regarding the possibility of reduced mitochondrial biogenesis, and the fact that Pink1’s role in basal mitophagy is not very clear. The use of the mitophagy inducer drug, Kaempferol, however, suggests that mitophagy is unlikely to be a cause of the DRP reduction.

      Minor comments

      Introduction well describes current state of C9orf72 fly models. Introduction would benefit from a few comparable lines for AD models. The first paragraph of reports may also be better placed in the introduction._

      We thank the reviewer for the suggestion, and have added a more in depth introduction to Aß and have moved the first paragraph of the results section to the introduction

      Figure 1 presents survival for three SLC16A transporters and bumple. The C9 control curve appears to be consistent between charts, likely indicating the same control used across experiments, rather than independent controls for each chart. The authors should considered showing either all SLC16A and bumple data on a single chart, or clarify in the figure legend that a common control dataset is used. GFP control is used in later experiments (Figure 2).

      We have now indicated that the SLC16A transporters were run together in the figure legend.

      Choice of amyloid model needs a line of explanation, particularly with regard to extra/intracellular deposition of amyloid in this model.

      We have now added a few sentences describing this when the model is introduced

      Fruit Fly Injection method section needs a bit more detail to describe site of injection (head, body etc). This is not clear in the result section either.

      We have now added this, the injection was done in the abdomen.

      How were bumple orthologues identified? What degree of conservation (sequence homology etc?)

      The bumpel orthologues are those identified as most similar by flybase. We have now added the degree of conservation in the text

      The speculative mechanism for C9 pathology modification involves interaction of neurons and glia, monocarboxylate transporters and changes in autophagy activity. For clarity a diagram showing the model may be a helpful addition.

      We have now added a diagram explaining how we think the rescue is achieved

      Typos:<br /> Figure 1 Legend - "p values of ona way ANOVA "

      We apologise for the error, and have now corrected it

      Figure S2 Legend - Atg1 RNAi genotypes from S2 legend are mentioned erroneously

      We apologise for the error, and have now corrected it

      Repetition of text in results: "Bumpel, together with its paralogues kumpel and rumpel, is expressed in glia in flies, where it is thought to promote transport of substrates across the brain (31)."

      We apologise and have rectified this

      "Modulation of Atg1 when bumpel was co-overexpressed, however, did not affect GP<br /> levels (Fig 4E, F)" - Should be refering to Fig 4D, E)

      We apologise and have rectified this

      Reviewer #2 (Significance):

      The study will be of broadly of interest to researcher working in the fields of neurodegeneration and metabolism, providing evidence for a protective role of elevated pyruvate in neuron that provide new understand relating to pathology in C9orf72 associated motor neuron disease and frontotemporal dementia.

      Strengths:<br /> The study presents novel data to demonstrate that overexpression of fly monocarboxylate transporter bumple rescues an early death phenotype associate with ALS/FTD gene C9orf72. Any novel therapeutic strategies of ALS are of interest to the field, and the strategy demonstrated here may be readily translated to human cell culture systems for proof of principle translational studies to a more physiologically relevant system. This study further demonstrates the utility of invertebrate models to generate novel understanding of C9orf72 pathology.

      Limitations:<br /> The study speculates that there is a link between pyruvate levels and increased autophagy, however the mechanisms by which this occurs is not defined in present study. This is a limitation of the experiment, though opens up an interesting question for future studies._

      We thank the reviewer for their comments, and we have now added experiments characterising the role of bumpel in autophagy, particularly showing its rescue of a late autolysosomal block.

      Reviewer expertise: The reviewer researches ALS and dementia associated neurodegeneration, utilising Drosophila, rodent and stem cell derived model systems.

      Reviewer #3 (Evidence, reproducibility and clarity):

      This is an interesting manuscript in which the authors provide evidence that elevated neuronal expression of the pyruvate transporter bumpel can partially rescue shortened lifespan in fly models of frontotemporal dementia and Alzheimer's disease. In addition, elevated neuronal bumpel expression can reduce accumulation of arginine containing FTD-linked dipeptide repeat proteins. Some evidence is presented that elevated neuronal bumpel expression may activate autophagy. These findings are novel and may have implications for therapeutic interventions based on pyruvate import/metabolism to treat neurodegenerative disorders. However, I have several concerns as follows:

      Major Comments:

      1. The authors provide no explanation as to why they targeted bumpel overexpression in neurons. Endogenous bumpel appears to be predominately expressed in glia cells so why not target these cells instead?

      We wanted to increase pyruvate import in neurons, so we over-expressed a number of pyruvate transporter that were available in the fly ORF stock centre (so that they would all be inserted into the same site and therefore directly comparable), we were mainly interested in cell autonomous effects of importing glycolytic metabolites. Over-expressing bumpel in glia would be indeed an extremely interesting experiment, unfortunately we do not have the ability to express C9 in neurons while over-expressing bumpel in glia as we only have one over-expression system that works. We are working towards generating a new C9 model so we can then use the Gal 4 system to over-express bumpel in glia, but this is currently not available yet. Over-expression of C9 in glia is not toxic and not a good model of disease.

      1. Data is shown that overexpressed bumpel can suppress GR and PR dipeptide repeat toxicity when these peptides are translated using an ATG start codon (Fig 2D,E). Does bumpel mediated neuroprotection also correlate with a reduction in DPR levels driven with an ATG start codon?

      This would be a very interesting question, unfortunately, whist the Isaacs lab kindly made available the GR antibody for the initial ELISA experiment, we no longer have that antibody available and we do not have a working PR antibody. GR and PR westerns are not possible to carry out as the proteins are too positively charged to run. We do show that bumpel can down-regulate Aß from a UAS promoter, so its effect is not specific to RAN translation.

      1. The authors provide some evidence suggesting that overexpression of bumpel increases autophagy in the fly brain. However, knockdown of Atg1 while co-expressing bumpel (Fig 4E) did not result in increased GP protein levels. In addition, Atg1 knockdown did not attenuate the protective effects of bumpel overexpression (Fig 4I), suggesting that bumpel is working through a pathway independent of autophagy to promote DPR clearance and protection against toxic peptide accumulation. The authors need to modify the interpretation of their data and temper their claim that autophagy contributes to bumpel-mediated protective effects in the CNS.

      We apologise the data was not strong enough. We have now added evidence that bumpel acts downstream of Atg1, on late stage autolysosomal clearance. We also show that bumpel and Atg1 can act synergistically to improve the C9 phenotype when over-expressed, this is now described in Fig 5.

      1. Although the authors present evidence that increased bumpel expression can activate autophagy, the data is not convincing that the neuroprotective effects associated with bumpel are mediated through autophagy. Pyruvate, in some circumstances, can non-enzymatically scavenge hydrogen peroxide or in other cases trigger oxidative stress resistance through hormetic ROS signaling. The authors should consider these alternative possibilities.

      These are indeed possibilities, we have added a sentence to that effect in the discussion, we have now also showed that bumpel is affecting late clearance of autolysosomes, and is leading to an increase in TFEB targets.

      1. The authors rely on overexpressing bumpel to attenuate C9 toxicity in flies. They should perform the opposite experiment and knockdown bumpel to demonstrate that reduced bumpel expression results in potentiation of C9 and amyloid beta neurotoxicity. In addition, then should show that knockdown of bumpel expression has some effect on autophagy.

      This would be a very interesting experiment, unfortunately bumpel is expressed only in a few glia subtypes in a wild type fly, and we can’t downregulate it in glia while over-expressing toxic proteins in neurons, because of limitations of our expression system, both genes need to be over-expressed in the same cell type. We have tried downregulating bumpel in neurons, and don’t get an effect on phenotype, and no effect on DPR levels, but bumpel expression in neurons is extremely low. Moreover, bumpel has 2 paralogs, rumpel and kumpel,(also only present in glia) and all three need to be knocked out for phenotypes to become visible in glia (Yildirim et al, 2022). These experiments would be interesting but outside out scope.

      We are in the process of generating new C9 models to be able to do these experiments, but these are currently outside the scope of this work.

      Minor Comments:

      1. Neuronal overexpression of bumpel appears to shorten lifespan of wild type flies (Fig 2A). It is possible that neuronal import of pyruvate may drive mitochondrial oxidative phosphorylation and ROS formation. The authors should comment on this possibility in the discussion._

      This is a very good point, we have added a point to that effect.

      1. In Fig 3 the authors used a mixture of sodium pyruvate and ethyl pyruvate to demonstrate the import properties of bumpel. The rationale for using ethyl pyruvate is unclear as this membrane-permeable metabolite can by-pass any transporters.

      The ethyl pyruvate was only used in the injection of flies, not for the FRET experiments looking at the import properties of bumpel. Since we were not over-expressing bumpel, we needed the pyruvate to by-pass the requirement for a transporter. We were showing that delivery of pyruvate by another methods (other than by a transporter) was able to phenocopy the over-expression of bumpel, thus showing the effect is mediated by pyruvate entrance into the cell.

      1. In the introduction several acronyms are used (i.e. GRN, MAPT, TREM2) that are not defined.

      We apologise and have now rectified this.

      Reviewer #3 (Significance):

      To my knowledge, this is the first study to identify that bumpel can permit the import of pyruvate and lactate into neurons when ectopically expressed in the fly brain. The fact that increased neuronal pyruvate import can partially protect against toxic peptide accumulation is unexpected and quite novel. Although some evidence is presented that bumpel can trigger autophagy, it is not clear if autophagy is mediating bumpel neuroprotective effects. Alternative mechanisms related to pyruvate effects on ROS and oxidative stress resistance should be considered.

      We thank the reviewer for their comments, and have added clarifying statements regarding the potential role of ROS.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We would like to thank the reviewers for their comments and suggestions, which were very helpful to improve our manuscript. The revised manuscript notably includes the following improvements:

      • To evaluate the relevance of identified candidate targets genes, we integrated an additional screening step in our method, corresponding to the analysis of RNAseq datasets specific of blood or brain cells. RNAseq data from irradiated hematopoietic stem cells or splenic cells were analyzed and included in the new Table S19, and RNAseq data from zika virus-infected neural progenitors were analyzed and included in the new Table S28. In addition, we also verified that the expression of a subset of blood related genes was decreased in the bone marrow cells of p53Δ31/Δ31 mice, known to exhibit increased p53 activity and to phenocopy dyskeratosis congenita (new Figure S8).
      • Luciferase data were expanded to show that, for promoters exhibiting a significant p53-mediated repression in luciferase assays, the p53-dependent regulation was abrogated after mutation of the putative DREAM binding site (new Figures 2e and 2i).
      • We found putative DREAM binding sites for 151 targets, and the predicted binding sites were precisely mapped relative to the position of ChIP peaks of DREAM subunits (E2F4 and LIN9) and to transcription start sites of target genes. These additional analyses, shown in the new Figures 3a and 3b, further suggest the reliability of our predicted binding sites. Notably, hypergeometric tests of the distribution of DREAM binding sites relative to E2F4/LIN9 ChIP peaks reveal a significant >1300-fold enrichment of these sites at ChIP peaks.
      • We now present a detailed comparison of our results with those reported in other studies, notably the predicted E2F and CHR sites from the Target gene regulation database (new Figure S11), or the list of candidate DREAM targets suggested from Lin37 KO cells (new Figure S10 and new Table S35). This also leads us to discuss the different types of DREAM binding sites (bipartite sites (e.g. CDE/CHR or E2F/CLE) vs sites composed of a single E2F or a single CHR motif).
      • We integrated updates of the Human phenotype ontology website to include the latest lists of genes related to blood or brain ontology terms in our analysis. In the previous version of the manuscript we had analyzed a total of 811 genes downregulated ≥ 1.5 fold upon bone marrow cell differentiation. Our revised manuscript now includes the analysis of 883 genes.
      • Several improvements were made to present our results more clearly and with more details : 1) additional evidence that the differentiation of Hoxa9ER cells correlates with p53 activation is now provided in the new Figure S1; 2) the precise values for gene expression after bone marrow cell differentiation, as well as p53 regulation scores from the Target gene regulation databases are included in the new Tables S1, S5, S8, S11, S14, S20 and S23; 3) A Venn-like diagram was included to summarize the different steps of our approach in the new Figure 3c, with detailed lists of genes selected at each step in new Tables S17 and S26; 4) for genes associated with blood or brain genetic disorders, bibliographic references describing gene mutations and clinical traits were included in a new Table S36; 5) Figure 4a and Table S37 were improved to include evidence that increased BRD8 in glioblastoma cells leads to a decreased expression of several genes transactivated by p53.

      Reviewer #1 (Evidence, reproducibility and clarity):

      Summary<br /> In this paper the authors describe a data driven approach to identify and prioritise p53-DREAM targets whose repression might contribute to abnormal haematopoiesis and brain abnormalities observed in p53-CTD deleted mice. The premise is that in these mice, (where they have previously demonstrated p53 to be hyperactive in at least a subset of tissues), that the p53-p21-E2F/DREAM axis is at least in part responsible for observed phenotypes due to the repression of E2F and CDE/CHE element containing genes. Their approach to home in on relevant genes is based on transcriptomic gene ontology analysis of genes repressed in these disease settings where they primarily use publicly available data from HOXA9-ER regulated model of HSC expansion wherein they observe increases on p53-p21 expression upon differentiation where they demonstrate that p53-p21 DREAM target genes are suppressed as we would expect in this scenario where p53-p21 is activating withdrawal from cell cycle. They then spend a lot of effort analysing this datasets combining "gene-ontology", "disease phenotype" and "meta-ChIP-seq" analysis of public data to support the observation that mutations of genes suppressed in this manner are disproportionately linked to heritable haematopoetic and brain disorders. While these results are interesting in terms of framing a hypothesis about how mutations in p53-p21-DREAM regulated targets contribute to such conditions, they are to be expected given the now very well described impact of p53-p21 on both E2F4/DREAM targets.

      We agree with the referee that the impact of p53-p21 on both E2F4/DREAM targets is well described. However, discussions with many scientists or clinicians specialized in bone marrow failure syndromes or microcephaly diseases led us to realize that most were not familiarized with the p53-DREAM pathway, so that a study that would bridge the gap between DREAM experts and bone marrow or microcephaly specialists would be particularly useful. In addition, we thought that strategies that would rely on disease-based ontology terms were likely to identify new targets, compared to previous studies that considered cell cycle regulation instead of disease phenotypes. Consistent with this, many genes we identified as candidate DREAM targets were not reported in previous studies. In addition, as detailed below, our positional frequency matrices led to identify DREAM binding sites that had not been predicted by previous approaches.

      The natural progression of this work would be to go on to show this occurs in relevant cells or tissues derived from the p53-CTD mice as well as look at modulating target genes to understand underlying mechanisms and consequences.<br /> Rather than this, they focus on validating that a sub-set of these targets are indeed suppressed by specific p53 activation by MDM2 inhibitor Nutlin-3A in MEFs by qPCR and that mutation of predicted CDE CHR elements in luciferase constructs leads to increase luciferase activity. While these findings support their predictions, the results are entirely expected based on what is known about such targets and demonstrating that this occurs in MEFs does not closely relate to haematopoietic and brain cells they suggest this regulation is important. In fact, in the discussion, the authors comment on the importance of cell type context specificity in terms of discordance between predictions of TF binding sites and public datasets.

      We agree that additional data from relevant cells or tissues were required to strengthen our conclusions. In the revised manuscript, we evaluated the relevance of candidate target genes related to blood ontology terms by integrating an additional screening step in our method, corresponding to the analysis of RNAseq datasets specific of blood cells. We analyzed dataset GSE171697, with RNAseq data from hematopoietic stem cells of unirradiated p53 KO, or unirradiated or irradiated WT mice, as well as dataset GSE204924, with RNAseq data from splenic cells of irradiated p53Δ24/- or p53+/- mice. The latter dataset appeared interesting because p53Δ24 is a mouse model prone to bone marrow failure and the spleen is a hematopoietic organ in mice. The analysis of these datasets is included in the new Table S19. In the datasets,increased p53 activity correlated with the downregulation of most of the 269 candidate DREAM targets. However, 56 genes which appeared upregulated in cells with increased p53 activity were considered poor candidate p53-DREAM targets and removed from further analyses, leading to a list of 213 genes that appeared as better candidate p53-DREAM targets related to blood abnormalities. Furthermore, we also verified that the expression of a subset of blood-related candidate genes was decreased in the bone marrow cells of p53Δ31/Δ31 mice (prone to bone marrow failure) compared to bone marrow cells from WT mice. This result is presented in the new Figure S8.

      As for genes related to brain development, we discussed in the previous version of the manuscript that most genes mutated in syndromes of microcephaly or cerebellar hypoplasia are involved in ubiquitous cellular functions (chromosome condensation, mitotic spindle activity, tRNA splicing…), which suggested that our analysis of transcriptomic changes associated with bone marrow cell differentiation might also be used to identify brain specific targets. However, we agree with the referee that confirmation of these brain specific targets in a more relevant cellular context was preferable. In the revised manuscript, we included the analysis of datasets GSE78711 and GSE80434, containing RNAseq data from human cortical neural progenitors infected by the Zika virus (ZIKV) or mock-infected, because ZIKV was shown to cause p53 activation in cortical neural progenitors and microcephaly. This analysis is detailed in the new supplementary Table S28. In both datasets, increased p53 activity correlated with the downregulation of most of the 226 candidate DREAM targets. Sixty-four genes which appeared more expressed in ZIKV-infected cells were considered poor candidate p53-DREAM targets and removed from further analyses, leading to a list of 162 candidate p53-DREAM targets related to brain abnormalities. We think this significantly increases the relevance of our analysis of brain-specific targets.

      Finally, they try and contextualise effects in glioblastoma data by correlating target gene expression with levels of BRD8 since it has recently been shown to attenuate p53 function in glioblastoma and show that some of the brain disease associated genes are expressed at higher levels in BRD8 high patient samples. It seems strange here that they do not also look at expression of p21 or other p53 targets that would help ascertain if p53 activity is indeed suppressed. Moreover, much more elegant methods for predicting transcription factor activity could be applied to this data.

      We agree with the referee. Indeed, when we had performed the analysis of glioblastoma cells, we first verified that increased BRD8 levels correlated with decreased p21 levels in these cells. However, we had not included this verification in the previous version of the manuscript. In this revision, we improved the Figure 4 (and Table S37) reporting the analysis of glioblastoma cells to address this point. In Figure 4a, we now show the variations in mRNA levels between BRD8Low and BRD8High tumors, for BRD8 itself, as well as 5 genes well-known to be transactivated by p53 (p21, MDM2, BAX, GADD45A and PLK3) and the 77 p53-DREAM targets associated with microcephaly or cerebellar hypoplasia. The data clearly show that tumors with high BRD8 exhibit a decrease in the expression of p53 transactivated targets, and an increase in p53-DREAM repressed targets.

      Major Comments<br /> The major result of this paper as it stands is the prioritisation of candidate genes in the p53-DREAM pathway involved in these conditions, and their refined approach used to identify and prioritise these genes and is such more of a starting point for further investigation. They fall short of demonstrating the relevance of their predictions physiologically in tissues from the mice and do not demonstrate functional importance of regulation of targets they put forward. Given that these genes will be co-ordinately regulated, without a mechanistic experiment in physiologically relevant model it is impossible to infer causality. For example, depleting individual targets in the HOXA9 model and evaluating impact on survival, proliferation and differentiation may be a (relatively) simple way to explore this, perhaps comparing to effects of p53 activating agents such as Nutlin-3A. Of note the authors (Jaber 2016 PMID: 27033104) and several other groups had (Fischer 2014 PMID: 25486564 McDade 2014 PMID: 24823795) previously demonstrated the link between p53-p21 and suppression of DNA-repair/Damage related genes (as is also observed here in particular FA-related genes that they discuss briefly here. I would have thought that this would be an obvious starting point for some mechanistic experiments and in fact I note this has been demonstrated before (Li et al 2018 PMID: 29307578)

      The starting point of our study is not the prioritization of DREAM target genes, but rather the detailed phenotyping of p53Δ31/Δ31 mice that we performed in previous publications (Simeonova et al. Cell Rep 2013, Toufektchan et al. Nat. Commun. 2016), in which we mentioned phenotypical traits typical of dyskeratosis congenita and Fanconi anemia, including notably bone marrow failure and cerebellar hypoplasia.

      We understand that depleting individual targets in the Hoxa9 system and evaluating impact on survival, proliferation and differentiation might seem appropriate to explore their potential causality. However, our previous work on Fanc genes leads us to think that this might not be informative. Regarding this, we now clearly discuss in the revised version of the manuscript : “Finding a functionally relevant [DREAM binding site] for Fanca, mutated in 60% of patients with Fanconi anemia [59,60], may help to understand how a germline increase in p53 activity can cause defects in DNA repair. Importantly however, we previously showed that p53Δ31/Δ31 cells exhibited defects in DNA interstrand cross-link repair, a typical property of Fanconi anemia cells, that correlated with a subtle but significant decrease in expression for several genes of the Fanconi anemia DNA repair pathway, rather than the complete repression of a single gene in this pathway [25]. Thus, the Fanconi-like phenotype of p53Δ31/Δ31 cells most likely results from a decreased expression of not only Fanca, but also of additional p53-DREAM targets mutated in Fanconi anemia such as Fancb, Fancd2, Fanci, Brip1, Rad51, Palb2, Ube2t or Xrcc2, for which functional or putative [DREAM binding sites] were also found with our systematic approach.” We further discuss in the manuscript how this may also apply to telomere-, ribosome-, of microcephaly-related genes.

      The analysis of brain specific targets and the link to BRD8 sits largely as an aside and the analysis of patient data from glioblastomas is underdeveloped as noted above.

      As we previously mentioned, the revised manuscript includes the analysis of RNAseq datasets from human cortical neural progenitors infected by the Zika virus (ZIKV) or mock-infected, which significantly increases the relevance of our analysis of brain-specific targets. Furthermore, we improved Figure 4 to present more clearly the impact of BRD8 levels on the expression of genes transactivated by p53 or repressed by p53-DREAM.

      The computational methods applied are robust, albeit predominantly coorelative, in terms of identifying regulation of potential causative target genes, validated across human and mouse cell lines, and this indicates a role of these genes in the relevant conditions. However, further validation through application in a bulk or single cell RNAseq patient cohort, or at least an in vivo model would strengthen these conclusions and complement the work presented here which is based on in vitro mouse and human cells. This is pertinent as this study improves upon previously published approaches by focusing on "clinically relevant target genes". Additionally, this would exhibit the potential applications of the findings presented.

      We thank the referee for this comment. As mentioned above, in the revised manuscript we analyzed RNAseq data from hematopoietic stem cells of unirradiated WT or p53 KO mice, or irradiated WT mice, and from splenic cells of irradiated p53D24/- or p53+/- mice, and quantified the expression of a subset of blood-related candidate genes in the bone marrow cells of p53Δ31/Δ31 mice (prone to bone marrow failure) and WT mice (new Figure S8 and Table S19). For genes related to brain development, we included the analysis of RNAseq data from human cortical neural progenitors infected by the Zika virus (ZIKV) or mock-infected (Table S28). These RNAseq analyses were added as an additional screening criterion in our approach, which significantly increased the relevance of the target genes identified.

      In terms of statistical analysis, the hypergeometric test should be applied to assess significant enrichment of genes for example with CDE/CHR regions within the previously identified lists.

      In the revised manuscript, we precisely mapped the DREAM binding sites in 50 bp windows within regions bound by E2F4 and/or LIN9, an analysis included in new Figure 3a. We then compared the distribution of DREAM binding sites at the level of ChIP peaks compared to their distribution over the entire genome and found a > 1300-fold enrichment of these sites at ChIP peaks. This significant enrichment (f=3 10-239 in a hypergeometric test) is most likely underestimated because mouse-human DNA sequence conservations were not determined for putative DBS over the full genome. These new analyses clearly reinforce our previous conclusions.

      Minor Comments<br /> References are required for the genes listed which play a role in the diseases of interest.

      In the revised manuscript, references are provided for genes which play a role in the diseases of interest. Due to the large number of added references, these were included in a new supplementary table, Table S36.

      This paper would benefit from the inclusion of summary schematics and tables throughout (rather than relying only on somewhat unwieldy heatmaps which show little other than all these genes are co-ordinately regulated), this could include summaries of the methods applied, gene or CDE/CHR inclusion criteria, and Venn diagrams indicating the subsets of final genes identified through this approach.

      We thank the referee for this suggestion. In the revised manuscript we provide a Venn-like diagram of the different steps of our approach (new Figure 3c), as well as tables listing the genes retained after each step of the selection (new Tables S17 and S26) and these additions improve the clarity of our manuscript.

      Reviewer #1 (Significance):

      In its current form this is a very limited study that would require significant additional work to move conclusions beyond correlation and hypothesis generation.<br /> Overall, while limited largely to target prioritisation, this research nicely exemplifies how genes affected by the p53-DREAM pathway can be robustly identified, providing a potential resource for individuals working on this pathway or on abnormal haematopoiesis and brain abnormalities. These results are complementary to work previously published by Fischer et al, which has been referenced throughout the analysis (highlighting Target Gene Regulation Database p53 and DREAM target genes) and discussion.

      This paper will be of interest to researchers of blood/neurological diseases who can assess if these genes are dysregulated in their datasets, or those investigating the p53-DREAM pathway. This work represents a useful resource detailing genes affected by this pathway in these disease settings, however researchers of the p53-DREAM pathway may find this paper useful when planning an approach to identify and prioritise genes of interest.

      We thank the reviewer for considering that our study represents a useful resource for researchers working on the p53-DREAM pathway, abnormal haematopoiesis and brain abnormalities, because it was exactly the purpose of our work. As mentioned above, we think that a study bridging the gap between DREAM experts and bone marrow or microcephaly specialists should be particularly useful.

      We also agree with the referee that our approach could be used to identify DREAM targets relevant to other disease settings, and we now mentioned this clearly in the revised manuscript.

      While our results are complementary to work previously published by Fischer et al and included in the Target gene regulation database, in the revised manuscript we discuss the novelty of our results in more details, notably by performing additional analyses. For example, our method identified bipartite DREAM binding sites for 151 candidate DREAM targets (of which 56 genes were not previously mentioned by Fischer et al.) and we now provide a detailed mapping (using 50 bp windows) of the bipartite DREAM binding sites we identified relative to ChIP peaks for DREAM subunits, then performed a similar mapping of the E2F and CHR sites included in the Target gene regulation database. Our predicted DREAM binding sites coincided with ChIP peaks more frequently (Figure 3a) than the predicted E2F or CHR from the Target gene regulation database (Figure S11), which further indicates the usefulness of our study as a resource.

      Reviewer #2 (Evidence, reproducibility and clarity):

      The authors used various systems including Hoxa9-indubible BMCs, human and mouse cells, WT and p53 knockout MEF, glioblastoma cells to screen p53-DREAM targets and observed distinct finding for each system. Since different cell types have various p53 activation and p53 target genes expression, the authors might want to select proper cell type(s) to screen p53-DREAM target genes and design experiments to confirm that these genes are really p53-DREAM target genes.

      We agree that additional data from relevant cells or tissues were required to strengthen our conclusions. As mentioned in response to referee #1, in the revised manuscript we evaluated the relevance of candidate target genes related to blood ontology terms by integrating an additional screening step in our method, corresponding to the analysis of RNAseq dataset GSE171697, with data from hematopoietic stem cells of unirradiated or irradiated WT mice and unirradiated p53 KO mice , as well as RNAseq dataset GSE204924, with data from splenic cells of irradiated p53D24/- or p53+/- mice. As for genes related to brain development, we included the analysis of RNAseq datasets GSE78711 and GSE80434 for validation, two datasets from human cortical neural progenitors infected by the Zika virus or mock-infected. Together, the 4 datasets provide evidence for a p53-dependent downregulation in blood- and brain- relevant settings (new Tables S19 and S28).

      Importantly, in the revision we also compared our list of 151 genes appearing as the best p53-DREAM candidates with the results of Magès et al., who analyzed, in murine cells with a CRISPR-mediated KO of Lin37 (a subunit of DREAM), the transcriptomic changes that follow a reintroduction of Lin37. This comparison is detailed in the discussion section, with the new Figure S10 and Table S35. We mention: “Our list of 151 genes overlaps only partially with the list of candidate DREAM targets obtained with this approach, with 51/151 genes reported to be downregulated in Lin37-rescued cells [17]. To better evaluate the reasons for this partial overlap, we extracted the RNAseq data from Lin37 KO and Lin37-rescued cells and focused on the 151 genes in our list. For the 51 genes that Mages et al. reported as downregulated in Lin37-rescued cells, an average downregulation of 14.8-fold was observed (Figure S10, Table S35). Furthermore, when each gene was tested individually, a downregulation was observed in all cases, statistically significant for 47 genes, and with a P value between 0.05 and 0.08 for the remnant 4 genes (Table S35). By contrast, for the 100 genes not previously reported to be downregulated in Lin37-rescued cells, an average downregulation of 4.7-fold was observed (Figure S10, Table S35), and each gene appeared downregulated, but this downregulation was statistically significant for only 35/100 genes, and P values between 0.05 and 0.08 were found for 23/100 other genes (Table S35). These comparisons suggest that, for the additional 100 genes, a more subtle decrease in expression, together with experimental variations, might have prevented the report of their DREAM-mediated regulation in Lin37-rescued cells.”

      This comparison provides additional evidence that the 151 candidate target genes we identified are bona fide DREAM targets.

      Specific comments:<br /> The authors need to describe and define HSC and Diff in Figure 1.

      This has been corrected in the revised manuscript. “HSC” was replaced by “Hematopoietic Stem / Progenitor cells (+OHT)” and “Diff” was replaced by “Differentiated cells (5 days – OHT).

      Are Figure 1B and 1D list genes p53 targets in bone marrow cells?

      In the revised manuscript, we now analyzed RNAseq data to address this point. The question refers to lists of telomere-related genes (Figure 1b in both versions of the manuscript) and Fanconi-related genes (Figure 1d in the previous version, now Figure S2a), but could also apply to other lists of genes related to blood ontology terms (Figures S3-S5 in the revised manuscript). As mentioned in response to referee #1, in the revised manuscript we integrated an additional screening step in our method, corresponding to the analysis of RNAseq datasets specific of blood cells. We analyzed dataset GSE171697, with RNAseq data from hematopoietic stem cells of unirradiated WT or p53 KO mice, or irradiated WT mice, as well as dataset GSE204924, with RNAseq data from splenic cells of irradiated p53D24/- or p53+/- mice. The latter dataset appeared interesting because p53D24 is a mouse model prone to bone marrow failure and the spleen is a hematopoietic organ in mice. Furthermore, we also verified that the expression of a subset of blood-related candidate genes was decreased in the bone marrow cells of p53Δ31/Δ31 mice (prone to bone marrow failure) compared to bone marrow cells from WT mice, a result presented in the new Figure S8.

      Where is the detailed information for mouse and human cells in Figure 1 and Figure 2?

      In the first draft of the manuscript, supplementary tables provided precise values for ChIP binding. In the revised manuscript, we also provide the precise values for gene expression after bone marrow cell differentiation, as well as p53 regulation scores from the Target gene regulation databases. This additional information is included in the new Tables S1, S5, S8, S11, S14, S20 and S23.

      Are Figure 3B list genes also p53 target genes in other cell types such as bone marrow cells and glioblastoma?

      For genes in the Figure 3B of the previous version of the manuscript (now Figure 2B in the revised version), we now provide evidence that the blood-related genes are less expressed in the bone marrow cells of p53Δ31/Δ31 mice (mice with increased p53 activity and prone to bone marrow failure) compared to bone marrow cells from WT mice. This result is presented in the new Figure S8. For the brain-related genes of the same Figure, evidence of their p53-mediated regulation is provided by the RNAseq datasets GSE78711 and GSE80434, from human cortical neural progenitors infected by the Zika virus or mock-infected (analyzed in the new Table S28). Evidence of that a decreased p53 activity in glioblastomas correlates with increased expression of the brain-related genes of the same Figure is provided in supplementary Table S37.

      Does BRD8high has high p53 and p21?

      We now clearly show, in both Figure 4a and Table S37, that glioblastoma cells with high BRD8 exhibit a decreased expression of CDKN1A/p21 and other genes known to be transactivated by p53 (BAX, GADD45A, MDM2, PLK3), consistent with the fact that BRD8 attenuates p53 activity.

      Are genes listed in Figure 4B all p53 target genes? can some validation be done?

      For genes in Figure 4B, in the revision we focused on the genes that appeared more relevant, i.e. the 77 genes mutated in diseases with microcephaly or cerebellar hypoplasia. All the genes in Figure 4B are repressed in neural progenitors upon infection by the Zika virus, a virus known to cause p53 activation in those cells. This is reported in the new Table S28.

      Reviewer #2 (Significance):

      This is a potentially interesting study. The major limitation is the absence of validation from the screening. This study would definitely benefit the research community as long as some of the key findings are validated.

      We thank the referee for this comment. We hope the new evidence in this revision provide the validation requested by the referee.

      Reviewer #3 (Evidence, reproducibility and clarity):

      In their work submitted to Review Commons, Rakotopare et al. aim to identify p53-DREAM target genes associated with blood or brain abnormalities. To this end, they utilize published data generated with a cellular model that results in cell-cycle exit and differentiation of murine bone marrow progenitor cells upon inducible expression of Hoxa9. By analyzing this gene expression data set published by Muntean et al., they find that multiple of the 3631 genes which are downregulated more than 1.5-fold in differentiated BMCs are also mutated in several disorders connected to proliferation and differentiation defects during hematopoiesis and brain development. By screening ChIP-seq data sets available at ChIP-Atlas, they find that the promoters of many of these genes are bound by DREAM complex components, and most of them were identified as genes indirectly repressed by p53 before (Fischer et al. 2016, targetgenereg.org). They then use a computational approach to identify putative CDE/CHR DREAM-binding sites in the promoters of 372 genes associated with blood/brain abnormalities which are downregulated in differentiated BMCs and bound by DREAM components. Out of the 173 candidate genes, they select twelve to analyze whether mutation of the putative DREAM binding sites results in increased activity of the promoters in luciferase reporter assays. The authors conclude that their findings suggest a general role for the p53-DREAM pathway in regulating hematopoiesis and brain development.<br /> While the study supports a large body of publications proving that repression of cell cycle genes by the DREAM complex is crucial for cell cycle arrest and exit, it is noted that none of the main conclusions here are unexpected or particularly exciting. All the analyses are based on data sets that compare gene expression in highly proliferative cells with cells that underwent terminal cell cycle exit. Thus, a large portion of the genes that are downregulated in differentiated BMCs are cell cycle genes and well-established targets of DREAM and E2F:RB complexes. Furthermore, it is not surprising that some of these pro-proliferative genes are mutated in diseases connected to proliferation defects like anemias or microcephaly.

      We agree with the referee that the DREAM complex is well known to regulate cell cycle genes – in fact, this is what we mention in the first sentence of our introduction in both versions of our manuscript. However, as we already pointed out in response to Referee #1, many scientists or clinicians specialized in bone marrow failure syndromes or microcephaly diseases are not familiarized with the p53-DREAM pathway, and we think our study will be particularly useful to them. Furthermore, our strategy relying on disease-based ontology terms rather than cell cycle regulation led to identify many DREAM targets that were not reported in previous studies, and our positional frequency matrices led to identify DREAM binding sites not predicted by previous approaches. As discussed below, our revised manuscript provides a more detailed comparison of our findings with those from previous studies.

      Additionally, I am not very enthusiastic about this manuscript because of several major concerns:

      1. The authors draw conclusions about the p53-DREAM pathway based on data that was generated in a cellular differentiation model without convincingly showing that p53 plays a central role in gene repression in this experimental setup.<br /> (A) Rakotopare et al. define p53-DREAM target genes based on RNA expression data from proliferating precursor cells and non-proliferating, differentiated BMCs (Muntean et al., 2010). This paper has not studied whether p53 gets activated in the particular experimental setup during Hox9a-induced BMC differentiation. On page 4 of their manuscript, the authors state: "Consistent with the fact that BMC differentiation strongly correlates with p53 activation..." without citing any literature or explaining why this is supposed to be a fact. Furthermore, they imply that cell cycle gene repression in this model system depends on p53 because mRNA expression of the p53 targets p21 and Mdm2 was found to be increased in the differentiated cells (Fig. 1A, 5-fold and 2-fold, respectively). However, defining a large set of "p53-DREAM target genes" based on the moderate increase in mRNA levels of two genes that are known to be activated by p53 without showing any evidence that p53 is even involved in this effect during BMC differentiation is not appropriate.

      We agree that Muntean et al. did not study whether p53 gets activated when BMCs differentiate in the Hox9a-ER system. We previously mentioned: “We observed that p53 activation correlated with cell differentiation in this system, because genes known to be transactivated by p53 (e.g. Cdkn1a, Mdm2) were induced, whereas genes repressed by p53 (e.g. Rtel1, Fancd2) were downregulated after tamoxifen withdrawal (Figure 1a)”. We had provided examples for 2 genes transactivated and 2 genes repressed, but clearly mentioned that they were given as examples. In the revised manuscript, we provide additional evidence with a new supplementary Figure that includes changes in expression for 15 additional genes known to be transactivated by p53, and 5 additional genes known to be repressed by p53 (Figure S1). In total, we now correlate HSC differentiation with p53 activation based on the expression of 24 well-known p53-regulated genes, which we hope is more convincing.

      In addition, we changed our phrasing and mention “Consistent with the notion that BMC differentiation strongly correlates with p53 activation in this system, 72 of these 76 genes have negative score(s) in the Target gene regulation (TGR) database”.

      (B) Interestingly, p53 is among the genes that get repressed on mRNA level in differentiated BMCs (Fig. 1B; Trp53), and the authors also identify the DREAM components E2F4 and LIN9 as bound to the p53 promoter by screening ChIP-Atlas data (Fig. 1C). Given that p53 has never been described as a DREAM target, I find this rather surprising and it makes me wonder whether appropriate parameters were selected for analyzing the ChIP data, particularly since the authors do not provide binding data for sets of non-cell cycle genes as a negative control.

      We retrieved ChIP data from the ChIP Atlas database without any specific parameters, thus in a completely unbiased manner. Importantly however, for reasons detailed in the manuscript, we clearly mentioned that total ChIP scores <979/4000 were considered too low to reflect significant DREAM binding. The ChIP score for Trp53 was 630, which rapidly led us to eliminate this gene from our screen.

      This ChIP score criterion was already mentioned in the previous version of our manuscript, but we think the addition of a Venn-like diagram (Figure 3c) and summary tables (S17 and S26) in the revised manuscript will probably make it easier to understand.

      (C) Finally, the authors utilize the targetgenereg.org database to show that many of the genes they describe as p53-repressed were already identified as p53 targets. This database (Fischer et al. 2016) was created by performing a meta-analysis integrating a plethora of RNA-seq and ChIP-seq datasets with the aim to identify whether a particular gene gets up- or downregulated by p53, shows cell-cycle-dependent expression, is a DREAM/MuvB or E2F:RB target, etc. For example, 57 datasets analyzing p53-dependent RNA expression in human and 15 datasets generated with mouse cells were included, and a positive or negative score shows in how many of these experiments the gene was found to be up (positive score) or downregulated (negative score). Combining a large number of datasets in such a study is very helpful to get an idea if a gene is indeed generally regulated by a transcription factor, or if it just showed up in a few experiments - either as a false positive or because the regulation depends on a particular biological setting. The authors find most of the genes they identify as repressed in differentiated BMCs also as downregulated by p53 in targetgenereg.org, however, it remains unclear what parameters they used to define a gene as p53-repressed. For example, in the caption of Fig. 1C, they state: "According to the Target gene regulation database, 72/76 genes are downregulated upon mouse and/or human p53 activation." The four exemptions are SLX1B (human score: 0, mouse score : na), PML (+41, +9), RAD50 (0, na), and TNKS2 (+17, +4). However, there are several other genes that do not appear to be generally repressed by p53, e.g. HMBOX1 (+4, -2); UPF1 (+1, -2), SMG6 (+18, -2), CTC1 (-5, +11), etc. Thus, without providing details regarding the parameters they use to define p53-target genes, such statements are rather misleading. An easy way to solve this problem would be to show the p53 scores in the tables together with the E2F4/LIN9 ChIP data.

      All the genes mentioned as downregulated by p53 had a negative TGR score in human and/or mouse cells. In the revised manuscript, we mention clearly what a negative TGR score means, by stating: “Consistent with the notion that BMC differentiation strongly correlates with p53 activation in this system, 72 of these 76 genes have negative p53 expression score(s) in the Target gene regulation (TGR) database [23], which indicates that they were downregulated upon p53 activation in most experiments carried out in mouse and/or human cells (Figure 1b, Table S1).” We agree with the referee that adding precise TGR scores is informative. In the revised manuscript, we provide the TGR scores for all the genes analyzed, as part of the new supplementary Tables S1, S5, S8, S11, S14, S20 and S23, together with their expression levels in undifferentiated or differentiated cells (as requested by Referee #2). The ChIP data are provided in separate tables (Tables S2, S3, S6, S7, S9, S10, S12, S13, S15, S16, S21, S22, S24 and S25).

      1. The authors define a large set of genes containing "CDE-CHR" promoter elements and thereby ignore how these elements are defined and what properties they have.<br /> (A) At the beginning of the introduction, the authors state: "The DREAM complex typically represses the transcription of genes whose promoter contain a bipartite CDE/CHR binding site, with a cell cycle-dependent element (CDE) bound by E2F4 or E2F5, and a cell cycle gene homology region (CHR) bound by LIN54, the DNA binding subunit of MuvB (Zwicker et al., 1995; Müller and Engeland, 2010)."<br /> This statement is incorrect. The authors ignore that the CDE/CHR tandem site is just one of four promoter elements that have been shown to recruit DREAM for the transcriptional repression of several hundred genes. It has been studied in detail that DREAM can bind to the following promoter sites:<br /> (I) CHR elements - bound by DREAM via LIN54; also bound by the activator MuvB complexes B-MYB-MuvB and FOXM1-MuvB which results in maximum gene expression in G2/M<br /> (II) CDE-CHR tandem elements - like (I) but binding of DREAM can be stabilized via E2F4/DP interacting with a truncated E2F binding site. Since CDE elements do not represent functional E2F sites, E2F:RB complexes do not bind.<br /> (III) E2F binding sites - bound by DREAM via E2F4/DP; also bound by E2F:RB complexes and activator E2Fs which results in maximum gene expression in G1/S<br /> (IV) E2F-CLE tandem elements - like (III) but binding of DREAM can be stabilized via LIN54 interacting with a non-canonical CHR-like element. Since CLE elements do not represent functional CHR sites, B-MYB-MuvB and FOXM1-MuvB do not bind.<br /> Thus, these promoter sites have different functions and can be clearly distinguished from each other based on their properties - a fact that is completely ignored by the authors. Since the authors do not differentiate between G1/S and G2/M expressed genes and (CDE)-CHR and E2F-(CLE) sites, they identify CDE-CHR elements in G1/S genes that are functional E2F-(CLE) sites. A good example of this is the Rad51ap1 gene (and also the Rad51 gene that the Toledo lab described before as a CDE-CHR gene (Jaber et al. 2016)): these genes get expressed in G1/S and the promoters contain highly conserved E2F sites (parts of which the authors define as CDEs), and CLEs (which the authors define as CHRs). Furthermore, E2F:RB complexes bind to the promoters. Again: even though (CDE)-CHR and E2F-(CLE) sites both bind DREAM, they are otherwise functionally different in their ability to recruit non-DREAM complexes.

      We agree that in the previous version of our manuscript we should have presented in more details the different types of DREAM binding sites and have corrected this in the revised manuscript. We now mention in the introduction that “The DREAM complex was initially reported to repress the transcription of genes whose promoter sequences contain a bipartite binding motif called CDE/CHR [19,20] (or E2F/CHR [21]), with a GC-rich cell cycle dependent element (CDE) that may be bound by E2F4 or E2F5, and an AT-rich cell cycle gene homology region (CHR) that may be bound by LIN54, the DNA-binding subunit of MuvB [19,20]. Later studies indicated that DREAM may also bind promoters with a single E2F binding site, a single CHR element, or a bipartite E2F/CHR-like element (CLE), and concluded that E2F and CHR elements are required for the regulation of G1/S and G2/M cell cycle genes, respectively [14,22].”

      We hope that the referee will agree with this complete yet concise way of presenting DREAM binding sites. Importantly, we agree that CDE/CHR and E2F/CLE are sites bound by different non-DREAM complexes, but both sites are bound by DREAM, so it makes perfect sense to use them together to define positional frequency matrices for DREAM binding predictions. We would also like to point out that terms used to define DREAM binding sites may vary in the literature. For example, to our knowledge Müller et al. were the first to propose a clear distinction between “CDE/CHR” and “E2F/CLE” sites (Müller et al. (2017) Oncotarget 8, 97737-97748), yet Müller recently co-authored a review in which these two distinct terms were not used, but were replaced by a single, apparently more generic term of “E2F/CHR” (Fischer et al., (2022) Trends Biochem. Sci. 47, 1009-1022). In the revised manuscript we now clearly mention that we designed our positional frequency matrices to search for “bipartite DREAM binding sites”, i.e. sites that might be referred to as CDE/CHR, E2F/CLE or E2F/CHR sites in various publications.

      (B) The authors identified putative CDE-CHR in the promoters of genes by building two position weight matrices (PWMs) based on 10 or 22 "validated CDE-CHR elements". However, since they include several genes that are clearly expressed in G1/S and contain E2F-(CLE) sites (e.g. Mybl2/B-myb, Rad51, Fanca, Fen1), it is not surprising that they identify a lot of putative CDE-CHR sites in genes that do not contain such elements.

      As discussed above, both CDE/CHR and E2F/CLE are bipartite DREAM binding sites, and we now clearly state that we used bipartite DREAM binding sites to generate our positional frequency matrices and predict DREAM binding.

      (C) Finally, in the discussion, the authors state: "A recent update (2.0) of the Target gene regulation database of p53 and cell cycle genes (www.targetgenereg.org) was recently reported to include putative DREAM binding sites for human genes (Fischer et al., 2022). However, this update only suggests potential E2F or CHR binding sites independently, a feature of little help to identify CDE/CHR elements. For example, targetgenereg 2.0 suggests several potential E2F sites, but no CHR site close to the transcription start site of FANCD2, despite the fact that we previously identified a functionally CDE/CHR element near the transcription start site of this gene (Jaber et al., 2016)." This statement highlights again that the authors don't seem to be aware of what specific properties distinct DREAM binding sites have, and that analyzing promoters for CHR and E2F sites separately generates much more meaningful results than the approach they chose. Also, the FANCD2 promoter binds DREAM as well as E2F:RB complexes and contains a highly conserved E2F binding site - which Jaber et al. mutated together with a potential downstream CLE element and named it "CDE/CHR".

      In the revised manuscript, we provide a more detailed comparison between the bipartite DREAM binding sites predicted with our positional frequency matrices for 151 genes and the separate E2F and CHR predicted sites reported in the Target gene regulation database for the same set of genes. We now mention: “The Target gene regulation (TGR) database of p53 and cell-cycle genes was reported to include putative DREAM binding sites for human genes, based on separate genome-wide searches for 7 bp-long E2F or 5 bp-long CHR motifs [23]. We analyzed the predictions of the TGR database for the 151 genes for which we had found putative bipartite DBS. A total of 342 E2F binding sites were reported at the promoters of these genes, but only 64 CHR motifs. The similarities between the predicted E2F or CHR sites from the TGR database and our predicted bipartite DBS appeared rather limited: only 14/342 E2F sites overlapped at least partially with the GC-rich motif of our bipartite DBS, while 27/64 CHR motifs from the TGR database exhibited a partial overlap with the AT-rich motif. Importantly, most E2F and CHR sites from the TGR database mapped close to E2F4 and LIN9 ChIP peaks, but only 16% of E2Fs (54/342), and 33% of CHRs (21/64) mapped precisely at the level of these peaks (Figure S11), compared to 55% (83/151) of our bipartite DBS (Figure 3a). Thus, at least for genes with bipartite DREAM binding sites, our method relying on PFM22 appeared to provide more reliable predictions of DREAM binding than the E2F and CHR sites reported separately in the TGR database. Importantly however, predictions of the TGR database may include genes regulated by a single E2F or a single CHR that would most likely remain undetected with PFM22, suggesting that both approaches provide complementary results.”

      1. The experimental approach chosen to validate CDE-CHR elements in a set of twelve promoters by luciferase reporter assays is not adequate.<br /> (A) Since the authors introduce point mutations in putative CDE and CHR elements in parallel, it is impossible to identify functional CDE elements. As explained above, a functional CDE is not required for binding of MuvB complexes and gene repression, and mutating the CHR alone would already lead to a loss of DREAM binding and to de-repression of a promoter. Thus, without mutating both sites of CDE-CHR elements separately, it is impossible to provide evidence that a putative CDE is functional.<br /> (B) As the putative CDE-CHR elements identified by the authors with a computational approach can overlap with functional E2F-(CLE) elements, the authors inactivate such sites by introducing mutations which leads to loss of DREAM binding and upregulation of the promoters, however, because of the problems described above, this experimental approach in the best case identifies DREAM binding sites, but does not differentiate between (CDE)-CHR and E2F-(CLE) elements.

      Yes, we agree with this comment. As discussed above, our goal was to identify DREAM-binding sites, not to differentiate between CDE/CHR and E2F/CLE elements. In other words, we wanted to identify genes regulated by p53 and DREAM, but not distinguish between genes regulated by p53, DREAM and E2F/Rb versus those regulated by p53, DREAM and BMyb-MuvB or FoxM1-MuvB.

      (C) The authors analyze the activities of wild-type and mutant promoters in proliferating NIH3T3 cells. Since the mutated promoters showed increased activity (about 2-3 fold), which would be expected when binding of DREAM gets abolished, they conclude: "...these experiments indicated that we could identify functional CDE/CHRs for 12/12 tested genes." In addition to the problems described above, a slight upregulation of promoter activities caused by the introduction of multiple point mutations close to the TSS is not sufficient to verify these elements. The increase in activity could occur independent of DREAM-binding by unrelated mechanisms. The authors should at least analyze the activities of the promoters with and without induction of p53. A loss of p53-dependent repression of the mutated promoters would prove that the elements are essential for p53-dependent repression. Furthermore, there are several experimental approaches to analyze whether DREAM binds to the putative promoter element and whether the introduced mutations disrupt binding (ChIP, DNA affinity purification, etc.).

      In the revised manuscript, we show that the promoters of 7 of the tested genes, when cloned in luciferase reporter plasmids and transfected into NIH3T3 cells, exhibited a significant (> 1.4 fold) repression upon p53 activation by cell treatment with Nutlin, the Mdm2 antagonist. For these promoters, we showed that the p53-dependent repression was abrogated by mutating the identified DREAM binding site, which provided direct evidence that our positional frequency matrices can identify functionally relevant DREAM binding sites essential for p53-mediated repression. These experiments were added in Figures 2e and 2i.

      Furthermore, as previously mentioned in response to referee #1, in the revised manuscript we precisely mapped the predicted DREAM binding sites for 151 genes in 50 bp windows within regions bound by E2F4 and/or LIN9, an analysis included in new Figure 3a. The distribution of these peaks clearly indicates that most predicted DREAM binding sites map precisely within a 50 bp-window encompassing the ChIP peaks, which represents an enrichment of at least a 1300-fold compared to the rest of the genome. This mapping strongly suggests that our predicted DREAM binding sites are functionally relevant.

      Importantly, as shown in the new Figure S11, we carried out a similar mapping of the predicted E2F and CHR sites reported in the Target gene regulation (TGR) database and found that our predicted DREAM binding sites co-mapped with E2F4/LIN9 ChIP peaks more frequently than the E2F and CHR sites of the TGR database, which supports the conclusion that our positional frequency matrices bring new and improved predictions for DREAM binding.

      1. Taken together, while over-simplifying mechanisms of cell cycle gene regulation, the authors largely ignore recent findings and publications regarding gene regulation by p53, E2F:RB, and DREAM/MuvB complexes:<br /> (A) Publications that show how DREAM binds to (CDE)-CHR sites and that experimentally defined a consensus motif for CHR elements (e.g. PMID: 27465258, PMID: 25106871).<br /> (B) Publications that identify p53-DREAM target genes by activating p53 in cells with or without functional DREAM complex (e.g. PMID: 31667499, PMID: 31400114).<br /> (C) Identification and comparison of (CDE)-CHR and E2F-(CLE) DREAM binding sites that have distinct functions in the activation of cell-cycle expression in G1/S and G2/M (e.g. PMID: 29228647, PMID: 25106871).<br /> These findings have been summarized in several review articles (e.g. PMID: 29125603, PMID: 28799433, PMID: 35835684). All of them describe the mechanisms I have mentioned above in detail, and since Rakotopare et al. cite one of the papers (Engeland 2018), I wonder even more why they did not design their experiments based on current knowledge.

      The points (A) and (C) of this comment were largely discussed in our response to points 2 and 3 of the same referee. Briefly, in the revised manuscript we clearly mention CDE/CHR, E2F/CLE and E2F/CHR sites, as well as the functional differences between E2F and CHR sites with regards to cell cycle regulation, but all these sites were considered together in our positional frequency matrices because our goal was to identify genes regulated by p53 and DREAM, not to distinguish between genes regulated by p53, DREAM and E2F/Rb versus those regulated by p53, DREAM and BMyb-MuvB or FoxM1-MuvB.

      Regarding point (B) of this comment, in the revised manuscript we performed a detailed comparison of our results with those of Mages et al. who analyzed, in murine cells with a CRISPR-mediated KO of Lin37 (a subunit of DREAM), the transcriptomic changes that follow a reintroduction of Lin37 (Mages et al. (2017) elife 6, e26876). This comparison is detailed in the discussion section, with New Figure S10 and Table S35. As mentioned in response to referee #2, this comparison is perfectly consistent with DREAM regulating the 151 genes for which we identified DREAM binding sites.

      Minor concerns:

      1. The authors state: "Importantly however, the relative importance of the p53-p21-DREAM pathway (called below p53-DREAM) remains controversial, because multiple mechanisms were proposed to account for p53-mediated gene repression (Peuget and Selivanova, 2021)." Even though Peuget & Selivanova do not agree that genes get repressed in response to p53 activation exclusively by the p21-DREAM pathway, they do not question that this mechanism is essential for the p53-dependent repression of a core set of cell cycle genes. Since I am also not aware of any publications that challenge the importance of the p53-p21-DREAM pathway, I do not agree with this statement.

      As the referee pointed out, in the first version of the manuscript we wrote that “the relative importance of the p53-p21-DREAM pathway (called below p53-DREAM) remains controversial, because multiple mechanisms were proposed to account for p53-mediated gene repression (Peuget and Selivanova, 2021)”. The term “relative” was crucial in this sentence, because we wanted to say that the relative proportion of genes regulated by DREAM remained controversial. It seems to us that the title of the review by Peuget & Selivanova (“p53-dependent repression: DREAM or reality?”) emphasizes this controversy. Nevertheless, in the revised manuscript, we now mention : “The relative importance of this pathway remains to be fully appreciated, because multiple mechanisms were proposed to account for p53-mediated gene repression [18]”. We hope the referee will find this phrasing more acceptable.

      1. Some parts of the manuscript are tiring to read - for example, pages 6, 7, and 8 which contain long listings and numbers of genes that are downregulated in differentiated BMC, found to be mutated in various disorders, bind DREAM components, were identified as downregulated by p53, etc. The authors may consider combining central parts of these data in a table that they show in the main manuscript which would make it easier to digest the information and at the same time significantly shorten the manuscript.

      We apologize if some parts of the article were tiring to read. We hope that the addition of Tables S17 and S26, as well as the Venn-like diagram in Figure 3c, will improve the reading of the manuscript.

      1. The supplementary tables (S1-S26) are combined in one Excel file with multiple tabs. The authors should label the tabs accordingly to make it easier for the reader to find a particular table.

      We labelled the Excel tabs in the revised manuscript, as suggested.

      1. At the end of page 6, the authors show that 17 genes found to be downregulated in differentiated BMCs are mutated in multiple bone marrow disorders, however, since they don't include references, it remains unclear where these mutations were originally described.

      In the revised manuscript, we included a supplementary table (Table S36) with appropriate references for blood and/or brain related phenotypes for the 106 genes associated with blood or brain abnormalities.

      1. On page 9, the authors state: "As a prerequisite to luciferase assays, we first verified that the expression of these genes, as well as their p53-mediated repression, can be observedin mouse embryonic fibroblasts (MEFs), because luciferase assays rely on transfections into MEFs (Figure 3b)." The authors don't explain why luciferase assays rely on transfections into MEFs and based on the caption of Fig. 3C, the luciferase assays were not performed in MEFs, but in NIH3T3 cells: "WT or mutant luciferase reporter plasmids were transfected into NIH3T3 cells..."

      According to the American Type Culture Collection (ATCC), the NIH3T3 cell line is a mouse embryonic fibroblastic (MEF) cell line, which explains why we had tested the expressions of candidate target genes in MEFs. However, as we now clearly mention in the manuscript, this cell line exhibits an attenuated p53 pathway, which improves cell survival after transfection but leads to decreased p53-mediated repression. These points are now clearly mentioned in the text and in a new supplemental Figure (Figure S9).

      Reviewer #3 (Significance):

      While the study supports a large body of publications proving that repression of cell cycle genes by the DREAM complex is crucial for cell cycle arrest and exit, it is noted that none of the main conclusions here are unexpected or particularly exciting. All the analyses are based on data sets that compare gene expression in highly proliferative cells with cells that underwent terminal cell cycle exit. Thus, a large portion of the genes that are downregulated in differentiated BMCs are cell cycle genes and well-established targets of DREAM and E2F:RB complexes. Furthermore, it is not surprising that some of these pro-proliferative genes are mutated in diseases connected to proliferation defects like anemias or microcephaly.

      Again, we agree with the referee that the DREAM complex is well known to regulate cell cycle genes, but many scientists or clinicians specialized in bone marrow failure syndromes or microcephaly diseases are not familiarized with the p53-DREAM pathway, and we think our study will be particularly useful to them. As for DREAM specialists, our strategy relying on disease-based ontology terms rather than cell cycle regulation led to identify many DREAM targets that were not reported in previous studies, and our positional frequency matrices led to identify DREAM binding sites not predicted by previous approaches. We hope that, by considering all these points together, the referee will acknowledge that our study provides a valuable resource for different types of readerships.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity):

      1) It is interesting MxDnaK1 seems to prefer cytosolic proteins while Mx-DnaK2 prefers inner membrane proteins. The domain-swapping experiments seem to suggest that the NBD is important for this difference. How NBD is important is not addressed. Is it due to ATP hydrolysis, NBD-SBD interaction, or co-chaperone interactions?

      Answer: Thanks for your comments. We speculate that the co-chaperone interaction might be the key factor contributing to substrate differences. According to the working principle of Hsp70, its functional diversity is largely determined by substrate differences. Co-chaperones, such as JDPs, play a crucial role in this process as they possess the ability to bind substrates and facilitate their targeted delivery. Therefore, much of the functional diversity of the HSP70s is driven by a diverse class of JDPs 1,2. We found that NBD played important roles in cochaperone recognition of MxDnaKs. Additionally, it is generally accepted that the efficiency of ATP hydrolysis does not significantly impact the substrate recognition of Hsp70. Furthermore, if the NBD-SBD interaction is crucial, the substitution of either the NBD or SBDβ domain might result in similar cell phenotypes, as both alterations disrupt the original NBD-SBDβ interaction. We believe the DnaK proteins and their cochaperones both determine the substrate spectrums. We made corresponding modifications in the revised manuscript. (Page22; Line 488-494 in the marked-up manuscript)

      2) About the interactome analysis, since apyrase was added to remove ATP, it's surprising multiple Hsp40s were found in their analysis. Hsp70-Hsp40 interaction is known to require ATP. This may suggest some of the proteins found in their interactome analysis are artifacts. The authors should perform negative controls for their interactome analysis, such as using a control antibody for their CO-IP and analyze any non-specific binding to their resin.

      In addition, since JDPs were pull-down, is it possible some of the substrates identified are actually substrates for JDPs, not binding directly to DnaKs?

      Answer: This is an interesting question. As you correctly noted, the interaction between Hsp70 and Hsp40 requires ATP. In our experiment, we used apyrase to remove ATP in order to promote tight binding of substrate by DnaK. This methodology was initially described by Calloni, G. et al in 20123, and the authors also identified the co-chaperone protein DnaJ, but with a concentration higher than 77% of the interactors. In our opinions, the incomplete removal of ATP could be the underlying cause of this phenomenon.

      We apologize for the undetailed description in Methods. Actually, we implemented negative controls for each MxDnaK in order to eliminate the potential non-specific interactions with Protein A/G beads or antibodies. Specifically, we conducted a CO-IP experiment without the presence of antibodies to assess any non-specific binding to the Protein A/G beads. To further investigate non-specific binding to the antibodies of MxDnaK2 and MxDnaK1, we utilized the mxdnak2-deleted mutant (strain YL2216) and the MxDnaK1 swapping strain with the MxDnaK2 SBDα (strain YL2204), respectively. As the SBDα of MxDnaK1 was employed as antigen to generate antibodies, and YL2204 can’t be recognized by anti-MxDnaK1 (Figure S5). We believe these controls allowed us to evaluate and exclude the non-specific interactions in our CO-IP. We have improved our description in methods. (Page 27; Line 596-607)

      While one of the main functions of JDPs is to interact with unfolded substrates and facilitate their delivery to Hsp70, there may still be substrates that do not directly bind to Hsp70. It’s thus possible that some of the substrates identified only bind to JDPs. We made corresponding modifications in the revised manuscript. (Page 14; Line 290-292)

      3) For Figure S7, the pull-down assay used His6-tagged JDPs. Ni resin is known to bind Hsp70s non-specifically. It's not surprising DnaK showed up in all the pull-down lanes, especially considering how much DnaK was over-expressed. For some pull-down lanes, the amount of DnaK is much more than that of JDPs, further indicating artifact. The author should include negative controls such as JDPs without His6-tag or any irrelevant protein with His6 tag.

      Answer: Thanks for your suggestion. As you and another reviewer pointed out, there were some flaws in the experimental design of the pulldown assay. These include the non-specific binding of Hsp70 proteins to nickel resin, the absence of a negative control without a tag, and the inappropriate selection of the MBP tag. Thus, we employed the nLuc assay as an alternative to the pulldown experiment to validate the interaction between DnaK and JDP (Figure S9). While our manuscript employed nLuc to confirm protein dimerization, it is worth noting that nLuc assay was originally devised for investigating protein interactions 4.

      4) For the proposed dimer formation in Fig. 4C, there are multiple bands above the monomer bands. What are these forms? It seems the majority of the Cys residues that could form disulfide bonds are in the NBD of MxDnaK2 since constructs with MxDnaK2-NBD form some sort of high-MW bands above the monomer. Does MxDnaK1-NBD also contain Cys at the analogous positions? The fact that MxDnaK1 didn't show disulfide-bonded bands doesn't mean it doesn't form dimer. It depends on where the Cys residues are.

      It's nice the authors did Fig. 4D. However, the authors should include a positive control to show how strong the signal is for a true interaction before interpreting their results.

      Answer: Thank you very much for your comments. In at least three independent experiments, we consistently observed two unidentified bands within the molecular weight range of 70-100 kDa during the purification process of His6-MxDnaK2. These bands appeared to be intermediate in size between the monomeric and dimeric forms of His6-MxDnaK2, and disappeared upon DTT treatment. the unidentified band compositions have been confirmed by LC/MS. The upper band included MxDnaK2 (65.3 kDa) and anti-FlhDC factor of E. coli (WP_001300634.1, 27 kDa). In the lower band, we detected the presence of MxDnaK2 and the 50S ribosomal protein L28 of E. coli (WP_000091955.1, 9 kDa). Based on these findings, we conclude that these two additional bands are the result of the interaction between His6-MxDnaK2 and these two E. coli proteins. The related explanations have been added in the legend of Figure 5. (Page 42; Line 938-942)

      We analyzed the presence of Cys in MxDnaK1 and MxDnaK2. The NBD region of MxDnaK2 contains two Cys, located at positions 15 and 319. MxDnaK1-NBD contain a Cys at position of 316, which is the analogous position of 319-Cys of MxDnaK2. The analogous position of 15-Cys of MxDnaK2 is a Val in MxDnaK1, which might be an important factor contributing to the inability of MxDnaK1 to form oligomers.

      Thanks for your suggestion to add the positive control. We re-performed the nLuc assays including a positive control(αSyn). According to the working principle of the nLuc assay, the amount of fluorescent substrate is limited. Therefore, even for proteins that interact with each other, the fluorescence value gradually decreases and reaches a plateau, similar to the negative control. This gradual decline in fluorescence is a significant indicator of protein interaction. In Figure 4D (Figure 5D in the revision version), we only presented the results of the first 20 minutes of detection. The complete two-hour detection results have been added in the supplementary figure (Figure S14).

      5) line 48: "human HSC70 and HSP70 are 85% identical, and the phenotypes of their knockout mutants are different, which is consistent with their largely nonoverlapping substrates" The authors completely ignored that the promoters of HSC70 and HSP70 are very different.

      Answer: This is our carelessness. Yes, HSC70 and HSP70 exhibit distinct expression patterns, which play important roles in their functional diversity. We modified the sentence in the new version (Page 5; Line 58)

      6) Line 69: "The two PRK00290 proteins, not the other Myxococcus Hsp70s, could alternatively compensate the functions of EcDnaK (DnaK of E. coli) for growth." Please add references for this statement.

      Answer: Added, thanks.

      7) line 191: What's the mechanism for DnaK's role in oxidative stress? Is the disulfide bond formation in Fig. 4 related? Does disulfide-bond change the activity of DnaK?

      Answer: Thanks for your pertinent comments. Honestly, we have no idea about the mechanism for MxDnaK2's role in oxidative stress. In our previous studies, we determined that the deletion of mxdnaK2 resulted in a longer lag phase after H2O2 treatment. Here, our aim was to investigate the impact of region swapping on the cellular function of MxDnaK2. In other bacteria, the critical role that DnaK plays in resistance to oxidative stress stems from the pleotropic functions of this chaperone. By shortening the dwelling time that proteins spend in the unfolded state, the DnaK/DnaJ chaperone system minimizes the risk of metal-catalyzed carbonylation of the side chains of proline, lysine, arginine, and threonine residues, but none of them linked to the dimerization characteristic of DnaK 5-7.

      8) Fig. S9 seems redundant.

      Answer: Deleted, thanks.

      9) line 263, "but the NBD exchange was almost equal to the deletion of the gene with respect to phenotypes." But, the mutant has >50% activity in Fig. 3F.

      Answer: We apologize for the confusing description. The “phenotypes” here indicates “cell phenotypes”. What we really tried to say with this sentence is that the NBD swapping strain of either MxDnaK1 or MxDnaK2 presented identical cell phenotypes with the gene-deleted strain. As we have already provided a detailed description of this result earlier, now we consider this sentence to be redundant and have therefore deleted it. (Page 17; Line 355-356)

      10) line 221-226: the logic is not quite clear.

      Answer: We apologize for the confusing description. In M. xanthus DK1622, MxDnaK1 is essential for cell survival, and an insertion of a second copy of mxdnaK1 in the genome is required for deletion of the in-situ gene. Thus, To verify whether the NBD region is required for the essentiality of MxDnaK1, we performed the region swapping of the in situ MxDnaK1 gene in the att::mxdnaK1 mutant (a DK1622 mutant containing a second copy of mxdnaK1 at attB site), and successfully obtained the MxDnaK1 mutant swapped with the MxDnaK2 NBD region. The experiment indicated that the NBD of MxDnaK1 is essential for the cellular functions of the chaperone. We have added the information and modified the sentences in the manuscript. (Page 15; Line 308-319)

      Minor concerns:

      Please check spelling. There are some typos such as "HPPES" in the Methods.

      Answer: Corrected. Many thanks.

      My areas of expertise are protein biochemistry, genetics, and structural biology on heat shock proteins.

      Reviewer #2 (Evidence, reproducibility and clarity):

      Major comments:

      The manuscript is very nice and interesting, although some of the authors' conclusions are perhaps not well supported by their data. For example:

      1) In the pulldown experiments the lack of interaction between 2747-MxDnaK2, 3015-MxDnaK2 and 1145-MxDnaK1 should be shown in order to support the conclusion made in line 197-198,

      Answer: This is our carelessness. As you and another reviewer pointed out, there are some flaws in the experimental design of the pulldown assay. These include the non-specific binding of Hsp70 proteins to nickel resin, the absence of a negative control without a tag, and the inappropriate selection of the MBP tag. Thus, we employed the nLuc assay as an alternative to the pulldown experiment to validate the interaction between DnaK and JDP (including 2747-MxDnaK2, 3015-MxDnaK2 and 1145-MxDnaK1 interaction) (Figure S9). While our manuscript employed nLuc to confirm protein dimerization, it is worth noting that nLuc assay was originally devised for investigating protein interactions 4.

      2) The only evidence that the NBD of MxDnaK1 is essential for bacterial growth is that this mutation couldn´t be obtained in M. xanthus. However, it could be purified in E. coli. Could the authors do some experiments with the M. xanthus strain without the chromosomal MxDnaK1 and then introduce a plasmid with the mutated gene?

      Answer: We apologize for the confusing description. Actually, we determined the NBD is essential not only from the mutation couldn’t be obtained. In M. xanthus DK1622, MxDnaK1 is essential for cell survival, and in-situ deletion of the gene could be obtained after an insertion of a second copy of mxdnaK1 in the genome at the attB site. To verify whether the NBD region is required for the essentiality of MxDnaK1, we performed the region swapping of the in situ MxDnaK1 gene in the att::_mxdnaK_1 mutant (a DK1622 mutant containing a second copy of _mxdnaK_1), and successfully obtained the MxDnaK1 mutant swapped with the MxDnaK2 NBD region. The experiment indicated that the NBD of MxDnaK1 is essential for the cellular functions of the chaperone. We have added the information and modified the sentences in the manuscript. (Page 15; Line 308-319)

      3) All the experiments with purified proteins were done with MxDnaKs bearing His-tags. It doesn't say explicitly its position, but as they employed a pET28A it is likely that the tag is at the N-terminus, which is close to the linker region. As this tag might interfere, it should be removed for the experiments, or at least a control done with the tag removed.

      Answer: We apologize for the lack of detailed description. As you pointed out, the His-tags are located at the N-terminus of DnaKs. The full lengths of MxDnaK1 and MxDnaK2 are 638 and 607 amino acids. The linker regions are located at amino acid positions 381-386 for MxDnaK1 and 387-392 for MxDnaK2. Therefore, we believe that the His-tag is not close to the linker regions. We have included the information in new manuscript. (Page 24; Line 544-546)

      The purified His6-DnaK proteins were employed for holdase activity assays and in vitro dimerization assays. Several previous studies have utilized the same holdase activity assay method with His-tagged DnaK 8,9. We suggested that the His-tag did not interfere with the holdase activity of DnaK. To exclude the influence of His-tag on oligomerization, we conducted a control with the tag removed in the in vitro dimerization assay and the result show no difference (Figure S13).

      4) The authors state that MxDnaK dimerized in vitro with the NBD, and to disrupt the dimer they used 100 mM DTT, which is a very high concentration. As the protein has the His-tag, it should be removed to corroborate that it is not interfering with the dimerization.

      Answer: Thanks for your suggestion. As mentioned above, to exclude the influence of the His-tag on oligomerization, we conducted a control with the tag removed in the in vitro dimerization assay and the result show no difference (Figure S13).

      5) Why were the pulldown experiments done with MBP-MxDnaKs? Can you show a negative control between the MBP and the JDPs to rule out this interaction? It will be more suitable to do the pulldown assays with the purified MxDnaK´s without the His-tags (and the His-tags JDP that were employed).

      Answer: Thanks for your suggestion. As mentioned above, there are some flaws in the experimental design of the pulldown assay. Thus, we employed the nLuc assay as an alternative to the pulldown experiment to validate the interaction between MxDnaKs and JDPs (Figure S9).

      Minor comments:

      • E. coli´s DnaK is only essential in heat shock conditions and for lambda phage cycle. If MxDnaK1 is similar to this Hsp70, why the substitution of its NBD for the NBD MxDnaK2 would be lethal for bacterial growth?

      Answer: Thanks for the comments. As you correctly point out, DnaK is nonessential in E. coli. But in some other bacteria, DnaK also plays an essential role in cell growth for different reasons 10-12. In our previous studies, we determined that MxDnaK1 is essential in M. xanthus DK1622, and the MxDnaK2 is nonessential. In this study, we performed region swapping and found that only the NBD of MxDnaK1 was unreplaceable. In our opinions, the result indicated that NBD play important roles in the functional diversity between MxDnaK1 and MxDnaK2.

      • I think that the writing should be revised and in the supporting information the captions of the figures should include more information.

      Answer: Thanks a lot for the suggestion. We revised the manuscript and added more information in the legends of supplementary figures.

      Reviewer #2 (Significance):

      -General assessment: This is a nice piece of work which would benefit from revision to address the comments above. The authors showed the roles and differences between two DnaK in the same organism. They track these differences to the subdomains of the MxDnaK´s and co-chaperones. It will be interesting for future works to explore more deeply the co-chaperones and their interactions.

      -Advance: I think that this manuscript fills a gap regarding the role of DnaK duplicated in bacterial strains. -Audience: I would say that the audience is broad and includes scientists interested in protein folding and chaperones, as well as myxobacteria.

      1. Rosenzweig, R., Nillegoda, N. B., Mayer, M. P. & Bukau, B. The Hsp70 chaperone network. Nat Rev Mol Cell Biol 20, 665-680, doi:10.1038/s41580-019-0133-3 (2019).
      2. Kampinga, H. H. & Craig, E. A. The HSP70 chaperone machinery: J proteins as drivers of functional specificity. Nat Rev Mol Cell Biol 11, 579-592, doi:10.1038/nrm2941 (2010).
      3. Calloni, G. et al. DnaK functions as a central hub in the E. coli chaperone network. Cell Rep 1, 251-264, doi:10.1016/j.celrep.2011.12.007 (2012).
      4. Dixon, A. S. et al. NanoLuc Complementation Reporter Optimized for Accurate Measurement of Protein Interactions in Cells. ACS Chem Biol 11, 400-408, doi:10.1021/acschembio.5b00753 (2016).
      5. Fredriksson, A., Ballesteros, M., Dukan, S. & Nystrom, T. Defense against protein carbonylation by DnaK/DnaJ and proteases of the heat shock regulon. J Bacteriol 187, 4207-4213, doi:10.1128/JB.187.12.4207-4213.2005 (2005).
      6. Santra, M., Dill, K. A. & de Graff, A. M. R. How Do Chaperones Protect a Cell's Proteins from Oxidative Damage? Cell Syst 6, 743-751 e743, doi:10.1016/j.cels.2018.05.001 (2018).
      7. Fredriksson, A., Ballesteros, M., Dukan, S. & Nystrom, T. Induction of the heat shock regulon in response to increased mistranslation requires oxidative modification of the malformed proteins. Mol Microbiol 59, 350-359, doi:10.1111/j.1365-2958.2005.04947.x (2006).
      8. Chang, L., Thompson, A. D., Ung, P., Carlson, H. A. & Gestwicki, J. E. Mutagenesis reveals the complex relationships between ATPase rate and the chaperone activities of Escherichia coli heat shock protein 70 (Hsp70/DnaK). J Biol Chem 285, 21282-21291, doi:10.1074/jbc.M110.124149 (2010).
      9. Thompson, A. D., Bernard, S. M., Skiniotis, G. & Gestwicki, J. E. Visualization and functional analysis of the oligomeric states of Escherichia coli heat shock protein 70 (Hsp70/DnaK). Cell Stress Chaperones 17, 313-327, doi:10.1007/s12192-011-0307-1 (2012).
      10. Shonhai, A., Boshoff, A. & Blatch, G. L. The structural and functional diversity of Hsp70 proteins from Plasmodium falciparum. Protein Sci 16, 1803-1818, doi:10.1110/ps.072918107 (2007).
      11. Vermeersch, L. et al. On the duration of the microbial lag phase. Curr Genet 65, 721-727, doi:10.1007/s00294-019-00938-2 (2019).
      12. Burkholder, W. F. et al. Mutations in the C-terminal fragment of DnaK affecting peptide binding. Proc Natl Acad Sci U S A 93, 10632-10637, doi:10.1073/pnas.93.20.10632 (1996).
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: Kellner and Berlin present their research findings pertaining to the effect of GRIN2B variants that modify NMDA receptor function and pharmacology. While these mutations were published previously, the new manuscript provides a more thorough investigation into the effects that these variants pose when incorporated into heteromeric complexes with either wildtype GluN2B or GluN2A - NMDA receptors containing only a single mutated GluN2B subunits is more relevant to the disease cases because the associated patients are heterozygous for the variant. The authors achieved selective expression of receptor heteromeric complexes by utilising an established trafficking control system. The authors found that while a single variant subunit in the receptor complex is largely dominant in its effect on reducing glutamate potency of the NMDA receptor, it 's effect on receptor pharmacology varied. Unlike diheteromeric receptors containing mutated subunits, polyamine spermine potentiated GluN1/2B (but not GluN1/2A/2B) receptors that contained a single mutated GluN2B. In contrast, the neurosteroid, pregnenolone-sulfate (PS), was effective at potentiating the NMDA receptor currents (to varying degrees) regardless of the subunit composition. The potentiation of NMDA receptor currents by PS was also observed in neurons overexpressing the variants.

      The techniques used in this study were appropriate to address the objectives and the overall effects are large, and generally convincing. I like the way the results are presented, although have a few (easily addressable) comments.

      We thank the reviewer for the positive remarks on our manuscript.

      Major comments:

      #1 When incrementally adding drugs (e.g. traces in figures 5 and 6), it doesn't always appear like the response has plateaued before changing the solutions/drugs. Therefore, I am curious to what extent the effects observed are underestimated.

      The reviewer is correct to note that some responses do not necessarily reach a plateau, despite our efforts reach steady-state (as shown in most figures, e.g., Figs. 1-4, 6b, etc.), in particular when applying pregnenolone-sulfate (PS) (Fig. 5a, all traces in middle and bottom rows). However, in several instances, this was unobtainable due the very slow effect of the neurosteroid (its mode of action is from within the membrane) and the very large size of the cell (~1 mm). For these reasons, these experiments mandated excessively long exposures (~minutes) of oocytes to glutamate and PS (see scale bar- 20 secs) to try to reach steady-state, however this also caused deterioration to some cells (which did not return to baseline- and were therefore discarded). Thus, we eventually converged on settings whereby we did not expose oocytes to more than 4 minutes of the drug. Nevertheless, to try to estimate the extent of the underestimation (if any), we fitted the currents (standard mono-exponential fit, as previously reported1–3 (Suppl. 5a). We found that our application times of PS were, on average, three time the response’s time constants (tau) (Suppl. 5b), and we found a very weak relationship (R2 = 0.09) between the response to PS and time of its application (Suppl. 5c). These are now explicitly mentioned in the text (line #203), and in the legend of Suppl. 5. These thereby suggest that the reaction reached approximately 95% (1 - 1/e^3) of the steady-state value, and we are therefore confident that we have very small, if any, underestimation the extent of PS potentiation.

      2 Also, in relation to figure 6, to what extent does agonist application cause desensitization here? Looking at traces in Figure 6b it appears that there is some desensitization and it isn’t clear to what extent this persists during the solution changes.

      Agonist desensitization of NMDARs-currents is a well-known phenomenon, but it is very well established that it is not always observed in cells, including neurons (e.g., 4–7). In general, we did not observe very frequent desensitization’s (we provide a larger variety of traces of desensitizing and non-desensitizing currents (Fig. 6b Suppl. 7e and Suppl. 8a). Nevertheless, we explicitly note that in neurons, currents that didn’t reach steady-state after application of 100 mM NMDA were excluded from analysis (Methods - Patch clamping of cultured neurons, line #474), and in most cases desensitization was minor (or absent) following application of 100 mM NMDA and 100 mM PS (Fig. 6b).

      3 Could the authors conduct/show the controls where NMDA alone (for 50-60s), or NMDA followed by PE-S (without ifenprodil).

      These recordings are now shown in Fig. 6b and Suppl. 8a, (as opposed to Suppl. 7e).

      #4 Finally, figure 5 shows the effect of the neurosteroid (and ifenprodil) on NMDA-evoked currents in neurons overexpressing the GluN2B variants in neurons. However, there currents probably reflect a mixture of extrasynaptic and synaptic receptors. To what extent are synaptic NMDA receptors affected by the variants?

      To show the extent of the effect of the variants over synaptic receptors, we recorded miniature NMDA-dependent EPSCs; mEPSCNMDA), as described in our previous report8. We find that the varinats completely eliminate the appearance of mEPSCs (Suppl. 7a, b). Change in minis’ frequency is not the result of a presynaptic change or a change in synapse number9, as we have shown that AMPAR-mEPSC frequency was unaffected by the variants (i.e., synapse number and probability of presynaptic release are unchanged by the variants).

        To further address this, we also explored the relative synaptic vs. extrasynaptic distribution of the variants by using the established MK-801-protocol (to block all synaptic receptors during spontaneous activity, leaving extrasynaptic receptors unblocked)10,11. In neurons overexpressing the GluN2B-*wt* subunit, we obtained an extrasynaptic fraction of 38%, highly consistent with previous reports12,13. Overexpression of the variants, however, yielded a significantly and higher fraction (~50%) of the remaining current, supposedly suggesting more variant receptors at extrasynaptic loci (__Suppl. 8b, c__). However, due to the experimental settings we have chosen, the results from this experiment represent quite the inverse when involving extreme LoF variants. Firstly, 100 mM NMDA does not saturate variant receptors (whether pure, mixed di- or tri-heteromers, see __Table 1__). Secondly, normal neurotransmission does not open synaptic receptors containing mutant GluN2B-subunits, attested by the complete absence of mEPSCs (see __Suppl. 7a, b and __8,9). Thus, during the 10 minutes exposure to MK-801, only (mostly) purely *wt* receptors are blocked by spontaneous synaptic activity, and thus the second bout of 100 mM NMDA solely exposes the remaining *wt*-receptors. An increase in the number reflects more *wt*-receptors at the extrasynapse than the synapse. Thus, the observed increase in the fraction of extrasynaptic receptors in neurons overexpressing the variants, implies that the number of *wt*-receptors is necessarily decreased from the synapse and increases at the extrasynapse. We deem this to ensue due to the incorporation of the variants at the synapse. This increase cannot be explained by an overall increase in membrane expression of *wt*-receptors in neurons overexpressing the variants, as these cells show a strong reduction in Imax  (see __Fig. 6c and Suppl. 7e__). This is now detailed in the text (lines #270-290).
      

      Minor comments:

      5 Looking at the fits in the graph of Figure 2b it appears that the slope on the concentration response curves is less steep for the mixed 2B-diheteromeric NMDA receptors. How much are the Hill coefficients changing and can this be interpreted to provide more mechanistic insight? Wouldn't it make sense to include the Hill coefficients in Table 1?

      We agree with the reviewer’s observation. Actually, the mixed di-heteromers have a similar Hill coefficient (nH) as the purely di-heteromeric GluN2Bwt receptors (see Table 1), and these show the typical near nH ~1 (e.g., 14–16). The only diverging groups are the purely di-heteromeric variant-containing channels (G689C/S only containing receptors; nH~2). Although these may suggest positive cooperation between the subunits, we are less inclined to infer insights from the latter owing to the fact that we limited our examination to 10 mM glutamate (we limit exposure of oocytes to 10 mM glutamate due to artifacts arising past this concentration, as discussed in Kellner et al.8: Fig. 2—figure supplement 1). (this description is now mentioned in page lines #149, 318, 319).

      6 The authors illustrate the changes in potency by the shift in the concentration response curves, but is there any change in efficacy? A simple way to illustrate this would be also present a simple graph showing the maximum current amplitudes (i.e. to 10 mM glutamate) for each of the receptor complexes.

      We now provide these data in (Suppl. 2a, b). We would like to note however that the expression pattern of the tailed-receptors (i.e., subunits with carboxy-termini tagged with C1/C2 tails, see Fig. 1a) are less expressive in general when compared with the native subunits (Suppl. 2c). This description is detailed in lines# 162-166.

      #7 The authors characterize the 'apparent' affinity (or potency) of the receptor using concentration-response curves, but numerous points in the manuscript refer to changes in affinity. None of the experiments shown directly measure affinity (which would require ligand-binding assays) and so the use of the word affinity is inaccurate/misleading. I suggest the authors replace the instances of the word 'affinity' with 'potency'.

      We apologize for the confusion surrounding our use of the term affinity. In fact, we do initially define this term in introduction (page #4): “apparent glutamate affinity (EC50)” to differentiate from affinity (KD). Regardless, and to avoid confusion, we replaced all terms, as suggested by reviewer to potency.

      #8 In the third line of the abstract, the authors wrote, 'for which there are no treatments' in relation to GRINopathies. My understanding is that there are symptomatic treatments but that there are no disease-modifying treatments.

      Indeed, all current treatments are supportive, rather than provide a bona fide cure or disease-modifying. These are now better defined in the abstract.

      #9 The authors have interchangeably used the terms NMDAR or GluNRs throughout the manuscript. I suggest sticking to one of these terms. I would suggest NMDARs since this is less likely to be misread as a a specific NMDA receptor subunit.

      Agreed and corrected throughout manuscript.

      #10 Typos: 1) Results paragraph 2 sentence one: 'We thereby produced GluN2B-wt, GluN2B-G689C and GluN2B-G689S subunits tagged with C1 or C2, co-expressed these along with the GluN1a-wt subunits in...') Results paragraph 2: '...but these were mainly noticeable when oocytes are were exposed to high (saturating) glutamate concentrations...'
3) Last sentence in the second to last paragraph of the results section entitled 'Mixed di-and tri-heteromeric channels...': 'This , PS may serve to rescue...'
4) Last sentence in last paragraph of the results section entitled 'Mixed di-and tri-heteromeric channels...': 'Despite the latter, we found no evidence for any direct effect of three different physiologically relevant concentrations of the drug on di- or tri-heteromeric receptors'

      All typos corrected.

      #11 Figures 1e, 2b, 3b: it would be helpful to add a legend to the graph so that the curves can be interpreted without having to read through the figure legend.

      Corrected.

      #12 The bar graphs in Figure 6 show individual data points but those in figures 4b and 5b don't. Can the authors please add the data points to these graph.

      Individual data points have been added.

      #13 It would be helpful to reviewers that future manuscripts by the authors include page numbers and line numbers.

      Included.

      **Referees cross-commenting**

      #14 Reviewers 2 and 3 highlight an important issue concerning figure 6 and the extent to which the overexpressed variants subunits can compete and assemble with endogenous NMDA receptors (unlike the system where the surface expression of specific receptor complexes is controlled). Indeed in the recent paper by the same authors, the two variants differed in their surface expression (in HEK cells), with G689C expressing particularly poorly. With reference to the second minor comment of Reviewer 1, the maximum current amplitudes would of course need to be normalized to cell surface expression of the receptor to gain any insight into efficacy.

      We provide maximal current amplitudes (Imax) as a proxy for expression level as typically done (e.g.,8,17). These are now shown in Suppl. 2a, b (and see our response to comment #6, above). We would like to emphasize that we find it challenging to gain insights about efficacy of the variants in neuronal synapses, as we purposefully express non-C1/C2 tagged subunits in neurons (as we covet assembly of the variants with endogenous subunits). Moreover, the C1/C2-tagged subunits (whether wt or variants) are less expressive compared to their non-tagged NMDAR-counterparts. For instance, tagged GluN2B-wt subunits express at ~50% compared to non-tagged GluN2B wt subunits (Suppl. 2c). Thus, we find that efficacy of the C1/C2 tagged-subunits is less relatable to the non-tagged subunits (which are used in neurons and likely more relevant to the disease).

      Despite the latter, we deem that we have specifically addressed this issue by measuring miniature EPSCs (mEPSCs) (see our reply to comment #4, Suppl. 7a, b). Briefly, even though the non-tagged G689C expresses at ~40% compared to other subunits (in oocytes and mammalian cells8), in neurons it engenders a robust (and highly significant) negative effect over synaptic currents (mEPSCs), as strong as the G689S-variant which expresses much more robustly (non-tagged G689S expresses to same extent as wt subunits). This demonstrates that the reduced efficacy the tagged subunits is less relatable to the non-tagged subunits and, importantly, it does not hinder the variants’ ability to incorporate within the synapse and affect function (i.e., exert a dominant negative effect). Here, we extend these observations towards the major postnatal channel subtype, namely tri-heteromers (2A/2B*), and therefore demonstrate that the robust dominant negative effect of G689C and G689S variants is likely due to their ability to incorporate within the predominant receptor subtype at the synapse (Suppl. 8).

      Reviewer #1 (Significance (Required)):

      This study emphasizes the complex pattern of effects that variants can have on glutamate receptor function and pharmacology, especially considering the context of receptor subunit composition. The authors have followed up their previous findings on the same mutants (Kellner et al, 2021, Elife), but used a trafficking control system here to characterize properties of mutated receptor complexes that are most likely to exist in neurons. The authors show that the defective currents mediated by NMDA receptors containing a loss-of-function GluN2B variant can be enhanced by neurosteroids (and in the case of GluN1/2B receptors, polyamines also). Development and approval of neurosteroids for the clinic would be required for the findings to translate to a therapy for patients. Readers should also be aware that neurosteroids act on other receptors too (e.g. GABA receptors), which could complicate the outcome. The expertise of the reviewer is in glutamate receptors and synaptic transmission.

      We agree with the reviewer’s comment pertaining to challenges in translating PS to the clinic. Indeed, we explicitly mentioned its inhibitory effect on GABAA receptors (see line #366-367 and reference 18), as well as note its potential negative effect over GluN2C/D-containing receptors (line #365 and reference 19). We further describe alternative neurosteroids and means to bypass the limitations of PS, for instance by use of 24(S)-hydroxycholesterol6,18 or synthetic analogues (SGE-201, SGE-301)6. Lastly, we also propose a novel therapeutic approach, for which we did not find any mentions in the literature with regard to GRINopathies, consisting of the use of the FDA-approved Efavirenz (anti-retroviral compound20) to promote activity of cytochrome P450 46A1 (CYP46A1) to increase amounts of 24-S in the brain (discussion, lines #370-383).


      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      #1 The objective of this paper is to assess whether a single mutated subunit of GRIN can affect the function of various forms of NMDA receptors. In particular, this study investigates the functional consequences of a GRIN variant when assembled within tri-heteromers, containing 2 GluN1, 1 GluN2A and 1 GluN2B subunits, the major postnatal receptor type. For this purpose, the authors artificially forced the subunits to associate in predefined complexes, using chimeras of GRIN subunits fused to GABAb receptor retention control sites at the endoplasmic reticulum. This trick allows to control the stoichiometry of the channels at the membrane and thus to focus on the function of a single type of NMDA receptor.The take home message of the paper is that a single GluN2B‐variant, whether assembled with a GluN2B‐wt subunit to form mixed di‐heteromer or with a GluN2A‐wt‐subunit (tri‐heteromer), strongly impairs the receptor functioning, as reported by a decrease of the apparent glutamate affinity of the receptor.

      Altogether, this is a straightforward study of great interest for the GRIN community.

      We greatly appreciate the reviewer’s comment about the relevance of our work towards the GRIN-community.

      2 However, the way the background and purpose of the study (title and abstract) are presented is a bit confusing for non-specialists and could be easily improved. Technical information, which is crucial to validate the conclusions drawn from data analysis, should be added to the article. Some additional experiments are suggested to consolidate the work. Finally, additional discussion points are strongly encouraged.

      We apologize for not making the paper more accessible to a broader readership. We did so for the sake of brevity. Nevertheless, we have re-written major parts of the manuscript to address this issue and retitled the report: “Rescuing Tri-Heteromeric NMDA Receptor Function: The potential of Pregnenolone-Sulfate in Loss-of-Function GRIN2B Mutations”.

      Specific comments

      Abstract / Title:

      3 This work shows that a single GRIN variant impairs the function of various forms of NMDA receptors. Several sentences in the title and the abstract are confusing for a non-specialized audience. "Two extreme Loss‐of‐Function GRIN2B‐mutations are detrimental to triheteromeric NMDAR‐function, but rescued by pregnanolone‐sulfate." "Here, we have systematically examined how two de novo GRIN2B variants (G689C and G689S) affect the function of di‐ and tri‐heteromers." The number of variants tested is not of capital importance in the title, especially because one could believe that both are tested at the same time; similarly, when variants are named in the abstract, the fact that only 1 variant is studied at a time should be clarified (G689C OR G689S). Indeed, the problem is obvious to those familiar with GRIN disorders, but if this paper is to be published in a journal reaching a large audience rather than a specialized audience, the title of the paper should be modified.

      As noted in our reply to comment #1 of this reviewer, we apologize for not making the paper more accessible and have therefore changed the title and re-written major parts of the manuscript to address this issue. We would like to note that we appreciate the reviewer’s comment and intent to increase the readership of our manuscript.

      #4 "We find that the inclusion of a single GluN2B‐variant within mixed di‐ or tri‐heteromeric channels is sufficient to prompt a strong reduction in the receptors' glutamate affinity, but these reductions are not as drastic as in purely di‐heterometric receptors containing two copies of the variants. This observation is supported by the ability of a GluN2B‐selective potentiator (spermine) to potentiate mixed diheteromeric channels." Please, clarify the link between the two sentences. How do spermin potentiation of mixed diheteromeric channels supports the observation that the inclusion of a single GluN2B‐variant has less effect than the inclusion of two variants?

      Our intention was to highlight that mixed di-heteromeric channels (2B/2B*) are less “damaged” (this is the link) than purely di-heteromeric channels (2B*/2B*).Explicitly mixed di-heteromers show less reduction in glutamate potency AND are also spermine-responsive, whereas purely mutant di-heteromers (2B*/2B*) show reduced potency, BUT do not respond to spermine at all. We have rephrased the sentences in our current manuscript to be clearer:

      For instance: The positive responses of mixed di-heteromers, compared to the null effect over pure di-heteromers, is likely the result of the restored pH-sensitivity of mixed di-heteromers (Suppl. 3). This was surprising as the minimal and essential rules of engagement for potentiation by spermine are not well established, particularly in the case of tri-heteromers21,22 (see discussion, lines #341-353).

      Methods

      #5 All this study is based on the use of a unique ER‐retention technique to limit expression of a desired receptor‐population at the membrane of cells. According to the ER system retention of GABAb receptor, used in this study, while C1/C1-fused subunits are retained in the ER, C2/C2 reach the cell surface and the association of C1/C2 in the ER enables cell-surface targeting of the heterodimer. However, GB2 does not contain any retention signal and can reach the cell surface in the absence of GB1, as a functionally inactive homo-dimer (doi: 10.1042/BJ20041435). If there is an experimental trick that prevents the addressing of C2/C2 to the cell membrane, it should be specified and explained. This is critically important for understanding which receptor populations the data are derived from: receptors containing C1/C2 fused subunits only as stated by the authors, or C1/C2 and C2/C2 fused subunits?

      We base our experiments on two seminal reports—23,24—that have developed this unique method (which we also refer to in the text, lines# 112-116). Briefly, the method employs the binding motifs of GABAB1 (GB1) and GB2 subunits and ER-retention motifs (these are now better detailed in Methods section, line # 448). Previous reports explicitly demonstrate that C1/C1- OR C2/C2-containing receptors do not reach the plasma (or very minimally) and we have reproduced these data with our variants (C1/C1: Suppl. 1a-d; C2/C2: Fig. 1a-c).

      Figures #6 NMDA-receptor current amplitude should be normalized by the membrane expression of the receptors. A preliminary experiment should measure the effective cell surface expression of each of the subunits in the different transfection conditions.

      To address the effective cell surface expression, we employed Imax as a proxy for functional expression (e.g.,8,17). These are now shown in Suppl. 2a, b (and see our response to reviewer 1, comments #6 and 14). Expectedly, we find significantly reduced efficacy by the varinats compared to wt-receptors, and the purely mutant di-heteromeric receptors exhibit the weakest efficacy. We have also addressed this issue by measuring miniature EPSCs (mEPSCs) (see our reply to reviewer 1, comment #4,). We find the variants to abolish mEPSCNMDA frequency (Suppl. 7a, b). This shows that the variants’ reduced efficacy translates to elimination of synaptic activity (dominant negative effect) (also seen in Suppl. 8).

      #7 Fig.1a

      The scheme should include C2-C2 complexes and mention whether these complexes are expressed at the cell surface (see previous and following comments).

      As noted in our reply to comment #5 of this reviewer (above), C2/C2-containing receptors do not reach the plasma membrane (Fig. 1a-c). To avoid confusion, we have now added this scheme to the cartoon presented in Fig. 1a and have provided a more detailed description of the method and clones produced in the Methods section (line # 448).

      #8 Fig.1b and c

      Current from cells transfected with GluN2B‐wt‐C1 and GluN2B‐wt‐C2 should be compared to current expressed in cells expressing untagged receptor subunits: GluN2B‐wt Current from cells transfected with GluN2B‐wt‐C1 alone should be shown as well (although expected to be retained in the ER) (as performed for GluN2A‐wt‐C1 GluN2B‐wt‐C1 in suppl Fig. 1a)

      Current comparisons of oocytes expressing tagged GluN2B‐wt‐C1 and GluN2B‐wt‐C2 and non-tagged GluN2B‐wt are now demonstrated in Suppl. 2c. The results indicate that the “tags” (C1 and C2) affect the expression of the subunits. We have also added a sample trace of current from a cell expressing the GluN2B‐wt‐C1 alone (Fig. 1b).

      9 How could you explain the null current from cells transfected with GluN2B‐wt‐C2 alone (Fig.1b middle, and 1c)? since GB2 does not contain any retention signal and can reach the cell surface in the absence of GB1, GluN2B‐wt‐C2 is supposed to reach the cell surface. This is a very important point to clarify (I am probably missing a technical detail) because if the sub-unit tagged with C2 does reach the cell surface, then all the results and conclusions drawn from the C1-C2 conditions are wrong and could be attributed to a mix of complexes containing either C1-C2 or C2-C2.

      We now realize that the reviewer was missing a crucial technical detail, namely how the clones are designed. Briefly, all clones have ER retention motifs and cannot reach plasma membrane unless they necessarily assemble as C1/C223,24. Also, please see our replies to comments #5, 7 to this reviewer (and Methods section, line # 448).

      My following comments are based on the assumption that only receptors containing C1-C2 tagged subunits reach the membrane (as assumed by the authors and suggested in Figure 1b middle), but explanations should absolutely be provided to convince the reader. Fig. 4a and 5a (see our above replies to comments #5, 7 and 9; and references 23,24).

      #10 Please, keep the current scale constant between all current illustrations within the same figure (4a and 5a). Indeed, not only the Spermin- or SP- induced potentiation is an important data (which is presently quantified on the histograms of fig. 4b and 5b) but also knowing whether the amount of current recorded in cells expressing one mutant subunit in presence of SP (for example GluN2A‐wt‐C1 GluN2B‐G689S‐C2) is comparable to the current recorded in wt receptor-expressing cells (GluN2A‐wt‐C1 GluN2B‐wt‐C2) in absence of SP would be an excellent added value for the paper. A special figure could quantify this rescue effect of SP, measuring and comparing the mean currents recorded in these conditions (one current illustration is not sufficient given variations between similar samples). By the way 5mM glutamate might be an excessive concentration. At 1mM, the expected synaptic concentration of glutamate following action potential, according to figures 3 and Suppl1 the response of the mutated receptor is much lower than that of the WT which is already almost maximal. In these conditions, SP-induced potentiation by a factor of two of GluN2A‐wt‐C1 GluN2B‐G689S‐C2 current could be equivalent to control currents recorded in GluN2A‐wt‐C1 GluN2B‐wt‐C2 cells.

      We have rescaled all current amplitudes in Figs. 4 and 5 to be identical in size for easier comparison.

      We have added all current amplitudes to try to examine the rescue effect of the two drugs in cell overexpressing a specific channel subtype, as requested (Suppl. 4). We find that; indeed, the potentiated currents of the mutant receptors reach (or even surpass) the basal Imax (i.e., current before potentiation) of the non-mutated receptors (Suppl. 4, dashed statistics bar).

      In neurons, we address this in two ways. First, we show that the total NMDA-current is reduced by expression of the variants, and this current is “normalized” by PS (Fig. 6a-c). Similar reductions in Imax (by the variants) are shown in Suppl. 7e (to provide more examples). Secondly, neurotransmission (i.e., 1 mM glutamate25,26) is not sufficient for activating mutant receptors, certainly not pre-di-heteromers (see Table1, EC50 and Suppl. 7a, b- no mEPSCs)27–29. Therefore, 5 mM was required. Together, these strongly suggest that PS may normalize the currents of different receptors that respond to PS (under physiological settings and not 1- or 5mM NMDA). As suggested by the reviewer, there are many subtypes, and some may be activated by ambient glutamate (as suggested by application of PS onto neurons without opening the receptors by NMDA; see Suppl. 7c, d).

      #11 Fig. 6

      Figure 6 is not convincing because cultured hippocampal neurons do express endogenous NMDA receptors. To what extent the recording currents are affected by endogenous, non-mutated GluN2B subunits? Western Blots showing an extinction of endogenous subunits expression when transfected tagged subunits are competitively expressed would be required.

      We have previously shown that the two variants incorporate very efficiently within synapses, causing a very robust elimination of synaptic currents (by measuring miniature NMDA-dependent EPSCs; minis) [see Fig. 8 in Kellner et al. eLIFE, 202127, and see review by Sabo et al.9 ). Change in minis’ frequency can be interpreted as either a presynaptic change or a change in synapse number, however we observed that AMPAR-mEPSC frequency was unaffected by these variants. These imply that synapse number and probability of release are unchanged by the variants. As the experiments are performed in wild-type neurons, (which express wild-type GluN2A and -2B), the dramatic effects we observed on minis suggests a dominant-negative effect of these disease-associated GluN2B variants. These are consistent with our observations that mutant subunits can co-assemble with wild-type GluN2B and/or GluN2A subunits. We have now reproduced this experiment (in fact, we employ this strategy prior each experiment to ensure expression of the variants) (Suppl. 7a, b). This thereby shows that there are no available wt-receptors at the synapse.

      As there are various pools of NMDARs at synaptic and extrasynaptic sites, we did not think that a western blot would sufficiently differentiate between the latter, and thereby would not provide insight about extinction of wt-receptors (which could be simply pushed to other sites compared to synapse). Moreover, the intracellular pool of receptors is much larger than the amount of NMDARs that can be detected at the membrane (e.g., 30,31), and therefore electrophysiology seemed to be a better means to monitor membrane receptors only:

      Thus, to examine the distribution of the variants between synaptic- and extrasynaptic loci, we employed a standard procedure consisting of the use of the activity-dependent blocker MK-801 (Methods). Briefly, neurons were persistently bathed in TTX during which they were probed for Imax using 100 mM NMDA (to refrain from activating other GluRs), followed by application of MK-801 for 10 minutes to exclusively blocks synaptic receptors (that open following action-potential independent miniature neurotransmission). This thereby spares all extrasynaptic receptors from being blocked by MK-801, which are subsequently revealed by a second application of 100 mM NMDA (Suppl. 8a, inset)12. In neurons overexpressing the GluN2B-wt subunit, we obtained an extrasynaptic fraction of 38%, highly consistent with previous reports12,13. Overexpression of the variants, however, yielded a significantly and higher fraction (~50%) of remaining current (Suppl. 8b, c), but instead of reflecting a larger pool of extrasynaptic receptors, the experiment represents quite the inverse when involving LoF variants. Firstly, 100 mM NMDA does not saturate variant receptors (whether pure, mixed di- or tri-heteromers, see Table 1). Secondly, normal neurotransmission does not open synaptic receptors containing mutant GluN2B-subunits, attested by the complete absence of mEPSCs (see Suppl. 7). Thus, during the 10 minutes exposure to MK-801, only wt receptors are blocked by spontaneous synaptic activity, and thus the second bout of 100 mM NMDA solely exposes the remaining wt-receptors at the extrasynapse. Thus, the observed increase in the fraction of extrasynaptic receptors, in neurons overexpressing the variants, implies that the number of wt-receptors is necessarily decreased from the synapse and increases at the extrasynapse, most likely due to the incorporation of the variants at the synapse. This increase cannot be explained by an overall increase in membrane expression of wt-receptors in neurons overexpressing the variants, as these cells show, yet again, a strong reduction in Imax as seen above (see Fig. 6c and Suppl. 7e) (lLines #270-291). These thereby suggest that purely wt-receptors are not necessarily eliminated from the membrane (extinct), rather pushed outside of the synapse.

      12 Fig.6b “PE-S” on the graph should be replaced by “PS”

      Typo corrected.

      Discussion #13 The authors are surprised by the fact (Fig.2) that 1 variant reduces the apparent glutamate affinity of the receptor, but not as much as 2 variants, despite the fact that "NMDARs opening requires all four subunits to be liganded (i.e., occupied by a ligand) which implies that the least affine subunit should have dominated the final affinity of the receptor". I agree that the difference is noticeable, however the glutamate affinity for receptors containing 1 variant is much closer to that of receptors containing 2 variants than that of wild-type receptors. Hence, the results obtained do not seem so surprising and could result, as rightly explained by the authors from a possible cooperativity between the subunits.

      We agree with the reviewer that glutamate potency of receptors containing 1 variant subunit is much closer to that of receptors containing 2 variant subunits. However, we maintain our surprise because we expected it to equal (not just close) to the potency of the least affine subunit (the limiting factor). This is based on the notion that all four subunits need to be liganded for channel opening4,32–34. We gently raise the possibility of potential cooperativity (Table 1, see Hill-coefficient and 33,35,36), as well as mention that this may also stem from the variants’ lower proton sensitivity (Suppl. 3), which has also been shown to promote motions of the ATD (amino terminal domain) and increase open probability (positive cooperativity)36. Nevertheless, we are very careful with interpreting the Hill coefficient , as we limited exposure of oocytes to 10 mM glutamate due to artifacts arising past this concentration (see Kellner et al.8: Fig. 2—figure supplement 1). This description is now mentioned in page lines #149, 318, 319. Thus, even the slightest underestimation of the maximal reposnse would surely affect the slope.

      #14 On the other hand, the data in Figure 6 are much more difficult to interpret and reconcile with the nature of the expressed receptor subunits (which this time is not controlled) nor their association within the same receptor. However, this aspect, which is essential to the understanding of the consequences of 1 variant on neuronal signalling, is not discussed: Whatever the stoichiometry of the complexes in the heterozygous disease, the mutated and wild type GluN2B subunits coexist in the same cell: Either within the same di-heteromeric complexes GluN2B-wt + GluN2B-mutant, or in separate complexes but nevertheless expressed in the same cell, in di heteromeric (GluN2B-wt + GluN2B-wt and GluN2B-mutant + GluN2B-mutant); or tri-heteromeric (GluN2A-wt + GluN2B-wt and GluN2A-wt + GluN2B-mutant) complexes. Assuming that half of the complexes remain wild-type, e.g. (GluN2A-wt + GluN2B-wt and GluN2A-wt + GluN2B-mutant) we would expect (Fig. 6) a small decrease in NMDA current (carried only by the half that expresses the mutated subunit, and whose function is not zero but only decreased by about 20% in response to 5 mM Glutamate, Fig. 3b). The same reasoning applies to the di-heteromeric conditions (GluN2B-wt + GluN2B-wt; GluN2B-mutant + GluN2B-mutant), here again the decrease observed Fig. 6b is difficult to reconcile with the responses measured Fig. 2b.

      In other words, how to explain a 50% decrease of the currents, instead of the 10% expected by the previous reasoning. In this experiment we do not know which subunits are expressed, their proportions, nor how they are associated in functional complexes, which makes the interpretation of the data impossible. The only explanation, far-fetched, for 50 % decrease would be that the complexes were to contain all (or the vast majority) 1 wild-type subunits associated with 1 variant, then a homogeneous 50% reduction in current could be expected. But this extreme condition could only be possible in the case of di-heteromers, which is unlikely the case in Fig.6 as GluN2A currents are measured in presence of Ifenprodil. To conclude

      1) the comparison of the currents in transfected and non-transfected neurons does not make sense in figure 6b which is not convincing because we do not know the nature of the currents actually measured. A comparison in controlled condition would make more sense (as I suggested in the criticism of figures 4, 5).

      2) The reality of the combinations of expression and association between subunits within different complexes expressed in the same cell must be considered and taken into account in the interpretation of the data. Undoubtedly, the means of restoring the NMDA current will be different depending on the presence of mutated subunits in all functional channels or not.

      Indeed, neurons express a variety of different combinations of channel stoichiometry, including following transfection with the variants. We do find find that the effect on whole-cell current is indeed ~50% (Fig. 6b, c), thereby safe to assume that 50% remain “wt”, but we do not know how they distribute between synaptic and extrasynaptic loci. Our results however argue against 50% remaining receptors at the synapse. First, mEPSCNMDA disappear (Suppl. 7a, b and see reply to comment #11 of this reviewer), but wt-receptors are still at the membrane, and they seem to be moving out of synapse (Suppl. 8). Thus, we can only state with higher certainty that the variant subunits are very efficient in incorporating within mixed or pure receptors, especially at the synapse.

        We also consider that the reduction in the whole-cell current observed in __Fig. 6b, c__ is not due to the remaining 50% GluN2B-*wt*-containing receptors, rather likely due to other variants, notably GluN2A, which are more prominent at postnatal stages37, such as in our case. In support, we see a large remaining current after saturating ifenprodil application (__Suppl. 7 e, f__)38. Thus, the variants incorporate within all 2A/2B membrane receptors, at the synapse and outside it (i.e., extrasynaptic) (see __Suppl. 8, c__).
      

      **Referees cross-commenting**

      The referees' comments are highly relevant. In particular, referee 3's comment 1 seems very interesting because it may help to better understand the discrepancy in the results observed in neurons, i.e. a 50% decrease in the currents induced by the expression of the mutant and wild type subunits in the same cells, whereas theoretically one would expect a 10% decrease of this current (cf. referee 2's 2nd comment in the discussion section). This comment 1 of referee 3 indeed stresses the fact that the control (non-transfected neurons) to which the heterozygous condition is compared is not the correct control, which should rather be neurons transfected with wild type receptor subunits. More generally, this comment underlines the importance of monitoring the effective membrane expression of the different subunits in each of the experimental conditions in order to be able to compare conditions and draw conclusions.

      We initially did not perform this control as the literature paints a clear picture whereby expression of the GluN2B-subunit (without adding excess of the GluN1 subunit) does not instigate a robust increase in surface expression of NMDARs (and thus current remains the ~same) 4,39–43, and see our reply to comment #14 (above), and reviewer 3 comment 1 (below). Nevertheless, we have now performed this test by overexpressing GluN2B-wt. In support of previous reports, we do not find any statistical difference in current size between non-transfected neurons and neurons solely overexpressing the GluN2B-wt subunit (Fig. 6a, b). Furthermore, application of PS onto naïve or GluN2Bwt expressing neurons yields identical currents (Imax) and potentiation (Fig. 6c, d). These argue that we did not obtain “overexpression”.

      We suggest that the 50% reduction in current size between neurons expressing the mutant and wt expressing neurons stems from the integration of mutant subunits and their dominant negative effect. Evidence for this incorporation is provided by the very strong reduction in synaptic currents (suppl 7a, b), and the supposedly higher abundance of wt-containing receptors in extrasynaptic regions (see reviewer 1 comment 4 and suppl 8). This is

      Reviewer #2 (Significance required):

      The novelty of the study, is to evaluate the consequences of a single mutated subunit within NMDA receptors affected by GRIN variant, to mimic the heterozygous condition of GRIN encephalopathies, this is of potential value for the field and the interest could also be extended to other genetic diseases (at least the experimental way to study the functioning of only one desired stoichiometric configuration). The strength of this paper is precisely to isolate technically and to study the functioning of a desired stoichiometric configuration only. The main limitation of the paper is the interpretation that the authors make of their data in a physio-pathological context. This work could be of interest for general audience, providing the title and summary are slightly modified. My area of expertise could not be closer to the topic of the article: Glutamate receptors; GRIN; molecular tinkering, cell culture, electrophysiology, receptor stoichiometry...

      We thank the reviewer for noting the value in our work and its potential contribution and interest to the field and other diseases. Per reviewer’s suggestion, we have modified the title and text to suit a larger audience.

      Reviewer #3 (Evidence, reproducibility and clarity (Required):

      This paper is a follow up of an earlier paper published by the group (Kellner et al., eLife 2021), which aimed at characterizing the functional properties of two de novo GluN2B mutations in patients suffering from severe pediatric diseases, GluN2B-G689C and -G689S. NMDA receptors (NMDARs) are tetramers composed of two GluN1 and two GluN2 subunits. A single receptor can incorporate either two identical GluN2 subunits (di-heteromers) or two different GluN2 subunits (tri-heteromers), leading to a large diversity of NMDAR subtypes. The main NMDAR subtypes in the adult forebrain are GluN1/GluN2A and GluN1/GluN2B di-heteromers, as well as GluN1/GluN2A/GluN2B tri-heteromers. While the exact proportions of these three subtypes are still contentious, there are evidence that in the adult N1/2A/2B tri-heteromers form the major population of synaptic NMDARs in the adult forebrain. In addition, patients bearing pathogenic mutations are often heterozygous for the mutation, giving rise to mixed NMDARs incorporating one mutated and one intact GluN2 subunit. In their previous paper, Kellner et al. had shown that purely di-heteromeric GluN1/GluN2B-G689C and -G689S mutants display a drastic (> 1,000-fold) decrease of glutamate sensitivity and a decrease of surface expression. In the current paper, the authors characterize the effects of the -G689C and -G689S mutations on N1/2A/2B tri-heteromeric receptors, as well as on mixed di-heteromeric GluN1/GluN2B receptors containing one copy of the wild-type GluN2B subunit and one copy of the mutated GluN2B subunit. They show that one copy of the mutant subunit, either within mixed diheteromeric or tri-heteromeric receptors, is sufficient to decrease drastically glutamate sensitivity, although the shift in glutamate EC50 is not as strong as in pure di-heteromeric receptors (≈ 500-fold). They furthermore explore strategies to counteract the hypofunction induced by these mutations by testing the effect of positive allosteric modulators (PAMs). They show that spermine, a GluN2B-specific PAM, can potentiate the activity of mixed diheteromeric N1/2B but not N1/2A/2B tri-heteromers. However pregnenolone sulfate (a 2A/2B-specific PAM) can potentiate both the activity of mixed diheteromeric and tri-heteromeric NMDAR populations, either in oocytes or cultured neurons.I have very few major comments to make. The experiments are straightforward and the adequate controls have been made. Here are my two only major comments:

      We thank the reviewer for the very detailed overview of our work and for appreciating our controls and methods.

      #1 About the experiment on cultured neurons. The authors compare the currents of cultured neurons transfected with GluN2B-G689C and -G689S to non transfected neurons. The adequate control is rather neurons transfected with the wild-type GluN2B subunit to even out any phenomenon linked to transfection of the neuron. Given the overexpression that can occur after transfection, the effect of the mutations on the size of NMDAR currents might be even stronger than what the authors show. However in that case PS might not completely rescue mutant NMDAR currents to wild-type levels.

      We initially did not perform this control as the literature paints a clear picture whereby expression of the GluN2B-subunit (without adding excess of the GluN1 subunit) does not instigate a robust increase in surface expression of NMDARs (and thus current remains the ~same) 4,39–43, and see our reply to comment #14 (above), and reviewer 3 comment 1 (below). Nevertheless, we have now performed this test by overexpressing GluN2B-wt. In support of previous reports, we do not find any statistical difference in current size between non-transfected neurons and neurons solely overexpressing the GluN2B-wt subunit (Fig. 6a, b). Furthermore, application of PS onto naïve or GluN2Bwt expressing neurons yields identical currents (Imax) and potentiation (Fig. 6c, d). These argue that we did not obtain “overexpression”. Thus, our results and interpretations hold true, and are therefore not underestimation of the effects of PS in neurons.

      2 How come high concentrations of glutamate (>100µM) produce additional current on wt GluN1/GluN2B (with retention signals) compared to 100 µM glutamate, which is supposed to be saturating? It does not seem to stem from an osmotic effect since 10 mM glutamate does not produce any current on uninjected oocytes. Knowing that this "artefactual" effect might also occur in the mutant receptors, how do you take this effect into account when calculating the glutamate EC50s of the mutants? Given the drastic shift in EC50 produced by the mutant, taking into account this artefact is not going to change the conclusion, but the actual EC50s will be affected.

      GluN1/GluN2B-wt receptors (with or without retention signals) are indeed saturated at 100 mM glutamate. However, excessively large concentrations of glutamate (>100 mM) may yield artefacts even in non-injected oocytes (in 10 mM, this occurs in ~20% of the cells, see Kellner et al 20218—Fig. 2 and Suppl. 1c, d) as well as in GluN2B-wt injected oocytes (supplementary Table 1 in 44). This is not due to osmolarity, as rightly mentioned by the reviewer (and below), rather possibly by endogenous glutamate receptors and transporters that do not readily contribute to current amplitude (these are extremely small currents), but can cause deterioration of the cell (and enhance ‘leak’) when activated for prolonged times by very large concentrations (e.g.,45). In fact, we explicitly report these to highlight potential artefacts, as these are often overlooked in the field. Regardless, most reports do no go past 100 mM glutamate, not even when describing GRIN2 mutations since most mutations do not cause such drastic shifts in potency as we observed (to the best of our knowledge only one report describes such an extreme LoF mutation for a GluN2A variant46). Of note, these effects are not seen when glycine is applied at high concentrations (supporting lack of effect by osmolarity)47. Thus, we refrained from testing concentrations past 10 mM, aware that it may yield a slight underestimation of glutamate potency (and perhaps the reason for the larger Hill coefficient, nH; see our reply to reviewer #1, comment #5). Importantly, despite the potential underestimation of the EC50, it does not change our conclusions as all groups are measured side-by-side (thus, the same underestimation equally applies to all other groups as well). We now mention this more in detail in the methods under the section – “Two Electrode Voltage Clamp recordings in Xenopus Laevis oocytes”.

      Minor comments:

      3 In the first paragraph of the "Results" section, when describing the design of the constructs used to force a heteromeric stoichiometry in recombinant systems, the authors do as if they had designed the constructs themselves "Briefly, we tagged...are retained in the ER (Fig. 1a)". Please rewrite this paragraph to show that you used constructs that had been previously designed by another group (Hansen et al., 2014).

      We apologize. We did not mean to express that we have developed the method and indeed refer readers to the seminal works of those who did (Stroebel et al., 2014 and Hansen et al. 2014, lines #109-116). We did not go into details for the sake of brevity. We have rewritten this part to give proper acknowledgement to the method’s developers (also see Methods, line# 448).

      4 I do not see any evidence of "positive cooperativity" between subunits in ref. 32. Ref. 32, to the best of my knowledge, states that in N1/2A/2B tri-heteromers, the 2A subunit sets the biophysical properties of the tri-heteromer. But there is no account of mixed di-heteromers. In addition, the cooperativity between the glutamate and glycine binding sites is negative.

      The reviewer is correct, and we apologize for the mis-citation. Indeed, the cooperativity between glutamate- and glycine-binding is typically reported as negative48,49, and our intention was to highlight the strong cooperativity (whether positive or negative) observed between NMDAR-subunits and meant to cite the works of: 33,35,50 (lines . We have now rephrased the sentence: The divergence from this scenario suggests that the slight amelioration in potency could stem from positive cooperativity between the subunits50 (but see Hill coefficients in Table 1). Indeed, mixed receptors show restored proton sensitivity (Suppl. 3), which has been suggested to be coupled to other receptor features, notably increase in open probability.

      5 Interpretation of spermine action within the Results section: it is striking indeed to observe that the mutations in the context of a mixed di-heteromer still allow spermine potentiation, while they abolish this potentiation in pure di-heteromers. As rightly said in the discussion, the regain of spermine potentiation in the mixed compared to the pure diheteromers is likely due to a more favorable transduction of spermine signaling to the pore, likely via a higher pH sensitivity of mixed di-heteromers compared to di-heteromers. I would thus avoid the terms of "one single intact interface" for the mixed di-heteromer, since both spermine binding sites are likely intact in this NMDAR configuration. How is pH sensitivity affected in the mixed di-heteromers?

      We have performed a detailed pH dose-responses for the various channel types (Suppl 3). We find that GluN2B mixed di-heteromers exhibit similar IC­50 as pure GluN2B-wt di-heteromers, thus explaining their ability to undergo potentiation by spermine via alleviation of proton inhibition. We therefore further suggest that mixed di-heteromers’ have higher pH-sensitivity compared to pure mutant di-heteromers and this mat also contributes to their higher spermine sensitivity. Lastly, we observed that all GluN2A-wt-containing tri-heteromeric receptors were non-responsive to spermine (Fig. 4a). In fact, under our experimental conditions tri-heteromers underwent slight inhibition by spermine, regardless the identity of the GluN2B subunit (whether wt or variant) (Fig. 4b). Thus, as the tri-heteromers used here exhibit identical pH-sensitivity as 2B-di-heteromers, the only diverging aspect is the missing interface between the GluN1a and GluN2B subunits, demonstrating that potentiation by spermine requires at least one GluN2B-subunit with an intact proton sensitivity, and mandates two intact interfaces between GluN1-wt and GluN2B-wt subunits (Table 1)21.

      6 In the methods section, the oocyte recording solution (likely Ringer and not Barth) does not contain any potassium. This is probably a typo. Could you correct the composition of your Ringer?

      Corrected. We record NMDARs currents by use of a Barth solution containing (in mM): 100 NaCl, 0.3 BaCl2, 5 HEPES, pH 7.3 (adjusted with KOH, at ~2.5 mM) (as in 4,51).

      7 There are several typos, especially in the Discussion.

      We have corrected the typos throughout the publication.

      **Referees cross-commenting**

      I overall agree with the comments of reviewers 1 and 2. In particular, I agree that it is pointless to compare the absolute currents of non transfected neurons vs mutant-transfected neurons without an idea of receptor cell-surface expression.

      We have performed this experiment (Fig. 6) and please see our reply to this reviewer’s comment #1.

      I would like however to give some precisions about some comments of Reviewer 2. About the ER retention technique to express tri-heteromers: I didn't know that the C2 signal could be addressed to the membrane on its own. The lack of leak current stemming from C1-C1 or C2-C2 combinations has been demonstrated in the paper establishing the technique (Hansen et al, 2014), as well as in another paper that developed an analog technique based on GABAB retention signals (Stroebel et al., J Neurosci 2014). So it is fair to consider that the authors were not surprised by the lack of current when co-expressing two GluN2B subunits carrying the C2 signal.

      We thank you for this addition and support for our observations.

      About the comparison about absolute currents wt vs mutants, +/- spermine (Fig. 4a and 5a). I agree with reviewer 2 that being able to compare absolute currents of wt without spermine to mutant + spermine would be very interesting to see if spermine can actually rescue mutant hypofunction. However, to the defense of the authors, comparing absolute current values of recordings from Xenopus oocytes is meaningless. Indeed the variability of currents for the same construct and same day of experiment is too high (there can be up to a ten-fold difference between the lowest and the highest current of oocytes expressing the same construct the same experimental day). A way to investigate this aspect would be to estimate the open probability of the different constructs with or without spermine via the inhibition kinetics of an open channel blocker (e.g. MK801) and measure surface expression by Western blot, but I am not sure these experiments are worth it for the spermine experiment.

      We agree with this reviewer about current size. It is quite variable among cells and would therefore introduce an additional variable and variability: the expression of these modified (C1/C2-tagged) subunits is dually affected by the mutation itself (Kellner et al. 2021) and by the introduction of the tagging (which really hampers there trafficking to membrane, Suppl. 2c); with unknown contribution of each variable. We thereby do not think these provide an added value to our conclusions, yet to grant reviewers’ no 2 request we have added __Suppl. 4 __which shows the rescue effect of the different drugs.

      Reviewer #3 (Significance (Required)):

      This paper is not of high significance since most of the characterization of the 2B-G689C and -G689S de novo mutants found in patients has already been published (Kellner et al., eLife 2021). However, this paper is worth publishing since it brings new data on the effect of the mutations on tri-heteromeric and mixed di-heteromeric NMDAR populations, which are likely the most abundant NMDAR populations in the patient's brain at adult stage. Tri-heteromeric and mixed NMDAR populations have often been overlooked when studying pathogenic NMDAR mutations due to the difficulty to express them specifically in recombinant systems. This paper (in addition to other papers in the field, see for instance Elmasri et al., Brain Sci. 2022; Li et al., Hum. Mutat. 2019) shows that the effect of the mutations on the receptor biophysical and pharmacological properties (but also on trafficking) differ whether the receptor contains one or two copies of the mutant subunit. This paper is of interest to scientists interested in NMDA receptor structure-function and pharmacology, as well as clinicians interested in GRINopathies (pathologies linked to NMDAR mutations).

      I, the reviewer, am an expert in NMDAR structure-function and pharmacology. I believe I have sufficient expertise to evaluate the entirety of the paper.

      We thank the reviewer for appreciating and acknowledging the merits of our work for publication.

      References:

      1. Berlin, S. et al. Gαi and Gβγ Jointly Regulate the Conformations of a Gβγ Effector, the Neuronal G Protein-activated K+ Channel (GIRK). J. Biol. Chem. 285, 6179–6185 (2010).
      2. Kahanovitch, U., Berlin, S. & Dascal, N. Collision coupling in the GABAB receptor–G protein–GIRK signaling cascade. FEBS Lett. 591, 2816–2825 (2017).
      3. Berlin, S. et al. A Collision Coupling Model Governs the Activation of Neuronal GIRK1/2 Channels by Muscarinic-2 Receptors. Front. Pharmacol. 11, (2020).
      4. Berlin, S. et al. A family of photoswitchable NMDA receptors. eLife 5, e12040 (2016).
      5. Reyes-Guzman, E. A., Vega-Castro, N., Reyes-Montaño, E. A. & Recio-Pinto, E. Antagonistic action on NMDA/GluN2B mediated currents of two peptides that were conantokin-G structure-based designed. BMC Neurosci. 18, 44 (2017).
      6. Paul, S. M. et al. The Major Brain Cholesterol Metabolite 24(S)-Hydroxycholesterol Is a Potent Allosteric Modulator of N-Methyl-D-Aspartate Receptors. J. Neurosci. 33, 17290–17300 (2013).
      7. Yakovlev, A. V., Kurmasheva, E. D., Ishchenko, Y., Giniatullin, R. & Sitdikova, G. F. Age-Dependent, Subunit Specific Action of Hydrogen Sulfide on GluN1/2A and GluN1/2B NMDA Receptors. Front. Cell. Neurosci. 11, 375 (2017).
      8. Kellner, S. et al. Two de novo GluN2B mutations affect multiple NMDAR-functions and instigate severe pediatric encephalopathy. eLife 10, e67555 (2021).
      9. Sabo, S. L., Lahr, J. M., Offer, M., Weekes, A. L. & Sceniak, M. P. GRIN2B-related neurodevelopmental disorder: current understanding of pathophysiological mechanisms. Front. Synaptic Neurosci. 14, (2023).
      10. Martel, M.-A. et al. The Subtype of GluN2 C-terminal Domain Determines the Response to Excitotoxic Insults. Neuron 74, 543–556 (2012).
      11. Papouin, T. et al. Synaptic and Extrasynaptic NMDA Receptors Are Gated by Different Endogenous Coagonists. Cell 150, 633–646 (2012).
      12. Harris, A. Z. & Pettit, D. L. Extrasynaptic and synaptic NMDA receptors form stable and uniform pools in rat hippocampal slices. J. Physiol. 584, 509–519 (2007).
      13. Moldavski, A., Behr, J., Bading, H. & Bengtson, C. P. A novel method using ambient glutamate for the electrophysiological quantification of extrasynaptic NMDA receptor function in acute brain slices. J. Physiol. 598, 633–650 (2020).
      14. Curras, M. C. & Dingledine, R. Selectivity of amino acid transmitters acting at N-methyl-D-aspartate and amino-3-hydroxy-5-methyl-4-isoxazolepropionate receptors. Mol. Pharmacol. 41, 520–526 (1992).
      15. Laube, B., Hirai, H., Sturgess, M., Betz, H. & Kuhse, J. Molecular Determinants of Agonist Discrimination by NMDA Receptor Subunits: Analysis of the Glutamate Binding Site on the NR2B Subunit. Neuron 18, 493–503 (1997).
      16. Esmenjaud, J. et al. An inter‐dimer allosteric switch controls NMDA receptor activity. EMBO J. 38, (2019).
      17. Liu, S. et al. A Rare Variant Identified Within the GluN2B C-Terminus in a Patient with Autism Affects NMDA Receptor Surface Expression and Spine Density. J. Neurosci. 37, 4093–4102 (2017).
      18. Geoffroy, C., Paoletti, P. & Mony, L. Positive allosteric modulation of NMDA receptors: mechanisms, physiological impact and therapeutic potential. J. Physiol. 600, 233–259 (2022).
      19. Malayev, A., Gibbs, T. T. & Farb, D. H. Inhibition of the NMDA response by pregnenolone sulphate reveals subtype selective modulation of NMDA receptors by sulphated steroids. Br. J. Pharmacol. 135, 901–909 (2002).
      20. Petrov, A. M. et al. CYP46A1 Activation by Efavirenz Leads to Behavioral Improvement without Significant Changes in Amyloid Plaque Load in the Brain of 5XFAD Mice. Neurotherapeutics 16, 710–724 (2019).
      21. Mony, L., Zhu, S., Carvalho, S. & Paoletti, P. Molecular basis of positive allosteric modulation of GluN2B NMDA receptors by polyamines. EMBO J. 30, 3134–3146 (2011).
      22. Stroebel, D., Casado, M. & Paoletti, P. Triheteromeric NMDA receptors: from structure to synaptic physiology. Curr. Opin. Physiol. 2, 1–12 (2018).
      23. Hansen, K. B., Ogden, K. K., Yuan, H. & Traynelis, S. F. Distinct Functional and Pharmacological Properties of Triheteromeric GluN1/GluN2A/GluN2B NMDA Receptors. Neuron 81, 1084–1096 (2014).
      24. Stroebel, D., Carvalho, S., Grand, T., Zhu, S. & Paoletti, P. Controlling NMDA Receptor Subunit Composition Using Ectopic Retention Signals. J. Neurosci. 34, 16630–16636 (2014).
      25. Clements, J. D., Lester, R. A. J., Tong, G., Jahr, C. E. & Westbrook, G. L. The Time Course of Glutamate in the Synaptic Cleft. Science 258, 1498–1501 (1992).
      26. Budisantoso, T. et al. Evaluation of glutamate concentration transient in the synaptic cleft of the rat calyx of Held: Glutamate concentration in synapse. J. Physiol. 591, 219–239 (2013).
      27. Kellner, S. et al. Two de novo GluN2B mutations affect multiple NMDAR-functions and instigate severe pediatric encephalopathy. eLife 10, e67555 (2021).
      28. McAllister, A. K. & Stevens, C. F. Nonsaturation of AMPA and NMDA receptors at hippocampal synapses. Proc. Natl. Acad. Sci. 97, 6173–6178 (2000).
      29. Ishikawa, T., Sahara, Y. & Takahashi, T. A Single Packet of Transmitter Does Not Saturate Postsynaptic Glutamate Receptors. Neuron 34, 613–621 (2002).
      30. Washbourne, P., Liu, X.-B., Jones, E. G. & McAllister, A. K. Cycling of NMDA Receptors during Trafficking in Neurons before Synapse Formation. J. Neurosci. 24, 8253–8264 (2004).
      31. Yan, Y.-G. et al. Clustering of surface NMDA receptors is mainly mediated by the C-terminus of GluN2A in cultured rat hippocampal neurons. Neurosci. Bull. 30, 655–666 (2014).
      32. Kussius, C. L. & Popescu, G. K. Kinetic basis of partial agonism at NMDA receptors. Nat. Neurosci. 12, 1114–1120 (2009).
      33. Sun, W., Hansen, K. B. & Jahr, C. E. Allosteric interactions between NMDA receptor subunits shape the developmental shift in channel properties. Neuron 94, 58-64.e3 (2017).
      34. Benveniste, M. & Mayer, M. L. Kinetic analysis of antagonist action at N-methyl-D-aspartic acid receptors. Two binding sites each for glutamate and glycine. Biophys. J. 59, 560–573 (1991).
      35. Lü, W., Du, J., Goehring, A. & Gouaux, E. Cryo-EM structures of the triheteromeric NMDA receptor and its allosteric modulation. Science 355, eaal3729 (2017).
      36. Vyklicky, V., Stanley, C., Habrian, C. & Isacoff, E. Y. Conformational rearrangement of the NMDA receptor amino-terminal domain during activation and allosteric modulation. Nat. Commun. 12, 2694 (2021).
      37. Stroebel, D., Casado, M. & Paoletti, P. Triheteromeric NMDA receptors: from structure to synaptic physiology. Curr. Opin. Physiol. 2, 1–12 (2018).
      38. Borza, I. & Domany, G. NR2B Selective NMDA Antagonists: The Evolution of the Ifenprodil-Type Pharmacophore. Curr. Top. Med. Chem. 6, 687–695 (2006).
      39. Tang, Y. P. et al. Genetic enhancement of learning and memory in mice. Nature 401, 63–69 (1999).
      40. Gonda, S. et al. GluN2B but Not GluN2A for Basal Dendritic Growth of Cortical Pyramidal Neurons. Front. Neuroanat. 14, (2020).
      41. Sceniak, M. P. et al. A GluN2B mutation identified in Autism prevents NMDA receptor trafficking and interferes with dendrite growth. J. Cell Sci. jcs.232892 (2019) doi:10.1242/jcs.232892.
      42. Philpot, B. D. et al. Effect of transgenic overexpression of NR2B on NMDA receptor function and synaptic plasticity in visual cortex. Neuropharmacology 41, 762–770 (2001).
      43. Barria, A. & Malinow, R. Subunit-Specific NMDA Receptor Trafficking to Synapses. Neuron 35, 345–353 (2002).
      44. Platzer, K. et al. GRIN2B encephalopathy: novel findings on phenotype, variant clustering, functional consequences and treatment aspects. J. Med. Genet. 54, 460–470 (2017).
      45. Green, T., Rogers, C. A., Contractor, A. & Heinemann, S. F. NMDA Receptors Formed by NR1 in Xenopus laevis Oocytes Do Not Contain the Endogenous Subunit XenU1. Mol. Pharmacol. 61, 326–333 (2002).
      46. Swanger, S. A. et al. Mechanistic Insight into NMDA Receptor Dysregulation by Rare Variants in the GluN2A and GluN2B Agonist Binding Domains. Am. J. Hum. Genet. 99, 1261–1280 (2016).
      47. Madry, C., Betz, H., Geiger, J. R. P. & Laube, B. Supralinear potentiation of NR1/NR3A excitatory glycine receptors by Zn2+ and NR1 antagonist. Proc. Natl. Acad. Sci. 105, 12563–12568 (2008).
      48. Regalado, M. P., Villarroel, A. & Lerma, J. Intersubunit Cooperativity in the NMDA Receptor. Neuron 32, 1085–1096 (2001).
      49. Durham, R. J. et al. Conformational spread and dynamics in allostery of NMDA receptors. Proc. Natl. Acad. Sci. 117, 3839–3847 (2020).
      50. Vyklicky, V., Stanley, C., Habrian, C. & Isacoff, E. Y. Conformational rearrangement of the NMDA receptor amino-terminal domain during activation and allosteric modulation. Nat. Commun. 12, 2694 (2021).
      51. Kellner, S. et al. Two de novo GluN2B mutations affect multiple NMDAR-functions and instigate severe pediatric encephalopathy. eLife 10, e67555 (2021).
    1. Now, when data is transformed into evidence, when we isolate or distill the features of a data set, or when we generate a visualization or present the results of a statistical procedure, we are not presenting the artifact. These are abstractions. The data itself has an artifactual quality to it. What one researcher considers noise, or something to be discounted in a dataset, may provide essential evidence for another.

      When it comes to data analysis, I usually think of data as a source of information rather than it being a research object by itself. The term “raw data” has been used in all my classes, starting from accounting and finishing with introduction to digital culture and information. Yes, we’ve talked about biases that come up in different data sets, but usually this conversation is related to so-called “post-production” of data – either us, students, using it, or someone else and we reverse engineered where it came from. So, reading about an approach to data, even ‘raw’ data, as a constructed artifact is very refreshing. It’s extremely important to look at how the raw data was collected and what was left out by collectors initially to have a full image of what’s going on.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Major:

      - The statement (line 149'Together, our data suggest that systemic ecdysone levels are unlikely to be involved in modulating tumour-induced muscle detachment or to mediate the role of fatbody Insulin signalling in regulating muscle detachment.') is derived from an experiment with sterol free diet (in which 20HE is genetically addressed) and a pleiotropic experiment (PG>RasG12V). In neither paper nor the current manuscript, 20HE levels have been directly addressed.

      Therefore, this statement needs further experimental support and discussion. Ecdysone is a critical hormone during development and especially growth-related effects central to this study. The authors should consider doing pharmacology or augment their claims here with genetic manipulation experiments of 20HE related genes in larvae (Leopold, Rewitz, Rideout, Drummond-Barbosa, Schuldiner labs) and adult animals using genetics, pharmacology or direct assessment of 20HE levels (RIPA, Edgar and Reiff labs).

      The main point we were trying to convey is that we do not think global ecdysone levels plays a role in modulating fatbody insulin or tgfb signalling, which in turn affects muscle detachment. We are not claiming that edysone levels is not changing in control vs. tumour bearing animals. In fact, we predict that 20HE levels will be different in tumour bearing vs. control animals (as tumour bearing animals undergo developmental delay), but this is not the main point of our conclusions. We believe that our conclusions are supported by the experiment demonstrating global ecdysone alterations (via feeding sterol-free food) did not affect how fatbody Akt activation altered tgfb signalling and enhanced muscle integrity (Figure S1). Therefore, we don’t think measuring 20HE helps to support our conclusions. Pharmacological inhibition via feeding ecdysone inhibitors effectively demonstrate a similar point to feeding sterol-free food which we have already performed. We are happy to try direct manipulation of 20HE related genes (eip75B-RNAi) in the fatbody to see if this affects muscle detachment or pAkt and pMad levels in tumour bearing animals.

      - In Fig.7 the authors used a sog-LacZ stock to show transcriptional activation in fatbody cells. This stock is based on P-element insertion in the according regulatory regions and supposed to express lacZ with an nls. I can clearly see lacZ in nuclei in Fig. 7H, whereas this is very hard to see in nuclei in Fig7i in the tumour model. In addition, lacZ is known for its high stability and not the best option. As this finding is vital for central claims of this study, it should be complemented by either qPCR for sog on fat body cells or using another readout by converting one of the two Mimic lines (BL42189/44958) into GFP sensors for sog.

      We will add a counterstain to these images. We will also perform qPCR in the fatbody of control and cachectic animals to assess whether Sog transcription is altered. We agree converting one of the Mimic lines to a GFP sensor would be a good option, but this experiment would require getting new fly lines into Australia, which takes at least 2 months because of quarantine laws. We don’t believe this experiment would change the general conclusions of the paper, therefore would prefer not to do this experiment.

      - I have similar problems with Fig.7B-F, as phosphorylated Mad should be translocated to the nucleus. In 7F the authors measure pMad over Dapi, which is the right way but it is hard to see pMad in the nucleaus apart from Fig7B, wheras in D and E, where the authors measure higher levels, I cannot identify clear pMad in nuclei. These images either need to show the Dapi channel or more representative images should be chosen like in Fig.4 with arrows pointing to measured nuclei. Fig.7C something went wrong with the compression of this image.

      We will show more representative examples and fix Fig 7C.

      - The proper function of RNAi stocks targeting genes like sog, mad, etc. is vital for this study as these lines are used throughout the study. Functional evidence of specific knockdown efficiency should be provided or references given in which these stocks were shown to provide functional knockdown on transcript or protein level.

      We agree with the reviewer that this is an important point. We will demonstrate the knockdown of sog and mad (and other RNAis) used in the study by either referring to published data or demonstrate knockdown ourselves.

      - Fig.S7 discusses appearance of gbb/Bmp7 and Sog/CHRD in human patients. The analysis the authors performed shows a correlation between both factors, but is hampered by the fact that datasets for peripheral tissues of cachexia patients are unavailable. The authors may consider sorting these after tumor entities in which cachexia occurs frequently vs. low occurrence and then check for both genes.

      We will try this analysis.

      Fig.5 M-P pMAd is not indicated in the Panels only the legend.

      We will fix this error.

      - Please follow FlyBase nomenclature, e.g. dlg1 for discs large 1 and unify in the whole manuscript and figure for all genes.

      We will fix this error.

      - For endogenous fusion proteins like Viking-GFP (e.g. vkg::GFP) choose a format to clearly decipher them from transcriptional readout stocks like sog-lacZ.

      We will fix this error.

      - The quantifications in most figures are quite small with tiny lettering and XY axis are difficult to read in letter/A4 size.

      We will enlarge font size.

      Minor:

      1. Adjust in-figure caption alignments

      2. Line 104: add comma RasV12, dlgRNAi

      3. Line 114: replace little  not significant (n.s.)

      4. Line 334: 'sogRNAi overexpression' to my knowledge, RNAi are expressed, not overexpressed.

      5. Line 454: italicize r4>

      6. Fig S4E: remove frame

      7. Figures 6: It would be better to number and explain the pathway presented in the figure in text and fig legend.

      8. Just a personal preference. Lettering of images in images is commonly done horizontally, here it appears like a mix between vertical and horizontal.

      We will fix these minor errors.

      Reviewer #2

      Major comment

      Their genetic experiments clearly showed that the reduction of insulin signaling activity in the fatbody induces upregulation of TGF-β signaling and Collagen accumulation. Then, how does TGF-β signaling induce Collagen accumulation?

      From the experiments we have carried out, we do not have insights into how TGF-B signalling induce Collagen accumulation.

      They showed that Rab10 knockdown and SPARC overexpression reduced the accumulation of fatbody ECM. Are Rab10 and SPARC expression regulated by TGF-β signaling?

      We can address this point by assessing if Rab10 and SPARC expression is altered in cachectic fatbody.

      Minor comments

      Line 90: "Disc Large (Dlg) RNAi in the eye" must be "Discs Large (Dlg1) RNAi in the eye imaginal discs".

      we will fix this error.

      Figures 1D and 1L are from the same image. Also, Figures 1C and 1M are from the same image. Are both of them necessary to be shown in the different panels?

      The duplication of 1C and 1M, was an error, we thank the reviewer for picking this up. We will fix this error. We will use different images for 1D and 1L.

      Why are the staining patterns of anti-pAkt shown in Figures 1L and 1U so different? pAkt is not detected in the nuclei in Fig. 1L but its nuclear signal is clear in Fig. 1U.

      We will show more representative images of these staining.

      Figure 1: Images of counter staining for nuclei like DAPI should be also included for all these fatbody images.

      We will show counter staining for DAPI.

      Line 101: "Tumour specific ImpL2 inhibition was sufficient to reduce fatbody pAkt levels." Is this correct? ImpL2 inhibition in tumors should elevate the pAKT level in fatbody.

      This was a mistake, we will fix this error.

      Figure S1~S4: These figures and their legends do not correspond to each other. We thank the reviewer in picking up this error, there was an error in inserting the images into the text. S2 and S3 were swapped.

      We will fix this error.

      Line 189: The pAkt level in the muscle of tumour-bearing animals should be examined to confirm the activity of the insulin signaling is downregulated.

      We will include this data.

      Line 189: If the authors conclude that muscle insulin signaling predominantly regulates translation and atrophy, OPP assay for the muscle cells should be examined in the same experimental settings.

      We will carry out OPP assay upon Akt overexpression in the muscle.

      Line 247: The expression level of Rab10 and SPARC should be examined in the fatbody of tumour-bearing animals to see whether Rab10 is upregulated and SPARC is downregulated.

      Line 247: If Rab10 upregulation and SPARC downregulation are the causes of the accumulation of ECM proteins in the fatbody of tumour-bearing animals, how the overexpressed Collagen proteins can be secreted from the fatbody cells?

      We are not sure, but the overexpression of Collagen proteins is at an extremely high level, therefore, it is possible that some of it can be processed and secreted despite Rab10 upregulation and SPARC downregulation. We have carried out an experiment to overexpress Collagen proteins in the muscle, in this case, this manipulation did not rescue. This indicates that processing of Collagen in the fatbody is important, however, we do not know how the processing is regulated.

      Line 347: Sog is a secreted BMP antagonist. Thus, it can be expected that the Sog overexpression downregulates TGF-β signaling in fatbody and muscle tissues. If the rescued phenotypes with Sog overexpression can be explained by this logic, pMad level should be examined in these experiments.

      We have shown this data in Figure R-T. We will refer back to this data in Line 347.

      Reviewer #3

      Major comments:

      - Are the key conclusions convincing?

      Most of the conclusions are convincing. It is not clear however whether the ECM accumulation in the fat body of tumor animals is fibrotic and whether it is extracellular or in the cell cortex.

      - Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      -The authors state in line 71 'This deposition of disorganized ECM leads to fibrotic ECM

      accumulation.' The authors haven't really provided evidence for the ECM being fibrotic. The authors could either rephrase this or provide additional experimental evidence of fibrosis in the fat body.

      We will tone down the claim that the ECM accumulation is fibrotic.

      - Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      -The authors state in line 147" Finally, in tumor-bearing animals fed a sterol-free diet, that underwent a prolonged 3rd instar stage due to reduced ecdysone levels (Parkin and Burnet, 1986), we activated insulin signalling in the fatbody via Akt overexpression (QRasV12, scribRNAi). We found that this manipulation caused a significant decrease in pMad levels in the fatbody and a rescue of muscle detachment (Figure S1 D-I), similar to animals fed a standard diet (Figure 1 O-Q, Figure 2 F-H)." Since it's not already known what the extent of muscle integrity defect there is in tumors with additional sterol free diet, it would be important to show a non-tumor control for comparison in FigS1F. This would also then make it clear to what extent the defect is rescued by Akt overexpression.

      We will include a non-tumour control for Fig S1F.

      -The authors state in line 158 'Upon the knockdown of Impl2, we found that tumor gbb was not significantly altered (Figure S3A).' Even though this shows an indication that Gbb levels are not reduced, the n number is too low to state that it is non-significant. The authors should increase the n number here.

      N=3 is generally enough to see a difference, we will include data done in parallel which shows Impl2 RNAi is sufficient to induce a reduction in Impl2 RNA levels. This will demonstrate that n=3 is sufficient to demonstrate a reduction in transcript levels if there is a reduction.

      -The authors state in line 171 'Conversely, knockdown of gbb alone or knockdown of gbb together with ImpL2 significantly rescued the Nidogen overaccumulation defects observed at the plasma membrane of fatbody from tumor-bearing animals, while ImpL2RNAi alone did not (Figure S2 Q-U).' This is a somewhat misleading representation, since again no non-tumor control was used, so the extent of the rescue by gbb knowdown is not obvious. In FigS2P Nidogen levels in the tumor seem ~100% higher than in control. But in FigS2U, in which no control was included, the tumor+gbb knowdown seems ~ 20% lower than tumor. So it is probably a more moderate rescue, but that's only possible to assess by including a non-tumor control in FigS2U. Also the images in FigS2Q-T don't seem representative since they appear to show a much bigger difference in fluorescence intensity than ~20%. Please show more representative images.

      We will include a non-tumour control for S2Q-T and show more representative pictures.

      -The authors state in line 174 'Finally, co-knockdown of gbb and ImpL2 in the tumor significantly rescued the reduction in OPP and Nidogen levels observed in the muscles of tumor-bearing animals (Figure S3 B-I).'

      Again, the single knockdowns and the non-tumor control are not shown in FigS3E and I and should be included for comparison and to see the contribution of each knockdown and to be able to judge the extent of the rescue.

      We will include the single knockdowns and a wildtype control

      -Regarding Fig3O: Is there a significant tumor muscle attachment defect here? In this graph the tumor only looks about 10% lower than the WT (rather than 40% in Fig2E). The other issue is the extremely low n number for WT. I would recommend increasing the n number for WT here and to indicate in the graph whether the tumor is significantly different to WT (or non-significant, in which case RabRNAi wouldn't actually 'rescue' the defect). In the present form, this graph is not very convincing.

      We will increase the n number for WT for this experiment. The reduction in muscle detachment is 10% rather than 40% here is because this experiment was done at day 6, which we will indicate in the figure legend. The 40% reduction in Fig2E is because these samples were processed at day7. Rab10RNAi experiment was carried out at day 6, because by day7, the Rab10RNAi rescue is so good, most of the tumour bearing animals have pupated, thus the experiment could only be carried out at day6.

      - Regarding Fig3W: A non-tumor control would be important to include to be able to judge the extent of muscle attachment defects and the extent of the rescue for UAS-Sparc. This will allow to assess the severity of muscle integrity defect in this particular experiment (since it appears to vary in different experiments e.g. muscle defect in tumor 40% in Fig2E and ~10% in Fig3O) and to assess the extent of rescue for the various genotypes.

      We will include a non-tumour control for 3W.

      -The authors show an accumulation of ECM in the fat body of tumors. It is not clear, whether this ECM accumulates intracellularly near the cell surface or extracellularly. The authors should assess this, maybe by doing electron microscopy.

      We do not have an EM facility that can accommodate this experiment, thus doing EM is not an option for us. However, we can address whether the accumulation of ECM is intracellular or extracellular by performing an experiment, where we try perform antibody staining against Viking-GFP without permeabilizing the cells. If Viking is detected without permeabilization, it would indicate the accumulations are extracellular. This approach has been previously used to address this question in Zang et al., elife, 2015.

      - Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      -These suggested experiments should be quite straightforward since they are mostly just repeating previous experiments with the appropriate controls and n numbers. I would think that they can be done within a few months. The electron microscopy should not take more than a few weeks and not be costly.

      - Are the data and the methods presented in such a way that they can be reproduced?

      -The details on how old animals used in each experiment were, are not easy to find and not written very clearly. They should be included in the each figure legend rather than summarising those details in the methods.

      We will add the number of days in the figure legend.

      -Also, in line 788 in the methods, several stocks are indicated as coming from particular labs (e.g. UAS-FOXO (Kieran Harvey), UAS-GFP (Kieran Harvey), UAS-lacZRNAi (Kieran Harvey), UAS-RasV12 (Helena Richardson), UAS-cg25C;UAS-Vkg (Brian Stramer)).

      However, it is not clear whether these labs actually made these stocks and if so whether it has already been described in their papers how the lines were made. If the lines are unpublished, the detailed information should be given on how the lines were made. Or if the lines are published, the authors should provide the reference.

      We will fix these references.

      - Are the experiments adequately replicated and statistical analysis adequate?

      In general, the n number is rather low in several experiments, especially n of 3 for many controls. And as I mentioned before, rescues of tumor phenotypes are often shown without including a non-tumor control, making it hard to judge the extent of the rescue. Sometimes this information can be found in other figures, but the reader should not have to search for it. And also the severity of the phenotype can vary from experiment to experiment.

      We will include a non-tumour control when appropriate to address this.

      Minor comments:

      - Specific experimental issues that are easily addressable.

      - Are prior studies referenced appropriately?

      Yes, as far as I can tell.

      - Are the text and figures clear and accurate?

      -In the literature, people usually call it 'fat body' rather than 'fatbody'.

      We will fix this error.

      -The authors state in line 265 "Vkg accumulated in the membranes of fatbody where p60 was overexpressed using r4-GAL4 (Figure 5 A-C)."

      This must be a typo. I think it is shown in Fig5E-G. Unless it's labelled wrongly in the figure and B, C and D show p60 rather than TorDN.

      We will fix this error.

      -The authors state in line 188 'This manipulation significantly rescued muscle integrity (Figure S4 A-C) and muscle atrophy (Figure S4 D-F), without affecting muscle ECM levels (Figure S4 G-H).' According to the graph in FigS4H this does actually 'affect muscle ECM levels' significantly, as in that it reduced Nidogen levels further. The authors could rephrase this.

      We will reword this statement.

    1. Author Response

      The following is the authors’ response to the original reviews.

      This important work reports the identification of a list of proteins that may participate in the clearance of paternal mitochondria during fertilization, which is known as essential for normal fertilization and embryonic and fetal development. While the main method used is state of the art and the supporting data are solid, the vigor of the biochemical assays and function validation is inadequate. This work will be of interest to developmental and reproductive biologists working on fertilization. Key revisions (for the authors) include 1) Use a mitochondria-enriched fraction instead of whole sperm for the assays, and add more control samples to monitor what got lost during sperm and oocyte treatments before the coincubation step. 2) Functional validation of the key proteins identified.

      We thank Editors of eLife, as well as Special Issue Guest-Editors and Reviewers for a favorable assessment and helpful recommendations for key revisions. Provisional revisions included in our revised article are detailed below. We agree with Editors’ comment about the use of mitochondrion enriched fractions and additional functional validation of key proteins. In fact, we are developing experimental protocols for oocyte extract coincubation with isolated sperm heads and tails, and eventually with purified mitochondrial sheaths, to separate the ooplasmic sperm nucleus remodeling factors from the mitophagic ones. Such experiments, as well as functional validations using porcine zygotes are contingent upon anticipated post-pandemic rebound in the availability of porcine oocytes, obtained from ovaries harvested on slaughterhouse floors, requiring currently unavailable workforce which has hampered our access to this necessary resource.

      Reviewer #1 (Peer Review):

      Could the authors make clear how much the presented pictures reflect the described localisation? There is no information on the number of spermatozoa and embryos observed nor the fraction of these embryos showing the presented pattern of localisation. This must be included.

      Two hundred spermatozoa were counted per replicate of the cell-free system co-incubation and 20 zygotes per replicate, with 3 replicates of immunolabelling for each phase/picture which were examined to establish the typical localization patterns that were observed. The displayed patterns were observed in 65 to 88% of examined spermatozoa/zygotes; varying dependent on protein, replicate, and phase of immunolabelling. In all cases, the signal displayed is the typical pattern that was displayed in most cells. This information has been added to the Materials and Methods section for clarification.

      It is not clear if the authors also examined the localization of other proteins and obtained a different pattern than anticipated from the proteomic approach or if they only tested these 6 proteins and got a 100% of correlation.

      These are the 6 proteins which were selected based on extensive literature review into known functions of all identified proteins, as well as extensive research into available and reliable antibodies to detect such proteins within our porcine systems. Even so, no particular localization patterns were anticipated; instead, we presented the patterns actually observed and even some patterns which defied our expectations (i.e., the localization of BAG5 in the sperm acrosome).

      The authors use "MS" in the text to indicate "mitochondrial Sheath" and "Mass spectrometry". this is confusing.

      The authors agree and the usage of MS as an acronym for either has been removed entirely to avoid confusion.

      In the introduction the author refers to Ankel-Simons and Cummins, 1996 as a reference for the number of sperm mitochondria in mammalian species, this is incorrect since the quoted paper is about the number of mtDNA molecules and mentioned an earlier publication.

      This has been revised and the appropriate citation has been used.

      Reviewer #2 (Peer Review):

      Major:

      1) It has been proved from the earlier studies from this group that the porcine cell-free system is useful to observe spermatozoa interacting with ooplasmic proteins in a single trial and could recapitulate fertilization sperm mitophagy events that take place in a zygote without affecting later cell-division process. However, the post-fertilization sperm mitophagy process is a complex time-associated event that many processes that occur sequentially and interactively, which means ooplasmic proteins might be involved in this process but may not directly interact with sperm or may associate with sperm-ooplasmic protein complex at different time points. It is certainly a great advance already in knowledge to identify "the candidate players" from the list of 185 proteins; however, with the time-resolution (4 and 24hr) in the current study and without functional validation experiments at this stage, it is still difficult to postulate the importance of these identified proteins. The functional validation experimental designs, in my opinion, is critically important for better interpretation of the data.

      The authors agree with this reviewer’s sentiments and do plan to conduct further functional analysis. This project was able to generate a list of candidate, sperm-mitophagy promoting proteins and we were further able to show that many of these proteins were detectable both via mass spectrometry and via immunocytochemistry in spermatozoa exposed to our cell-free system. Furthermore, similar localization patterns were found in spermatozoa that were detected within newly fertilized zygotes. These results boost our confidence in our cell-free system and show that our list of candidate proteins is truly a useful list for future localization and functional analyses. We are certainly aware that we have not captured every protein that may play a role in post-fertilization sperm mitophagy and that the proteins captured are just candidates until proven otherwise. Likewise, we have almost certainly captured multiple proteins that are currently candidates that will likely not be shown to play a role in postfertilization sperm mitophagy, while it is plausible that at least some of these candidate proteins do play a role in mitophagy and some of them likely participate (perhaps have yet to be described roles) in other fertilization events, in which we would be extremely interested in as well.

      2) As shown in Figure 1, whole sperm was used in the co-incubation and the later MS analysis; thus, proteins identified in the current study might be relevant in fertilization processes other than postfertilization sperm mitophagy, as proteins identified in the current study may be associated with other parts of the sperm (e.g. sticky sperm head, e.g. PSMG2 associated with sperm midpieces, tail at 4hr coincubation, but then only associate with sperm head at 24hr co-incubation) rather than sperm midpiece, despite the fact that authors applied immunohistochemistry to show the localization of this protein, but the evidence is indirect, so how authors functionally differentiate these 6 identified proteins from sperm mitophagy process with other processes and to confirm (or to associate) the relevance of these proteins with sperm mitophagy process?

      The authors agree that the 6 proteins which were further studied by using immunocytochemistry may be playing roles in other processes such as pronuclear formation. We discussed some potential roles including and beyond post-fertilization mitophagy, in the Supplemental Discussion. After reviewer comments, we moved the Supplemental Discussion back in the main Discussion section. Thus, this section now considers additional putative pathways in which the said 6 proteins cold participate, though we concede that thorough functional studies must still be performed.

      3) Class 3 proteins were present in both the gametes or only the primed control spermatozoa, but are decreased in the spermatozoa after co-incubation, which authors interpreted as sperm-borne mitophagy determinants and/or sperm-borne proteolytic substrates of the oocyte autophagic system, this data categorization may need to be revised as sperm-borne proteolytic substrates of the oocyte autophagic system only, not for sperm borne mitophagy determinants. The argument for this disagreement is due to the fact that if the protein is a sperm-borne mitophagy determinant, after coincubation, to execute the mitophagy process, this protein should still be associated with the sperm at least at the early stage (of 4hr) (constant under MS detection when comparing control with 4hr treated) rather than being released from the sperm. Or alternatively, they could result in class 3 proteins (but not all those 6 were in class 3). Nevertheless, if these proteins serve as substrates, they can be used (consumed) and show decreased under MS detection.

      This argument for redefining the Class 3 proteins more accurately is understood and we agree. The definition is revised in the paper.

      4) Of particular interest among the 6 proteins that were further investigated. Unlike other proteins, MVP was highly significant (p<0.001) after 4hr incubation, but the significance became less after 24hr (p=0.19). Interpretation of this dynamic change in the relevance of the mitophagy process would facilitate the readers to understand the relevance and the role of MVP.

      The differences in significance are likely influenced by the abundance of MVP detectable by mass spectrometry. As the time of cell-free system incubation increases, the variability between replicates also seemed to increase, likely due to the sustained proteolytic activity taking place in our system. This work was based on three replicates of mass spectrometry for each time point; additional replicates likely would have reduced the p-value for the 24hr cell-free data set, for MVP and potentially other proteins also. At both time points, MVP was only detectable in spermatozoa after they had been exposed to the cell-free system treatment which is the criteria that truly interested us more than the actual differences in content between the timepoints and is why it was added to our list of candidate proteins.

      5) In figure 3, the association of ooplasmic MVP to sperm midpiece is not convincing enough as sperm midpiece and tail often show some levels of non-specific signals under fluorescent microscopy. And the dynamic association of ooplasmic MVP to sperm midpiece in Fig. 3F-G is difficult to reach a conclusion solely based on data presented in the manuscript. Additional negative control of sperm MVP staining from the primed and treated sperm would be helpful. Additionally, a quantitative comparison (15 vs 25hr) of sperm-associated MVP signals from the fertilized embryo or a stack image from different angles would clarify the doubts raised here.

      For all images and all replicates, serum controls were also generated. These controls were then viewed under fluorescent microscope, and light intensities and exposures thresholds for each fluorescent light channel were set based on the background intensity that came from these nonimmune serum-treated control samples. We set our light intensity/acquisition time below a threshold where the non-specific signal began to appear. All the presented patterns are based on setting this peak intensity threshold and as such the signal we see should be the true signal. Furthermore, 200 spermatozoa were counted per treatment per replicate of the cell-free system co-incubation and 20 zygotes per replicate, with 3 replicates of immunolabelling for each protein and data point, which was used to represent the typical localization patterns that were observed. The displayed patterns were observed between in 65- 88% of examined spermatozoa/zygotes. Invariably, the signal displayed in the manuscript is the typical pattern that was seen in a majority of cells. This information has now been added to the Materials & Methods section for clarification.

      6) Same concerns for the other 5 proteins (PSMG2, PSMA3, FUNDC2, SAMM50, BAG5) as indicated above.

      See response to Question 5.

      7) The patterns of these 6 proteins under the immunofluorescent study are confusing as the pattern varies after co-incubation (treated), and mostly, the signal of these proteins observed from the fertilized embryos is not really associated with sperm midpieces. Therefore, the evidence of these proteins involving in post-fertilization sperm mitophagy is, at this moment, weak based on the data presented. But the relevance of these proteins in events post-fertilization or early embryo development is certainly (evidence did not strong enough to support "sperm mitophagy," in my opinion).

      The authors agree that some of these proteins seem to be playing roles beyond postfertilization sperm mitophagy and that there is a need for true functional studies before the authors can state with certainty that these proteins play a role in any of the discussed fertilization events. We state this in the discussion: “Considering the dynamic proteomic remodeling of both the oocyte and spermatozoa which takes place during early fertilization, these 185 proteins which have been identified likely play roles in processes beyond sperm mitophagy.” It should be noted that the authors went into greater detail about potential alternative protein functions based on the present data and literature review in the Supplemental Discussion. Based on this comment and other reviewer comments we have now included the Supplemental Discussion as part of the main Discussion section, and this will hopefully help clarify some of the authors’ thoughts about the 6 candidate proteins which were further analyzed during this study.

      Minor:

      1) To my understanding, statistical significance (relevance) is normally set at a p-value of either <0.1 or 0.05. The reason for loosening the p-value of 0.2 in the current study needs to be justified as this was not a common statistical criterium, and the interpretation of those candidates from this loosened criterium should also be careful.

      The loosening of statistical relevance in this study to 0.2, only applied to our Class 1 proteins. This is because for a protein to fall into the Class 1 proteins it was a protein that was only present in samples after they were exposed to the cell-free system. In the case of these Class 1 proteins, this happened for all 3 replicates at each stated timepoint. We found this pattern of detection to be important whether the p-value fell under 0.1 or 0.2. As such, we loosened our statistical threshold for our Class 1 proteins. Any proteins added to our candidate list will be subject to further investigation before definitive conclusions can be drawn, and as such we think that capturing more proteins was more important for the goals of this study than limiting the number of proteins captured, especially for those Class 1 proteins. An explanation of this has been added to the Materials & Methods section Mass Spectrometry Data Statistical Analysis.

      2) First cell cleavage of porcine embryo normally occurs within 48hr post-insemination or activation; therefore, the 4 and the 24hr time points used in the current study require justification included in the discussion or methods and material section.

      First cleavage of porcine embryos normally occurs around 24 - 28 hours post-insemination. Thus, for both the cell-free system and the embryo studies we were capturing an advanced 1 cell stage zygote/zygote like system with our 24 hour and 25-hour time points.

      3) In figure 2, colors used in different time points and in two different classes represent (sometimes) different protein categories, would be easier for the readers for quick comparisons if the same color could be used to represent the same protein category throughout the graph. (E.g, proteins for early zygote development are shown in red in "A", but blue in "B")

      This has been corrected and the color scheme for Figure 2 has been revised for easier comparisons.

      Reviewer #3 (Peer Review):

      I am not used to seeing a supplementary discussion in a manuscript. I also believe it should be incorporated into normal discussion.

      The Supplemental Discussion has been incorporated into the main Discussion now.

      It would be very helpful to make an additional figure in which the proposed interactome of identified factors with the sperm mitochondria before and after incubation are drawn schematically and also which factors are not IDed in both cases (when comparing to somatic mito- or autophagy). This eases to get through the discussion and will beautifully summarize and illustrate the importance and progress that the authors have made with this assay.

      We made a diagram that depicts the changes in protein localization patterns overtime within our cell-free system. This diagram has been added to the manuscript as Figure 9.

      Reviewer #1 (Public Review):

      In this manuscript, the authors used an unbiased method to identify proteins from porcine oocyte extracts associated with permeabilised boar spermatozoa in vitro. The identification of the proteins is done by mass spectrometry. A previous publication of this lab validated the cell-free extract purification methods as recapitulating early events after sperm entry in the oocyte. This novel method with mammalian gametes has the advantage that it can be done with many spermatozoa at the time and allows the identification of proteins associated with many permeabilised boar spermatozoa at the time. This allowed the authors to establish a list of proteins either enriched or depleted after incubation with the oocytes extract or even only associated with spermatozoa after incubation for 4h or 24h. The total number of proteins identified in their test is around 2 hundred and with very few present in the sample only when spermatozoa were incubated with the extracts. The list of proteins identified using this approach and these criteria provide a list of proteins likely associated with spermatozoa remnants after their entry and either removed or recruited for the transformation of spermatozoa-derived structures. Using WB and histochemistry labelling of spermatozoa and early embryos using specific antibodies the authors confirmed the association/dissociation of 6 proteins suspected to be involved in autophagy.

      While this unique approach provides a list of potential proteins involved in sperm mitochondria clearance it's (only) a starting point for many future studies and does not provide the demonstration that any of these proteins has indeed a role in the processes leading to sperm mitochondria clearance since the protein identified may also be involved in other processes going-on in the oocyte at this time of early development.

      We thank reviewer 1 for positive comments. We added a sentence in Discussion addressing the obvious shortcoming of present study, as further functional validations of candidate mitophagy factors are planned.

      Concerning the localisation of the 6 proteins further analysed, the authors must add how much the presented picture represents the observed patterns. They must include the details on the fraction of spermatozoa and embryos displaying the presented pattern.

      We now specify that the patterns depicted in manuscript are typical and representative of data from at least three replicates of immunolabeling in spermatozoa and zygotes. For each of these replicates, 200 spermatozoa were examined per replicate of the cell-free system co-incubation or 20 zygotes per replicate. The displayed patterns were observed between 65-88% in examined spermatozoa/zygotes. Invariably, the signal displayed in manuscript is the typical pattern that was seen in a majority of cells. This information has now been added to the Materials & Methods section for clarification.

      Reviewer #2 (Public Review):

      Mitochondria are essential cellular organelles that generate ATPs as the energy source for maintaining regular cellular functions. However, the degradation of sperm-borne mitochondria after fertilization is a conserved event known as mitophagy to ensure the exclusively maternal inheritance of the mitochondrial DNA genome. Defects on post-fertilization sperm mitophagy will lead to fatal consequences in patients. Therefore, understanding the cellular and molecular regulation of the postfertilization sperm mitophagy process is critically important. In this study, Zuidema et. al applied mass spectrometry in conjunction with a porcine cell-free system to identify potential autophagic cofactors involved in post-fertilization sperm mitophagy. They identified a list of 185 proteins that might be candidates for mitophagy determinants (or their co-factors). Despite the fact that 6 (out of 185) proteins were further studied, based on their known functions, using a porcine cell-free system in conjunction with immunocytochemistry and Western blotting, to characterize the localization and modification changes these proteins, no further functional validation experiments were performed. Nevertheless, the data presented in the current study is of great interest and could be important for future studies in this field.

      We thank reviewer 2 for positive comments. As we explain in our response to Editors and Reviewer 1, further validation studies will be resumed once the availability of slaughterhouse ovaries for such studies improves. Examples of such functional validation of pro-mitophagic proteins SQSTM1 and VCP are included in our previous studies (DOI: 10.1073/pnas.1605844113 and DOI: 10.3390/cells10092450) that led to the development of cell-free system reported here, and are cited in present study.

      Reviewer #3 (Public Review):

      In this manuscript, a cytosolic extract of porcine oocytes is prepared. To this end, the authors have aspirated follicles from ovaries obtained from by first maturing oocytes to meiose 2 metaphase stage (one polar body) from the slaughterhouse. Cumulus cells (hyaluronidase treatment) and the zona pellucida (pronase treatment) were removed and the resulting naked mature oocytes (1000 per portion) were extracted in a buffer containing divalent cation chelator, beta-mercaptoethanol, protease inhibitors, and a creatine kinase phosphocreatine cocktail for energy regeneration which was subsequently triple frozen/thawed in liquid nitrogen and crushed by 16 kG centrifugation. The supernatant (1.5 mL) was harvested and 10 microliters of it (used for interaction with 10,000 permeabilized boar sperm per 10 microliter extract (which thus represents the cytosol fraction of 6.67 oocytes). The sperm were in this assay treated with DTT and lysoPC to prime the sperm's mitochondrial sheath. After incubation and washing these preps were used for Western blot (see point 2) for Fluorescence microscopy and for proteomic identification of proteins.

      Points for consideration:

      1) The treatment of sperm cells with DTT and lysoPC will permeabilize sperm cells but will also cause the liberation of soluble proteins as well as proteins that may interact with sperm structures via oxidized cysteine groups (disulfide bridges between proteins that will be reduced by DTT).

      This is certainly a possibility, the lysoPC and DTT permeabilization steps were designed to mimic natural processing (plasma membrane removal and sperm protein disulfide bond reduction), which the spermatozoa would undergo during fertilization. However, we do realize that this is a chemically induced processing and thus is not a perfect recapitulation of fertilization processes. However, in this study and in previous studies with this system, we were able to show alignment between proteomic interactions taking place in the cell-free system and within the zygotes.

      2) Figure 3: Did the authors really make Western blots with the amount of sperm cells and oocyte extracts as the description in the figures is not clear? This point relates to point 1. The proteins should also be detected in the following preparations (1) for the oocyte extract only (done) (2) for unextracted nude oocytes to see what is lost by the extraction procedure in proteins that may be relevant (not done) (3) for the permeabilized (LPC and DTT treated and washed) sperm only (not done) (4) For sperm that were intact (done) (5) After the assay was 10,000 permeabilized sperm and the equivalent of 6.67 oocyte extracts were incubated and were washed 3 times (or higher amounts after this incubation; not done). Note that the amount of sperm from one assay (10,000) likely will give insufficient protein for proper Western blotting and or Coomassie staining. In the materials and methods, I cannot find how after incubation material was subjected to western blotting the permeabilized sperm. I only see how 50 oocyte extracts and 100 million sperm were processed separately for Western blot.

      The authors did make Western blots with the number of spermatozoa and oocytes stated in the materials and methods, a total protein equivalent of 10 to 20 million spermatozoa (equivalent to ~20-40 µg of total protein load) and 100 MII oocytes (equivalent to ~20 µg of total protein load). These numbers have been corrected in the Materials & Methods. Also, we did find in the Materials & Methods section that the Co-Incubation of Permeabilized Mammalian Spermatozoa with Porcine Oocyte Extracts section refers to using cell-free exposed spermatozoa for electrophoresis; however, for none of the presented Western blot work was this true. Rather, all of the presented Western blots as per their descriptions are utilizing ejaculated or capacitated sperm or oocytes. This line has been removed from the Materials & Methods to reduce confusion.

      Regarding preparation (2), we have previously assessed the difference between oocyte extract and intact oocytes in this manner internally and we are certainly losing proteins due to the oocyte extraction process. We make caveats in this vein throughout the article such as: “Furthermore, this cell-free system while useful does not perfectly capture all the events which take place during in vivo fertilization. The cell-free system is intended to mimic early fertilization events but is presumably not the exact same as in vitro fertilization.”

      3) Figures 4, 5, 6, 7, and 8 see point 2. I do miss beyond these conditions also condition 1 despite the fact that the imaged ooplasm does show positive staining.

      For all the presented Western blots, the tissue type is stated in the image description and the protocol which was used to prepare these samples is stated in the Materials & Methods.

      4) These points 1-3 are all required for understanding what is lost in the sperm and oocyte treatments prior to the incubation step as well as the putative origin of proteins that were shown to interact with the mitochondrial sheath of the oocyte extract incubated permeabilized sperm cells after triple washing. Is the origin from sperm only (Figs 5-8) or also from the oocyte? Is the sperm treatment prior to incubation losing factors of interest (denaturation by DTT or dissolving of interacting proteins preincubation Figs 3-8)?

      The authors understand that there are proteins and interactions lost on both sides of the cellfree system equation and we have added a sentence to the Discussion to caveat this limitation in the system.

      5) Mass spectrometry of the permeabilized sperm incubated with oocyte extracts and subsequent washing has been chosen to identify proteins involved in the autophagy (or cofactors thereof). The interaction of a number of such factors with the mitochondrial sheath of sperm has been shown in some cases from sperm and others for an oocyte origin. Therefore, it is surprising that the authors have not sub-fractionated the sperm after this incubation to work with a mitochondrial-enriched subfraction. I am very positive about the porcine cell-free assay approach and the results presented here. However, I feel that the shortcomings of the assay are not well discussed (see points 1-5) and some of these points could easily be experimentally implemented in a revised version of this manuscript while others should at least be discussed.

      We agree that the use of a mitochondrial-enriched subfraction for further analysis would be interesting and useful. We are actively developing experimental protocols for oocyte extract coincubation with isolated sperm heads and tails, and eventually with purified mitochondrial sheaths. However, such experiments are contingent upon our access to porcine oocytes, which has continued to be a struggle since the COVID-19 pandemic compromised our ability to attain oocytes in large, cheap, and reliable quantities. This was a continuous problem with preparing materials for this very paper and has continued to be an issue for our laboratory as well as many others at our university and across the country. We continue to maximize oocytes every time we can get access to them, but the unfortunate reality is that this access has become sparce and unreliable over the past three years.

    1. Costanza-Chock explains that we should be designing algorithms that are just.45 This means shifting from the ahistorical notion of fairness to a model of equity.

      This reminds me of a metaphor my high school used to properly explain the difference between equality and equity. Let's say there's a fence, and on the other side is a baseball game that you and your friend are trying to peek over and watch. You each get a box to stand on, and now you can see over the fence! Your friend, however, is shorter than you and still can't reach. Although you may have the same box to stand on (equality), in order to get the same opportunity to watch the game you have to put effort into making sure that everyone actually receives that truly equal opportunity, e.g. another box for your friend.

      Costanza-Chock's example of college admissions to explain equality vs. equity also make me think about what kinds of digital barriers exist in place to prevent restorative justice. Issues such as accessibility, class, and status keep coming up for me and now I'm wondering: How does class background influence the attempts made by digital humanities scholars who try to perform this restoration?

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer 1 (Recommendations For The Authors):

      1) The strikingly different conclusion from the previous Bourane study seems to stem from the experimental approaches. Rather than using genetic crosses that target all neurons from the hindbrain and spinal cord that express Npy at any point in development, Boyle et al target their manipulations specifically to the lumbar region of the superficial dorsal horn in adult mice using direct viral injections. Thus, Boyle is almost certainly manipulating much fewer neurons that the original study. How then is their behavioral effects so much greater? At the minimum, the authors need to discuss this discrepancy head on. Better would be a direct molecular/anatomical comparison of the neurons targeted by each approach. This could be done using Nyp-Cre mice crossed to a Rosa-LSL-reporter strain and quantifying the overlap with the same markers used here. Perhaps, the intersectional approach with Lbx1 resulted in labeling of a different population of neurons than the adult AAV injections? Although likely outside the scope, given this work directly questions the main conclusion of the Bourane paper, it will be important to see a replication of the original finding of selectivity to mechanical itch.

      We agree that our approach should be manipulating a smaller population of neurons, and that it is therefore suprising that we see greater behavioural effects. Please see our response to "Weakness 1" of Reviewer 2 for consideration of this point. We have already provided a direct molecular comparison as requested by the reviewer, and this appears in Figure 1 supplement 1. Here we used tissue from NPY::Cre that had been crossed with Ai9 mice (i.e. a Rosa-LSL-reporter) and had received intraspinal injections of AAV.flex.GFP. We then characterised the neurochemistry of tdTomato+ cells that were GFP+ or GFP-negative.

      2) The authors state that, "91.6% ± 0.3% of cells classed as Cre-positive cells were also Npy-positive, and these accounted for 62.1% ± 0.6% of Npy-positive cells" If I am reading this correctly, does that mean that 40% of the Npy+ cells are Cre negative? If so, how is this possible?

      This interpretation is correct. For quantification of RNAscope data we used a cut-off level of 4 transcripts, and cells with fewer than 4 transcripts were classed as negative. It is likely that some of the NPY cells classified as negative for Cre would have had some Cre mRNA (sufficient to cause recombination), but at a level below this threshold. It is also possible that some NPY+ cells would fail to express Cre, since this is a BAC transgenic mouse, rather than a knock-in.

      3) Similarly, the authors state that "great majority of FP-expressing neurons in laminae I-III were immunoreactive (IR) for NPY (78.5% ± 3.6%), and these accounted for 74.6% ± 109 1.9% of the NPY-IR neurons in this area". So does this mean 20% of the recombination is non-specific/in other cell types that could be involved in pain/itch sensation?

      Our finding that 91.6% of cells with Cre mRNA were also positive for Npy mRNA (see above) indicates that Cre expression was largely restricted to NPY cells. The failure to detect NPY peptide in some of these cells probably results from the relatively low level of peptide seen in the cell bodies of peptidergic neurons, which results from the rapid transport of peptides into their axons.

      4) Comparing Fig 3B and Fig4B it seems the control baseline von Frey responses are different. In fact, baseline response in Fig4b is quite like the CNO effect in Fig 3B. Unless I'm misunderstanding something, this seems quite odd?

      We agree that there is a difference between the baseline responses. We are not aware of any particular reason for this, and we think that it reflects a degree of variability that is seen with the von Frey test. Interestingly, the baseline values for the SNI cohort (Fig 4E) lies between the values in Fig 3B and Fig 4B.

      5) In Fig 4E, the behavior of the CNO treated mice is quite variable. Can the authors comment as to how this might be happening? Does the effect correlate with viral transduction?

      We did not see any obvious correlation between the extent of viral transduction and the behaviour of individual mice.

      6) Fig6, the PDyn-Cre experiment, is a bit of a non sequitur?

      Please see our response to "Weakness 2" of Reviewer 2 for consideration of this point.

      7) The conclusion is unusually long. I recommend trimming it to make it more concise.

      We presume that this refers to the Discussion. However, this was ~1550 words, and we do not feel that that is unusually long.

      Reviewer 2 (Public Review):

      Weaknesses

      1) There is inadequate discussion about previous studies of NPY interneurons. Specifically, the authors should address why a more restricted subset of these neurons (this study) have broader effects than seen previously.

      We have expanded the discussion on the discrepancies between our findings and those reported previously. We state at the outset that we are targeting a more restricted population (lines 509-10), and we now go into more detail concerning both similarities and differences between our findings and the reasons that we think may underlie any discrepancies (various changes between lines 522-575).

      2) I cannot see the reason for including results from manipulation of Dyn+ interneurons in this paper. First, the title does not reflect roles of spinal Dyn+ population. In addition, without further experiments characterizing relationships between NPY and Dyn interneurons in modulating itch and/or nociception, Dyn datasets seem to deviate from the main theme.

      We had previously shown that activating Dyn-INs suppressed pruritogen-evoked itch (Huang et al 2018), but it was important to test whether silencing these cells would have the opposite effect. Our finding of overlap in function (i.e. both NPY-INs and Dyn-INs suppress itch, and that both innervate GRPR cells) provides strong evidence against the idea that neurochemically-defined interneuron populations have highly specific functions, and we now state this in the Discussion. The anatomical experiments (which follow on from the functional studies) provide important new information concerning synaptic circuitry of the dorsal horn, by showing that NPY-INs preferentially innervate GRPR cells, and provide around twice as many synapses on these cells, compared to the Dyn-INs. Interestingly, this correlates with the relatively large optogenetically-evoked IPSCs that we saw when NPY-INs were activated, compared to those reported by Liu et al (2019) when galanin-expressing (which largely correspond to Dyn-INs) were activated. By including these findings in the paper, we are able to make comparisons between these two populations.

      3) While the authors provided convincing evidence that GRPR+ neurons serve as a downstream effector of NPY+ neuron evoked itch, the relationship between GRPR and NPY neurons in modulating pain is not examined. Therefore, Fig. 7B is pure speculation and should be removed.

      We feel that our recent findings that GRPR neurons correspond to vertical cells, that they respond to noxious stimuli, and that activating them results in pain-related behaviours, makes it reasonable to speculate that the NPY/GRPR circuit may also be involved in the anti-nociceptive action of NPY cells. The legend for Fig 7B already refers to this as a "potential circuit", and we have toned down the corresponding part of the discussion to say that our findings "raise the possibility" that this is the case (lines 605-7). We feel that this part of the figure is important, as otherwise our summary diagram ignores some of the main findings of the paper, and we hope that this is now acceptable.

      Recommendations For The Authors

      1) Fig. 1G: the "misexpression" of tdTomato neurons was much more prominent in deep dorsal horn laminae but not in the superficial ones. Was this representative? Can the authors perform a laminae specific characterization?

      We did test for this possibility in 2 NPY::Cre;Ai9 mice that had received intraspinal injections of AAV.flex.GFP, and found that there was a modest difference - 62% of tdTomato+ cells in laminae I-II, but only 39% of those in lamina III, were GFP+. This suggests that "misexpression" may have differed slightly between these regions. However, since the difference was quite modest, and we were only able to analyse tissue from two mice in this way, we did not include these findings in the paper.

      2) I have a lot of problems interpreting the c-Fos data in Fig. 2 E and F. For the mCherry- population, how was the quantification performed? From the image, it does not look like 2030% of cells express c-Fos; at a minimum a clear stain of neurons would be needed. Similarly, the identification of NPY cells is not particularly convincing (e.g., middle arrowhead lower 2 panels in C).

      We have provided further details on how the analysis was performed (changes made to lines 1016-29). NeuN staining was used to reveal all neurons, and a modified optical disector method was performed from somatotopically appropriate regions of the dorsal horn. As noted by the Reviewer, NeuN staining was required to allow identification of mCherrynegative cells. However, we have not included the NeuN immunoreactivity in the image, as this would add considerably to the complexity. These images are from single optical sections, and therefore the overall numbers of cells are low (in comparison to what would be seen in a projected image). The intensity of mCherry staining varied between cells. However, for all mCherry-positive cells (including the example referred to by the Reviewer), there was clear staining in the membrane, which could be followed in serial sections.

      3) Please add individual data points for all quantifications.

      These have been added.

      Reviewer 3 Recommendations For The Authors:

      1) It is somewhat surprising that there is no effect on CPP after activating spinal NPY neurons in neuropathic mice, given the almost complete rescue of hypersensitivity to baseline values in the nociceptive tests. Based on the methods, it appears that conditioning was carried out already 5 min after CNO injection. Yet, suppression of c-fos activity in excitatory spinal dh neurons was observed 30min after CNO injection. Also, it is not clear to me when CNO was injected prior to the nociceptive or CQ testing?

      Have the authors considered that conditioning from 5-35 min after CNO injection might be too short after CNO injection to achieve a profound analgetic effect?

      In a previous study (Polgár et al 2023), we had observed the timecourse of CNO-evoked itch and pain behaviours in mice in which GRPR cells expressed hM3Dq. We found that these started within 5 minutes of i.p. CNO injection (e.g. Fig S2 in that paper). In addition, the timecourse of action of gabapentin and CNO (both given i.p.) are likely to be similar, and there was a preference for the chamber paired with gabapentin. We are therefore confident that the conditioning period with CNO was adequate. We now explain this in the Methods section (lines 846-52). The timing of CNO injections for the nociceptive and CQ tests is now described (lines 749-55).

      2) The authors claim that tonic pain was not affected based on the conditioned place preference test. Efficacy in withdrawal response tests and in the CPP differ by more than duration of the stimulus. I'd suggest using more cautious wording here.

      We agree that caution is needed in interpreting the results of the CPP experiments. We have therefore replaced "does" with "may" in the Results section (line 336) and "did" with "may" in the Discussion (line 620).

      3) On page 9 the authors state "...suggesting that they suppress the transmission of pain- and itch-related information in the dorsal horn." However, pain is not affected in the loss of function experiments suggesting some qualitative differences in the role of the NPY neurons in itch and pain. This should also be reflected more clearly in this statement and in the discussion e.g. "suppress itch" and "can suppress pain".

      We accept the point made by the Reviewer. We have slightly altered the wording in lines 249-51 and 610 to reflect this.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the four reviewers for their generally positive feedback on the manuscript. Below, we provide a point-by-point response to each reviewer.

      We are performing new FCS and gradient measurements as suggested by the reviewers. We are confident we can have these completed within three months (accounting for the summer break).


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      *This manuscript reports a very thorough and careful study of the mobility of Bicoid in the early embryo, explored with single-point fluorescence correlation spectroscopy. Although previous groups have looked into this question in the past, the work presented here is novel and interesting because of the different Bicoid mutants and constructs the authors have examined, in particular with the goal of understanding the role of the protein DNA-binding homeodomain. The authors convincingly show that there is a significant increase in Bicoid dynamics from the anterior to the posterior region of the embryo, and that the homeodomain plays an important role in regulating the protein's dynamics. Their experiments are very well designed and carefully analyzed. The authors also modelled gradient formation to see whether this change in dynamics might play a role in setting the shape of the gradient. I am not sure I fully agree with their conclusion that it does, as mentioned in my comment below. However, it is an interesting discussion to have, and I think this paper makes a significant advance in our understanding of Bicoid's behavior in the early embryo. *

      We thank the Reviewer for their positive comments and their suggestions for improving the manuscript. We will resolve the concerns raised by the reviewer with clarity in the revision. We will also add additional comment in the Discussion regarding the interpretation of our results.

      *Major comments: *

      • 1) Gradient profile quantification: Some of the conclusions made by the authors rely on the comparison between their model of gradient formation (as captured in the equations in lines 232 and 233) and the Bcd intensity profile measured in the embryos. Since the differences in gradient shape predicted by the different models are very small (see Fig. 3B, which is on a log scale and therefore emphasize small differences, and Fig. 3C), it is very important to understand how reliable the experimental concentration profiles are.*

      This is a fair comment. It is worth noting that the key differences between the 1- and 2-component models are only apparent at large distances (and hence low concentrations) from the source.

      We performed the quantification of the gradients in a manner similar to the Gregor lab, whereby the midsagittal plane is analysed. We used 488nm illumination (rather than 2-photon, as the Gregor lab does) so our measurements are likely noisier. However, we are not investigating the variability in the gradient here, but the mean extent. We currently correct background with a uniform subtraction, but we appreciate that is not the optimal method.

      In the revised manuscript, we will repeat the above experiments using a 2-photon microscope. Further, we will image lines expressing His::mcherry without eGFP under the same imaging conditions to more accurately estimate the background signal. While we expect this to improve the data quality, we do not envisage significant change to the observed profiles based on prior experience.

      At the moment, I do not find the evidence that [Bcd] concentration profile is more consistent with a 2-component diffusion model than a 1-component model very strong. A few comments related to this: * * 1a. Line 249, it is mentioned that: "observations ... incompatible with the SDD model". Which observations exactly are incompatible with the SDD model?

      The key points are in the preceding paragraph. We will improve the model presentation in the Results and also include further contextualisation in the Discussion.

      1b. In Fig. 3D, only the prediction of the 2-component model is shown. What would the simple 1-component diffusion model look like? Is it really incompatible with the data?

      We agree with this comment and will provide the 1-component fit to the gradient profiles. We expect it to fit well for the anterior half of the embryo but fail at larger distances (as has been previously shown).

      Regarding the FCS data, we also show one and two component fits. We will show the alternative fits – a 2 particle fit is clearly an improvement (see also related response to reviewer 2).

      1c. Line 243: "The increased fraction in the fast form ... consistent with experimental observation of Bcd in the most posterior" (Mir et al.)". I am not sure how this is significant, since the simple model also predicts there will be Bcd in the posterior - the only difference is how much is there (as shown in Fig. 3C), and it's a very small difference.

      The absolute differences are not large between the two models, but due to the observed clustering (Mir et al. 2018), even small differences can have very large effects. In the revision we will provide estimates of the actual concentration differences.

      We are performing new experiments with the Fritzsche lab at Oxford to estimate if there is clustering of Bcd. We will also repeat our FCS experiments to validate our key conclusion of AP differences in diffusion of Bcd. These should be completed by the end of the summer.

      1d. Since the difference between models is in the posterior region where Bcd concentration is very low, when comparing the models to the data the question of background subtraction is essential. How was the subtracted background (mentioned line 612) estimated?

      See above response to the first comment.

      1e. Along the same line, were the detectors on the Zeiss LSM analog or photon counting detectors, and how confident can we be that signal is exactly proportional to concentration?

      We used PMTs and did not directly do photon counting. But the intensity is still proportional to the concentration. It is possible to estimate the absolute concentration value, e.g., Zhang et al., 2021 (https://doi.org/10.1016/j.bpj.2021.06.035). However, our main conclusions – especially regarding the spatially varying Bcd dynamics – are not dependent on this.

      1f. Can the gradients created by the two Bcd mutants (FIg. 4B) be quantified as well, and are they any different from the original Bcd gradient?

      We agree this would be useful. We will provide the gradient quantifications of the bcd mutants in the revision.

      1e. What is the pink line in Figure 5C (I am assuming the green one is the same as in Fig. 3D)? It could be better to not use normalization here, or normalize everything respective to the eGFP::Bcd data to make comparison in relative concentrations in the posterior for different constructs more evident (also maybe different colors for the three different data sets would help clarity).

      This is a fair comment, and we will create graphs with new data for better visualisation.

      1f. Discussion, lines 402-403: Does the detailed shape of the Bcd in the posterior region matter at all, since the posterior is not a region where Bicoid is active, as far as we know? Could a varying Bcd dynamics have other consequences that would be more biologically relevant?

      Bcd is now known to act at 70% EL (Singh et al., Cell Reports 2022). So, the gradient is relevant for a large extent of the embryo length, though it is not known if there is any effect in the most posterior region.

      2) Model for gradient formation (lines 231-238): * * 2a. Whether the molecules of Bcd can change from their fast to slow form is never questioned. How do we know (or why might we suspect) they do exchange?

      This is a good point. Within the nucleus, and based on our mutant data, we suspect the fast/slow forms correspond to unbound/bound DNA states.

      In the cytoplasm, the dynamics are less clear. Bcd can bind to cytoskeletal elements (Cai et al., PLoS One 2017) as well as to Caudal mRNA. Therefore, it seems reasonable to have different effective dynamic modes – yet, how such switching occurs remains unclear.

      Ultimately, our model approximates multiple dynamic modes that are integrated to drive Bcd motion. Including switching between states is a reasonable assumption based on what is known about cytoskeletal and protein dynamics, but we do not have a specific mechanism.

      It is challenging to estimate a specific kon / koff rate, as the dynamic changes also depend on the diffusion – which itself is changing. For now, we believe our level of abstraction is appropriate given what is known about the system. It will be very interesting to explore the specific interactions underlying such behaviour in the future, but that is beyond this current manuscript.

      2b. The values used in the model for alpha, beta_0 and rho_0 should be mentioned. Maybe having a table with all the parameters in the method section, or even in the supplementary section, would help. The exact values of alpha and beta matter, because if they are large (fast exchange) a single exponential gradient is to be expected, if they are 0 (no exchange) a double exponential gradient is to be expected, with intermediate behavior in between. Which case are we in here?

      We agree and will add a more complete table in the revision.

      3) Discussion about anomalous diffusion (lines 386-388): The 2-component model used by the authors to interpret their FCS data seems very well justified here (excellent fits with very small residuals). I agree with the authors' conclusion that "the dynamics of Bcd within the nucleus are more complicated than a simple model of bound versus unbound Bcd", but I don't see how that can lead to a diagnostic of anomalous diffusion instead. Maybe it is just a matter of exactly explaining what is meant by anomalous diffusion here (since this term is often used to mean different things). A more likely scenario I think, is that there are more than just two Bcd components in the system.

      This is a good point, and we can’t easily differentiate two/multi- component fits from anomalous diffusion ones. This is a known problem. But we have recently shown in a collaboration with the Laurent Heliot lab (Furlan et al, Biophys J 2019), that anomalous diffusion is a good stable indicator of changes, even if it might not be the right model. We use anomalous diffusion as it stably predicts changes. We do not claim, however, that diffusion is anomalous. We will improve the discussion of these points in the revised manuscript.

      4) Line 440 and after: What is the evidence that the transition between the two forms might vary non-linearly with Bcd concentration? How would that help adapt to different embryo sizes? It would be good to be more explicit here instead of just referring to another paper.

      We will improve this discussion. The central point is that the action of Bicoid is unlikely to simply depend linearly on concentration as in that case the ratio of fast to slow forms would be constant across the embryo. Related to the above comment, it is important to emphasise that we are using a phenomenological model, not one based on a specific mechanism.

      5) Since an important aspect of this work is the study of different Bcd constructs in vivo, it is important that these constructs are very clearly described, so the section on the generation of the fly lines (Methods) should be expanded. In particular: * * 5a. It seems that the eGFP:: NLS control used here was different from that first described in Ref. 64 (and used for FCS experiments in Ref. 30 and 36)? If so, what NLS sequence was used here, and precisely what type of eGFP was used (in particular, was the A206K mutation that prevents dimerization present in the eGFP used)? If it is the same construct as in Ref. 64, it should be mentioned explicitly. * * 5b. Were the mutant N51A and R54A lines gifts as well, or have they been described before? If so, previous publications should be referenced. If not, how the plasmid was introduced in the embryo should be briefly explained.

      We agree and will expand on the fly lines in the revision.

      6) Concentration calibration measurements (Methods Fig. 2, line 568 and on). It is well known that background noise is going to interfere with the measurement of N when the signal becomes equivalent to the background noise (Koppel 197, Phys Rev A 10:1938-1945, and for a recent discussion of this effect for morphogens in fly embryos: Zhang et al., 2021, Biophysical Journal 120,4230-4241). It is almost certain that in the low signal regions of the embryo (e.g. posterior cytoplasm) this is affecting the reported concentration, and should be at least acknowledged.

      We agree with the reviewer. We will provide the SBR. We will also correct the N values based on the method followed in Zhang et al., 2021, Biophysical Journal 120,4230-4241.

      *7) Reference 3 is mis-characterized in two different ways in the manuscript: * * 7a. Line 50: The conclusion in Ref. 3 was not that the gradient was due to a diffusive process, on the contrary Gregor et al. argued that Bcd was too slow to form such a long-range gradient by diffusion. Studies that do present data consistent with a morphogen gradient formation mechanism driven by diffusion are reference 5, reference 30, Zhou et al., Curr. Biol. 2012;22(8):668-75 and Müller et al., Science 336 (2012) 721-724. *

      Gregor et al., do not argue against a diffusion process – indeed, they utilise a SDD model in their paper. However, they do extensively discuss how the predicted dynamics from the SDD model are not compatible with gradient formation as observed after n.c. 13. This problem was resolved to some degree by FCS measurements of Bcd (e.g., Dostatni lab, Development 2011) and the use of a Bcd tandem reporter which showed that production and degradation change during n.c. 14 (Durrieu et al., MSB 2018). We will improve the framing of these results in the revision.

      7b. The diffusion coefficient estimated from FRAP measurements and reported in Ref. 3 (D = 0.4 micron^2/s) is mentioned a couple of times in the manuscript (line 66, line 395, line 411). However, this number is simply incorrect. When fast components (such as the ones clearly detected here by FCS) are present, they diffuse out of the photobleached area during the photobleaching step. If that is not corrected for during the analysis (and it wasn't in Ref. 3), then the recovery time measured is just equal to the photobleaching time, and has nothing to do with either the fast or slow fraction of the studied molecule - it has no other meaning than to give a lower bound on the value of the actual effective diffusion coefficient of the molecule. This effect (called the halo effect) is well known in the FRAP community (see e.g. Weiss 2004, Traffic 5:662-671), it has been experimental demonstrated to occur for Bcd-eGFP in the conditions used in Ref. 3 (Reference 30), and the actual diffusion coefficient that should have been extracted from the data presented in Ref. 3 has been recalculated by another group to be instead D = 0.9 micron^2/s (Castle et al., 2011, Cell. Mol. Bioeng. 4:116-121). It would therefore be better to report the corrected value from Castle et al. to help the field converge towards an accurate description of Bcd mobility.

      We fully agree and will use the improved FRAP estimated value for Bcd.

      *Minor comments and suggestions: *

      • 8) Figure 1: From panel A, it seems that what is called "Anterior" and "Posterior" is about 150 micron away from the embryo mid-section, i.e. about 100 micron from either the anterior pole or the posterior pole (so not the tip of the embryo, but somewhere in the anterior half or posterior half). Maybe this should be made clear in the text. *

      We have made changes in Figure 1A to indicate the region within which the FCS measurements are carried out. We have added the relevant details in the legend of figure 1 lines 137-138.

      *9) Fig. 2A; It might be good to put this graph on a log scale, so that cytoplasmic values are seen more clearly. Also, what about reporting on nuclear to cytoplasmic ratios? *

      We will rework on this graph and make necessary changes.

      *10) Fig. 2: It could be interesting to plot D_effective as a function of the measured concentration of Bicoid in different locations, since the (interesting) suggestion is made several time that [Bcd] could the a determinant of the protein mobility. *

      Our work provides an indication that Bcd concentration is connected to the diffusion. We did this by measuring at two locations. To extend this to a rigorous model would require substantial new measurement along the whole length of the embryo. While interesting, this represents a very large investment of time and lies beyond the current manuscript.

      *11) Figure 3B&C: Is the curve for 2-component diffusion (without concentration dependence) for steady-state missing? *

      We will clarify in the revision.

      *12) Lines 78 and 471: What do the authors mean by "new reagents"? The word reagent evokes a chemical reaction, but there are none here. Do the authors mean new constructs? or new mutants? *

      We have changed lines 78 and 479 from “new reagents” to new Bcd mutant eGFP lines”.

      *13) Lines 57-59: Another good reference for FCS measurements performed to study the dynamics of a morphogen (in this case Dpp) is Zhou et al., Curr. Biol. 2012;22(8):668-75 *

      We added this reference in no.70.

      *14) Lines 109-111: A word must be missing. Precisely determined what? *

      Precisely measure within cytoplasm, and nuclear compartments and also during interphase stages. We have changed to “precisely measure in the cytoplasmic and nuclear regions during the interphase stages of nuclear cycles (n.c.)12-14.” in line no.111-112.

      *15) Line 278: The increase in the slow mode is expected. Maybe explicitly mention why. *

      In line 286, we have added “due to the loss of Bcd binding to the DNA”.

      *16) Line 282: "with the fast component increasing", maybe replace with "with the diffusion coefficient of the fast component increasing" or "with the fraction of the fast component increasing". *

      We have changed line 289 “with the diffusion component of fast component increasing towards the posterior”.

      *17) Line 517: Is there a reason why the dorsal surface is always placed in the coverslip? *

      We have added these details in line 528-529 in Methods.

      *18) Line 524 and on: FCS measurements: What was the duration of each individual FCS measurement? It is great that the exact number of measurements are reported in the supplementary! *

      Thank you for the complement. Typically, cytoplasmic measurements are 60secs and nuclear measurements are 20-40s. We have added this in line no.528-529. We also added a column to indicate the duration of each of the measurements in the supplementary tables.

      *19) An Airy unit of 120 um seems large in combination with an objective with a NA of 1.2, is there a reason for that? What was the radius of the resulting detection volume? *

      Olympus microscopes have a 3x magnification stage in their confocals. This leads to the change in the Airy unit. Otherwise, it would be 40 mm.

      *20) Thank you for detailing the reasons behind the choice of excitation power, an important and often omitted details. Where in the excitation path were the values of the laser power measured (before or after the objective?)? *

      Thank you for the complement. The laser power is measured before the objective. We removed the objective and measured the laser power in the objective path.

      *21) Line 585: "since the brightness of eGFP::Bcd..." do the authors mean the molecular brightness of a single eGFP::Bcd molecule, or the total fluorescence signal? *

      It is the total fluorescence signal. We have edited line no.592.

      *22) It would be good for reference to mention the approximate value of the molecular brightness recorded for these eGFP constructs at the laser power used. *

      We will measure and tabulate in the revised manuscript.

      *23) Reference 766: The year (and maybe other things) is missing. *

      We have corrected this reference.

      24) Figure 2 (Methods): The concentrations shown on the figure should be in nM not uM. * * Thanks for noticing – we have changed.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      MAJOR POINTS

      • 1) FCS measurements and fits *
      • a) Please state the duration of each individual FCS measurement. *

      In the cytoplasm, the measurements were carried out for 60 secs and in nuclei it is between 20-40s. We could not measure for 60s in the nuclei as the nuclear position fluctuates from its initial position. We will add another column to indicate the duration of FCS measurements in the supplementary tables.

      b) The authors acknowledge potential issues with fluorophore photophysics and use different lag time ranges for the calibration dye Atto-488 (0.001 ms in Method Fig. 2) and eGFP (0.1 ms in the main figures). Given the strong influence of different parameters on data interpretation and conclusions, Method Fig. 2 should be repeated with purified eGFP. This is particularly relevant for the noisy FCS measurements in posterior regions.

      Performing the experiment with purified eGFP will be a volume calibration. We routinely performed this before each imaging session, and that should be fluorophore independent. As noted by Reviewer 1, it is also important to be clear about background correction. We will provide brightness data for eGFP and background values in the revised manuscript. We can then use this to estimate the corrected concentrations.

      We use 0.1 ms to start, as at that point any contribution from the photo-physics should have decayed (0.1 ms is about 3-5 times the day rate of the photophysical process, Sun et al., Analytical Chem 2015).

      c) Please explain why no data is shown for "AN" around 0.1 ms lag time in Fig. 1B in contrast to all other figures.

      We will add the data for AN from 0.01 in the revised figures.

      d) Please state what the estimated diffusion coefficients with one-component model fits are. Please also explain why the fits in Fig. S1E do not reach a value of 1 and why they plateau higher than the experimental data at long lag times. Please constrain the fits to G=1 at 0.1 ms tau and G=0 at 1 s tau to make a fair comparison.

      The experimental ACF curves reach 0 at long lag times as would be expected. The one-component fits, however, don’t describe the data well and as a result they do not reach 1 and 0 at short and long lag times, respectively. The fitting is done using a mean-squared estimation of the best approximation of the particular model function to the data. Fixing the parameters can be done, but it will further reduce fit accuracy and deviations will be larger. We will perform this analysis and tabulate the one component fits in supplementary 1 with necessary corrections.

      e) Please assess the validity of all multi-component fits by comparing the relative quality of the models to the number of estimated parameters using the Akaike information criterion or similar approaches.

      We will provide the values denoting the quality of the fits in the revision. We will provide the 3D 1 particle fit, the 3D 1 particle fit with triplet, the 3D 2 particle fit and the 3D 2 particle fit with triple and will provide appropriate measures of fit quality.

      f) Please also present the Bcd-GFP fits with 0.001 ms that are mentioned in line 590, and present the results for the data that did not give comparable tau_D1 and tau_D2 values mentioned in line 593.

      We will provide all the curves from 0.001ms in the supplementary. We did not provide these details as we have followed the methods from Abu Arish et al., 2010. As our cytoplasmic and nuclear TauD values match with Abu Arish et al., 2010 and Porcher et al., 2010, we thought the excess data would be redundant.

      3) Bicoid gradient and modeling * a) Little et al. 2011 observed that the Bcd gradient decreases around n.c. 13. Can the authors of the present work observe a similar concentration decrease using FCS? This is important to i) validate the FCS concentration measurements, and ii) to resolve the controversy regarding "previous claims based on imaging the Bcd profile within nuclei, which predicted decrease in Bcd diffusion in later stages".*

      This is a good point regarding conclusions from the previous literature. The Little et al. paper inferred that diffusion had to decrease from fitting to the gradient profiles. However, subsequent analysis from our lab (Durrieu et al., MSB 2018 [which uses a different method involving a tandem reporter for Bicoid] and this manuscript) strongly suggest that Bicoid remains dynamic, at least through n.c. 13 and early n.c. 14. One way to test this is to use SPIM-FCS, where longer time courses can be taken (though with slower time resolution in the FCS). We have performed preliminary experiments with SPIM-FCS and we will revisit this data to see if we can find evidence for changes in the diffusion.

      We will also extend the Discussion to make the results clearer in terms of previous models and literature.

      b) Please explain why the experimental Bcd-GFP gradient data does not reach a value of 1 (e.g. in Fig. 3D) despite normalization. Please also explain why the fits become flatter in Fig. 5B compared to the steep fit in Fig. 3D.

      Both lines were measured under identical conditions. Therefore, we normalised to the maximum value of both experiments. We will redo, normalising to each individual experiment. Regarding Fig. 5C, the Bcd::eGFP curve is identical to Fig. 3D. The flatter curve is the line with eGFP tagged to a NLS alone.

      c) For modeling, please take into account observations that the Bcd source is graded with a wide distribution (30-40% EL, see Spirov et al. 2009, Little et al. 2011, Cai et al. 2017 etc.). The extent of the source used in the present work (x_s=20 um, line 620) is at least five times too small.

      Care must be taken in defining the source extent. The most careful measurements are reported in Little et al., PLoS Biology 2011 who performed single molecule FISH. They conclude “We demonstrate that all but a few mRNA particles are confined to the anterior 20% of the egg”. Further, the peak in the particle density is around 20-30um from the anterior (Figure 3, Little et al., PLoS Biology 2011), with the vast majority of counts being with 10% of the anterior pole. Further, Durrieu et al. MSB 2018, showed using a Bcd tandem reporter that there was unlikely to be an extended gradient of bcd mRNA (maximum extent of around 50um). Here, we used a simple source domain, which was arguable a little narrow, but not significantly so. We will increase the value in the revision, but the claim that there is an extended bcd mRNA gradient (Spirov et al., Development 2009) has not been substantiated by later experiments.

      • d) Please discuss in the paper how well the simulations in Fig. 3B agree with the experimental data.*

      We will provide these details in the revision.

      • e) Please provide a precise estimate for the statement "Even with an effective diffusion coefficient of 7 μm2s-1, few molecules would be expected at the posterior given the estimated Bcd lifetime (30-50 minutes)" to turn this into a quantitative argument. How many molecules are expected to reach posterior in which model, and how does it compare to experimental observations?*

      This can be estimated based on the root-mean-square distance for diffusive processes. We will provide this in the revision.

      • f) The sentence "we find that a model of Bcd dynamics that explicitly incorporates fast and slow forms of Bcd (rather than a single "effective" dynamic mode) is consistent with a range of observations that are otherwise incompatible with the standard SDD model" needs to be toned down and corrected since a simple SDD appears to be sufficient to account for the observed gradients. If the authors disagree, please specifically point out in the paragraph around line 249 what observations exactly are incompatible with a standard SDD model.*

      This is similar to the point raised by Reviewer 1. While the standard SDD model can explain the overall gradient shape, it is not compatible with the observed time scales and Bcd puncta tracked in the posterior pole. We will improve the Discussion around this point to make the distinctions between the models clearer.

      • 5) Data presentation *
      • a) In line 27 and 122 it would be better to rephrase the wording "find/found" and give credit to previous papers that first made these observations. *

      We will edit in the revision.

      • b) For the statement "This suggests that the dynamics of the fast fraction were not captured by previous FRAP measurements", please explain why this should not be the case even though the fast fraction is shown to be larger than the slow fraction in the current work.*

      We will edit in the revision.

      • c) Similarly, the sentence "The dynamics of the slower mode correspond closely to measured Bcd dynamics from FRAP" likely needs to be corrected since it neglects the contribution of the faster mode, which is fluorescent as well and should also contribute to the dynamics from FRAP.*

      This is similar to the point raised by Reviewer 1 and we will edit in the revision.

      d) In the absence of further evidence (see above), the sentences "We establish that such spatially varying differences in the Bcd dynamics are sufficient to explain how Bcd can have a steep exponential gradient in the anterior half of the embryo and yet still have an observable fraction of Bcd near the posterior pole" and "These results explain how a long- ranged gradient can form while retaining a steep profile through much of its range" in the abstract need to be toned down.

      We are not sure here what needs to be toned down. Our results show that there are (at least) two dynamic forms of Bcd and, combined, they are capable of forming a long-ranged gradient while also ensuring the gradient remains steep in the anterior (because the diffusion coefficient itself varies across the embryo). We will go through these statements and make sure the meaning is clear.

      e) The authors state that "However, we show that eGFP::Bcd in its fastest form can move quickly (~18 μm2s-1), and the fraction of eGFP::Bcd in this form increases at lower concentrations", but this has not been directly shown. Please tone down this statement or directly test the prediction that Bcd has a higher fraction of the fast form in earlier nuclear cycles when Bcd concentration is smaller.

      This is a good suggestion, and we will test whether early nuclear cycles of the anterior domain show faster dynamics.

      *MINOR POINTS * * 1) Introduction * * a) Please explain explicitly what exactly the contention in Bcd, Nodal and Wingless dynamics is in the cited references. *

      We will add in the revision. b) In line 95, it would be better to state that this is a variation of the SDD model rather than "a new model". * We changed from “a new model” to “an improved version of SDD model” in the current version of the manuscript. 2) Methods * * a) The authors state that "The same software was also used to calculate the cross-correlation function", but I couldn't find any cross-correlation analyses. Please clarify. *

      It is line 538. There is no cross correlation. We changed this to the autocorrelation function.

      b) Please correct the "uM" typo to "nM" in the legend of Method Fig. 2A.

      We have changed this in the current version.

      • c) In the sentence "Further, since the brightness eGFP:Bcd in the anterior and posterior cytoplasm is lower compared to the nuclei", "brightness" probably needs to be changed to "concentration" since the molecular brightness is unlikely to change. *

      We edited the line no.591.

      • d) Please explain the background-correction method mentioned in line 612. Please also state at what temperature the experiments were performed.*

      We will add a better background correction in the revision. Currently, it is the non-embryo background as background noise. The measurements are carried out at 25oC.

      *3) Results * * a) Please provide labels for anterior, posterior, dorsal and ventral in Fig. 1A. * * b) Please explain the colors in Fig. 5C. * * c) Please explain the dashed lines in Fig. 3C. * We have edited Figure 1A and Figure 5C. We will edit Figure 3C in further revision.

      *OPTIONAL * * 1) If possible, it would be helpful to mention whether the transgenic animals have any abnormal phenotypes or whether they can rescue the bcd mutant. * We will update in the revision.

      *2) To validate the concentration measurements, it would be ideal if the authors could determine the Bcd concentration gradient using FCS along the anterior-posterior axis. This would also address whether there are further unexpected changes in diffusivity in medial regions and along the anterior-posterior axis that would have to be considered for modeling. * To measure the Bcd concentration using FCS along the whole axis would be a very challenging undertaking. To get the data for the two positions analysed already represents a significant amount of work. We have done SPIM-FCS measurements, and we will be repeating our FCS measurements in the Fritzsche lab at Oxford. Combined, we believe this provides sufficient corroboration of our results.

      *3) Local photoconversion experiments, e.g. in Bcd-Dendra2 embryos if available, would provide compelling support for the relevance of the measurements in the current work. * This is a nice idea, but this would represent a substantial project in its own right and lies beyond the current work.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      *In my estimation the experimental work is rigorous and the results fully support the conclusions of the authors. I was surprised, however, that the HD-only form localizes via very different and simpler dynamics than does full-length Bcd, but nevertheless forms at least a qualitatively similar gradient. That leads to the question as to whether the existence of the fast and slow forms and their different ratios in different parts of the embryo actually are physiologically relevant. I don't see a straightforward way to test this experimentally, because the mutations that effect Bcd gradient formation also affect essential functions of the protein that if abrogated produce severe downstream effects on embryonic development and lethality. However I would like to see this point at least addressed in the discussion. The data and the methods are presented in such a manner that they can be reproduced, and the number of replicates and statistical analysis is overall robust. * We thank the Reviewer for the positive and constructive review. They, like both previous reviewers, raise the issue of the model and how it fits with the data. As outlined above, we will improve this part of the data presentation and also the Discussion to make sure the main results are clear.

      We agree that the underlying importance of the different dynamic forms of Bicoid – and why they change across the embryo – remains unknown. We believe that our careful characterisation of such behaviour is important nonetheless, as it reveals that: (1) morphogen dynamics are more complicated than typically modelled, and this may be just as relevant for ligands moving through extracellular space; and (2) dynamics can vary in space/time, providing an additional possible mechanism of control for regulating morphogen gradient profiles.

      Of course, we would like to explore potential physiological relevance. Further exploration of the homeodomain and its role in regulating dynamics is a potential route, but that belongs in future work.

      *Minor comments: *

      • The presentation of the graphical data measuring Bcd levels along the a-p axis (Fig 1C, 1D, 4C-F and others) needs to be improved, because the grey lines that represent ACF curves are essentially invisible. This is partly because there is usually extensive overlap between the grey lines and other lines. This may be solved by using a more vivid colour than grey for the ACF curves, or perhaps the ACF lines could be made thicker but with some transparency so that overlapping data can be seen. In any event this aspect of the presentation needs to be improved. * We have made the ACF lines thicker to distinguish from the model fit.

      *In Figs 2D and 2I measurements of statistical significance between the proportion of protein in fast and slow modes need to be added. * We will add in the revision.

      *Relevant to line 174 and Fig 2, NLS should be defined when first used, the source of the NLS should be given (is it from Bcd?) and the rationale for looking at eGFP::NLS should be made explicit. *

      We have added details on how the eGFP::NLS is generated in the methods.

      *In Fig 3D the dashed lines need to be defined. I assume these are experimental error bars but this is not stated. *

      We now state this in the legends.

      *On lines 344-5, shouldn't this conclusion concern the HD rather than the NLS? * Yes, thanks for pointing it out it is related to only NLS not NLSHD. We removed this statement from line 351.

      *On line 432, CAP is not an acronym, the correct term is 5' 'cap' or 'cap structure'. Also Cho et al. PMID 15882623 should be added to the references here. * We changed the corresponding section and added the references.

      *On lines 446, 456, 469, and throughout: replace 'blastocyst' with 'blastoderm'. The former term is generally used for embryos that undergo full cellular divisions and cleavage in early embryogenesis, not for syncytial embryos such as Drosophila. * We have changed blastocyst to blastoderm throughout the manuscript.

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      Major comments: The averaged autocorrelation curves were fitted to models of diffusion with one and two components. The one-component model was insufficient to reproduce the data and the two-component model seems to fit the data. Have the authors tested models with more than two components? Could it be possible to distinguish more Bcd populations?

      While it is possible to fit with further components, it rarely provides useful further insight. In particular, the error in measuring three tau_D’s is typically very large. In addition, the improvement in the fit will be marginal, and thus the extra components cannot be justified statistically. Of course, we cannot exclude a third (or more) possible dynamic modes, but within the resolution of our FCS measurements two components with triplets are in general the maximum that can be accommodated without overfitting. We will provide evidence for this claim in the supplement of the revised manuscript.

      In Figure 2E, the same concentration of eGFP::NLS is estimated to exist in the cytoplasm and nucleus. Since the NLS should target eGFP to the nucleus, what is the explanation for this observation? Is it possible that the method used to estimate the concentration of molecules is underestimating the concentration in the nucleus or the opposite in the cytoplasm?

      This is a good observation. There are two possible explanations. First, the regular division cycles “reset” the nuclear levels. Therefore, differences may not be so large. Second, FCS measurements of concentration can be noisy, as they depend on the very short time scales in the measurement. We will double check our measurements and clarify this in our revision.

      *In the simulation of the SDD model (Figure 3B), simulations at 10 min, 25 min and 120 min are shown. Assuming that 120 min corresponds to early nc14, are simulations at earlier timepoints corresponding to nc12 and nc13 indistinguishable from the profile at 120 min? This demonstration would further support the option to merge the data from all nuclear cycles. *

      This is a good point. Here, we were primarily focused on showing the time evolution of the model, rather than directly mapping onto experiment. We will clarify in the revision.

      *The results obtained with the BcdN51A mutant show an increase in diffusion speed, while retaining similar proportions of fast and slow populations. In the slow fraction, a new population is found. Assuming that the BcdN51A molecules cannot bind specifically to DNA due to the mutation, what would this newly found population correspond to? Could the authors explore the possibility of nonspecific binding to DNA? The article would also win by discussing more on this aspect or other options. *

      This is an interesting question. Dslow for anterior nuclei of N51A mutants increases (Dslow from ~0.2um2/s to ~1.5 um2/s), and the proportion is similar to the slow fraction of WT Bcd in the anterior nuclei (F=50%). The Dslow values of bcdWT suggest that 0.2um2/s is a result of DNA binding. For bcdN51A, Dslow of 1.5 um2/s is suggestive of nonspecific interaction of bcdN51A to the DNA. Such a nonspecific interaction is also noticed in the case of NLS::eGFP, where we see a significant amount (Dslow~ 1-1.5 um2/s , F=20%) of slow form in the anterior nuclei, likely due to non-specific interaction with the DNA.

      It is worth noting that the inactive homeodomain of transcription factor sex comb reduced (scr) also interacts non-specifically with DNA at high concentration (Vukojevic et al., PNAS 2010). Non-specific interaction of eGFP fluorophore is also noted to be higher in the nuclei of AT-1 cells that suggest “obstacle-free accessible space” is low in the nuclei (Wachsmuth et al., JMB 2000). Therefore, though we do not understand the specific mechanism, our results for N51 mutants are aligned with previous observations of intra-nuclei dynamics.

      The experimental rational behind the BcdMM reporter needs to be better explained as it is not clear. It was previously shown that the N51A mutation disturbs zygotic hb activation and Caudal gradient formation (see Figure 3 in Niessing et al., 2000). Since N51A already causes a strong phenotype by disturbing hb expression and Cad gradient formation, what is the reasoning being adding extra mutations to this background? Since the mutations in the PEST domain and YIRPYL motif are involved in cad translational repression, it would be more interesting to add them to the R54A mutation and further study the repression of cad? It would also shed light on the unexpected no difference or even decrease in diffusion in the cytoplasm of the R54A mutant which should increase if indeed the cad mRNA binding is being repressed.

      Our rationale was to remove more elements of Bcd to see if there was some degree of redundancy – at least in terms of the dynamics.

      The Bicoid homeodomain N51A mutation is physiologically known to cause de-repression of caudal and inhibit hunchback expression. Mechanistically, nuclear Bcd activates hb transcription. However, in the cytoplasm Bcd interacts with other proteins and forms a complex to de-repress caudal. Bcd binds to caudal mRNA through its HD at one end of the complex. However, in the other end, other proteins in the complex are bound to the 5’cap region caudal mRNA. Our rationale for generating the MM mutation was that the N51A mutation may not be sufficient for Bcd to be released from the protein complex. Therefore, additional mutations to N51A may release Bcd from interactions with either DNA or with other proteins through PEST domain and YIRPYL motif.

      *Have the authors confirmed that their BcdR54A indeed inhibits cad translation? *

      We have not tested the eGFP:bcdR54A to inhibit cad translation. We will add the data in the revision.

      *How many embryos of BcdMM were analysed? The authors should also provide a table with all the values in SI as they have done for all the other reporters. *

      We will add this data with the revision.

      *The claims with eGFP::NLSBcdHD need to be supported by data from multiple embryos. Even if multiple ACF curves are obtained from one embryo, analysing only one embryo is not sufficient. This would clarify the fact that this reporter seems to be able to reproduce the mobility of Bcd in the nucleus. *

      We agree and we are arranging to collect more data. This should be completed by the end of the summer.

      *According to the methods, all reporters were expressed in a bcd null background, made with the bcd1 allele. This allele is also known as bcd085 and according to Driever and Nusslein-Volhard, 1988 (PMID: 3383244), this allele only causes an intermediate phenotype. This indicates that a truncated version of the protein probably still exists on the embryo. Do the conclusions obtained here still hold if a truncated version of the Bcd protein exists in addition to their reporters? *

      We used the bcdE1 mutant, a null mutant of bcd. This was used by Gregor et al., Cell 2007 in their generation of the original Bcd::eGFP. We have also recently generated a more complete bcdKO mutation (Huang et al., eLife 2017). Our embryos do not have a clear phenotype that we can relate to the specific bcd- background used. Nonetheless, we agree it is an important point to be clear about the genetic background and we will clarify in the revised manuscript.

      Minor comments: * * In line 45: "Morphogens are signalling molecules", the authors should consider removing the word "signalling" since not all morphogens are, especially the one being studied, Bicoid. * * In lines 80-81 (and also throughout the text): "We measure the Bcd dynamics at multiple locations along the embryo AP-axis", should be more accurate and changed to anterior and posterior of the embryo. Using "multiple locations along the AP axis" is ambiguous and not exact for what was done.

      Yes, this is a fair comment. We have edited these sections in the current manuscript.

      *Throughout the article, the authors refer multiple times to "modes for/of Bcd transport". Since they or others have not proven that Bcd is being transported, which would involve at least another factor, the authors should replace transport by movement, diffusion or a similar word with which they are comfortable. *

      We have changed transport to movement wherever relevant in the text.

      *Suggestion: The authors claim that the Bcd gradient is exponential up to 60% of embryo length. Would this information allow a more precise calculation of the gradient decay length in the exponential region than the 80-100µm stated on line 202? *

      This is an interesting point, but our results suggest that the idea of the decay length is not so applicable in the posterior region. There, the Bcd dynamics are generally quicker, thereby increasing l. Of course, we cannot discount possible spatial variation in degradation. However, in previous work, our Bcd tandem reporter (which is sensitive to changes in degradation) did not reveal spatial variation in degradation.

      In lines 258-259, the sentence "Further, Bcd binds to caudal mRNA, repressing its expression in the cytoplasm" should be improved to clarify the role of Bcd in caudal mRNA translation repression and references should be added. This should also be corrected in the following paragraph.

      We will add the necessary corrections in the revision.

      *In line 262, "mutations" should be singular since it corresponds to only one amino acid mutation. *

      We have corrected this.

      *Figure 4J needs to be corrected as the fractions of the slow and fast populations do not correspond to what is shown in Table 3. For example, Fslow fraction of AC is ~45% in the figure while it is 36% in Table 3. The problem occurs in all fractions. *

      We are sorry there is a mislabelling in the corresponding figure. AN is in the place of AC. We have edited figure 4J and removed the mislabelling.

      *In the discussion, in lines 379-380, "Given the changing fractions of the fast and slow populations in space, the interactions between the populations are likely non-linear". What is the reasoning for non-linearity and not interchangeability? *

      If the interactions between the two populations were linear, then the fraction in each form would be constant across the embryo. Some degree of nonlinearity is required in order to have spatially varying relative populations.

      *In line 432 caudal should be italicized. *

      We have edited this.

      *In the discussion, the authors conclude that "In the nucleus, the two populations can be largely (though not completely) explained by Bcd binding to DNA". The discussion would win by explaining all the possible options. * We will add the necessary changes in the discussion. This is also related to above reviewer comments.

    1. Author Response

      Reviewer #1 (Public Review):

      This paper studies color vision in anemonefish. The central conclusion of the paper is that anemonefish use signals from their UV cones to discriminate colors that would not otherwise be distinguishable; this differs from other fish in which UV cones extend the range of wavelengths of sensitivity but do not add a dimension to color vision. The work fits into a rich history of studies investigating how color vision fits into an animal's ecological niche. My primary concerns regard the microspectrophotometry data from single cones and some aspects of the presentation of the behavioral data.

      Microspectrophotometry

      The spectral properties of the cone types are a key issue for interpreting the results. These were measured using MSP, and fits are shown in Figure 2. The raw data shown in Fig. S1 appears more complicated than indicated in the main text. The templates miss the measurements across broad wavelength bands in each cone type. Particularly concerning is the high UV absorbance across cone types and the long-wavelength absorbance in the UV cone. It is not clear how this picture supports the relatively simple description of cone types and spectral sensitivities given in the main text and which forms the basis of the modeling.

      Microspectrophotometry is an inherently noise-prone measurement technique, particularly for very small photoreceptor outer segments such as that of single cones, which are also difficult to detect as intact, isolated (nonoverlapping) cells. As such, the absorbance curve fitting and derived lambda max (λmax) values should be treated as estimates. The accuracy of these estimates is adequate for this type of study, and visual modelling results have been shown to be robust against small errors (±10 nm λmax) in photoreceptor sensitivity for multiple species [see Lind, O. & Kelber, A. (2009). Vis Res. 49(15), 1939-1947; and Bitton, PP. et al. (2017). PLOS ONE, 12: e0169810]. We consider it highly unlikely that small shifts in cone λmax from measurement error would make a meaningful difference to the colour discrimination thresholds.

      It should be noted that the raw data shown in the original Supplementary Figure 1, included all scans overlain with an average absorbance curve for presentation purposes; however, the actual lambda max values for different cone types were measured and then averaged among individual scans fitted with photopigment absorbance curve templates. For clarity and transparency, we have now provided three multipaned plots (see Figure 1 – figure supplements 1-3) showing the individual pre- and post-bleach scans of absorbance spectra, fitted absorbance curve templates, and R2 values from the best visual pigment template fit.

      It is worth noting that most of the cone absorbance spectra found in our study closely resemble those in λmax and quality to those measured in another anemonefish species (Amphiprion akindynos) [see Supplementary Figure 1 in Stieb S. et al. (2019). Sci Rep. 9, 16459]. These cone λmax values can also be reconciled with previous estimates on opsin λmax based on amino acid sequences and cone opsin expression in the A. ocellaris retina characterised in Mitchell LJ et al. (2021). GBE, 13: evab184.

      Evidence that the unusual long-wavelength absorbance detected in a couple of the single cone (pre-bleach) measurements were not of visual pigment in origin comes from post-bleach scans, which showed their persistence (i.e., did not show a photobleaching response) and were likely instead contaminants (e.g., blood, RPE pigment). UV absorbance in some of the double cone measurements (above that expected of the prebleached beta peak from chromophore spectral absorption) can be attributed to either noise from scans as is quite typical of MSP and/or partial (accidental) bleaching from stray light sources. Although utmost care was taken to minimise contamination and unintended bleaching sometimes it is unavoidable.

      We refer the Reviewer to multiple published studies for further examples of typical MSP measurements that share similar levels of noise to ours e.g., see Figure 1 in Knott B. et al. (2013). JEB, 216:4454-4461; Figure 3 in Schott, RK et al. (2015). PNAS, 113(2): 356-361; Figure 2 in Dalton BE et al. (2014). Proc R Soc B. 281; Figure 5 in Tosetto, JE et al. (2021). Brain Behav Evol. 96: 103-123.

      Presentation

      The results are not presented in a straightforward way - at least for this reviewer. What is missing for me is a clear link between the psychometric curves in Figure 3A and the discrimination thresholds indicated in Figure 3B and Figure 4. Figure 3A is only discussed in the text on line 289 - after Figure 4 has been introduced and discussed. It would have been very helpful for me if the psychometric curves were first introduced and described, then the relation to Figure 3B was clearly indicated (perhaps with a single psychometric curve as an example). Similarly for Figure 4 the relationship between specific psychometric curves and the threshold plotted would be quite helpful. Currently it takes a careful reading to understand why being below the dashed line in Figure 4 is important.

      We have made the following changes, including the introduction of the psychometric curves earlier in the results (lines 236-249) and moved the psychometric function comparison before the mention of Figure 4. Additionally, to make the association between the plotted colour loci and psychometric curves clearer, we have added a smaller psychometric curve plot adjacent to the colour space (in Figure 3B) using red as an example which has an averaged psychometric curve overlying the individual fish curves. The figure caption (lines 250-274) explains that the plotted colour loci and given thresholds are mean values calculated from the individual fish behavioural data.

      We have also added a brief reminder that the theoretical limit of colour discrimination is predicted by the RNL model as 1∆S, where in our task fish should be just able to distinguish targets from grey distractors (see lines 222-224). To clarify, the plotted values in Figure 4B are both the individual fish thresholds (points) and average threshold (black bar) per colour set. The individual threshold values are taken at a correct choice probability of 50% from fitted psychometric curves of fish behavioural performance (shown in Figure 3A).

      RNL model

      The data is fit and interpreted in the context of the receptor noise limited model. The paragraph in the discussion about complementary color pairs suggests that this model is incorrect (text around line 332). Consideration of how the results depend on the RNL model is important, especially given the interpretation here.

      The inability of the RNL model to account for the observed asymmetry between color discrimination thresholds implies that they cannot be solely attributed to photoreceptor noise. We can therefore infer from the asymmetry that thresholds are set by a higher-level process, whether that involves post-receptor processes within the inner retina or in the brain remains to be investigated. As explained in lines 396-397 one possibility is that activation of the UV receptor suppresses noise in the visual pathway or enhances the saliency of colors for anemonefish. The high sensitivity to violet-green, which was found in all six of the fish tested, is consistent with the heightened saliency of this color (lines 397-399).

      Figure 3B

      This is the key figure in the paper. But several issues make seeing the data in this figure difficult. First, the important part of the figure is buried near the origin and hard to see. Can you show a surface that connects the thresholds in the different chromatic directions, or otherwise highlight the regions of discriminable and not discriminable colors?

      See previous comment. In short, we have taken the advice of the Reviewer and added highlighted areas around the regions of discriminable colors in Figure 3B to help visually separate them from the non-discriminable regions of colors (from grey). Additionally, we have added an inset showing an enlarged image of the area surrounding the centre of colour space.

      Reviewer #2 (Public Review):

      Mitchell and colleagues examined the contribution of a UV-sensitive cone photoreceptor to chromatic detection in Amphiprion ocellaris, a type of anemonefish. First, they used biophysical measurements to characterize the response properties of the retinal receptors, which come in four spectrally-distinct subtypes: UV, M1, M2, and L. They then used these spectral sensitivities to construct a 4-dimensional (tetrahedral) color space in which stimuli with known spectral power distributions can be represented according to the responses they elicit in the four cone types. A novel five-LED display was used to test the fish's ability to detect "chromatic" modulations in this color space against a background of random-intensity, "achromatic" distractors that produce roughly equal relative responses in the four cone types. A subset of stimuli, defined by their high positive UV contrast, were more readily detected than other colors that contained less UV information. A well-established model was used to link calculated receptor responses to behavioral thresholds. This framework also enabled statistical comparisons between models with varying number of cone types contributing to discrimination performance, allowing inferences to be drawn about the dimensionality of color vision in anemonefish.

      The authors make a compelling case for how UV light in the anemonefish habitat is likely an important ecological source of information for guiding their behavior. The authors are to be commended for developing an elegant behavioral paradigm to assess visual performance and for incorporating a novel display device especially suited to addressing hypotheses about the role of UV light in color perception. While the data are suggestive of behavioral tetrachromacy in anemonefish, there are some aspects of the study that warrant additional consideration:

      1) One challenge faced by many biological imaging systems is longitudinal chromatic aberration (LCA) - that is, the focal power of the system depends on wavelength. In general, focal power increases with decreasing wavelength, such that shorter wavelengths tend to focus in front of longer wavelengths. In the human eye, at least, this focal power changes nonlinearly with wavelength, with the steepest changes occurring in the shorter part of the visible spectrum (Atchison & Smith, 2005). In the fish eye, where the visible spectrum extends to even shorter wavelengths, it seems plausible that a considerable amount of LCA may exist, which could in turn cause UV-enriched stimuli to be more salient (relative to the distractor pixels) due to differences in perceived focus rather than due solely to differences in their respective spectral compositions. Such a mechanism has been proposed by Stubbs & Stubbs (2016) as a means for supporting "color vision" in monochromatic cephalopods (but see Gagnon et al. 2016). It would be worth discussing what is known about the dispersive properties of the crystalline lens in A. ocellaris (or similar species), and whether optical factors could produce sufficient cues in the retinal image that might explain aspects of the behavioral data presented in the current study.

      This is an interesting point, and we appreciate the reviewer’s thoughtful comment regarding this topic especially as LCA increases exponentially in the UV. Although we certainly cannot disprove such a mechanism in the present study, we are highly sceptical that LCA could be used by reef fish and is involved in the heightened saliency of UV stimuli. Previous work has found that LCA is mostly corrected for in the teleost retina of both marine and freshwater species by graded, multifocal lenses that focus different wavelengths at the same depth as their maximally sensitive cone photoreceptors [e.g., for evidence in African cichlids see Kröger, R. H. H. et al. (1999). J Comp Physiol. A, 184, 361-369; Malkki, P. E. & Kröger, R. H. H. (2005). J Opt. A, 7, 691-700; and for various reef fishes see Karpestam, B. et al. (2007). J Exp Biol., 210, 16: 2923-2931]. In essence, LCA is corrected in the eyes of many teleosts by accurately tuning longitudinal spherical aberration through having a graded density lens. We draw particular attention to the latter reference which comparatively examined the optical properties of reef fish lenses, including diurnal, planktivorous damselfishes (from the same family as anemonefishes, Pomacentridae). They found that not only were the lenses of these species highly UV-transmissive (as we show in anemonefish), but all were multifocal and capable of focusing both visible (non-UV) and UV wavelengths. Considering the coastal cephalopod species examined thus far, all of them contain only one type of visual pigment which is packed in their long photoreceptor (150-450µm long outer segment) across an entire retina (Chung and Marshall 2016, Proceeding B). Theoretically, given these long photoreceptors, the LCA and the resulting differentials of focal length onto different patches of photoreceptors or different depth of the outer segment might provide cues for colour discrimination even though no behavioural evidence exists to prove this hypothesis yet. Unlike the cephalopod case, the four specific spectral cones arranged in a mosaic pattern along with their very short outer segments (5-10µm) in the anemonefish retina likely makes the LCA less effective in this retinal design.

      We have added a short paragraph (Lines 400-412) discussing the possibility of an optical mechanism contributing to heightened UV saliency with a particular focus on LCA and our thoughts on why we consider it an unlikely mechanism in anemonefish.

      2) The authors provide a quantitative description of anemonefish visual performance within the context of a well-developed receptor-based framework. However, it was less clear to me what inferences (if any) can be drawn from these data about the post-receptoral mechanisms that support tetrachromatic color vision in these organisms. Would specific cone-opponent processes account for instances where behavioral data diverged from predictions generated with the "receptor noise limited" model described in the text? The general reader may benefit from more discussion centered on what is known (or unknown) about the organization of cone-opponent processing in anemonefish and related species.

      In short, we do not know the specific opponent interactions of anemonefish cones. The RNL model assumes all possible opponent interactions in its calculations. From our results, very little can be said about the post-receptor mechanisms involved in their putative tetrachromatic vision. We would like to avoid overreaching beyond what our data can show. A future directions section has now been added to the discussion (lines 467-497), which briefly mentions the known UV opponency in larval zebrafish and that future investigation in anemonefish should attempt to disentangle the specific opponent (chromatic) and non-opponent (achromatic) circuits in the anemonefish retina.

      Reviewer #3 (Public Review):

      The comments below focus mainly on ways that the data and analysis as currently present do not to this reviewer compel the conclusions the authors wish to draw. It is possible that further analysis and/or clarification in the presentation would more persuasively bolster the authors' position. It also seems possible that a presentation with more limited conclusions but clarity on exactly what has been demonstrated and where additional future work is needed would make a strong contribution to the literature.

      • Fig 3A. It might be worth emphasizing a bit more explicitly that the x-axis (delta S) is the result of a model fit to the data being shown, since this then means that if RNL model fit the data perfectly, all of the thresholds would fall at deltaS = 1. They don't, so I would like to see some evaluation from the authors' experience with this model as to whether they think the deviations (looks like the delta S range is ~0.4 to ~1.6 in Figure 4B) represent important deviations of the data from the model, the non-significant ANOVA notwithstanding. For example, Figure 4B suggests that the sign of the fit deviations is driven by the sign of the UV contrast and that this is systematic, something that would not be picked up by the ANOVA. Quite a bit is made of the deviations below, but that the model doesn't fully account for the data should be brought out here I think. As the authors note elsewhere, deviations of the data from the RNL model indicate that factors other than receptor noise are at play, and reminding the reader of this here at the first point it becomes clear would be helpful.

      We have now stated more explicitly in the figure caption for Figure 3A, that the delta S values presented were calculated by fitting fish behavioral data to the RNL model. To test the overall effect that the sign of the UV contrast had on the discrimination threshold, we have now included ‘contrast’ (positive or negative) as another fixed effect in the linear mixed effects model. We have now included details of this test in the results which shows the systematic effect (lines 338-340). Additionally, as suggested we now briefly introduce in the results the idea that factors other than receptor noise are causing the observed deviations in data from the RNL model.

      • Line 217 ff, Figure 4, Supplemental Figure 4). If I'm understanding what the ANOVA is telling us, it is that the deviations of the data across color directions and fish (I think these are the two factors based on line 649) is that the predictions deviate significantly from the data, relative to the inter-fish variability), for the trichromatic models but not the tetrachromatic model. If that's not correct, please interpret this comment to mean that more explanation of the logic of the test would be helpful.

      The interpretation of the ANOVA by the Reviewer is mostly correct. We had the variables color set and Fish ID, with threshold delta S as the dependent variable. This showed that deviations from the predicted threshold were significant relative to the inter-fish variability for the trichromatic models. Missing details describing the ANOVA have now been added to the methods (lines 789-798).

      Assuming that the above is right about the nature of the test, then I don't think the fact that the tetrachromatic model has an additional parameter (noise level for the added receptor type) is being taken into account in the model comparison. That is, the trichromatic models are all subsets of the tetrachromatic model, and must necessarily fit the data worse. What we want to know is whether the tetrachromatic model is fitting better because its extra parameter is allowing it to account for measurement noise (overfitting), or whether it is really doing a better job accounting for systematic features of the data. This comparison requires some method of taking the different number of parameters into account, and I don't think the ANOVA is doing that work. If the models being compared were nested linear models, than an F-ratio test could be deployed, but even this doesn't seem like what is being done. And the RNL model is not linear in its parameters, so I don't think that would be the right model comparison test in any case.

      Typical model comparison approaches would include a likelihood ratio test, AIC/BIC sorts of comparisons, or a cross-validation approach.

      If the authors feel their current method does persuasively handle the model comparison, how it does so needs to be brought out more carefully in the manuscript, since one of the central conclusions of the work hinges at least in part on the appropriateness of such a statistical comparison.

      Our visual model comparisons were aimed at assessing whether a trichromatic or tetrachromatic model best fit the colour discrimination data. The trichromatic and tetrachromatic models assume two and three opponency pathways, respectively. If the fish were not tetrachromatic, and instead trichromatic, then we would expect that the RNL model should better fit the data with two opponency mechanisms (rather than three). Our reason for making this assessment, is because of the possibility that not all the cones could be contributing to colour vision and could be used exclusively for achromatic tasks (e.g., luminance vision or motion detection). However, according to our finding that the data best fit the tetrachromatic model (i.e., how the behavioural discrimination thresholds more closely fitted the theoretical prediction of 1∆S), it is likely that anemonefish used all four cones for colour vision.

      We have also now repeated our analysis using unweighed delta S values which are calculated using general n-dimensional models of colour vision (using the PAVO2 package). These models essentially follow the same initial steps followed by the RNL model (and many others) but omit the receptor noise correction stage. After comparing (using ANOVA, see lines 303-311) the predicted thresholds with the data in this non-RNL space, it was found that again the tetrachromatic model predictions did not deviate significantly from the data relative to individual fish performance; however, we also found that the trichromatic model without M2 cone input no longer differed from the predicted values. In this case, it seems that the extra noise parameter did contribute to the difference in fit. Whether this is a biologically meaningful comparison (as all photoreceptors contain noise) is an open question. We have added a short statement explicitly framing our interpretation of anemonefish having a 3-D colour space to being in accordance with the closeness of RNL model predictions (lines 370-371, 506-508).

      • Also on the general point on conclusions drawn from the model fits, it seems important to note that rejecting a trichromatic version of the RNL model is not the same as rejecting all trichromatic models. For example, a trichromatic model that postulates limiting noise added after a set of opponent transformations will make predictions that are not nested within those of RNL trichromatic models. This point seems particularly important given the systematic failures of even the tetrachromatic version of the RNL model.

      This is a good point. We have limited our conclusions to specifically address trichromatic models generated within the framework of the RNL model by adding in the conclusion section that fish psychophysical thresholds were best explained by the RNL model when all four cone types contributed to colour vision (see lines 370-371, 506-508). In this same sentence, we have also added in parentheses that “suggesting (but not proving) tetrachromacy” (line 508). We have also edited the abstract to state that our results were “…best described by a tetrachromatic model using all four cone types…”, rather than stating we have shown tetrachromacy (lines 36-37).

      • More generally, attempts to decide whether some human observers exhibit tetrachromacy have taught us how hard this is to do. Two issues, beyond the above, are the following. 1) If the properties of a trichromatic visual system vary across the retina, then by imaging stimuli on different parts of the visual field an observer can in principle make tetrachromatic discriminations even though visual system is locally trichromatic at each retinal location. 2) When trying to show that there is no direction in a tetrachromatic receptor space to which the observer is blind, a lot of color directions need to be sampled. Here, 9 directions are studied. Is that enough? How would we know? The following paper may be of interest in this regard: Horiguchi, Hiroshi, Jonathan Winawer, Robert F. Dougherty, and Brian A. Wandell. "Human trichromacy revisited." Proceedings of the National Academy of Sciences 110, no. 3 (2013): E260-E269. Although I'm not suggesting that the authors conduct additional experiments to try to address these points, I do think they need to be discussed. We agree with the reviewer, that colour discriminability achieved by tetrachromatic vision could in theory be achieved by the combined effect of localised, distinct forms of trichromacy. Evidence in other fishes suggests that such multiple forms of trichromacy across the retina likely exist in many species. However, the behavioural effects of this retinal setup remain to be studied likely due to its extremely difficult nature. We have added a new section titled “future directions” (Lines 474-489), in which we discuss the possibility that distinct forms of trichromacy in the anemonefish retina could in theory achieve colour discrimination on par with tetrachromatic vision. We also give suggestions on how this could be investigated.

      Although we tried to include as many colour directions as practically possible in our experiment, we have certainly not provided an exhaustive range that completely encompasses anemonefish colour space. Whether 9 colour directions are adequate to assess the dimensionality of their color vision is difficult to say. As addressed in the previous comment, we now acknowledge this limitation by refining our conclusion, saying that our results do not prove tetrachromacy.

      • Line 277 ff. After reading through the paper several times, I remain unsure about what the authors regard as their compelling evidence that the UV cone has a higher sensitivity or makes an omnibus higher contribution to sensitivity than other cones (as stated in various forms in the title, Lines 37-41, 56-57, 125, 313, 352 and perhaps elsewhere).

      At first, I thought they key point was that the receptor noise inferred via the RNL model as slightly lower (0.11) for the UV cone than for the double cones (0.14). And this is the argument made explicitly at line 326 of the discussion. But if this is the argument, what needs to be shown is that the data reject a tetrachromatic version of the RNL model where the noise value of all the cones is locked to be the same (or something similar), with the analysis taking into account the fewer parametric degrees of freedom where the noise parameters are so constrained. That is, a careful model comparison analysis would be needed. Such an analysis is not presented that I see, and I need more convincing that the difference between 0.11 and 0.14 is a real effect driven by the data. Also, I am not sanguine that the parameters of a model that in some systematic ways fails to fit the data should be taken as characterizing properties of the receptors themselves (as sometimes seems to be stated as the conclusion we should draw).

      We have performed various modelling scenarios where receptor noise was adjusted for each channel; however, the UV channel was consistently found to be more sensitive than the other channels. In (the original) Supplementary Figure 6 (now Figure 4 – figure supplements 1 and 2), we show predicted dS values calculated using receptor noise levels in the exact manner that the Reviewer suggests by ranging from 0.05 to 0.15, and most importantly, included scenarios where receptor noise was held equal across cone types and others where it was varied between single cones and double cones. None of the models adjusted the data so that sensitivity was equal across all four channels, which means that by an unknown mechanism, the UV channel is more sensitive, but this is unrelated to noise levels. Our best-fit receptor noise values of 0.11 (for single cones) and 0.14 (for double cones) are estimate values and should be treated as such till actual receptor noise measurements are made.

      Then, I thought maybe the argument is not that the noise levels differ, but rather that the failures of the model are in the direction of thresholds being under predicted for discriminations that involve UV cone signals. That's what seems to be being argued here at lines 277 ff, and then again at lines 328 ff of the discussion. But then the argument as I read it more detail in both places switches from being about the UV cones per se to being about postive versus negative UV contrast. That's fine, but it's distinct from an argument that favors omnibus enhanced UV sensitivity, since both the UV increments and decrements are conveyed by the UV cone; it's an argument for differential sensitivity for increments versus decrements in UV mediated discriminations. The authors get to this on lines 334 of the discussion, but if the point is an increment/decrement asymmetry the title and many of the terser earlier assertions should be reworked to be consistent with what is shown.

      To clarify our argument, we found that the colour discrimination thresholds were systematically lower than predicted by the RNL model for colours which elicited higher UV cone stimulation relative to other cone types. These colours we refer to as UV positive based on the sign direction of their contrast against grey distractors produced by higher UV/V LED channel (i.e., in a positive direction). Whereas colours with UV negative chromatic contrast had lower UV cone stimulation relative to the other cone types. Therefore, our interpretation of the importance of UV cone signals for colour discrimination are congruent with the results. In the discussion, we suggest a possibility that activation of the UV receptor suppresses noise downstream in the visual pathway or enhances the saliency of colours (see lines 397-398). This activation of the UV receptor would, of course, be at its highest for colours with positive UV chromatic contrast.

      Note that we have added to the discussion the possibility that colour preferences or a difference in attentiveness might have contributed to differences in discrimination thresholds (see discussion lines 412-413, 427-428, 433-435, 456-466, and 469-473). However, we consider it a less likely explanation due to a couple of reasons, including 1) a lack of difference in responsiveness across colour sets in their timing to peck the target, and 2) any non-learnt bias would have likely been overridden or at least weakened by training prior to the experiment where colours were rewarded equally (see lines 462-466).

      We have edited the results (lines 334-352) to make our point clearer and by changing the subtitle to be more explicit: “Lower discrimination thresholds induced by positive UV contrast”. The subsection begins by explaining the different types of UV chromatic contrast by elevation angle and, finally, how this division among colour sets was a major determinant of colour discrimination thresholds.

      Perhaps the argument with respect to model deviations and UV contrast independent of sign could be elaborated to show more systematically that the way the covariation with the contrasts of the other cone stimulations in the stimulus set goes, the data do favor deviations from the RNL in the direction of enhanced sensitivity to UV cone signals, but if this is the intent I think the authors need to think more about how to present the data in a manner that makes it more compelling than currently, and walk the reader carefully through the argument.

      We have added to the results the linear mixed-effects model output with ‘contrast’ (positive/negative) added as a fixed effect. This analysis shows that the sign direction of UV contrast was a strong predictor of threshold (see address to previous comments and lines 399-401, 790-799).

      • On this point, if the authors decide to stick with the enhanced UV sensitivity argument in the revision, a bit more care about what is meant by "the UV cone has a comparatively high sensitivity (line 313 and throughout)" needs more unpacking. If it is that these cones have lower inferred noise (in the context of a model that doesn't account for at least some aspects of the data), is this because of properties of the UV cones, or the way that post-receptoral processing handles the signals from these cones mimicking a cone effect in the model. And if it is thought that it is because of properties of the cones, some discussion of what those properties might be would be helpful. As I understand the RNL model, relative numbers of cones of each type are taken into account, so it isn't that. But could it be something as simple as higher photopigment density or larger entrance aperture (thus more quantum catches and higher SNR)?

      It is unknown what aspect of the cone morphology or physiology sets the activation or inactivation threshold. Electrophysiological data collected from the UV cones of other fish species e.g., in goldfish and zebrafish [see Hawryshyn & Beauchamp (1985). 25, Vis Res.; and Yoshimatsu et al. (2020). 107, Neuron.] show that they have exceptionally high sensitivity. What has not been shown is that having a UV cone can improve colour discrimination.

      Previous quantitative cone opsin gene expression analysis showed that the single cone opsins (SWS1 and SWS2B) are expressed at lower levels than all double cone opsin genes. This difference in expression combined with the smaller size of single cone outer segments than the double cones make it unlikely that a larger photoreceptor size, higher volume or packing density of visual pigment is responsible. Contrary to our findings, these aspects of the different cone types (if they had an effect) would instead predict that double cones have a higher SNR, and non-UV colours would be more discriminable. We have now added these details to the discussion (see lines 391-397).

      • Line 288 ff. The fact that the slopes of the psychometric functions differed across color directions is, I think, a failure of the RNL model to describe this aspect of the data, and tells us that a simple summary of what happens for thresholds at delta S = 1 does not generalize across color directions for other performance levels. Since one of the directions where the slope is shallower is the UV direction, this fact would seem to place serious limits on the claim that discrimination in the UV direction is enhanced relative to other directions, but it goes by here without comment along those lines. Some comment here, both about implications for fit of RNL model and about implications for generalizations about efficacy of UV receptor mediated discrimination and UV increment/decrement asymmetries, seems important.

      The variation in the psychometric functions is difficult to interpret and cannot be explained by the RNL model. What the RNL model predicts is delta S based on low level factors (namely receptor noise). In the discussion, we completely agree with the notion that the asymmetry in thresholds from predicted values, and the variation in psychometric slopes cannot be explained by the RNL model, e.g., this is heavily implied by “colour discrimination thresholds cannot be directly attributed to noise in the early stages of the visual pathway…” (lines 388-390). To clarify the inability of the RNL model to account for this aspect of the data, we have included a statement (see line 390).

      It is a good point that this could be an indication of heterogeneity in colour space. Heterogeneity in discrimination thresholds across animal colour space (both surrounding the threshold area and for more saturated regions) has been explored in detail using trichromatic triggerfish by Green N. F. et al. (2022). JEB, 7(225):jeb243533. We have added this idea to the discussion (see lines 490-498). For UV, it seems that two of the five fish (#34 and 20) had noticeably shallower curves than the others tested for UV (fish #19, 33, 36). Both also varied more in their ability to distinguish targets, as shown by their wider confidence intervals. One of these two fish (#34) was retested for UV at the end of the experiment, and in the secondary assessment had a steeper psychometric curve more in line with the other fish in the experiment (see Figure 3 – figure supplement 1 and added lines 247-250). Based on this discrepancy in performance between assessments, it is also possible that individual learning effects had a role in impacting the shape of the psychometric curve. Note, this had minimal effect on colour discrimination thresholds and any differences were in the direction of change observed across colour sets in the experiment (i.e., lower dS for UV positive directions).

      • Line 357 ff. Up until this point, all of the discussion of differences in threshold across stimulus sets has been in terms of sensitivity. Here the authors (correctly) raise the possibility that a difference in "preference" across stimulus sets could drive the difference in thresholds as measured. Although the discussion is interesting and germaine, it does to some extent further undercut the security of conclusions about differential sensitivity across color directions relative to the RNL model predictions, and that should be brought out for the reader here. The authors might also discuss about how a future experiment might differentiate between a preference explanation and a sensitivity explanation of threshold differences.

      We have now added a paragraph (see lines 469-473) discussing that future work should test for color preferences and suggest how this could be done using a similar foraging task. We also include our thoughts immediately prior on why it is unlikely that a colour preference was a major contribution towards the results. In short, we consider it unlikely as fish showed no evidence of reduced latency for pecking at targets across the colour sets and because the training regime prior to the experiment equally rewarded fish for all colours and would likely have overridden a strong preference (at least in this specific foraging context).

      • RNL model. The paper cites a lot of earlier work that used the RNL model, but I think many readers will not be familiar with it. A bit more descriptive prose would be helpful, and particularly noting that in the full dimensional receptor space, if the limiting noise at the photoreceptors is Gaussian, then the isothreshold contour will be a hyper-ellipsoid with its axes aligned with the receptor directions.

      There is now added explanation of the RNL model (see lines 141-151), particularly on its assumptions that it only receives chromatic input and that discrimination is limited by noise arising in the photoreceptors and not by any specific opponent mechanisms. We also added the mention of the expected hyper-ellipsoid shape of isothreshold contours if receptor noise is Gaussian. Note, while we appreciate the importance of the reader to understand the basic functionality of the model, we wanted to avoid overloading the introduction with details on the RNL model which is not the focus of the paper. The RNL model is well-established in the field of visual ecology and animal vision research for well over a decade and has been thoroughly dissected by previous methodological reviews. We refer to one of these more recent reviews by Olsson et al. (2018) Behav Ecol. 29(2):273-282, and direct the reader to the methods section for further details on the RNL model.

      • Use of cone isolating stimuli? For showing that all four cone classes contribute to what the authors call color discrimination, a more direct approach would seem to be to use stimuli that target stimulation of only one class of cone at a time. This might require a modified design in which the distractors and target were shown against a uniform background and approximately matched in their estimated effect on a putative achromatic mechanism. Did the authors consider this approach, and more generally could they discuss what they see as its advantages and disadvantages for future work.

      The Reviewer is correct in that a targeted approach of isolated cone stimulation would be the optimal approach to demonstrating tetrachromatic colour vision. However, the extreme spectral overlap in the absorption curves of anemonefish cones, particularly in the mid-wavelength region makes this problematic in using the current LED display. We added to the discussion ways that this could be studied in the future (see lines 474-489). This might be possible (but still challenging) using a monochromator, but such technology severely limits the diversity of stimuli which can be created and usually restricts experiments to a simple paired choice design (or grey card experiment). The traditional paired choice experiment requires animals to be trained to distinguish a specific colour, while the Ishihara-like task trains animals to distinguish targets using an odd-one-out approach. This latter approach is highly efficient, as it does not require retraining when testing a new colour (i.e., fish learnt the task not a specific colour). Here, we wanted to assess colour discrimination in multiple directions to compare performance, and the flexible LED display combined with a generalisable task was important.

      The above assumes that anemonefish do not use multiple trichromatic systems. In which case, the use of standard experimental stimuli (e.g., a monochromator, an LED display) would be unsuitable as they illuminate the whole retina. To definitively test the range of opponent interactions, it would be necessary to make electrophysiological measurements targeting the transmitting neurons using a retinal multielectrode array (MEA) approach or by in-vivo calcium imaging (lines 484-486).

      We understand that our results are not a direct test of the dimensionality of anemonefish colour vision and should not be interpreted as such, as we do not have direct evidence of tetrachromacy. To recognize this limitation of our data, we have drawn back some of our conclusive statements that claimed to have demonstrated tetrachromacy.

    2. Reviewer #3 (Public Review):

      The comments below focus mainly on ways that the data and analysis as currently present do not to this reviewer compel the conclusions the authors wish to draw. It is possible that further analysis and/or clarification in the presentation would more persuasively bolster the authors' position. It also seems possible that a presentation with more limited conclusions but clarity on exactly what has been demonstrated and where additional future work is needed would make a strong contribution to the literature.

      * Fig 3A. It might be worth emphasizing a bit more explicitly that the x-axis (delta S) is the result of a model fit to the data being shown, since this then means that if RNL model fit the data perfectly, all of the thresholds would fall at deltaS = 1. They don't, so I would like to see some evaluation from the authors' experience with this model as to whether they think the deviations (looks like the delta S range is ~0.4 to ~1.6 in Figure 4B) represent important deviations of the data from the model, the non-significant ANOVA notwithstanding. For example, Figure 4B suggests that the sign of the fit deviations is driven by the sign of the UV contrast and that this is systematic, something that would not be picked up by the ANOVA. Quite a bit is made of the deviations below, but that the model doesn't fully account for the data should be brought out here I think. As the authors note elsewhere, deviations of the data from the RNL model indicate that factors other than receptor noise are at play, and reminding the reader of this here at the first point it becomes clear would be helpful.

      * Line 217 ff, Figure 4, Supplemental Figure 4). If I'm understanding what the ANOVA is telling us, it is that the deviations of the data across color directions and fish (I think these are the two factors based on line 649) is that the predictions deviate significantly from the data, relative to the inter-fish variability), for the trichromatic models but not the tetrachromatic model. If that's not correct, please interpret this comment to mean that more explanation of the logic of the test would be helpful.

      Assuming that the above is right about the nature of the test, then I don't think the fact that the tetrachromatic model has an additional parameter (noise level for the added receptor type) is being taken into account in the model comparison. That is, the trichromatic models are all subsets of the tetrachromatic model, and must necessarily fit the data worse. What we want to know is whether the tetrachromatic model is fitting better because its extra parameter is allowing it to account for measurement noise (overfitting), or whether it is really doing a better job accounting for systematic features of the data. This comparison requires some method of taking the different number of parameters into account, and I don't think the ANOVA is doing that work. If the models being compared were nested linear models, than an F-ratio test could be deployed, but even this doesn't seem like what is being done. And the RNL model is not linear in its parameters, so I don't think that would be the right model comparison test in any case.

      Typical model comparison approaches would include a likelihood ratio test, AIC/BIC sorts of comparisons, or a cross-validation approach.

      If the authors feel their current method does persuasively handle the model comparison, how it does so needs to be brought out more carefully in the manuscript, since one of the central conclusions of the work hinges at least in part on the appropriateness of such a statistical comparison.

      * Also on the general point on conclusions drawn from the model fits, it seems important to note that rejecting a trichromatic version of the RNL model is not the same as rejecting all trichromatic models. For example, a trichromatic model that postulates limiting noise added after a set of opponent transformations will make predictions that are not nested within those of RNL trichromatic models. This point seems particularly important given the systematic failures of even the tetrachromatic version of the RNL model.

      * More generally, attempts to decide whether some human observers exhibit tetrachromacy have taught us how hard this is to do. Two issues, beyond the above, are the following. 1) If the properties of a trichromatic visual system vary across the retina, then by imaging stimuli on different parts of the visual field an observer can in principle make tetrachromatic discriminations even though visual system is locally trichromatic at each retinal location. 2) When trying to show that there is no direction in a tetrachromatic receptor space to which the observer is blind, a lot of color directions need to be sampled. Here, 9 directions are studied. Is that enough? How would we know? The following paper may be of interest in this regard: Horiguchi, Hiroshi, Jonathan Winawer, Robert F. Dougherty, and Brian A. Wandell. "Human trichromacy revisited." Proceedings of the National Academy of Sciences 110, no. 3 (2013): E260-E269. Although I'm not suggesting that the authors conduct additional experiments to try to address these points, I do think they need to be discussed.

      * Line 277 ff. After reading through the paper several times, I remain unsure about what the authors regard as their compelling evidence that the UV cone has a higher sensitivity or makes an omnibus higher contribution to sensitivity than other cones (as stated in various forms in the title, Lines 37-41, 56-57, 125, 313, 352 and perhaps elsewhere).

      At first, I thought they key point was that the receptor noise inferred via the RNL model as slightly lower (0.11) for the UV cone than for the double cones (0.14). And this is the argument made explicitly at line 326 of the discussion. But if this is the argument, what needs to be shown is that the data reject a tetrachromatic version of the RNL model where the noise value of all the cones is locked to be the same (or something similar), with the analysis taking into account the fewer parametric degrees of freedom where the noise parameters are so constrained. That is, a careful model comparison analysis would be needed. Such an analysis is not presented that I see, and I need more convincing that the difference between 0.11 and 0.14 is a real effect driven by the data. Also, I am not sanguine that the parameters of a model that in some systematic ways fails to fit the data should be taken as characterizing properties of the receptors themselves (as sometimes seems to be stated as the conclusion we should draw).

      Then, I thought maybe the argument is not that the noise levels differ, but rather that the failures of the model are in the direction of thresholds being under predicted for discriminations that involve UV cone signals. That's what seems to be being argued here at lines 277 ff, and then again at lines 328 ff of the discussion. But then the argument as I read it more detail in both places switches from being about the UV cones per se to being about postive versus negative UV contrast. That's fine, but it's distinct from an argument that favors omnibus enhanced UV sensitivity, since both the UV increments and decrements are conveyed by the UV cone; it's an argument for differential sensitivity for increments versus decrements in UV mediated discriminations. The authors get to this on lines 334 of the discussion, but if the point is an increment/decrement asymmetry the title and many of the terser earlier assertions should be reworked to be consistent with what is shown.

      Perhaps the argument with respect to model deviations and UV contrast independent of sign could be elaborated to show more systematically that the way the covariation with the contrasts of the other cone stimulations in the stimulus set goes, the data do favor deviations from the RNL in the direction of enhanced sensitivity to UV cone signals, but if this is the intent I think the authors need to think more about how to present the data in a manner that makes it more compelling than currently, and walk the reader carefully through the argument.

      * On this point, if the authors decide to stick with the enhanced UV sensitivity argument in the revision, a bit more care about what is meant by "the UV cone has a comparatively high sensitivity (line 313 and throughout)" needs more unpacking. If it is that these cones have lower inferred noise (in the context of a model that doesn't account for at least some aspects of the data), is this because of properties of the UV cones, or the way that post-receptoral processing handles the signals from these cones mimicking a cone effect in the model. And if it is thought that it is because of properties of the cones, some discussion of what those properties might be would be helpful. As I understand the RNL model, relative numbers of cones of each type are taken into account, so it isn't that. But could it be something as simple as higher photopigment density or larger entrance aperture (thus more quantum catches and higher SNR)?

      * Line 288 ff. The fact that the slopes of the psychometric functions differed across color directions is, I think, a failure of the RNL model to describe this aspect of the data, and tells us that a simple summary of what happens for thresholds at delta S = 1 does not generalize across color directions for other performance levels. Since one of the directions where the slope is shallower is the UV direction, this fact would seem to place serious limits on the claim that discrimination in the UV direction is enhanced relative to other directions, but it goes by here without comment along those lines. Some comment here, both about implications for fit of RNL model and about implications for generalizations about efficacy of UV receptor mediated discrimination and UV increment/decrement asymmetries, seems important.

      * Line 357 ff. Up until this point, all of the discussion of differences in threshold across stimulus sets has been in terms of sensitivity. Here the authors (correctly) raise the possibility that a difference in "preference" across stimulus sets could drive the difference in thresholds as measured. Although the discussion is interesting and germaine, it does to some extent further undercut the security of conclusions about differential sensitivity across color directions relative to the RNL model predictions, and that should be brought out for the reader here. The authors might also discuss about how a future experiment might differentiate between a preference explanation and a sensitivity explanation of threshold differences.

      * RNL model. The paper cites a lot of earlier work that used the RNL model, but I think many readers will not be familiar with it. A bit more descriptive prose would be helpful, and particularly noting that in the full dimensional receptor space, if the limiting noise at the photoreceptors is Gaussian, then the isothreshold contour will be a hyper-ellipsoid with its axes aligned with the receptor directions.

      * Use of cone isolating stimuli? For showing that all four cone classes contribute to what the authors call color discrimination, a more direct approach would seem to be to use stimuli that target stimulation of only one class of cone at a time. This might require a modified design in which the distractors and target were shown against a uniform background and approximately matched in their estimated effect on a putative achromatic mechanism. Did the authors consider this approach, and more generally could they discuss what they see as its advantages and disadvantages for future work.

    1. Author Response

      Reviewer #1 (Public Review):

      Precise regulation of gamete fusion ensures that offspring will have the same ploidy as the parents. However, breaking this regulation can be useful for plant breeding. Haploid induction followed by chemical-induced genome doubling can be used to fix desirable genotypes, while triparental hybrids where two sperm cells with two different genotypes fertilize an egg cell can be advantageous for bypassing hybridization barriers to create interspecies hybrids with increased fitness. This manuscript follows up on a previous study from the same research group that used a clever high throughput polyspermy detection assay (HIPOD) to show that wild-type Arabidopsis naturally forms triparental hybrids at very low frequencies (less than 0.05% of progeny) and that these triparental hybrids can bypass dosage barriers in the endosperm (Nakel, et al., 2017). Mao and co-authors hypothesized that mutants that conferred polytubey, the attraction of multiple pollen tubes by mutant female gametophytes, would also increase the rate of triparental hybrids. They used a double mutant in the endopeptidase genes ECS1 and ECS2 which had previously been reported to induce supernumerary pollen tube attraction to test this hypothesis with their two-component HIPOD system in which one pollen donor constitutively expresses the mGAL4-VP16 transcription factor while the second pollen donor carries an herbicide resistance gene regulated by the GAL4-responsive UAS promoter. Triparental hybrids are detected as herbicide-resistant progeny from wild-type Arabidopsis flowers that have been pollinated by the two paternal genotypes. The authors convincingly show that the ecs1 ecs2-1 double mutant more than doubled the frequency of triparental, triploid hybrids in HIPOD crosses. They next tested the hypothesis that this increase in triparental hybrids was due to a gametophytic effect by using an ecs1-/- ecs2-1/ECS2 maternal parent in the HIPOD assay and testing whether the ecs2-1 mutant allele was preferentially inherited in triparental hybrids. The mutant allele was inherited at a much higher rate than expected, confirming their hypothesis.

      The triparental hybrid results with the ecs1 ecs2 mutant were not that surprising since the presence of extra sperm cells gives more opportunities for triparental hybrids to form, especially if gamete fusion is misregulated. However, an unexpected result came when the authors used aniline blue staining to analyze the ecs1 ecs2 polytubey phenotype. They confirmed that the double mutant had increased levels of polytubey compared to wild-type ovules, but they also noticed that 13% of seeds were not developing normally. This phenotype was confirmed with a second ecs2 allele and was complemented with both ECS1 and ECS2 transgenes under their native promoters. Microscopic analysis revealed normal gametophyte morphology before fertilization, but 8% of pollinated ovules failed to develop an embryo and 7% failed to develop endosperm, suggesting single fertilization events. In a logical set of experiments, they followed up on this result by crossing ecs1 ecs2 with pollen carrying a fluorescent reporter that would be expressed in developing embryos and endosperm. In this experiment, they were again surprised. Some of the wild-type-looking seeds lacked a paternal contribution (i.e. no fluorescent signal from the paternal reporter construct) in the embryo. This prompted them to look more closely at the progeny, upon which they detected small plants that were haploid. They confirmed the haploid nature by chromosome spreads. Finally, they used interaccession crosses between ecs1 ecs2 (Col-0) and Landsberg to verify that haploid progeny only carried maternal alleles of markers on all five chromosomes, indicating that the ecs1 ecs2 genotype can induce maternal haploids.

      This interesting study highlights the importance of following up on unexpected results. The conclusions are well-supported by the data and quite exciting. Paternal haploid inducers have been discovered in several species, but this is one of only two examples of maternal haploid induction. While the percentage of maternal haploids is very low, this phenomenon could be useful for plant breeding.

      Weaknesses

      The data in the manuscript is intriguing, but the question of how the same mutant combination promotes the formation of both triploid and haploid progeny remains unanswered and is not thoroughly discussed, nor is any model suggested for how the ECS1/2 peptidases could play a role in regulating gamete fusion and/or repressing parthenogenesis. A second unanswered question is whether the maternal haploids are a result of failed plasmogamy or karyogamy between the egg and sperm leading to parthenogenesis or a result of paternal genome elimination after plasmogamy. In figure 3B, the authors attempted to test whether plasmogamy occurs between the male and female gametes in ecs1 ecs2 ovules by crosses with pollen that expresses a mitochondrial marker under control of the pRPS5a promoter which is active in sperm cells as well as embryos and endosperm of fertilized ovules. This experiment allowed them to detect sperm cells that had not fused with the egg and central cell at 2 days after pollination. They also counted the percentage of seeds that expressed the mitochondrial marker in both embryo and endosperm at 2 DAP and found that ecs1 ecs2 mutants had a 20% reduction of visible mitochondria in embryo sacs compared to wildtype. They conclude that the result indicates a potential plasmogamy defect. However, the dependability of this marker is questionable since only ~55% of wild-type seeds had detectable signal in the embryo and endosperm. The authors imply that this experiment could be used to test plasmogamy, but it is not clear how any conclusions related to the abnormal seed phenotype could be drawn from examining the rate of signal in both the embryo and endosperm. Since the mitochondrial marker was not expressed from a sperm-specific promoter, the fluorescent signal at 2DAP is likely due to new gene expression from pRPS5a in the fertilized embryo and endosperm, not an indication of the presence of sperm-derived mitochondria. Perhaps an earlier timepoint could be used as well as a spermspecific promoter instead of pRPS5a to answer the question of whether plasmogamy is happening in the ecs1 ecs2 ovules.

      Thanks for the suggestion. We here provide two additional new data sets to provide evidence that ecs1 ecs2 mutant plants indeed exhibit single fertilization that lead to fertilization recovery.

      We determined the fertilization failure by checking the decondensation HTR10-RFP labelled sperm nuclei 8-10 HAP (Figure 3B) and the frequency of heterofertilization through dual pollination experiment (Figure 3C-E) (see above).

      Reviewer #2 (Public Review):

      The manuscript reports the triploid and haploid productions using an ecs1ecs2 mutant as the maternal donor, in addition to the evaluation of the sexual process observed in the mutant. The indicated data show exquisite quality. To improve the content, I recommend carefully reconsidering the descriptions because some of the insights would cause a stir in the controversy regarding ECS1&2 functions in plant reproduction.

      Strengths

      Triploid production by a combination of ecs1ecs2 mutant and HIPOD system has potential as a future plant breeding tool. Moreover, it's intriguing that both triploid and haploid productions were achieved using the same mutant as a maternal donor. I think authors can claim the value of their results more by adding descriptions about the usefulness of the aneuploid plants in plant breeding history.

      The evidence of the persistent synergid nucleus (Figure 3A) is critical insight reported by this study. As Maruyama et al. (2013) reported by live cell imaging, synergid-endosperm fusion had occurred at the two endosperm nuclei stage. It would be valuable to claim the observed fact by citing Maruyama's previous observation.

      Weakness

      As the authors suggested, the higher triploid frequency observed in ecs1ecs2 than WT was likely caused by the increased polyspermy. However, it also could be that reduction of normal seed number in ecs1ecs2 (whichever is due to failure of fertilization or embryo development arrest) accounts for the increased frequency of the triploid compared to WT.

      The results in Figure 3C-E suggested the single fertilization for both egg and central cells at similar frequencies. This is an exciting result, but it is still possible that the fertilized egg or central cell degenerated after fertilization resulting in the disappearance of paternally inherited fluorescence. Evaluation of fertilization patterns at 7-10HAP in ecs1ecs2 mutant may provide more confident insight, although unfused sperm cell was evaluated at 1DAP (Figure 3-figure supplement 1B). The fertilization states can be distinguished depending on the HTR10RFP sperm nuclei morphology and their positions, as reported by Takahashi et al (2018).

      Thank you for your suggestion. We added the requested experiment see Figure 3B in the revised manuscript. In addition, we conducted a dual pollination experiment, that provides evidence for the activation of the fertilization recovery machinery (Figure 3C-E) (see above).

      Several recent studies have reported exciting insights on ECS1&2 functions; however, various results from different laboratories have raised controversy. Though, the commonly found feature is the repression of polytubey. For readers, it would be helpful to organize the explanation about which insights are concordant or different.

      Thank you for your suggestion. We now indicate using terms like in line with or in contrast to, where our data confirms /or contradicts with previous data.

      In addition, a drawing that explains the time course in the process from pollination to seed development (up to 6DAP) based on WT would help to understand which point is evaluated in each data.

      Thank you for your suggestion. We added a model figure (Figure 4E) at the end of the manuscript that brings the concepts together and facilitates the understandings.

      Reviewer #3 (Public Review):

      In this manuscript, Mao et al. reported that the two proteases ECS1 and ECS2 participate in both polyspermy block and gamete fusion in Arabidopsis thaliana. The authors could observe polytubey phenotype which has been reported previously and obtain both triparental plants and haploids in ecs1 ecs2 mutants. Therefore, they proposed that the triparental plants resulted from the polytubey block defect, whereas the haploids were caused by the gamete fusion defect. Together with two other previous reports, I think it is very interesting to see these two proteases participating in so many different but connected processes. Although they did not provide the molecular mechanism of how ECS participated in polyspermy block and gamete fusion, their findings provide more options for and thus promote plant breeding. The work may have a wide application in the future and will be of broad interest to cell biologists working on gamete fusion and plant breeders.

      We thank the reviewer for their positive comments.

      Although most of the conclusions in this paper are well supported by the data, it could be improved with a minor revision including providing clearer data analysis and descriptions, images with higher resolution, and more discussions.

    1. Author Response

      Reviewer #2 (Public Review):

      Here, a simple model of cerebellar computation is used to study the dependence of task performance on input type: it is demonstrated that task performance and optimal representations are highly dependent on task and stimulus type. This challenges many standard models which use simple random stimuli and concludes that the granular layer is required to provide a sparse representation. This is a useful contribution to our understanding of cerebellar circuits, though, in common with many models of this type, the neural dynamics and circuit architecture are not very specific to the cerebellum, the model includes the feedforward structure and the high dimension of the granule layer, but little else. This paper has the virtue of including tasks that are more realistic, but by the paper’s own admission, the same model can be applied to the electrosensory lateral line lobe and it could, though it is not mentioned in the paper, be applied to the dentate gyrus and large pyramidal cells of CA3. The discussion does not include specific elements related to, for example, the dynamics of the Purkinje cells or the role of Golgi cells, and, in a way, the demonstration that the model can encompass different tasks and stimuli types is an indication of how abstract the model is. Nonetheless, it is useful and interesting to see a generalization of what has become a standard paradigm for discussing cerebellar function.

      We appreciate the Reviewer’s positive comments. Regarding the simplifications of our model, we agree that we have taken a modeling approach that abstracts away certain details to permit comparisons across systems. We now include an in-depth discussion of our simplifying assumptions (Assumptions & Extensions section in the Discussion) and have further noted the possibility that other biophysical mechanisms we have not accounted for may also underlie differences across systems.

      Our results predict that qualitative differences in the coding levels of cerebellum-like systems, across brain regions or across species, reflect an optimization to distinct tasks (Figure 7). However, it is also possible that differences in coding level arise from other physiological differences between systems.

      Reviewer #3 (Public Review):

      1) The paper by Xie et al is a modelling study of the mossy fiber-to-granule cell-to-Purkinje cell network, reporting that the optimal type of representations in the cerebellar granule cell layer depends on the type task. The paper stresses that the findings indicate a higher overall bias towards dense representations than stated in the literature, but it appears the authors have missed parts of the literature that already reported on this. While the modelling and analysis appear mathematically solid, the model is lacking many known constraints of the cerebellar circuitry, which makes the applicability of the findings to the biological counterpart somewhat limited.

      We thank the Reviewer for suggesting additional references to include in our manuscript, and for encouraging us to extend our model toward greater biological plausibility and more critically discuss simplifying assumptions we have made. We respond to both the comment about previous literature and about applicability to cerebellar circuitry in detail below.

      2) I have some concerns with the novelty of the main conclusion, here from the abstract: ’Here, we generalize theories of cerebellar learning to determine the optimal granule cell representation for tasks beyond random stimulus discrimination, including continuous input-output transformations as required for smooth motor control. We show that for such tasks, the optimal granule cell representation is substantially denser than predicted by classic theories.’ Stated like this, this has in principle already been shown, i.e. for example: Spanne and Jo¨rntell (2013) Processing of multi-dimensional sensorimotor information in the spinal and cerebellar neuronal circuitry: a new hypothesis. PLoS Comput Biol. 9(3):e1002979. Indeed, even the 2 DoF arm movement control that is used in the present paper as an application, was used in this previous paper, with similar conclusions with respect to the advantage of continuous input-output transformations and dense coding. Thus, already from the beginning of this paper, the novelty aspect of this paper is questionable. Even the conclusion in the last paragraph of the Introduction: ‘We show that, when learning input-output mappings for motor control tasks, the optimal granule cell representation is much denser than predicted by previous analyses.’ was in principle already shown by this previous paper.

      We thank the Reviewer for drawing our attention to Spanne and Jo¨rntell (2013). Our study shares certain similarities with this work, including the consideration of tasks with smooth input-output mappings, such as learning the dynamics of a two-joint arm. However, our study differs substantially, most notably the fact that we focus our study on parametrically varying the degree of sparsity in the granule cell layer to determine the circumstances under which dense versus sparse coding is optimal. To the best of our ability, we can find no result in Spanne and J¨orntell (2013) that indicates the performance of a network as a function of average coding level. Instead, Spanne and Jo¨rntell (2013) propose that inhibition from Golgi cells produces heterogeneity in coding level which can improve performance, which is an interesting but complementary finding to ours. We therefore do not believe that the quantitative computations of optimal coding level that we present are redundant with the results of this previous study. We also note that a key contribution of our study is mathemetical analysis of the inductive bias of networks with different coding levels which supports our conclusions.

      We have included a discussion of Spanne and Jo¨rntell (2013) and (2015) in the revised version of our manuscript:

      "Other studies have considered tasks with smooth input-output mappings and low-dimensional inputs, finding that heterogeneous Golgi cell inhibition can improve performance by diversifying individual granule cell thresholds (Spanne and J¨orntell, 2013). Extending our model to include heterogeneous thresholds is an interesting direction for future work. Another proposal states that dense coding may improve generalization (Spanne and Jo¨rntell, 2015). Our theory reveals that whether or not dense coding is beneficial depends on the task."

      3) However, the present paper does add several more specific investigations/characterizations that were not previously explored. Many of the main figures report interesting new model results. However, the model is implemented in a highly generic fashion. Consequently, the model relates better to general neural network theory than to specific interpretations of the function of the cerebellar neuronal circuitry. One good example is the findings reported in Figure 2. These represent an interesting extension to the main conclusion, but they are also partly based on arbitrariness as the type of mossy fiber input described in the random categorization task has not been observed in the mammalian cerebellum under behavior in vivo, whereas in contrast, the type of input for the motor control task does resemble mossy fiber input recorded under behavior (van Kan et al 1993).

      We agree that the tasks we consider in Figure 2 are simplified compared to those that we consider elsewhere in the paper. The choice of random mossy fiber input was made to provide a comparison to previous modeling studies that also use random input as a benchmark (Marr 1969, Albus 1971, Brunel 2004, Babadi and Sompolinsky 2014, Billings 2014, LitwinKumar et al., 2017). This baseline permits us to specifically evaluate the effects of lowdimensional inputs (Figure 2) and richer input-output mappings (Figure 2, Figure 7). We agree with the Reviewer that the random and uncorrelated mossy fiber activity that has been extensively used in previous studies is almost certainly an unrealistic idealization of in vivo neural activity—this is a motivating factor for our study, which relaxes this assumption and examines the consequences. To provide additional context, we have updated the following paragraph in the main text Results section:

      "A typical assumption in computational theories of the cerebellar cortex is that inputs are randomly distributed in a high-dimensional space (Marr, 1969; Albus, 1971; Brunel et al., 2004; Babadi and Sompolinsky, 2014; Billings et al., 2014; Litwin-Kumar et al., 2017). While this may be a reasonable simplification in some cases, many tasks, including cerebellumdependent tasks, are likely best-described as being encoded by a low-dimensional set of variables. For example, the cerebellum is often hypothesized to learn a forward model for motor control (Wolpert et al., 1998), which uses sensory input and motor efference to predict an effector’s future state. Mossy fiber activity recorded in monkeys correlates with position and velocity during natural movement (van Kan et al., 1993). Sources of motor efference copies include motor cortex, whose population activity lies on a lowdimensional manifold (Wagner et al., 2019; Huang et al., 2013; Churchland et al., 2010; Yu et al., 2009). We begin by modeling the low dimensionality of inputs and later consider more specific tasks."

      4) The overall conclusion states: ‘Our results....suggest that optimal cerebellar representations are task-dependent.’ This is not a particularly strong or specific conclusion. One could interpret this statement as simply saying: ‘if I construct an arbitrary neural network, with arbitrary intrinsic properties in neurons and synapses, I can get outputs that depend on the intensity of the input that I provide to that network.’ Further, the last sentence of the Introduction states: ‘More broadly, we show that the sparsity of a neural code has a task-dependent influence on learning...’ This is very general and unspecific, and would likely not come as a surprise to anyone interested in the analysis of neural networks. It doesn’t pinpoint any specific biological problem but just says that if I change the density of the input to a [generic] network, then the learning will be impacted in one way or another.

      We agree with the Reviewer that our conclusions are quite general, and we have removed the final sentence as we agree it was unspecific. However, we disagree with the Reviewer’s paraphrasing of our results.

      First, we do not select arbitrary intrinsic properties of neurons and synapses. Rather, we construct a simplified model with a key quantity, the neuronal threshold, that we vary parametrically in order to assess the effect of the resulting changes in the representation on performance. Second, we do not vary the intensity/density of inputs provided to the network – this is fixed throughout our study for all key comparisons we perform. Instead, we vary the density (coding level) of the expansion layer representation and quantify its effect on inductive bias and generalization. Finally, our study’s key contribution is an explanation of the heterogeneity in average coding level observed across behaviors and cerebellum-like systems. We go beyond the empirical statement that there is a dependence of performance on the parameter that we vary by developing an analytical theory. Our theory describes the performance of the class of networks that we study and the properties of learning tasks that determine the optimal expansion layer representation.

      To clarify our main contributions, we have updated the final paragraph of the Introduction. We have also removed the sentence that the Reviewer objects to, as it was less specific than the other points we make here.

      "We propose that these differences can be explained by the capacity of representations with different levels of sparsity to support learning of different tasks. We show that the optimal level of sparsity depends on the structure of the input-output relationship of a task. When learning input-output mappings for motor control tasks, the optimal granule cell representation is much denser than predicted by previous analyses. To explain this result, we develop an analytic theory that predicts the performance of cerebellum-like circuits for arbitrary learning tasks. The theory describes how properties of cerebellar architecture and activity control these networks’ inductive bias: the tendency of a network toward learning particular types of input-output mappings (Sollich, 1998; Jacot et al., 2018; Bordelon et al., 2020; Canatar et al., 2021; Simon et al., 2021). The theory shows that inductive bias, rather than the dimension of the representation alone, is necessary to explain learning performance across tasks. It also suggests that cerebellar regions specialized for different functions may adjust the sparsity of their granule cell representations depending on the task."

      5) The interpretation of the distribution of the mossy fiber inputs to the granule cells, which would have a crucial impact on the results of a study like this, is likely incorrect. First, unlike the papers that the authors cite, there are many studies indicating that there is a topographic organization in the mossy fiber termination, such that mossy fibers from the same inputs, representing similar types of information, are regionally co-localized in the granule cell layer. Hence, there is no support for the model assumption that there is a predominantly random termination of mossy fibers of different origins. This risks invalidating the comparisons that the authors are making, i.e. such as in Figure 3. This is a list of example papers, there are more: van Kan, Gibson and Houk (1993) Movement-related inputs to intermediate cerebellum of the monkey. Journal of Neurophysiology. Garwicz et al (1998) Cutaneous receptive fields and topography of mossy fibres and climbing fibres projecting to cat cerebellar C3 zone. The Journal of Physiology. Brown and Bower (2001) Congruence of mossy fiber and climbing fiber tactile projections in the lateral hemispheres of the rat cerebellum. The Journal of Comparative Neurology. Na, Sugihara, Shinoda (2019) The entire trajectories of single pontocerebellar axons and their lobular and longitudinal terminal distribution patterns in multiple aldolase C-positive compartments of the rat cerebellar cortex. The Journal of Comparative Neurology.

      6) The nature of the mossy fiber-granule cell recording is also reviewed here: Gilbert and Miall (2022) How and Why the Cerebellum Recodes Input Signals: An Alternative to Machine Learning. The Neuroscientist. Further, considering the re-coding idea, the following paper shows that detailed information, as it is provided by mossy fibers, is transmitted through the granule cells without any evidence of re-coding: Jo¨rntell and Ekerot (2006) Journal of Neuroscience; and this paper shows that these granule inputs are powerfully transmitted to the molecular layer even in a decerebrated animal (i.e. where only the ascending sensory pathways remains) Jo¨rntell and Ekerot 2002, Neuron.

      We agree that there is strong evidence for a topographic organization in mossy fiber to granule cell connectivity at the microzonal level. We thank the Reviewer for pointing us to specific examples. We acknowledge that our simplified model does not capture the structure of connectivity observed in these studies.

      However, the focus of our model is on cerebellar neurons presynaptic to a single Purkinje cell. Random or disordered distribution of inputs at this local scale is compatible with topographic organization at the microzonal scale. Furthermore, while there is evidence of structured connections at the local scale, models with random connectivity are able to reproduce the dimensionality of granule cell activity within a small margin of error (Nguyen et al., 2022). Finally, our finding that dense codes are optimal for learning slowly varying tasks is consistent with evidence for the lack of re-coding – for such tasks, re-coding may absent because it is not required.

      We have dedicated a section on this issue in the Assumptions and Extensions portion of our Discussion:

      "Another key assumption concerning the granule cells is that they sample mossy fiber inputs randomly, as is typically assumed in Marr-Albus models (Marr, 1969; Albus, 1971; LitwinKumar et al., 2017; Cayco-Gajic et al., 2017). Other studies instead argue that granule cells sample from mossy fibers with highly similar receptive fields (Garwicz et al., 1998; Brown and Bower, 2001; J¨orntell and Ekerot, 2006) defined by the tuning of mossy fiber and climbing fiber inputs to cerebellar microzones (Apps et al., 2018). This has led to an alternative hypothesis that granule cells serve to relay similarly tuned mossy fiber inputs and enhance their signal-to-noise ratio (Jo¨rntell and Ekerot, 2006; Gilbert and Chris Miall, 2022) rather than to re-encode inputs. Another hypothesis is that granule cells enable Purkinje cells to learn piece-wise linear approximations of nonlinear functions (Spanne and J¨orntell, 2013). However, several recent studies support the existence of heterogeneous connectivity and selectivity of granule cells to multiple distinct inputs at the local scale (Huang et al., 2013; Ishikawa et al., 2015). Furthermore, the deviation of the predicted dimension in models constrained by electron-microscopy data as compared to randomly wired models is modest (Nguyen et al., 2022). Thus, topographically organized connectivity at the macroscopic scale may coexist with disordered connectivity at the local scale, allowing granule cells presynaptic to an individual Purkinje cell to sample heterogeneous combinations of the subset of sensorimotor signals relevant to the tasks that Purkinje cell participates in. Finally, we note that the optimality of dense codes for learning slowly varying tasks in our theory suggests that observations of a lack of mixing (J¨orntell and Ekerot, 2002) for such tasks are compatible with Marr-Albus models, as in this case nonlinear mixing is not required."

      7) I could not find any description of the neuron model used in this paper, so I assume that the neurons are just modelled as linear summators with a threshold (in fact, Figure 5 mentions inhibition, but this appears to be just one big lump inhibition, which basically is an incorrect implementation). In reality, granule cells of course do have specific properties that can impact the input-output transformation, PARTICULARLY with respect to the comparison of sparse versus dense coding, because the low-pass filtering of input that occurs in granule cells (and other neurons) as well as their spike firing stochasticity (Saarinen et al (2008). Stochastic differential equation model for cerebellar granule cell excitability. PLoS Comput. Biol. 4:e1000004) will profoundly complicate these comparisons and make them less straight forward than what is portrayed in this paper. There are also several other factors that would be present in the biological setting but are lacking here, which makes it doubtful how much information in relation to the biological performance that this modelling study provides: What are the types of activity patterns of the inputs? What are the learning rules? What is the topography? What is the impact of Purkinje cell outputs downstream, as the Purkinje cell output does not have any direct action, it acts on the deep cerebellar nuclear neurons, which in turn act on a complex sensorimotor circuitry to exert their effect, hence predictive coding could only become interpretable after the PC output has been added to the activity in those circuits. Where is the differentiated Golgi cell inhibition?

      Thank you for these critiques. We have made numerous edits to improve the presentation of the details of our model in the main text of the manuscript. Indeed, granule cells in the main text are modeled as linear sums of mossy fiber inputs with a threshold-linear activation function. A more detailed description of the model for granule cells can now be found in Equation 1 in the Results section:

      "The activity of neurons in the expansion layer is given by: h = φ(Jeffx − θ), (1) where φ is a rectified linear activation function φ(u) = max(u,0) applied element-wise. Our results also hold for other threshold-polynomial activation functions. The scalar threshold θ is shared across neurons and controls the coding level, which we denote by f, defined as the average fraction of neurons in the expansion layer that are active."

      Most of our analyses use the firing rate model we describe above, but several Supplemental Figures show extensions to this model. As we mention in the Discussion, our results do not depend on the specific choice of nonlinearity (Figure 2-figure supplement 2). We have also considered the possibility that the stochastic nature of granule cell spikes could impact our measures of coding level. In Figure 7-figure supplement 1 we test the robustness of our main conclusion using a spiking model where we model granule cell spikes with Poisson statistics. When measuring coding level in a population of spiking neurons, a key question is at what time window the Purkinje cell integrates spikes. For several choices of integration time windows, we show that dense coding remains optimal for learning smooth tasks. However, we agree with the Reviewer that there are other biological details our model does not address. For example, our spiking model does not capture some of the properties the Saarinen et al. (2008) model captures, including random sub-threshold oscillations and clusters of spikes. Modeling biophysical phenomena at this scale is beyond the scope of our study. We have added this reference to the relevant section of the Discussion:

      "We also note that coding level is most easily defined when neurons are modeled as rate, rather than spiking units. To investigate the consistency of our results under a spiking code, we implemented a model in which granule cell spiking exhibits Poisson variability and quantify coding level as the fraction of neurons that have nonzero spike counts (Figure 7-figure supplement 1; Figure 7C). In general, increased spike count leads to improved performance as noise associated with spiking variability is reduced. Granule cells have been shown to exhibit reliable burst responses to mossy fiber stimulation (Chadderton et al., 2004), motivating models using deterministic responses or sub-Poisson spiking variability. However, further work is needed to quantitatively compare variability in model and experiment and to account for more complex biophysical properties of granule cells (Saarinen et al., 2008)."

      A second concern the Reviewer raises is our implementation of Golgi cell inhibition as a homogeneous rather than heterogeneous input onto granule cells. In simplified models, adding heterogeneous inhibition does not dramatically change the qualitative properties of the expansion layer representation, in particular the dimensionality of the representation (Billings et al., 2014, Cayco-Gajic et al., 2017, Litwin-Kumar et al., 2017). We have added a section about inhibition to our Discussion:

      "We also have not explicitly modeled inhibitory input provided by Golgi cells, instead assuming such input can be modeled as a change in effective threshold, as in previous studies (Billings et al., 2014; Cayco-Gajic et al., 2017; Litwin-Kumar et al., 2017). This is appropriate when considering the dimension of the granule cell representation (Litwin-Kumar et al., 2017), but more work is needed to extend our model to the case of heterogeneous inhibition."

      Regarding the mossy fiber inputs, as we state in response to paragraph 3, we agree with the Reviewer that the random and uncorrelated mossy fiber activity that has been used in previous studies is an unrealistic idealization of in vivo neural activity. One of the motivations for our model was to relax this assumption and examine the consequences: we introduce correlations in the mossy fiber activity by projecting low-dimensional patterns into the mossy fiber layer (Figure 1B):

      "A typical assumption in computational theories of the cerebellar cortex is that inputs are randomly distributed in a high-dimensional space (Marr, 1969; Albus, 1971; Brunel et al., 2004; Babadi and Sompolinsky, 2014; Billings et al., 2014; Litwin-Kumar et al., 2017). While this may be a reasonable simplification in some cases, many tasks, including cerebellumdependent tasks, are likely best-described as being encoded by a low-dimensional set of variables. For example, the cerebellum is often hypothesized to learn a forward model for motor control (Wolpert et al., 1998), which uses sensory input and motor efference to predict an effector’s future state. Mossy fiber activity recorded in monkeys correlates with position and velocity during natural movement (van Kan et al., 1993). Sources of motor efference copies include motor cortex, whose population activity lies on a low-dimensional manifold (Wagner et al., 2019; Huang et al., 2013; Churchland et al., 2010; Yu et al., 2009). We begin by modeling the low dimensionality of inputs and later consider more specific tasks.

      We therefore assume that the inputs to our model lie on a D-dimensional subspace embedded in the N-dimensional input space, where D is typically much smaller than N (Figure 1B). We refer to this subspace as the “task subspace” (Figure 1C)."

      The Reviewer also mentions the learning rule at granule cell to Purkinje cell synapses. We agree that considering online, climbing-fiber-dependent learning is an important generalization. We therefore added a new supplemental figure investigating whether we would still see a difference in optimal coding levels across tasks if online learning were used instead of the least squares solution (Figure 7-figure supplement 2). Indeed, we observed a similar task dependence as we saw in Figure 2F. We have added a new paragraph in the Discussion under Assumptions and Extensions describing our rationale and approach in detail:

      "For the Purkinje cells, our model assumes that their responses to granule cell input can be modeled as an optimal linear readout. Our model therefore provides an upper bound to linear readout performance, a standard benchmark for the quality of a neural representation that does not require assumptions on the nature of climbing fiber-mediated plasticity, which is still debated. Electrophysiological studies have argued in favor of a linear approximation (Brunel et al., 2004). To improve the biological applicability of our model, we implemented an online climbing fiber-mediated learning rule and found that optimal coding levels are still task-dependent (Figure 7-figure supplement 2). We also note that although we model several timing-dependent tasks (Figure 7), our learning rule does not exploit temporal information, and we assume that temporal dynamics of granule cell responses are largely inherited from mossy fibers. Integrating temporal information into our model is an interesting direction for future investigation."

      Finally, regarding the function of the Purkinje cell, our model defines a learning task as a mapping from inputs to target activity in the Purkinje cell and is thus agnostic to the cell’s downstream effects. We clarify this point when introducing the definition of a learning task:

      "In our model, a learning task is defined by a mapping from task variables x to an output f(x), representing a target change in activity of a readout neuron, for example a Purkinje cell. The limited scope of this definition implies our results should not strongly depend on the influence of the readout neuron on downstream circuits."

      8) The problem of these, in my impression, generic, arbitrary settings of the neurons and the network in the model becomes obvious here: ‘In contrast to the dense activity in cerebellar granule cells, odor responses in Kenyon cells, the analogs of granule cells in the Drosophila mushroom body, are sparse...’ How can this system be interpreted as an analogy to granule cells in the mammalian cerebellum when the model does not address the specifics lined up above? I.e. the ‘inductive bias’ that the authors speak of, defined as ‘the tendency of a network toward learning particular types of input-output mappings’, would be highly dependent on the specifics of the network model.

      We agree with the Reviewer that our model makes several simplifying assumptions for mathematical tractability. However, we note that our study is not the first to draw analogies between cerebellum-like systems, including the mushroom body (Bell et al., 2008; Farris, 2011). All the systems we study feature a sparsely connected, expanded granule-like layer that sends parallel fiber axons onto densely connected downstream neurons known to exhibit powerful synaptic plasticity, thus motivating the key architectural assumptions of our model. We have constrained anatomical parameters of the model using data as available (Table 1). However, we agree with the Reviewer that when making comparisons across species there is always a possibility that differences are due to physiological mechanisms we have not fully understood or captured with a model. As such, we can only present a hypothesis for these differences. We have modified our Discussion section on this topic to clearly state this.

      "Our results predict that qualitative differences in the coding levels of cerebellum-like systems, across brain regions or across species, reflect an optimization to distinct tasks (Figure 7). However, it is also possible that differences in coding level arise from other physiological differences between systems."

      9) More detailed comments: Abstract: ‘In these models [Marr-Albus], granule cells form a sparse, combinatorial encoding of diverse sensorimotor inputs. Such sparse representations are optimal for learning to discriminate random stimuli.’ Yes, I would agree with the first part, but I contest the second part of this statement. I think what is true for sparse coding is that the learning of random stimuli will be faster, as in a perceptron, but not necessarily better. As the sparsification essentially removes information, it could be argued that the quality of the learning is poorer. So from that perspective, it is not optimal. The authors need to specify from what perspective they consider sparse representations optimal for learning.

      This is an important point that we would like to clarify. It is not the case that sparse coding simply speeds up learning. In our study and many related works (Barak et al. 2013; Babadi and Sompolinsky 2014; Litwin-Kumar et al. 2017), learning performance is measured based on the generalization ability of the network – the ability to predict correct labels for previously unseen inputs. As our study and previous studies show, sparse codes are optimal in the sense that they minimize generalization error, independent of any effect on learning speed. To communicate this more effectively, we have added the following sentence to the first paragraph of the Introduction:

      "Sparsity affects both learning speed (Cayco-Gajic et al., 2017), and generalization, the ability to predict correct labels for previously unseen inputs (Barak et al., 2013; Babadi and Sompolinsky, 2014; Litwin-Kumar et al., 2017)."

      10) Introduction: ‘Indeed, several recent studies have reported dense activity in cerebellar granule cells in response to sensory stimulation or during motor control tasks (Knogler et al., 2017; Wagner et al., 2017; Giovannucci et al., 2017; Badura and De Zeeuw, 2017; Wagner et al., 2019), at odds with classic theories (Marr, 1969; Albus, 1971).’ In fact, this was precisely the issue that was addressed already by Jo¨rntell and Ekerot (2006) Journal of Neuroscience. The conclusion was that these actual recordings of granule cells in vivo provided essentially no support for the assumptions in the Marr-Albus theories.

      In our reading, the main finding of J¨orntell and Ekerot (2006) is that individual granule cells are activated by mossy fibers with overlapping receptive fields driven by a single type of somatosensory input. However, there is also evidence of nonlinear mixed selectivity in granule cells in support of the re-coding hypothesis (Huang et al., 2013; Ishikawa et al., 2015). Jo¨rntell and Ekerot (2006) also suggest that the granule cell layer shares similar topographic organization as mossy fibers, organized into microzones. The existence of topographic organization does not invalidate Marr-Albus theories. As we have suggested earlier, a local combinatorial expansion can coexist with a global topographic organization.

      We have described these considerations in the Assumptions and Extensions portion of the Discussion:

      "Another key assumption concerning the granule cells is that they sample mossy fiber inputs randomly, as is typically assumed in Marr-Albus models (Marr, 1969; Albus, 1971; LitwinKumar et al., 2017; Cayco-Gajic et al., 2017). Other studies instead argue that granule cells sample from mossy fibers with highly similar receptive fields (Garwicz et al., 1998; Brown and Bower, 2001; J¨orntell and Ekerot, 2006) defined by the tuning of mossy fiber and climbing fiber inputs to cerebellar microzones (Apps et al., 2018). This has led to an alternative hypothesis that granule cells serve to relay similarly tuned mossy fiber inputs and enhance their signal-to-noise ratio (Jo¨rntell and Ekerot, 2006; Gilbert and Chris Miall, 2022) rather than to re-encode inputs. Another hypothesis is that granule cells enable Purkinje cells to learn piece-wise linear approximations of nonlinear functions (Spanne and J¨orntell, 2013). However, several recent studies support the existence of heterogeneous connectivity and selectivity of granule cells to multiple distinct inputs at the local scale (Huang et al., 2013; Ishikawa et al., 2015). Furthermore, the deviation of the predicted dimension in models constrained by electron-microscopy data as compared to randomly wired models is modest (Nguyen et al., 2022). Thus, topographically organized connectivity at the macroscopic scale may coexist with disordered connectivity at the local scale, allowing granule cells presynaptic to an individual Purkinje cell to sample heterogeneous combinations of the subset of sensorimotor signals relevant to the tasks that Purkinje cell participates in. Finally, we note that the optimality of dense codes for learning slowly varying tasks in our theory suggests that observations of a lack of mixing (J¨orntell and Ekerot, 2002) for such tasks are compatible with Marr-Albus models, as in this case nonlinear mixing is not required."

      We have also included the Jo¨rntell and Ekerot (2006) study as a citation in the Introduction:

      "Indeed, several recent studies have reported dense activity in cerebellar granule cells in response to sensory stimulation or during motor control tasks (Jo¨rntell and Ekerot, 2006; Knogler et al., 2017; Wagner et al., 2017; Giovannucci et al., 2017; Badura and De Zeeuw, 2017; Wagner et al., 2019), at odds with classic theories (Marr, 1969; Albus, 1971)."

      11) Results: 1st para: There is no information about how the granule cells are modelled.

      We agree that this should information should have been more readily available. We now more completely describe the model in the main text. Our model for granule cells can be found in Equation 1 in the Results section and also the Methods (Network Model):

      "The activity of neurons in the expansion layer is given by: h = φ(Jeffx − θ), (2)

      where φ is a rectified linear activation function φ(u) = max(u,0) applied element-wise. Our results also hold for other threshold-polynomial activation functions. The scalar threshold θ is shared across neurons and controls the coding level, which we denote by f, defined as the average fraction of neurons in the expansion layer that are active."

      12) 2nd para: ‘A typical assumption in computational theories of the cerebellar cortex is that inputs are randomly distributed in a high-dimensional space.’ Yes, I agree, and this is in fact in conflict with the known topographical organization in the cerebellar cortex (see broader comment above). Mossy fiber inputs coding for closely related inputs are co-localized in the cerebellar cortex. I think for this model to be of interest from the point of view of the mammalian cerebellar cortex, it would need to pay more attention to this organizational feature.

      As we discuss in our response to paragraphs 5 and 6, we see the random distribution assumption at the local scale (inputs presynaptic to a single Purkinje cell) as being compatible with topographic organization occurring at the microzone scale. Furthermore, as discussed earlier, we specifically model low-dimensional input as opposed to the random and high-dimensional inputs typically studied in prior models.

      "A typical assumption in computational theories of the cerebellar cortex is that inputs are randomly distributed in a high-dimensional space (Marr, 1969; Albus, 1971; Brunel et al., 2004; Babadi and Sompolinsky, 2014; Billings et al., 2014; Litwin-Kumar et al., 2017). While this may be a reasonable simplification in some cases, many tasks, including cerebellumdependent tasks, are likely best-described as being encoded by a low-dimensional set of variables. For example, the cerebellum is often hypothesized to learn a forward model for motor control (Wolpert et al., 1998), which uses sensory input and motor efference to predict an effector’s future state. Mossy fiber activity recorded in monkeys correlates with position and velocity during natural movement (van Kan et al., 1993). Sources of motor efference copies include motor cortex, whose population activity lies on a low-dimensional manifold (Wagner et al., 2019; Huang et al., 2013; Churchland et al., 2010; Yu et al., 2009). We begin by modeling the low dimensionality of inputs and later consider more specific tasks. We therefore assume that the inputs to our model lie on a D-dimensional subspace embedded in the N-dimensional input space, where D is typically much smaller than N (Figure 1B). We refer to this subspace as the “task subspace” (Figure 1C)."

      References

      Albus, J.S. (1971). A theory of cerebellar function. Mathematical Biosciences 10, 25–61.

      Apps, R., et al. (2018). Cerebellar Modules and Their Role as Operational Cerebellar Processing Units. Cerebellum 17, 654–682.

      Babadi, B. and Sompolinsky, H. (2014). Sparseness and expansion in sensory representations. Neuron 83, 1213–1226.

      Badura, A. and De Zeeuw, C.I. (2017). Cerebellar granule cells: dense, rich and evolving representations. Current Biology 27, R415–R418.

      Barak, O., Rigotti, M., and Fusi, S. (2013). The sparseness of mixed selectivity neurons controls the generalization–discrimination trade-off. Journal of Neuroscience 33, 3844– 3856.

      Bell, C.C., Han, V., and Sawtell, N.B. (2008). Cerebellum-like structures and their implications for cerebellar function. Annual Review of Neuroscience 31, 1–24.

      Billings, G., Piasini, E., Lo˝rincz, A., Nusser, Z., and Silver, R.A. (2014). Network structure within the cerebellar input layer enables lossless sparse encoding. Neuron 83, 960–974.

      Bordelon, B., Canatar, A., and Pehlevan, C. (2020). Spectrum dependent learning curves in kernel regression and wide neural networks. International Conference on Machine Learning 1024–1034.

      Brown, I.E. and Bower, J.M. (2001). Congruence of mossy fiber and climbing fiber tactile projections in the lateral hemispheres of the rat cerebellum. Journal of Comparative Neurology 429, 59–70.

      Brunel, N., Hakim, V., Isope, P., Nadal, J.P., and Barbour, B. (2004). Optimal information storage and the distribution of synaptic weights: perceptron versus Purkinje cell. Neuron 43, 745–757.

      Canatar, A., Bordelon, B., and Pehlevan, C. (2021). Spectral bias and task-model alignment explain generalization in kernel regression and infinitely wide neural networks. Nature Communications 12, 1–12.

      Cayco-Gajic, N.A., Clopath, C., and Silver, R.A. (2017). Sparse synaptic connectivity is required for decorrelation and pattern separation in feedforward networks. Nature Communications 8, 1–11.

      Chadderton, P., Margrie, T.W., and Ha¨usser, M. (2004). Integration of quanta in cerebellar granule cells during sensory processing. Nature 428, 856–860.

      Churchland, M.M., et al. (2010). Stimulus onset quenches neural variability: a widespread cortical phenomenon. Nature Neuroscience 13, 369–378.

      Farris, S.M. (2011). Are mushroom bodies cerebellum-like structures? Arthropod structure & development 40, 368–379.

      Garwicz, M., Jorntell, H., and Ekerot, C.F. (1998). Cutaneous receptive fields and topography of mossy fibres and climbing fibres projecting to cat cerebellar C3 zone. The Journal of Physiology 512 ( Pt 1), 277–293.

      Gilbert, M. and Chris Miall, R. (2022). How and Why the Cerebellum Recodes Input Signals: An Alternative to Machine Learning. The Neuroscientist 28, 206–221.

      Giovannucci, A., et al. (2017). Cerebellar granule cells acquire a widespread predictive feedback signal during motor learning. Nature Neuroscience 20, 727–734.

      Huang, C.C., et al. (2013). Convergence of pontine and proprioceptive streams onto multimodal cerebellar granule cells. eLife 2, e00400.

      Ishikawa, T., Shimuta, M., and Ha¨usser, M. (2015). Multimodal sensory integration in single cerebellar granule cells in vivo. eLife 4, e12916.

      Jacot, A., Gabriel, F., and Hongler, C. (2018). Neural tangent kernel: Convergence and generalization in neural networks. Advances in Neural Information Processing Systems 31.

      Jo¨rntell, H. and Ekerot, C.F. (2002). Reciprocal Bidirectional Plasticity of Parallel Fiber Receptive Fields in Cerebellar Purkinje Cells and Their Afferent Interneurons. Neuron 34, 797–806.

      Jorntell, H. and Ekerot, C.F. (2006). Properties of Somatosensory Synaptic Integration in Cerebellar Granule Cells In Vivo. Journal of Neuroscience 26, 11786–11797.

      Knogler, L.D., Markov, D.A., Dragomir, E.I., Stih, V., and Portugues, R. (2017). Senso-ˇ rimotor representations in cerebellar granule cells in larval zebrafish are dense, spatially organized, and non-temporally patterned. Current Biology 27, 1288–1302.

      Litwin-Kumar, A., Harris, K.D., Axel, R., Sompolinsky, H., and Abbott, L.F. (2017). Optimal degrees of synaptic connectivity. Neuron 93, 1153–1164. Marr, D. (1969). A theory of cerebellar cortex. Journal of Physiology 202, 437–470.

      Nguyen, T.M., et al. (2022). Structured cerebellar connectivity supports resilient pattern separation. Nature 1–7.

      Saarinen, A., Linne, M.L., and Yli-Harja, O. (2008). Stochastic Differential Equation Model for Cerebellar Granule Cell Excitability. PLOS Computational Biology 4, e1000004.

      Simon, J.B., Dickens, M., and DeWeese, M.R. (2021). A theory of the inductive bias and generalization of kernel regression and wide neural networks. arXiv: 2110.03922.

      Sollich, P. (1998). Learning curves for Gaussian processes. Advances in Neural Information Processing Systems 11.

      Spanne, A. and Jo¨rntell, H. (2013). Processing of Multi-dimensional Sensorimotor Information in the Spinal and Cerebellar Neuronal Circuitry: A New Hypothesis. PLOS Computational Biology 9, e1002979.

      Spanne, A. and Jo¨rntell, H. (2015). Questioning the role of sparse coding in the brain. Trends in Neurosciences 38, 417–427.

      van Kan, P.L., Gibson, A.R., and Houk, J.C. (1993). Movement-related inputs to intermediate cerebellum of the monkey. Journal of Neurophysiology 69, 74–94.

      Wagner, M.J., Kim, T.H., Savall, J., Schnitzer, M.J., and Luo, L. (2017). Cerebellar granule cells encode the expectation of reward. Nature 544, 96–100.

      Wagner, M.J., et al. (2019). Shared cortex-cerebellum dynamics in the execution and learning of a motor task. Cell 177, 669–682.e24.

      Wolpert, D.M., Miall, R.C., and Kawato, M. (1998). Internal models in the cerebellum. Trends in Cognitive Sciences 2, 338–347.

      Yu, B.M., et al. (2009). Gaussian-process factor analysis for low-dimensional single-trial analysis of neural population activity. Journal of Neurophysiology 102, 614–635.

    1. Author Response

      Reviewer #1 (Public Review):

      In this manuscript, Huang et al., assess cognitive flexibility in rats trained on an animal model of anorexia nervosa known as activity-based anorexia (ABA). For the first time, they do this in a way that is fully automated and free from experimenter interference, as apparently experimenter interference can affect both the development of ABA as well as the effect on behaviour. They show that animals that are more cognitively flexible (i.e. animals that had received reversal training) were better able to resist weight loss upon exposure to ABA, whereas animals exposed to ABA first show poorer cognitive flexibility (reversal performance).

      Strengths:

      • The development of a fully-automated, experimenter-free behavioural assessment paradigm that is capable of identifying individual rats and therefore tracking their performance.

      • The bidirectional nature of the study - i.e. the fact that animals were tested for cognitive flexibility both before and after exposure to ABA, so that direction of causality could be established.

      • The analyses are rigorous and the sample sizes sufficient.

      • The use of touchscreens increases the translational potential of the findings.

      Weaknesses

      • Some descriptions of methods and results are confusing or insufficiently detailed.

      We have been through all methods and results to include additional details as requested by this reviewer below.

      It seems to me that performance on the pairwise discrimination task cannot be directly (statistically) compared to performance on reversal (as in Figure 4E), as these are tapping into fundamentally different cognitive processes (discrimination versus reversal learning). I think comparing groups on each assessment is valid, however.

      We agree that discrimination and reversal are different cognitive processes, and statistical comparisons between these two components of the task were only made when examining the speed of learning in the validation of the novel testing system. Moreover, our inclusion of the pink and purple bars on graphs such as Figure 4C & 4E represent “main effects of ABA exposure”, regardless of learning phase (PD or reversal) rather than, as you describe, comparing PD to R1. Perhaps this comparison wasn’t clear, so we have amended the text to say ‘main effect of ABA exposure p=.0017’ rather than just “exposure”.

      Not necessarily a 'weakness' but I would have loved to see some assessment of the alterations in neural mechanisms underlying these effects, and/or some different behavioural assessments in addition to those used here. In particular, the authors mention in the discussion that this manipulation can affect cholinergic functioning in the dorsal striatum We (Bradfield et al., Neuron, 2013) and a number of others have now demonstrated that cholinergic dysfunction in the dorsomedial striatum impairs a different kind of reversal learning that based on alterations in outcome identity and thus relies on a different cognitive process (i.e. 'state' rather than 'reward' prediction error). It would be interesting perhaps in the future to see if the ABA manipulation also alters performance on this alternative 'cognitive flexibility' task.

      This is an excellent suggestion and we have already begun exploring this in other ongoing work in the laboratory. Due to ‘compulsive’ wheel running being a hallmark of ABA, we are interested in determining if this also translates to a goal-directed action impairment using the well-established outcome-specific devaluation task. Perhaps with ABA it may be more relevant to investigate outcome-reversals rather than stimulus-reversals, and if this is the case, it would further support the use of the ABA model for investigating cognitive dysfunction relevant to AN. We have included an additional section in the discussion text relating to our hypotheses regarding outcome-specific reversal learning in the ABA model.

      Nevertheless, I certainly think the manuscript provides a solid appraisal of cognitive flexibility using more traditional tasks, and that the authors have achieved their aims. I think the work here will be of importance, certainly to other researchers using the ABA model, but perhaps also of translational importance in the future, as the causal relationship between ABA and cognitive inflexibility is near impossible to establish using human studies, but here evidence points strongly towards this being the case.

      Reviewer #2 (Public Review):

      Huang and colleagues present data from experiments assessing the role of cognitive inflexibility in the vulnerability to weight loss in the activity-based anorexia paradigm in rats. The experiments employ a novel in-home cage touchscreen system. The home cage touch screen system allows reduced testing time and increased throughput compared with the more widely used systems resulting in the ability to assess ABA following testing cognitive flexibility in relatively young female rats. The data demonstrate that, contrary to expectations, cognitive inflexibility does not predispose to greater ABA weight loss, but instead, rats that performed better in the reversal learning task lost more weight in the ABA paradigm. Prior ABA exposure resulted in poorer learning of the task and reversal. An additional experiment demonstrated that rats that had been trained in reversal learning resisted weight loss in the ABA paradigm. The findings are important and are clearly presented. They have implications for anorexia nervosa both in terms of potentially identifying those at risk also in understanding the high rates of relapse.

      Thanks for a great summary of the manuscript.

      Reviewer #3 (Public Review):

      Activity-based anorexia (ABA), which combines access to a running wheel and restricted access to food, is a most common paradigm used to study anorexic behavior in rodents. And yet, the field has been plagued by persistent questions about its validity as a model of anorexia nervosa (AN) in humans. This group's previous studies supported the idea that the ABA paradigm captures cognitive inflexibility seen in AN. Here they describe a fully automated touchscreen cognitive testing system for rats that makes it possible to ask whether cognitive inflexibility predisposes individuals to severe weight loss in the ABA paradigm. They observed that cognitive inflexibility was predictive of resistance to weight loss in the ABA, the opposite of what was predicted. They also reported reciprocal effects of ABA and cognitive testing on subsequent performance in the other paradigm. Prior exposure to the ABA decreased subsequent cognitive performance, while prior exposure to the cognitive task promoted resistance to the ABA. Based on these findings, the authors argue that the ABA model can be used to identify novel therapeutic targets for AN.

      The strength of this manuscript is primarily as a methods paper describing a novel automated cognitive behavioral testing system that obviates the need for experimentalist handling and single housing, which can interfere with behavioral testing, and accelerate learning on the task. Together, these features make it feasible to perform longitudinal studies to ask whether cognitive performance is predictive of behavior in a second paradigm during adolescence, a peak period of vulnerability for many psychiatric disorders. The authors also used machine learning tools to identify specific behaviors during the cognitive task that predicted later susceptibility to the ABA paradigm. While the benefits of this system are clear, the rigor and reproducibility of experiments using this paradigm would be enhanced if the authors provided clear guidelines about which parameters and analyses are most useful. In their absence, the large amount of data generated can promote p-hacking.

      The authors use their automated behavioral testing paradigm to ask whether cognitive inflexibility is a cause or consequence of susceptibility to ABA, an issue that cannot be addressed in AN. They provide compelling evidence that there are reciprocal effects of the two behavioral paradigms, but do not perform the controls needed to evaluate the significance of these observations. For example, the learning task involves sucrose consumption and food restriction, conditions that can independently affect susceptibility to the ABA. Similarly, the ABA paradigm involves exercise and restricted access to food, which can both affect learning.

      In the Discussion, the authors hypothesize that the ABA paradigm produces cognitive inflexibility and argue that uncovering the underlying mechanism can be used to identify new therapeutic targets for AN. The rationale for their claim of translational relevance is undermined by the fact that the biggest effect of the ABA paradigm is seen in the pair discrimination task, and not reversal learning. This pattern does not fit clinical observations in AN.

      In summary, the significance of this manuscript lies in the development of a new system to test cognitive function in rats that can be combined with other paradigms to explore questions of causality. While the authors clearly demonstrate that cognitive flexibility does not promote susceptibility to ABA, the experiments presented do not provide a compelling case that their model captures important features of the pathophysiology of AN.

      We thank the reviewer for this detailed review and note that we have now both explicitly defined the most useful parameters for analyses from the novel touchscreen system as well as removed some comparisons that could be considered superfluous. We argue that the additional information provided by the machine learning analyses are, at this stage, exploratory, and rather than reveal independent descriptions of behavioural change in ABA exposed versus naïve rats this information will aid in the generation of hypotheses to be tested in future studies. Therefore, the figures pertaining to these analyses have now been provided as supplements to Figures 3 & 4 (Figure 3-figure supplement 3; Figure 4-figure supplements 3&4). We have also clarified our intention to explore possible behavioural differences using this technique in the methods and discussion.

      We have also completed the essential control experiment, defined in the “essential revisions” section of this review, whereby we show only moderate impairments in reversal learning following a matched period of food restriction without rapid weight loss, suggesting that the substantial impairment seen following ABA exposure was not due to food restriction alone (see updated Figure 4 and supplements).

      However, we do not agree with this reviewer “that the biggest effect of the ABA paradigm is seen in the pair discrimination task” and point to the outcomes of both reciprocal experiments.

      In the first experiment, rats that went onto be susceptible or resistant to ABA did not differ on pairwise discrimination learning but specifically on performance at the reversal of reward contingencies (Figure 3B & E). Although this result was not in the hypothesised direction, this suggests that reversal learning specifically and not pairwise discrimination can differentiate those rats that go on to be susceptible to weight loss. We have included additional discussion in the text related to this finding (see line 490-497).

      In the second experiment, it is clear by the number of ABA exposed rats that were unable to learn the reversal component even after being able to learn pairwise discrimination, that flexible learning is more impaired by ABA. While it is true that ABA exposed rats that were successful in learning the reversal task were slower to learn the pairwise discrimination component than naïve rats (Figure 4E), this was not related to their ability to learn the reversal task overall – with equivalent learning rates in pairwise discrimination to ABA exposed rats that failed to learn the reversal component (Figure 4G-I). The absence of significant differences between ABA exposed and naïve animals in Figure 4F relates to the fact that the large proportion of ABA exposed animals never reached performance criterion in the reversal phase of the task and therefore data from these animals could not be included in the figure. This is where the trials completed within each session becomes important for interpretation (i.e. Figure 4-figure supplement 1M-O), whereby ABA exposure caused impaired responding specifically within the reversal phase of the task. The results text has been updated to better reflect this critical point.

      Overall, this suggests that the impairment in cognitive flexibility caused by ABA exposure was related both to an associative learning impairment (slower to learn PD than naïve animals) and an impairment in the integration of new and existing learning (failure to learn R1 in a large proportion of animals).

    1. one must conclude that community is always in/with time, always unfinished,

      Pauline van Mourik Broekman: And community is also always in/with space. In that respect, it seems so important to recognise how hard editors of ‘living books’ actually find it to encourage the reuse/appropriation/disappropriation offered up, and quite how much (material, socialised) time and care it takes to coax – and perform – this activity sensitively, on- and offline, with all the nuances you’ve described (and which run counter to the ‘social’ as the metricised communicating human being is now supposed to perform – and seek – it, and whose conditions of ‘communication’ Jodi Dean has done a lot to theorise).

      My PhD research on early Soviet life made me realise it is just really hard to conceive of the experience of true convulsive collectivity (a loss of individuality that I realise may be different, but that I hope might also be compared to the forms of subjectivation inherent in disappropriation?). And how creativity, let alone ‘authorship’, might be experienced within that. Do we (and I am thinking here especially of scholarly workers) come anything close to Walter Benjamin’s experience, from 1927, of how “Each thought, each day, each life lies here [in Moscow] as on a laboratory table. … No organism, no organisation, can escape this process.” Sensations which are also documented in Richard Stites’ Revolutionary Dreams: Utopian Vision and Experimental Life in the Russian Revolution, Oxford: Oxford University Press, 1989; and similarly, in Kristin Ross’s works on the Paris commune (Ross, 2008, 2015). The Soviet concept of the ‘social condenser’ is fascinating in this respect in that it places architecture, and space/s, right in the centre of psychosocial subjectivation, as a potentially intensifying, opening or collectivising force in social movement and change (as some have commented, these might importantly be separated into ‘planned’ and ‘accidental’ social condensers, meaning those which are forward-looking and intentional, or retroactively recognised for their capacities).

      If, as Teju Cole so memorably described, we have achieved the sort of collective spectacular alienation wherein we can witness ‘death in the browser tab’ while sitting still in front of a computer and toggling between that and other media ‘content’ (The New York Times Magazine, 2015, and online: https://www.nytimes.com/2015/05/24/magazine/death-in-the-browser-tab.html, how are we to expand living books’ writerly ‘space’ such that the tabs which living books’ readers/writers painstakingly write into might truly act as social condensers, in line with the more fervent hopes and dreams of ‘radical’ open access? As we sit at those computers, writing, our bodies slumped in chairs and our eyes tired and glazed, should we, can we, seek an experience of elated social dissolution the likes of which I’ve in recent times only seen described by authors contemplating the psychological experience of riots (e.g. Hannah Black, 2022; Tobi Haslett, 2021; Adrian Wohlleben, 2021). It is a vain imagining, probably, but I can’t help but wonder how might try and think of these phenomena together, or at least as potentially related? To me it seems inevitably to point to the fact that we cannot conceive of digital materials outside of the spaces in which they are engaged with. I’ve found Mark Nowak’s Social Poetics (2020) and June Jordan’s Poetry for the People (1995) some of the more helpful sources to think this relationship through (though I realise there are countless others). It also seems telling that they are to a lesser or greater extent centred in interpretations of communal pedagogy.

    1. Author Response

      Reviewer #1 (Public Review):

      Strengths

      This paper is well situated theoretically within the habit learning/OCD literature. Daily training in a motor-learning task, delivered via smartphone, was innovative, ecologically valid and more likely to assay habitual behaviors specifically. Daily training is also more similar to studies with non-humans, making a better link with that literature. The use of a sequential-learning task (cf. tasks that require a single response) is also more ecologically valid. The in-laboratory tests (after the 1 month of training) allowed the researchers to test if the OCD group preferred familiar, but more difficult, sequences over newer, simpler sequences.

      The authors achieved their aims in that two groups of participants (patients with OCD and controls) engaged with the task over the course of 30 days. The repeated nature of the task meant that 'overtraining' was almost certainly established, and automaticity was demonstrated. This allowed the authors to test their hypotheses about habit learning. The results are supportive of the authors' conclusions.

      We truly appreciate the positive assessment of referee 1, particularly the consideration that our study is theoretically strong and that ‘the results are supportive of the authors' conclusions’. This is an important external endorsement of our conclusions, contrasting somewhat with the views of referee 2.

      Weaknesses

      The sample size was relatively small. Some potentially interesting individual differences within the OCD group could have been examined more thoroughly with a bigger sample (e.g., preference for familiar sequences). A larger sample may have allowed the statistical testing of any effects due to medication status.

      The authors were not able to test one criterion of habits, namely resistance to devaluation, due to the nature of the task

      We agree with the reviewer that the proof of principle established in our study opens new avenues for research into the psychological and behavioral determinants of the heterogeneity of this clinical population. However, considering the study timeline and the pandemic constraints, a bigger sample was not possible. Our sample can indeed be considered small if one compares it with current online studies, which do not require in-person/laboratory testing, thus being much easier to recruit and conduct. However, given the nature of our protocol (with 2 demanding test phases, 1-month engagement per participant and the inclusion of OCD patients without comorbidities only) and the fact that this study also involved laboratory testing, we consider our sample size reasonable and comparable to other laboratory studies (typically comprising on average between 30-50 participants in each group).

      This article is likely to be impactful -- the delivery of a task across 30 days to a patient group is innovative and represents a new approach for the study of habit learning that is superior to an inlaboratory approach.

      An interesting aspect of this manuscript is that it prompts a comparison with previous studies of goal-directed/habitual responding in OCD that used devaluation protocols, and which may have had their effects due to deficits in goal-directed behavior and not enhanced habit learning per se.

      Thank you for acknowledging the impact of our study, in particular the unique ability of our task to interrogate the habit system.

      Reviewer #2 (Public Review):

      In this study, the researchers employed a recently developed smartphone application to provide 30 days of training on action sequences to both OCD patients and healthy volunteers. The study tested learning and automaticity-related measures and investigated the effects of several factors on these measures. Upon training completion, the researchers conducted two preference tests comparing a learned and unlearned action sequences under different conditions. While the study provides some interesting findings, I have a few substantial concerns:

      1) Throughout the entire paper, the authors' interpretations and claims revolve around the domain of habits and goal-directed behavior, despite the methods and evidence clearly focusing on motor sequence learning/procedural learning/skill learning. There is no evidence to support this framing and interpretation and thus I find them overreaching and hyperbolic, and I think they should be avoided. Although skills and habits share many characteristics, they are meaningfully distinguishable and should not be conflated or mixed up. Furthermore, if anything, the evidence in this study suggests that participants attained procedural learning, but these actions did not become habitual, as they remained deliberate actions that were not chosen to be performed when they were not in line with participants' current goals.

      We acknowledge that the research on habit learning is a topic of current controversy, especially when it comes to how to induce and measure habits in humans. Therefore, within this context referee’s 2 criticism could be expected. Across disQnct fields of research, different methodologies have been used to measure habits, which represent relaQvely stereotyped and autonomous behavioral sequences enacted in response to a specific sQmulus without consideraQon, at the Qme of iniQaQon of the sequence, of the value of the outcome or any representaQon of the relaQonship that exists between the response and the outcome. Hence these are sQmulus-bound responses which may or may not require the implementaQon of a skill during subsequent performance. Behavioral neuroscienQsts define habits similarly, as sQmulus-response associaQons which are independent of reward or outcome, and use devaluaQon or conQngency degradaQon strategies to probe habits (Dickinson and Weiskrantz, 1985; Tricomi et al., 2009). Others conceptualize habits as a form of procedural memory, along with skills, and use motor sequence learning paradigms to invesQgate and dissect different components of habit learning such as acQon selecQon, execuQon and consolidaQon (Abrahamse et al., 2013; Doyon et al., 2003; Squire et al., 1993). It is also generally agreed that the autonomous nature of habits and the fluid proficiency of skills are both usually achieved with many hours of training or pracQce, respecQvely (Haith and Krakauer, 2018).

      We consider that Balleine and Dezfouli (2019) made an excellent attempt to bring all these different criteria within a single framework, which we have followed. We also consider that our discussion in fact followed a rather cautious approach to interpretation solely in terms of goaldirected versus habitual control.

      Referee 2 does not actually specify criteria by which they define habits and skills, except for asserting that skilled behavior is goal-directed, without mentioning what the actual goal of the implantation of such skill is in the present study: the fulfillment of a habit? We assume that their definition of habit hinges on the effects of devaluation, as a single criterion of habit, but which according to Balleine and Dezfouli (2019) is only 1 of their 4 listed criteria. We carefully addressed this specific criterion in our manuscript: “We were not, however, able to test the fourth criterion, of resistance to devaluation. Therefore, we are unable to firmly conclude that the action sequences are habits rather than, for example, goal-directed skills. Regardless of whether the trained action sequences can be defined as habits or goal-directed motor skills, it has to be considered…”. Therefore, we took due care in our conclusions concerning habits and thus found the referee’s comment misleading and unfair.

      We note that our trained motor sequences did in fact fulfil the other 3 criteria listed by Balleine and Dezfouli (2019), unlike many studies employing only devaluation (e.g. Tricomi et al 2009; Gillan et al 2011). Moreover, we cited a recent study using very similar methodology where the devaluation test was applied and shown to support the habit hypothesis (Gera et al., 2022).

      Whether the initiation of the trained motor sequences in experiment 3 (arbitration) are underpinned by an action-outcome association (or not) has no bearing on whether those sequences were under stimulus-response control after training (experiment 1). Transitions between habitual and goal-directed control over behavior are quite well established in the experimental literature, especially when choice opportunities become available (Bouton et al (2021), Frölich et al (2023), or a new goal-directed schemata is recruited to fulfill a habit (Fouyssac et al, 2022). This switching between habits and goal-directed responding may reflect the coordination of these systems in producing effective behavior in the real world.

      • Fouyssac M, Peña-Oliver Y, Puaud M, Lim NTY, Giuliano C, Everitt BJ, Belin D. (2021).Negative Urgency Exacerbates Relapse to Cocaine Seeking After Abstinence. Biological Psychiatry. doi: 10.1016/j.biopsych.2021.10.009

      • Frölich S, Esmeyer M, Endrass T, Smolka MN and Kiebel SJ (2023) Interaction between habits as action sequences and goal-directed behavior under time pressure. Front. Neurosci. 16:996957. doi: 10.3389/fnins.2022.996957

      • Bouton ME. 2021. Context, attention, and the switch between habit and goal-direction in behavior. Learn Behav 49:349– 362. doi:10.3758/s13420-021-00488-z

      2) Some methodological aspects need more detail and clarification.

      3) There are concerns regarding some of the analyses, which require addressing.

      We thank referee 2 for their detailed review of the methods and analyses of our study and for the helpful feedback, which clearly helps improve our manuscript. We will clarify the methodological aspects in detail and conduct the suggested analysis. Please see below our answers to the specific points raised.

      Introduction:

      4) It is stated that "extensive training of sequential actions would more rapidly engage the 'habit system' as compared to single-action instrumental learning". In an attempt to describe the rationale for this statement the authors describe the concept of action chunking, its benefits and relevance to habits but there is no explanation for why sequential actions would engage the habit system more rapidly than a single-action. Clarifying this would be helpful.

      We agree that there is no evidence that action sequences become habitual more readily than single actions, although action sequences clearly allow ‘chunking’ and thus likely engage neural networks including the putamen which are implicated in habit learning as well as skill. In our revised manuscript we will instead state: “we have recently postulated that extensive training of sequential actions could be a means for rapidly engaging the ‘habit system’ (Robbins et al., 2019)]”

      5) In the Hypothesis section the authors state: “we expected that OCD patients... show enhanced habit attainment through a greater preference for performing familiar app sequences when given the choice to select any other, easier sequence”. I find it particularly difficult to interpret preference for familiar sequences as enhanced habit attainment.

      We agree that choice of the familiar response sequence should not be a necessary criterion for habitual control although choice for a familiar sequence is, in fact, not inconsistent with this hypothesis. In a recent study, Zmigrod et al (2022) found that 'aversion to novelty' was a relevant factor in the subjective measurement of habitual tendencies. It should also be noted that this preference was present in patients with OCD. If one assumes instead, like the referee, that the familiar sequence is goal-directed, then it contravenes the well-known 'egodystonia' of OCD which suggests that such tendencies are not goal-directed.

      To clarify our hypothesis, we will amend the sentence to the following: “Finally, we expected that OCD patients would generally report greater habits, as well as attribute higher intrinsic value to the familiar app sequences manifested by a greater preference for performing them when given the choice to select any other, easier sequence”.

      A few notes on the task description and other task components:

      6) It would be useful to give more details on the task. This includes more details on the time/condition of the gradual removal of visual and auditory stimuli and also on the within practice dynamic structure (i.e., different levels appear in the video).

      These details will be included in the revised manuscript. Thank you for pointing out the need for further clarification of the task design.

      7) Some more information on engagement-related exclusion criteria would be useful (what happened if participants did not use the app for more than one day, how many times were allowed to skip a day etc.).

      This additional information will be added to the revised manuscript. If participants omitted to train for more than 2 days, the researcher would send a reminder to the participant to request to catch up. If the participant would not react accordingly and a third day would be skipped, then the researcher would call to understand the reasons for the lack of engagement and gauge motivation. The participant would be excluded if more than 5 sequential days of training were missed. Only 2 participants were excluded given their lack of engagement.

      8) According to the (very useful) video demonstrating the task and the paper describing the task in detail (Banca et al., 2020), the task seems to include other relevant components that were not mentioned in this paper. I refer to the daily speed test, the daily random switch test, and daily ratings of each sequence's enjoyment and confidence of knowledge.

      If these components were not included in this procedure, then the deviations from the procedure described in the video and Banca al. (2020) should be explicitly mentioned. If these components were included, at least some of them may be relevant, at least in part, to automaticity, habitual action control, formulation of participants' enjoyment from the app etc. I think these components should be mentioned and analyzed (or at least provide an explanation for why it has been decided not to analyze them).

      This is also true for the reward removal (extinction) from the 21st day onwards which is potentially of particular relevance for the research questions.

      The task procedure was indeed the same as detailed in Banca et al., 2020. We did not include these extra components in this current manuscript for reasons of succinctness and because the manuscript was already rather longer than a common research article, given that we present three different, though highly inter-dependent, experiments in order to answer key interrelated questions in an optimal manner. However, since referee 2 considers this additional analysis to be important, we will be happy to include it in the supplementary material of the revised manuscript.

      Training engagement analysis:

      9)I find referring to the number of trials including successful and unsuccessful trials as representing participants "commitment to training" (e.g. in Figure legend 2b) potentially inadequate. Given that participants need at least 20 successful trials to complete each practice, more errors would lead to more trials. Therefore, I think this measure may mostly represent weaker performance (of the OCD patients as shown in Figure 2b). Therefore, I find the number of performed practice runs, as used in Figure 2a (which should be perfectly aligned with the number of successful trials), a "clean" and proper measure of engagement/commitment to training.

      We acknowledge referee’s concern on this matter and agree to replace the y-axis variable of Figure 2b to the number of performed practices (thus aligning with Figure 2a). This amendment will remove any potential effect of weaker performance on the engagement measurement and will provide clearer results.

      10) Also, to provide stronger support for the claim about different diurnal training patterns (as presented in Figure 2c and the text) between patients and healthy individuals, it would be beneficial to conduct a statistical test comparing the two distributions. If the results of this test are not significant, I suggest emphasizing that this is a descriptive finding.

      We will conduct the statistical test and report accordingly.

      Learning results:

      11) When describing the Learning results (p10) I think it would be useful to provide the descriptive stats for the MT0 parameter (as done above for the other two parameters).

      Thank you for pointing this out. The descriptive stats for MT0 will be added to the revised version of the manuscript.

      12) Sensitivity of sequence duration and IKI consistency (C) to reward:

      I think it is important to add details on how incorrect trials were handled when calculating ∆MT (or C) and ∆R, specifically in cases where the trial preceding a successful trial was unsuccessful. If incorrect trials were simply ignored, this may not adequately represent trial-by-trial changes, particularly when testing the effect of a trial's outcome on performance change in the next trial.

      This is an important question. Our analysis protocol was designed to ensure that incorrect trials do not contaminate or confound the results. To estimate the trial-to-trial difference in ∆MT (or C) and ∆R, we exclusively included pairs of contiguous trials where participants achieved correct performance and received feedback scores for both trials. For example, if a participant made a performance error on trial 23, we did not include ∆R or ∆MT estimates for the pairs of trials 23-22 and 24-23. Instead of excluding incorrect trials from our analyses, we retained them in our time series but assigned them a NaN (not a number) value in Matlab. As a result, ∆R and ∆MT was not defined for those two pairs of trials. Similarly for C. This approach ensured that our analyses are not confounded by incremental or decremental feedback scores between noncontiguous trials. In the past, when assessing the timing of correct actions during skilled sequence performance, we also considered events that were preceded and followed by correct actions. This excluded effects such as post-error slowing from contaminating our results (Herrojo Ruiz et al., 2009, 2019). Therefore, we do not believe that any further reanalysis is required.

      • Ruiz MH, Jabusch HC, Altenmüller E. Detecting wrong notes in advance: neuronal correlates of error monitoring in pianists. Cerebral cortex. 2009 Nov 1;19(11):2625-39.

      • Bury G, García-Huéscar M, Bhattacharya J, Ruiz MH. Cardiac afferent activity modulates early neural signature of error detection during skilled performance. NeuroImage. 2019 Oct 1;199:704-17.

      13) I have a serious concern with respect to how the sensitivity of sequence duration to reward is framed and analyzed. Since reward is proportional to performance, a reduction in reward essentially indicates a trial with poor performance, and thus even regression to the mean (along with a floor effect in performance [asymptote]) could explain the observed effects. It is possible that even occasional poor performance could lead to a participant demonstrating this effect, potentially regardless of the reward. Accordingly, the reduced improvement in performance following a reward decrease as a function of training length described in Figure 5b legend may reflect training-induced increased performance that leaves less room for improvement after poor trials, which are no longer as poor as before. To address this concern, controlling for performance (e.g., by taking into consideration the baseline MT for the previous trial) may be helpful. If the authors can conduct such an analysis and still show the observed effect, it would establish the validity of their findings."

      Thank you for raising this point. Figure 5b illustrates two distinct effects of reward changes on behavioral adaptation, which are expected based on previous research.

      I. Practice effects: Firstly, we observe that as participants progress across bins of practice, the degree of improvement in behavior (reflected by faster movement time, MT) following a decrease in reward (∆R−) diminishes, consistent with our expectations based on previous work. Conversely, we found that ∆MT does not change across bins of practices following an increase in reward (∆R+). We appreciate the reviewer's suggestion regarding controlling for the reference movement time (MT) in the previous trial when examining the practice effect in the p(∆T|∆R−) and p(∆T|∆R+) distributions. In the revised manuscript, we will conduct the proposed control analysis to better understand whether the sensitivity of MT to score decrements changes across practice when normalising MT to the reference level on each trial. But see below for a preliminary control analysis.

      II. Asymmetry of the effect of ∆R− and ∆R+ on performance: Figure 5b also depicts the distinct impact of score increments and decrements on behavioural changes. When aggregating data across practice bins, we consistently observed that the centre of the p(∆T|∆R−) distribution was smaller (more negative) than that of p(∆T|∆R+). This suggests that participants exhibited a greater acceleration following a drop in scores compared to a relative score increase, and this effect persisted throughout the practice sessions. Importantly, this enhanced sensitivity to losses or negative feedback (or relative drops in scores) aligns with previous research findings (Galea et al., 2015; Pekny et al., 2014; van Mastrigt et al., 2020).

      We have conducted a preliminary control analysis to exclude the potential impact that reference movement time (MT) values could have on our analysis. We have assessed the asymmetry between behavioural responses to ∆R− and ∆R+ using the following analysis: We estimated the proportion of trials in which participants exhibited speed-up (∆T < 0) or slow-down (∆T > 0) behaviour following ∆R− and ∆R+ across different practice bins (bins 1 to 4). By discretising the series of behavioural changes (∆T) into binary values (+1 for slowing down, -1 for speeding up), we can assess the type of changes (speed-up, slow-down) without the absolute ∆T or T values contributing to our results. We obtained several key findings:

      • Consistent with expectations (sanity check), participants exhibited more instances of speeding up than slowing down across all reward conditions.

      • Participants demonstrated a higher frequency of speeding up following ∆R− compared to ∆R+, and this asymmetry persisted throughout the practice sessions (greater proportion of -1 events than +1 events). 53% events were speed-up events in the in the p(∆T|∆R+) distribution for the first bin of practices, and 55% for the last bin. Regarding p(∆T|∆R-), there were 63% speed-up events throughout each bin of practices, with this proportion exhibiting no change over time.

      • Accordingly, the asymmetry of reward changes on behavioural adaptations, as revealed by this analysis, remained consistent across the practice bins.

      Thus, these preliminary findings provide an initial response to referee 2 and offer valuable insights into the asymmetrical effects of positive/negative reward changes on behavioural adaptations. We plan to include these results in the revised manuscript, as well as the full control analysis suggested by the referee. We will further expand upon their interpretation and implications.

      14) Another way to support the claim of reward change directionality effects on performance (rather than performance on performance), at least to some extent, would be to analyze the data from the last 10 days of the training, during which no rewards were given (pretending for analysis purposes that the reward was calculated and presented to participants). If the effect persists, it is less unlikely that the effect in question can be attributed to the reward dynamics.

      The reviewer’s concern is addressed in the previous quesQon. Also, this analysis would not be possible because our Gaussian fit analyses use the Qme series of conQnuous reward scores, in which ∆R− or ∆R+ are embedded. These events cannot be analyzed once reward feedback is removed because we do not have behavioral events following ∆R− or ∆R+ anymore.

      15) This concern is also relevant and should be considered with respect to the sensitivity of IKI consistency (C) to reward. While the relationship between previous reward/performance and future performance in terms of C is of a different structure, the similar potential confounding effects could still be present.

      We will conduct this analysis for the revised manuscript, similarly to the control analysis suggested by referee 2 on MT. Our preliminary control analysis, as explained above, suggests that the fundamental asymmetry in the effect of ∆R+ and ∆R+ on behavioral changes persists when excluding the impact of reference performance values in our Gaussian fit analysis.

      16) Another related question (which is also of general interest) is whether the preferred app sequence (as indicated by the participants for Phase B) was consistently the one that yielded more reward? Was the continuous sequence the preferred one? This might tell something about the effectiveness of the reward in the task.

      We have now conducted this analysis. There is in fact no evidence to conclude that the continuously rewarded sequence was the preferred one. The result shows that 54.5% of HV and 29% of the OCD sample considered the continuous sequence to be their preferred one. Of note, this preference may not necessarily be linked to the trial-by-trial reward sensitive analysis. The latter assesses how learning may be affected by reward. The overall preference may be influenced by many other factors, such as, for example, the aesthetic appeal of particular combinations of finger movements.

      Regarding both experiments 2 and 3:

      17) The change in context in experiment 2 and 3 is substantial and include many different components. These changes should be mentioned in more detail in the Results section before describing the results of experiments 2 and 3.

      Following referee’s advice, we will move these details (currently written in the Methods section) to the Results section, when we introduce Phase B and before describing the results of experiments 2 and 3.

      Experiment 2:

      18) In Experiment 2, the authors sometimes refer to the "explicit preference task" as testing for habitual and goal-seeking sequences. However, I do not think there is any justification for interpreting it as such. The other framings used by the authors - testing whether trained action sequences gain intrinsic/rewarding properties or value, and preference for familiar versus novel action sequences - are more suitable and justified. In support of the point I raised here, assigning intrinsic rewarding properties to the learned sequences and thereby preferring these sequences can be conceptually aligned with goal-directed behavior just as much as it could be with habit.

      We clearly defined the theoretical framing of experiment 2 as a test of whether trained action sequences gain intrinsic value and we are pleased to hear that the referee agrees with this framing. If the referee is referring to the paragraph below (in the Discussion), we actually do acknowledge within this paragraph that a preference for the trained sequences can either be conceptually aligned with a habit OR a goal-directed behavior.

      “On the other hand, we are describing here two potential sources of evidence in favor of enhanced habit formation in OCD. First, OCD patients show a bias towards the previously trained, apparently disadvantageous, action sequences. In terms of the discussion above, this could possibly be reinterpreted as a narrowing of goals in OCD (Robbins et al., 2019) underlying compulsive behavior, in favor of its intrinsic outcomes”

      This narrowing of goals model of OCD refers to a hypothetically transiQonal stage of compulsion development driven by behavior having an abnormally strong, goal-directed nature, typically linked to specific values and concerns.

      If the referee is referring to the penulQmate sentence of hypothesis secQon, this has been amended in response to Q5. We cannot find any other possible instances in this manuscript stating that experiment 2 is a test of habitual or goal-directed behavior.

      Experiment 3:

      19) Similar to Experiment 2, I find the framing of arbitration between goal-directed/habitual behavior in Experiment 3 inadequate and unjustified. The results of the experiment suggest that participants were primarily goal-directed and there is no evidence to support the idea that this reevaluation led participants to switch from habitual to goal-directed behavior.

      Also, given the explicit choice of the sequence to perform participants had to make prior to performing it, it is reasonable to assume that this experiment mainly tested bias towards familiar sequence/stimulus and/or towards intrinsic reward associated with the sequence in value-based decision making.

      This comment is aligned with (and follows) the referee’s criticism of experiment 1 not achieving automatic and habitual actions. We have addressed this matter above, in response 1 to Referee 2.

      Mobile-app performance effect on symptomatology: exploratory analyses:

      20) Maybe it would be worth testing if the patients with improved symptomatology (that contribute some of their symptom improvement to the app) also chose to play more during the training stage.

      We have conducted analysis to address this relevant question. There is no correlation between the YBOCS score change and the number of total practices, meaning that the patients who improved symptomatology post training did not necessarily chose to play the app more during the training stage (rs = 0.25, p = 0.15). Additionally, we have statistically compared the improvers (patients with reduced YBOCS scores post-training) and the non-improvers (patients with unchanged or increased YBOCS scores post-training) in their number of app completed practices during the training phase and no differences were observed (U = 169, p = 0.19).

      Discussion:

      21) Based on my earlier comments highlighting the inadequacy and mis-framing of the work in terms of habit and goal-directed behavior, I suggest that the discussion section be substantially revised to reflect these concerns.

      We do not agree that the work is either "inadequate or mis-framed" and will not therefore be substantially revising the Discussion. We will however clarify further the interpretation we have made and make explicit the alternative viewpoint of the referee. For example, we will retitle experiment 3 as “Re-evaluation of the learned action sequence: possible test of goal/habit arbitration” to acknowledge the referee’s viewpoint as well as our own interpretation.

      22) In the sentence "Nevertheless, OCD patients disadvantageously preferred the previously trained/familiar action sequence under certain conditions" the term "disadvantageously" is not necessarily accurate. While there was potentially more effort required, considering the possible presence of intrinsic reward and chunking, this preference may not necessarily be disadvantageous. Therefore, a more cautious and accurate phrasing that better reflects the associated results would be useful.

      We recognize that the term "disadvantageously" may be semantically ambiguous for some readers and therefore we will remove it.

      Materials and Methods:

      23) The authors mention: "The novel sequence (in condition 3) was a 6-move sequence of similar complexity and difficulty as the app sequences, but only learned on the day, before starting this task (therefore, not overtrained)." - for the sake of completeness, more details on the pre-training done on that day would be useful.

      Details of the learning procedure of the novel sequence (in condition 3, experiment 3) will be provided in the methods of the revised version of the manuscript.

      Minor comments:

      24) In the section discussing the sensitivity of sequence duration to reward, the authors state that they only analyzed continuous reward trials because "a larger number of trials in each subsample were available to fit the Gaussian distributions, due to feedback being provided on all trials." However, feedback was also provided on all trials in the variable reward condition, even though the reward was not necessarily aligned with participants' performance. Therefore, it may be beneficial to rephrase this statement for clarity.

      We will follow this referee’s advice and will rephrase the sentence for clarity.

      25) With regard to experiment 2 (Preference for familiar versus novel action sequences) in the following statement "A positive correlation between COHS and the app sequence choice (Pearson r = 0.36, p = 0.005) further showed that those participants with greater habitual tendencies had a greater propensity to prefer the trained app sequence under this condition." I find the use of the word "further" here potentially misleading.

      The word "further" will be removed.

    2. Reviewer #2 (Public Review):

      In this study, the researchers employed a recently developed smartphone application to provide 30 days of training on action sequences to both OCD patients and healthy volunteers. The study tested learning and automaticity-related measures and investigated the effects of several factors on these measures. Upon training completion, the researchers conducted two preference tests comparing a learned and unlearned action sequences under different conditions. While the study provides some interesting findings, I have a few substantial concerns:

      1. Throughout the entire paper, the authors' interpretations and claims revolve around the domain of habits and goal-directed behavior, despite the methods and evidence clearly focusing on motor sequence learning/procedural learning/skill learning. There is no evidence to support this framing and interpretation and thus I find them overreaching and hyperbolic, and I think they should be avoided. Although skills and habits share many characteristics, they are meaningfully distinguishable and should not be conflated or mixed up. Furthermore, if anything, the evidence in this study suggests that participants attained procedural learning, but these actions did not become habitual, as they remained deliberate actions that were not chosen to be performed when they were not in line with participants' current goals.<br /> 2. Some methodological aspects need more detail and clarification.<br /> 3. There are concerns regarding some of the analyses, which require addressing.

      Please see details below, ordered by the paper sections.

      Introduction:<br /> It is stated that "extensive training of sequential actions would more rapidly engage the 'habit system' as compared to single-action instrumental learning". In an attempt to describe the rationale for this statement the authors describe the concept of action chunking, its benefits and relevance to habits but there is no explanation for why sequential actions would engage the habit system more rapidly than a single-action. Clarifying this would be helpful.

      In the Hypothesis section the authors state: "we expected that OCD patients... show enhanced habit attainment through a greater preference for performing familiar app sequences when given the choice to select any other, easier sequence." I find it particularly difficult to interpret preference for familiar sequences as enhanced habit attainment.

      A few notes on the task description and other task components:<br /> It would be useful to give more details on the task. This includes more details on the time/condition of the gradual removal of visual and auditory stimuli and also on the within practice dynamic structure (i.e., different levels appear in the video).

      Some more information on engagement-related exclusion criteria would be useful (what happened if participants did not use the app for more than one day, how many times were allowed to skip a day etc.).

      According to the (very useful) video demonstrating the task and the paper describing the task in detail (Banca et al., 2020), the task seems to include other relevant components that were not mentioned in this paper. I refer to the daily speed test, the daily random switch test, and daily ratings of each sequence's enjoyment and confidence of knowledge.<br /> If these components were not included in this procedure, then the deviations from the procedure described in the video and Banca al. (2020) should be explicitly mentioned. If these components were included, at least some of them may be relevant, at least in part, to automaticity, habitual action control, formulation of participants' enjoyment from the app etc. I think these components should be mentioned and analyzed (or at least provide an explanation for why it has been decided not to analyze them).<br /> This is also true for the reward removal (extinction) from the 21st day onwards which is potentially of particular relevance for the research questions.

      Training engagement analysis:<br /> I find referring to the number of trials including successful and unsuccessful trials as representing participants "commitment to training" (e.g. in Figure legend 2b) potentially inadequate. Given that participants need at least 20 successful trials to complete each practice, more errors would lead to more trials. Therefore, I think this measure may mostly represent weaker performance (of the OCD patients as shown in Figure 2b). Therefore, I find the number of performed practice runs, as used in Figure 2a (which should be perfectly aligned with the number of successful trials), a "clean" and proper measure of engagement/commitment to training.

      Also, to provide stronger support for the claim about different diurnal training patterns (as presented in Figure 2c and the text) between patients and healthy individuals, it would be beneficial to conduct a statistical test comparing the two distributions. If the results of this test are not significant, I suggest emphasizing that this is a descriptive finding.

      Learning results:<br /> When describing the Learning results (p10) I think it would be useful to provide the descriptive stats for the MT0 parameter (as done above for the other two parameters).

      Sensitivity of sequence duration and IKI consistency (C) to reward:<br /> I think it is important to add details on how incorrect trials were handled when calculating ∆MT (or C) and ∆R, specifically in cases where the trial preceding a successful trial was unsuccessful. If incorrect trials were simply ignored, this may not adequately represent trial-by-trial changes, particularly when testing the effect of a trial's outcome on performance change in the next trial.

      I have a serious concern with respect to how the sensitivity of sequence duration to reward is framed and analyzed. Since reward is proportional to performance, a reduction in reward essentially indicates a trial with poor performance, and thus even regression to the mean (along with a floor effect in performance [asymptote]) could explain the observed effects. It is possible that even occasional poor performance could lead to a participant demonstrating this effect, potentially regardless of the reward. Accordingly, the reduced improvement in performance following a reward decrease as a function of training length described in Figure 5b legend may reflect training-induced increased performance that leaves less room for improvement after poor trials, which are no longer as poor as before. To address this concern, controlling for performance (e.g., by taking into consideration the baseline MT for the previous trial) may be helpful. If the authors can conduct such an analysis and still show the observed effect, it would establish the validity of their findings."<br /> Another way to support the claim of reward change directionality effects on performance (rather than performance on performance), at least to some extent, would be to analyze the data from the last 10 days of the training, during which no rewards were given (pretending for analysis purposes that the reward was calculated and presented to participants). If the effect persists, it is less unlikely that the effect in question can be attributed to the reward dynamics.<br /> This concern is also relevant and should be considered with respect to the Sensitivity of IKI consistency (C) to reward (even though the relationship between previous reward/performance and future performance in terms of C is of a different structure).<br /> This concern is also relevant and should be considered with respect to the sensitivity of IKI consistency (C) to reward. While the relationship between previous reward/performance and future performance in terms of C is of a different structure, the similar potential confounding effects could still be present.

      Another related question (which is also of general interest) is whether the preferred app sequence (as indicated by the participants for Phase B) was consistently the one that yielded more reward? Was the continuous sequence the preferred one? This might tell something about the effectiveness of the reward in the task.

      Regarding both experiments 2 and 3:<br /> The change in context in experiment 2 and 3 is substantial and include many different components. These changes should be mentioned in more detail in the Results section before describing the results of experiments 2 and 3.

      Experiment 2:<br /> In Experiment 2, the authors sometimes refer to the "explicit preference task" as testing for habitual and goal-seeking sequences. However, I do not think there is any justification for interpreting it as such. The other framings used by the authors - testing whether trained action sequences gain intrinsic/rewarding properties or value, and preference for familiar versus novel action sequences - are more suitable and justified. In support of the point I raised here, assigning intrinsic rewarding properties to the learned sequences and thereby preferring these sequences can be conceptually aligned with goal-directed behavior just as much as it could be with habit.

      Experiment 3:<br /> Similar to Experiment 2, I find the framing of arbitration between goal-directed/habitual behavior in Experiment 3 inadequate and unjustified. The results of the experiment suggest that participants were primarily goal-directed and there is no evidence to support the idea that this re-evaluation led participants to switch from habitual to goal-directed behavior.<br /> Also, given the explicit choice of the sequence to perform participants had to make prior to performing it, it is reasonable to assume that this experiment mainly tested bias towards familiar sequence/stimulus and/or towards intrinsic reward associated with the sequence in value-based decision making.

      Mobile-app performance effect on symptomatology: exploratory analyses:<br /> Maybe it would be worth testing if the patients with improved symptomatology (that contribute some of their symptom improvement to the app) also chose to play more during the training stage.

      Discussion:<br /> Based on my earlier comments highlighting the inadequacy and mis-framing of the work in terms of habit and goal-directed behavior, I suggest that the discussion section be substantially revised to reflect these concerns.

      In the sentence "Nevertheless, OCD patients disadvantageously preferred the previously trained/familiar action sequence under certain conditions" the term "disadvantageously" is not necessarily accurate. While there was potentially more effort required, considering the possible presence of intrinsic reward and chunking, this preference may not necessarily be disadvantageous. Therefore, a more cautious and accurate phrasing that better reflects the associated results would be useful.

      Materials and Methods:<br /> The authors mention: "The novel sequence (in condition 3) was a 6-move sequence of similar complexity and difficulty as the app sequences, but only learned on the day, before starting this task (therefore, not overtrained)." - for the sake of completeness, more details on the pre-training done on that day would be useful.

      Minor comments:<br /> In the section discussing the sensitivity of sequence duration to reward, the authors state that they only analyzed continuous reward trials because "a larger number of trials in each subsample were available to fit the Gaussian distributions, due to feedback being provided on all trials." However, feedback was also provided on all trials in the variable reward condition, even though the reward was not necessarily aligned with participants' performance. Therefore, it may be beneficial to rephrase this statement for clarity.

      With regard to experiment 2 (Preference for familiar versus novel action sequences) in the following statement "A positive correlation between COHS and the app sequence choice (Pearson r = 0.36, p = 0.005) further showed that those participants with greater habitual tendencies had a greater propensity to prefer the trained app sequence under this condition." I find the use of the word "further" here potentially misleading.

    1. Reviewer #2 (Public Review):

      Olszyński et al. claim that they identified a "new-type" ultrasonic vocalization around 44 kHz that occurs in response to prolonged fear conditioning (using foot-shocks of relatively high intensity, i.e. 1 mA) in rats. Typically, negative 22-kHz calls and positive 50-kHz calls are distinguished in rats, commonly by using a frequency threshold of 30 or 32 kHz. Olszyński et al. now observed so-called "44-kHz" calls in a substantial number of subjects exposed to 10 tone-shock pairings, yet call emission rate was low (according to Fig. 1G around 15%, according to the result text around 7.5%). They also performed playback experiments and concluded that "the responses to 44-kHz aversive calls presented from the speaker were either similar to 22-kHz vocalizations or in-between responses to 22-kHz and 50-kHz playbacks".

      Strengths: Detailed spectrographic analysis of a substantial data set of ultrasonic vocalizations recorded during prolonged fear conditioning, combined with playback experiments.

      Weaknesses: I see a number of major weaknesses.

      While the descriptive approach applied is useful, the findings have only focused importance and scope, given the low prevalence of "44 kHz" calls and limited attempts made to systematically manipulate factors that lead to their emission. In fact, the data presented appear to be derived from reanalyses of previously conducted studies in most cases and the main claims are only partially supported. While reading the manuscript, I got the impression that the data presented here are linked to two or three previously published studies (Olszyński et al., 2020, 2021, 2023). This is important to emphasize for two reasons: 1) It is often difficult (if not impossible) to link the reported data to the different experiments conducted before (and the individual experimental conditions therein). While reanalyzing previously collected data can lead to important insight, it is important to describe in a clear and transparent manner what data were obtained in what experiment (and more specifically, in what exact experimental condition) to allow appropriate interpretation of the data. For example, it is said that in the "trace fear conditioning experiment" both single- and group-housed rats were included, yet I was not able to tell what data were obtained in single- versus group-housed rats. This may sound like a side aspect, however, in my view this is not a side aspect given the fact that ultrasonic vocalizations are used for communication and communication is affected by the social housing conditions. 2) In at least two of the previously published manuscripts (Olszyński et al., 2021, 2023), emission of ultrasonic vocalizations was analyzed (Figure S1 in Olszyński et al., 2021, and Fig. 1 in Olszyński et al., 2023). This includes detailed spectrographic analyses covering the frequency range between 20 and 100 kHz, i.e. including the frequency range, where the "new-type" ultrasonic vocalization, now named "44 kHz" call, occurs, as reflected in the examples provided in Fig. 1 of Olszyński et al. (2023). In the materials and methods there, it was said: "USV were assigned to one of three categories: 50-kHz (mean peak frequency, MPF >32 kHz), short 22-kHz (MPF of 18-32 kHz, <0.3 s duration), long 22-kHz (MPF of 18-32 kHz, >0.3 s duration)". Does that mean that the "44 kHz" calls were previously included in the count for 50-kHz calls? Or were 44 kHz calls (intentionally?) left out? What does that mean for the interpretation of the previously published data? What does that mean for the current data set? In my view, there is a lack of transparency here.

      Moreover, whether the newly identified call type is indeed novel is questionable, as also mentioned by the authors in their discussion section. While they wrote in the introduction that "high-pitch (>32 kHz), long and monotonous ultrasonic vocalizations have not yet been described", they wrote in the discussion that "long (or not that long (Biały et al., 2019)), frequency-stable high-pitch vocalizations have been reported before (e.g. Sales, 1979; Shimoju et al., 2020), notably as caused by intense cholinergic stimulation (Brudzynski and Bihari, 1990) or higher shock-dose fear conditioning (Wöhr et al., 2005)" (and I wish to add that to my knowledge this list provided by the authors is incomplete). Therefore, I believe, the strong claims made in abstract ("we are the first to describe a new-type..."), introduction ("have not yet been described"), and results ("new calls") are not justified.

      In general, the manuscript is not well written/ not well organized, the description of the methods is insufficient, and it is often difficult (if not impossible) to link the reported data to the experiments/ experimental conditions described in the materials and methods section. For example, I miss a clear presentation of basic information: 1) How many rats emitted "44 kHz" calls (in total, per experiment, and importantly, also per experimental condition, i.e. single- versus group-housed)? 2) Out of the ones emitting "44 kHz" calls, what was the prevalence of "44 kHz" calls (relative to 22- and 50-kHz calls, e.g. shown as percentage)? 3) How did this ratio differ between experiments and experimental conditions? 4) Was there a link to freezing? Freezing was apparently analyzed before (Olszyński et al., 2021, 2023) and it would be important to see whether there is a correlation between "44-kHz" calls and freezing. Moreover, it would be important to know what behavior the rats are displaying while such "44-kHz" calls are emitted? (Note: Even not all 22-kHz calls are synced to freezing.) All this could help to substantiate the currently highly speculative claims made in the discussion section ("frequency increases with an increase in arousal" and "it could be argued that our prolonged fear conditioning increased the arousal of the rats with no change in the valence of the aversive stimuli"). Such more detailed analyses are also important to rule out the possibility that the "new-type" ultrasonic vocalization, the so-called "44 kHz" call, is simply associated with movement/ thorax compression.

      The figures currently included are purely descriptive in most cases - and many of them are just examples of individual rats (e.g. majority of Fig. 1, all of Fig. 2 to my understanding, with the exception of the time course, which in case of D is only a subset of rats ("only rats that emitted 44-kHz calls in at least seven ITI are plotted" - is there any rationale for this criterion?)), or, in fact, just representative spectrograms of calls (all of Fig. 3, with the exception of G, all of Fig. 4). Moreover, the differences between Fig. 5 and Fig. 6 are not clear to me. It seems Fig. 5B is included three times - what is the benefit of including the same figure three times? A systematic comparison of experimental conditions is limited to Fig. 7 and Fig. 8, the figures depicting the playback results (which led to the conclusion that "the responses to 44-kHz aversive calls presented from the speaker were either similar to 22-kHz vocalizations or in-between responses to 22-kHz and 50-kHz playbacks", although it remains unclear to me why differences were seen b e f o r e the experimental manipulation, i.e. the different playback types in Fig. 8B).

      Related to that, I miss a clear presentation of relevant methodological aspects: 1) Why were some rats single-housed but not the others? 2) Is the experimental design of the playback study not confounded? It is said that "one group (n = 13) heard 50-kHz appetitive vocalization playback while the other (n = 16) 22-kHz and 44-kHz aversive calls". How can one compare "44 kHz" calls to 22- and 50-kHz calls when "44 kHz" calls are presented together with 22-kHz calls but not 50-kHz calls? What about carry-over effects? Hearing one type of call most likely affects the response to the other type of call. It appears likely that rats are a bit more anxious after hearing aversive 22-kHz calls, for example. Therefore, it would not be very surprising to see that the response to "44 kHz" calls is more similar to 22-kHz calls than 50-kHz calls. Of note, in case of the other playback experiment it is just said that rats "received appetitive and aversive ultrasonic vocalization playback" but it remains unclear whether "44 kHz" calls are seen as appetitive or aversive. Later it says that "rats were presented with two 10-s-long playback sets of either 22-kHz or 44-kHz calls, followed by one 50-kHz modulated call 10-s set and another two playback sets of either 44-kHz or 22-kHz calls not previously heard" (and wonder what data set was included in the figures and how - pooled?). Again, I am worried about carry-over effects here. This does not seem to be an experimental design that allows to compare the response to the three main call types in an unbiased manner. Of note, what exactly is meant by "control rats" in the context of fear conditioning is also not clear to me. One can think of many different controls in a fear conditioning experiment. More concrete information is needed.

    1. Reviewer #2 (Public Review):

      Theta-nested gamma oscillations (TNGO) play an important role in hippocampal memory and cognitive processes and are disrupted in pathology. Deep brain stimulation has been shown to affect memory encoding. To investigate the effect of pulsed CA1 neurostimulation on hippocampal TNGO the authors coupled a physiologically realistic model of the hippocampus comprising EC, DG, CA1, and CA3 subfields with an abstract theta oscillator model of the medial septum (MS). Pathology was modeled as weakened theta input from the MS to EC simulating MS neurodegeneration known to occur in Alzheimer's disease. The authors show that if the input from the MS to EC is strong (the healthy state) the model autonomously generates TNGO in all hippocampal subfields while a single neurostimulation pulse has the effect of resetting the TNGO phase. When the MS input strength is weaker the network is quiescent but the authors find that a single CA1 neurostimulation pulse can switch it into the persistent TNGO state, provided the neurostimulation pulse is applied at the peak of the EC theta. If the MS theta oscillator model is supplemented by an additional phase-reset mechanism a single CA1 neurostimulation pulse applied at the trough of EC theta also produces the same effect. If the MS input to EC is weaker still, only a short burst of TNGO is generated by a single neurostimulation pulse. The authors investigate the physiological origin of this burst and find it results from an interplay of CAN and M currents in the CA1 excitatory cells. In this case, the authors find that TNGO can only be rescued by a theta frequency train of CA1 pulses applied at the peak of the EC theta or again at either the peak or trough if the MS oscillator model is supplemented by the phase-reset mechanism.

      The main strength of this model is its use of a fairly physiologically detailed model of the hippocampus. The cells are single-compartment models but do include multiple ion channels and are spatially arranged in accordance with the hippocampal structure. This allows the understanding of how ion channels (possibly modifiable by pharmacological agents) interact with system-level oscillations and neurostimulation. The model also includes all the main hippocampal subfields. The other strength is its attention to an important topic, which may be relevant for dementia treatment or prevention, which few modeling studies have addressed.

      The work has several weaknesses. First, while investigations of hippocampal neurostimulation are important there are few experimental studies from which one could judge the validity of the model findings. All its findings are therefore predictions. It would be much more convincing to first show the model is able to reproduce some measured empirical neurostimulation effect before proceeding to make predictions. Second, the model is very specific. Or if its behavior is to be considered general it has not been explained why. For example, the model shows bistability between quiescence and TNGO, however what aspect of the model underlies this, be it some particular network structure or particular ion channel, for example, is not addressed. Similarly for the various phase reset behaviors that are found. We may wonder whether a different hippocampal model of TNGO, of which there are many published (for example [1-6]) would show the same effect under neurostimulation. This seems very unlikely and indeed the quiescent state itself shown by this model seems quite artificial. Some indication that particular ion channels, CAN and M are relevant is briefly provided and the work would be much improved by examining this aspect in more detail. In summary, the work would benefit from an intuitive analysis of the basic model ingredients underlying its neurostimulation response properties. Third, while the model is fairly realistic, considerable important factors are not included and in fact, there are much more detailed hippocampal models out there (for example [5,6]). In particular, it includes only excitatory cells and a single type of inhibitory cell. This is particularly important since there are many models and experimental studies where specific cell types, for example, OLM and VIP cells, are strongly implicated in TNGO. Other missing ingredients one may think might have a strong impact on model response to neurostimulation (in particular stimulation trains) include the well-known short-term plasticity between different hippocampal cell types and active dendritic properties. Fourth the MS model seems somewhat unsupported. It is modeled as a set of coupled oscillators that synchronize. However, there is also a phase reset mechanism included. This mechanism is important because it underlies several of the phase reset behaviors shown by the full model. However, it is not derived from experimental phase response curves of septal neurons of which there is no direct measurement. The work would benefit from the use of a more biologically validated MS model.

      [1] Hyafil A, Giraud AL, Fontolan L, Gutkin B. Neural cross-frequency coupling: connecting architectures, mechanisms, and functions. Trends in neurosciences. 2015 Nov 1;38(11):725-40.

      [2] Tort AB, Rotstein HG, Dugladze T, Gloveli T, Kopell NJ. On the formation of gamma-coherent cell assemblies by oriens lacunosum-moleculare interneurons in the hippocampus. Proceedings of the National Academy of Sciences. 2007 Aug 14;104(33):13490-5.

      [3] Neymotin SA, Lazarewicz MT, Sherif M, Contreras D, Finkel LH, Lytton WW. Ketamine disrupts theta modulation of gamma in a computer model of hippocampus. Journal of Neuroscience. 2011 Aug 10;31(32):11733-43.

      [4] Ponzi A, Dura-Bernal S, Migliore M. Theta-gamma phase-amplitude coupling in a hippocampal CA1 microcircuit. PLOS Computational Biology. 2023 Mar 23;19(3):e1010942.

      [5] Bezaire MJ, Raikov I, Burk K, Vyas D, Soltesz I. Interneuronal mechanisms of hippocampal theta oscillations in a full-scale model of the rodent CA1 circuit. Elife. 2016 Dec 23;5:e18566.

      [6] Chatzikalymniou AP, Gumus M, Skinner FK. Linking minimal and detailed models of CA1 microcircuits reveals how theta rhythms emerge and their frequencies controlled. Hippocampus. 2021 Sep;31(9):982-1002.

    1. Author Response

      eLife assessment

      This study assesses homeostatic plasticity mechanisms driven by inhibitory GABAergic synapses in cultured cortical neurons. The authors report that up- or down-regulation of GABAergic synaptic strength, rather than excitatory glutamatergic synaptic strength, is critical for homeostatic regulation of neuronal firing rates. The reviewers noted that the findings are potentially important, but they also raised questions. In particular, the evidence supporting the findings is currently incomplete and demonstration of independent regulation of mEPSCs and mIPSCs is a necessary experiment to support the major claims of the study.

      We appreciate the detailed, thoughtful assessment of our paper by the reviewers and editors and will submit a revised version in the future that addresses the reviewers’ comments as detailed below in response to each concern. We will include a more open discussion of alternative possibilities. Further, we will repeat the optogenetic experiments assessing AMPAergic scaling in our mouse cortical cultures in order to demonstrate independent regulation of mEPSCs and mIPSCs as suggested.

      Reviewer #1 (Public Review):

      In the manuscript titled "GABAergic synaptic scaling is triggered by changes in spiking activity rather than transmitter receptor activation," the authors present an investigation of the role of GABAergic synaptic scaling in the maintenance of spike rates in networks of cultured neurons. Their main findings suggest that GABAergic scaling exhibits features consistent with a key homeostatic mechanism that contributes to the stability of neuronal firing rates. Their data demonstrate that GABAergic scaling is multiplicative and emerges when postsynaptic spike rates are altered. Finally, their data suggest that, in contrast to their prior data on glutamatergic scaling, GABAergic scaling is driven by spike rates. The authors set the paper up as an argument that GABAergic scaling, rather than glutamatergic scaling, serves as the critical homeostatic mechanism for spike rate regulation.

      While the paper is ambitious in its rhetorical scope and certainly presents intriguing findings, there are several serious concerns that need to be addressed to substantiate the interpretations of the data. For example, the CTZ data do not support the interpretations and conclusions drawn by the authors. Summarily, the authors argue that GABAergic scaling is measuring spiking (at the time scale of the homeostatic response, which they suggest is a key feature of a homeostat) yet their data in figure 5B show more convincingly that CTZ does not influence spiking levels - only one out of four time points is marginally significant (also, I suspect that the bootstrapping method mentioned in line 454-459 was conducted as a pairwise comparison of distributions. There is no mention of multiple comparisons corrections, and I have to assume that the significance at 3h would disappear with correction).

      We certainly understand the criticism here (similar to reviewer 2’s third point). In our resubmission we will do a better job discussing these complications, which we now summarize. First, we are presenting our entire dataset to be as transparent as possible. Unlike most synaptic scaling studies (including our own) that apply drugs to alter activity and assess mPSC amplitude at the final time point, here we are actually showing CTZ’s effect on spiking activity within the culture over time. This is critical because it has informed us of the drug’s true effect on spiking, the variability that is associated with these perturbations, and the ability and timing of the cultured network to homeostatically recover initial levels. This was important because it revealed that the drugs do not always influence activity in the way we assume, and this provides greater context to our results. Second, we are showing all of our data, and presenting it using estimation statistics which go beyond the dichotomy of a simple p value yes or no (Ho J, Tumkaya T, Aryal S, Choi H, Claridge-Chang A. 2019. Moving beyond P values: data analysis with estimation graphics. Nat Methods 16: 565-66). Estimation statistics have become a more standard statistical approach in the last 15 years and is the preferred method for the Society for Neuroscience’s eNeuro Journal. This method shows the effect size and the confidence interval of the distribution. For the 3 hr time point in Fig. 5B the CTZ/ethanol vs. ethanol data points exhibit very little overlap and the effect size demonstrates a near doubling of spike frequency, and the confidence interval shows a clear separation from 0. This was a pairwise comparison as we compared values at each time point after the addition of ethanol or ethanol/CTZ. Third, the plots illustrate an upward trend in spike frequency at 1 and 6 hrs, but that there is also clear variability. It is important to note that while these recordings help us to understand effects on spiking across the cultured network, they cannot directly speak to spiking activity in the principal neurons that we target. This complication along with the variability inherent in these cultures could make simple comparisons difficult to interpret. Regardless, we do see some increase in spiking with CTZ and we clearly see increases in mIPSC amplitude, thus providing some support for the idea that spiking could be a critical player in terms of GABAergic scaling, particularly when put in the context of our other findings. However, it is important to recognize that something other than total spike rate may contribute to GABAergic scaling, such as the pattern of spiking that produces a particular calcium transient, and this will be discussed in the resubmission.

      Then, the fact that TTX applied on top of CTZ drives a increase in mIPSC amplitude is interpreted as a conclusive demonstration that GABAergic scaling is sensing spiking. It is inevitable, however, that TTX will also severely reduce AMAP-R activation - a very plausible alternative explanation is that the augmentation of AMPAR activation caused by CTZ is not sufficient to overcome the dramatic impact of TTX. All together, these data do not provide substantial evidence for the conclusion drawn by the authors.

      We understand this point when considering the CTZ/TTX experiments by themselves. However, spiking appears to be a more straightforward trigger when the CTZ/TTX results are coupled with the prevention of GABAergic downscaling by optogenetic restoration of spiking in the presence of AMPAR antagonists. Further, an important point here is that our results with TTX vs. TTX + CTZ are different for GABAergic scaling (no difference) and AMPAergic scaling (CTZ diminished upward scaling) suggesting different triggers for the two forms of scaling. We will make this more clear in our resubmission.

      Specific points:

      • The logic of the basis for the argument is somewhat flawed: A homeostat does not require a multiplicative mechanism, nor does it even need to be synaptic. Membrane excitability is a locus of homeostatic regulation of firing, for example. In addition, synapse-specific modulation can also be homeostatic. The only requirement of the homeostat is that its deployment subserves the stabilization of a biological parameter (e.g., firing rate).

      We agree with the reviewer and should not have suggested that this was a necessary requirement for a spike rate hemostat. What we should have said was that historically this definition has been attributed to AMPAergic scaling, which is thought to be a spike rate homeostat. We will correct this in the resubmission.

      • Line 63 parenthetically references an important, but contradictory study as a brief "however". Given the tone of the writing, it would be more balanced to give this study at least a full sentence of exposition.

      Agreed, we will do this.

      • The authors state (line 11) that expression of a hyperpolarizing conductance did not trigger scaling. More recent work ('Homeostatic synaptic scaling establishes the specificity of an associative memory') does this via expression of DREADDs and finds robust scaling.

      The purpose of citing this study was to argue that the spike rate homeostat hypothesis doesn’t make sense for AMPAergic scaling based on a study that hyperpolarized an individual cell while leaving the rest of the network unaltered and therefore leaving network activity and neurotransmission largely normal. In this case scaling was not triggered, suggesting reduced spike rate within an individual cell was insufficient to trigger scaling. The study that the reviewer refers to hyperpolarizes a majority of cells in the network and therefore will also alter neurotransmission throughout the network, which does not separate the importance of spiking and receptor activation as in the above-mentioned study. We will make this point more clearly in the resubmission.

      • Supplemental figure 1 looks largely linear to me? Out of curiosity, wouldn't you expect the left end to be aberrant because scaling up should theoretically increase the strength of some synapses that would have been previously below threshold for detection?

      We agree that the scaling ratio plot is largely linear. To be clear, the linearity of the ratio plot was interesting but our main point here was that this line had a positive slope meaning ratios (CNQX mPSC amplitudes/control mPSC amplitudes) got bigger for the larger CNQX-treated mPSCs. Alternatively, a multiplicative relationship where mPSCs are all increased by a single factor (e.g. 2X) would be a flat line with 0 slope at the multiplicative value (e.g. 2). In terms of the left side of the plot, we do see values that rise abruptly from 1 - this is partially obstructed by the Y axis in this figure and we will adjust this. This left part of the plot is likely due the CNQX-induced increases in mPSC amplitudes of mini’s that were below our detection threshold of 5pA. Therefore, mini’s that were 4pAs could now be 5pAs after CNQX treatment and these are then divided by the smallest control mPSCs which are 5 pAs (ratio of 1). We will try to do a better job describing this in the resubmission.

      Given that figure 2B also shows warping at the tail ends of similar distributions, how is this to be interpreted?

      The left side of the ratio plot shows evidence consistent with the idea that mIPSCs are dropping into the noise after CNQX treatment (similar to above argument), while most of the distribution suggests mIPSCs are reduced to 50% by CNQX treatment. On the right side of the ratio plot the values appear to mostly increase. We are not sure why this is happening, but it looks like some mIPSCs are not purely multiplicative at 0.5, particularly in TTX. It is also important to point out that this is a relatively small percent of the total population and the biggest mPSCs can vary to a great degree from one cell to the next. We will discuss this in the resubmission.

      • The readability of the figures is poor. Some of them have inconsistent boundary boxes, bizarre axes, text that appears skewed as if the figures were quickly thrown together and stretched to fit.

      We will address these issues in the resubmission.

      • I'm concerned about the optogenetic restoration of activity experiment. Cortical pyramidal neuron mean firing rates are log normally distributed and span multiple orders of magnitude. The stimulation experiments can only address the total firing at a network-level - given than a network level "mean" is meaningless in a lognormal distribution, how are we to think about the effect of this manipulation when it comes to individual neurons homeostatically stabilizing their own activities? In essence, the argument is made at the single-neuron level, but the experiment is conducted with a network-level resolution.

      As described above, we do not have the capacity to know what the actual firing rate of a particular neuron was before and after introducing a drug and so we cannot absolutely say that we have restored the original firing rates of neurons. However, there is reason to believe that this is achieved to some extent. Our optogenetic stimulation is only 50-100 ms long activating a subset of neurons. This is sufficient to provide a synaptic barrage that then triggers a full blown network burst where the majority of spikes occur, but this is after the light is off. In other words, the optogenetic light pulse only initiates what becomes a normal network burst that fortunately allows the individual cells to express their relatively normal (pre-drug) activity pattern. In our previous study we show that this is the case for individual units - the spiking of an individual unit during a burst is similar before and after CNQX/optostim (see Figure 4b and Suppl. Fig 4 in Fong et al. 2015 Nat. Comm.). We are not claiming that we have restored spiking to exactly the pre-drug state, but bring it back toward those levels and we see this is associated with a return of the mIPSC amplitude to near control levels. We will include a description of this in the resubmission.

      • Line 198-99: multiplicativity is not a requirement of a homeostatic mechanism.

      • Line 264-265 - again, neither multiplicativity and synaptic mechanisms are fundamentally any more necessary for a homeostatic locus than anything else that can modulate firing rate in via negative feedback.

      Agreed, see above discussion of homeostat requirement. Will adjust these statements in our resubmission.

      • 277: do you mean AMPAR?

      We were not clear enough here. We actually do mean GABAR. The idea is that CTZ increases network activity and thus increases both AMPAergic and GABAergic transmission. We will clarify this in the resubmission.

      • Example: Figure 1A is frustratingly unreadable. The axes on the raster insets are microscopic, the arrows are strangely large, and it seems unnecessary to fill so much realestate with 4 rasters. Only one is necessary to show the concept of a network burst. The effect of time+CNQX on the frequency of burst is shown in B and C.

      • Example: Figure 2 appears warped and hastily assembled. Statistical indications are shown within and outside of bounding boxes. Axes are not aligned. Labels are not aligned. Font sizes are not equal on equivalent axes.

      We will adjust these issues in the resubmission.

      • The discussion should include mention of the limitations and/or constraints of drawing general conclusions from cell culture.

      We agree and will adjust the discussion. Also, this is why we cited studies that argue GABAergic neurons have a particularly important role in homeostatic regulation of firing following sensory deprivations in vivo.

      • The discussion should include mention of the role of developmental age in the expression of specific mechanisms. It is highly likely that what is studied at ~P14 is specific to early postnatal development.

      We will discuss caveats of cortical cultures at DIV 14-20.

      It is essential to ensure that the data presented in the paper adequately supports the conclusions drawn. A more cautious approach in interpreting the results may lead to a stronger argument and a more robust understanding of the underlying mechanisms at play.

      Agreed.

      Reviewer #2 (Public Review):

      Synaptic scaling has long been proposed as a homeostatic mechanism for the regulation for the activity of individual neurons and networks. The question of whether homeostasis is controlled by neuronal spiking or by the activation of specific receptor populations in individual synapses has remained open. In a previous work, the Wenner group had shown that upscaling of glutamatergic transmission is triggered by direct blockade of glutamate receptors rather than by the concomitant reduction in firing rate (Nat Comm 2015). In this manuscript they investigate the mechanisms regulating scaling of GABA-mediated responses in cortical cell cultures using whole-cell recordings to detect GABAergic currents and multielectrode arrays to monitor global firing activity, and find that spiking plays a fundamental role in scaling.

      Initially, the authors show that chronic blockade (24 h) of glutamatergic transmission by CNQX first reduces spontaneous spiking (at 2 h), but later (24 h) firing grows back towards higher frequencies, suggesting a compensatory mechanism. Then it is shown that either chronic CNQX treatment or TTX cause a reduction in the amplitude of GABAergic mIPSCs. Effects of CNQX on IPSCs are then reverted by replacing spontaneous network firing by chronic optogenetic stimulation of the entire culture, also indicating that GABAergic transmission is homeostatically regulated by global firing. Enhancing glutamatergic transmission with CTZ increases mIPSC amplitude, while addition of TTX in the presence of CTZ causes the opposite effect. Finally, increasing spiking activity using bicuculline also increases mIPSC amplitude, and the authors conclude that spiking activity rather than neurotransmission control homeostatic GABA scaling. The manuscript shows interesting properties in the regulation of global GABAergic transmission and highlight the important role of spiking activity in triggering GABA scaling. However, it is strongly recommended to address some caveats in order to better support the conclusions presented in the manuscript.

      Major points:

      1) The reason why CNQX does not completely eliminate spiking is unclear (Fig. 1). What is the circuit mechanism by which spiking continues, although at lower frequency, in the absence of AMPA-mediated transmission and what the mechanism by which spiking frequency grows back after 24h (still in the absence of AMPA transmission)?

      Is it possible that NMDA-mediated transmission takes over and triggers a different type of network plasticity?

      The bursting in AMPAR blockade is due to the remaining NMDA receptor mediated transmission. We showed this in our previous study in Suppl. Figure 2 and 6 of Fong et al., 2015 Nat. Comm.. Our ability to optically induce normal looking bursts of spikes was also dependent NMDAR activation. Further, in Dr Fong’s PhD dissertation it was shown that the bursting activity was abolished when AMPA and NMDA receptors were both blocked. There are likely many factors that contribute to the recovery of activity, and certainly one of them is likely to be the weakening of inhibitory GABAergic currents. These points will be discussed in the resubmission.

      2) A possible activation of NMDARs should be considered. One would think that experiments involving chronic glutamatergic blockade could have been conducted in the presence of NMDAR blockers. Why this was not the case?

      Unfortunately, it was not possible to optogenetically restore normal bursting in the presence of NMDAR blockade (even when AMPAergic transmission was intact), as NMDARs appeared to be critical for the optical restoration of the normal duration of the burst (see Suppl. Figure 6 Fong et al., 2015 Nat. Comm). The reviewer raises an excellent point about a possible NMDAR contribution to altered synaptic strength, however. It is likely that NMDAR signaling is reduced in the presence of CNQX since burst frequency was reduced along with AMPAR-mediated depolarizations. We cannot rule out the possibility that NMDAR signaling could contribute to the alterations in GABAergic mIPSCs and will discuss this in the resubmission. However, previous work suggests that 24/48 hour block NMDARs (APV) did not trigger AMPAergic scaling in cortical or hippocampal cultures (see Figure 1 Turrigiano et al., 1998 Nature and Suppl. Figure 4 Sutton et al., 2006 Cell), moreover, our previous study showed that restoring NMDAergic transmission optogentically, at least to some point, had no influence on AMPAergic scaling (Fong et al., 2015, Nat. Comm.). Regardless, we cannot rule out a role for NMDAergic transmission in GABAergic scaling and this discussion will be included in the resubmission.

      Also, experiments with global ChR2 stimulation with coincident pre and postsynaptic firing might also activate NMDARs and result in additional effects that should be taken into consideration for the global scaling mechanism.

      To be clear, our optical stimulation was turned off before the vast majority of spiking that occurred in the bursts, which played out in a relatively natural manner (see lower panel of Figure 3B optogenetic stimulation – short duration only at onset of burst – we will make this clearer in resubmission). Therefore, we were unlikely to trigger significant synchronous activation that does not normally occur in network bursts.

      3) Cultures exposed to CTZ to enhance AMPA receptors generated variable results (Fig. 5), somewhat increasing spiking activity in a non-significant manner but, at the same time, strengthening mIPSC amplitude. This result seems to suggest that spiking might be involved in GABAergic scaling, but it does not seem to prove it.Then, addition of TTX that blocked spiking reduced mIPSC amplitude. It was concluded here that the ability of CTZ to enhance GABAergic currents was primarily due to spiking, rather than the increase in AMPA-mediated currents. However, in addition to blocking action potentials, TTX would also prevent activation of AMPARs in the presence of CTZ due to the lack of glutamatergic release. Therefore, under these conditions, an effect of glutamatergic activation on GABAergic scaling cannot be ruled out.

      These concerns were very similar to reviewer 1’s first comments. We will address these issues in the resubmission, but to briefly repeat our responses: We are going a step beyond most scaling studies by assessing MEA-wide firing rate, but this still provides an incomplete picture of the particular cells that we target for patch recordings in terms of their firing before and after a drug. Further, we see considerable variability in effect on firing rate from culture to culture, which we will better recognize in the resubmission. Finally, While the CTZ results are not conclusive, taken together with the optogenetic results we think our results are most consistent with idea that GABAergic scaling is a strong candidate as a spike rate homeostat.

      4) The sample size is not mentioned in any figure. How many cells/culture dishes were used in each condition?

      The individual dots represent either individual cells for mIPSC amplitude or individual cultures in MEA experiments. Number of cultures for figures were: Figure 2 – con = 10, TTX = 3, CNQX = 6, Figure 4 – CNQX = 4, con = 10, CNQX/photostim = 6, Figure 5 – ethanol = 3, CTZ = 3, CTZ + TTX =3, Figure 6 – con = 10, bicuculline = 4. We will include the number of cultures for mIPSC amplitude experiments in the figure legends upon resubmission.

      5) Cortical cultures may typically contain about 5-10% GABAergic interneurons and 90-95 % pyramidal cells. One would think that scaling mechanisms occurring in pyramidal cells and interneurons could be distinct, with different impact on the network. Although for whole-cell recordings the authors selected pyramidal looking cells, which might bias recordings towards excitatory neurons, naked eye selection of recording cells is quite difficult in primary cultures. Some of the variability in mIPSC amplitude values (Fig. 2A for example) might be attributed to the cell type? One could use cultures where interneurons are fluorescently labeled to obtain an accurate representation. The issue of the possible differential effects of scaling in pyramidal cells vs. interneurons and the consequences in the network should be discussed.

      We will include this discussion in the resubmission. Briefly, we chose large cells, which will be predominantly glutamatergic neurons as suggested by the reviewer. Ultimately, even among glutamatergic principal cells there may be variability in the response to drug application. All of these issues could contribute to variability and we will expand our description of the variability in our results, including that based on cellular heterogeneity.

      Reviewer #3 (Public Review):

      This paper concerns whether scaling (or homeostatic synaptic plasticity; HSP) occurs similarly at GABA and Glu synapses and comes to the surprising conclusion that these are regulated separately. This is surprising because these were thought to be co-regulated during HSP and in fact, the major mechanisms thought to underlie downscaling (TTX or CNQX driven), retinoic acid and TNF, have been shown to regulate both GABARs and AMPARs directly. (As a side note, it is unclear that the manipulations used in Josesph and Turrigiano represent HSP, and so might not be relevant). Thus the main result, that GABA HSP is dissociable from Glu HSP, is novel and exciting. This suggests either different mechanisms underlie the two processes, or that under certain conditions, another mechanism is engaged that scales one type of synapse and not the other.

      However, strong claims require strong evidence, and the results presented here only address GABA HSP, relying on previous work from this lab on Glu HSP (Fong, et al., 2015). But the previous experiments were done in rat cultures, while these experiments are done in mice and at somewhat different ages (DIV). Even identical culture systems can drift over time (possibly due to changes in the components of B27 or other media and supplements). Therefore it is necessary to demonstrate in the same system the dissociation. To be convincing, they need to show the mEPSCs for Fig 4, clearly showing the dissociation. Doing the same for Fig 5 would be great, but I think Fig 4 is the key.

      We understand the concern of the reviewer as we do see significant variability within our cultures and they were plated in different places, by different people, in different species (rat vs mouse). Therefore, in the resubmission to strengthen the conclusions we will repeat our optogenetic studies restoring activity in the presence of AMPAergic blockade in our mouse cortical cultures and measuring AMPA mEPSCs to assess scaling.

      The paper also suggests that only receptor function or spiking could control HSP, and therefore if it is not receptor function then it must be spiking. This seems like a false dichotomy; there are of course other options. Details in the data may suggest that spiking is not the (or the only) homeostat, as TTX and CNQX causes identical changes in mIPSC amplitude but have different effects on spiking. Further, in Fig 5, CTZ had a minimal effect on spiking but a large effect on mIPSCs. Similar issues appear in Fig 6, where the induction of increased spiking is highly variable, with many cells showing control levels or lower spiking rates. Yet the synaptic changes are robust, across all cells. Overall, this is not persuasive that spiking is necessarily the homeostat for GABA synapses.

      Together our results argue against AMPAR or GABAR activation as a trigger for GABAergic scaling and that this is different than our results for AMPAergic scaling. These points alone are important to recognize. While changes in spiking do not perfectly follow the changes in GABAergic scaling they do always trend in the right direction. As mentioned above, total spiking activity is only one measure of spiking. It is possible that these drugs alter the pattern of spiking that translates into an altered calcium transient that is important for triggering the plasticity. Again, it is important to note that we are going a step beyond most homeostatic plasticity studies that add a drug and simply assume it is having an effect on spiking (e.g. CNQX was initially thought to completely abolish spiking, but clearly does not). Based on the variability that we observe and the nature of our MEA recordings we cannot precisely determine how the total activity or pattern of activity changes with drug application in the specific cells that we target for whole cell recordings. However, we believe our results are more consistent with our proposal that GABAergic scaling is a strong candidate as a spike rate homeostat. Regardless, in the resubmission we will include a broader discussion about these possibilities, and the reality that there could be multiple homeostatic mechanisms that act to recover spiking activity.

      The paper also suggests that the timing of the GABA changes coincides with the spiking changes, but while they have the time course of the spiking changes and recovery, they only have the 24h time point for synaptic changes. It is impossible to conclude how the time courses align without more data.

      We can only say that by the 24 hour CNQX time point, when overall spiking is recovered, that GABAergic scaling has already occurred. We will state this more clearly in the resubmission.

    2. Reviewer #1 (Public Review):

      In the manuscript titled "GABAergic synaptic scaling is triggered by changes in spiking activity rather than transmitter receptor activation," the authors present an investigation of the role of GABAergic synaptic scaling in the maintenance of spike rates in networks of cultured neurons. Their main findings suggest that GABAergic scaling exhibits features consistent with a key homeostatic mechanism that contributes to the stability of neuronal firing rates. Their data demonstrate that GABAergic scaling is multiplicative and emerges when postsynaptic spike rates are altered. Finally, their data suggest that, in contrast to their prior data on glutamatergic scaling, GABAergic scaling is driven by spike rates. The authors set the paper up as an argument that GABAergic scaling, rather than glutamatergic scaling, serves as the critical homeostatic mechanism for spike rate regulation.

      While the paper is ambitious in its rhetorical scope and certainly presents intriguing findings, there are several serious concerns that need to be addressed to substantiate the interpretations of the data. For example, the CTZ data do not support the interpretations and conclusions drawn by the authors. Summarily, the authors argue that GABAergic scaling is measuring spiking (at the time scale of the homeostatic response, which they suggest is a key feature of a homeostat) yet their data in figure 5B show more convincingly that CTZ does not influence spiking levels - only one out of four time points is marginally significant (also, I suspect that the bootstrapping method mentioned in line 454-459 was conducted as a pairwise comparison of distributions. There is no mention of multiple comparisons corrections, and I have to assume that the significance at 3h would disappear with correction). Then, the fact that TTX applied on top of CTZ drives a increase in mIPSC amplitude is interpreted as a conclusive demonstration that GABAergic scaling is sensing spiking. It is inevitable, however, that TTX will also severely reduce AMAP-R activation - a very plausible alternative explanation is that the augmentation of AMPAR activation caused by CTZ is not sufficient to overcome the dramatic impact of TTX. All together, these data do not provide substantial evidence for the conclusion drawn by the authors.

      Specific points:

      - The logic of the basis for the argument is somewhat flawed: A homeostat does not require a multiplicative mechanism, nor does it even need to be synaptic. Membrane excitability is a locus of homeostatic regulation of firing, for example. In addition, synapse-specific modulation can also be homeostatic. The only requirement of the homeostat is that its deployment subserves the stabilization of a biological parameter (e.g., firing rate).<br /> - Line 63 parenthetically references an important, but contradictory study as a brief "however". Given the tone of the writing, it would be more balanced to give this study at least a full sentence of exposition.<br /> - The authors state (line 11) that expression of a hyperpolarizing conductance did not trigger scaling. More recent work ('Homeostatic synaptic scaling establishes the specificity of an associative memory') does this via expression of DREADDs and finds robust scaling.<br /> - Supplemental figure 1 looks largely linear to me? Out of curiosity, wouldn't you expect the left end to be aberrant because scaling up should theoretically increase the strength of some synapses that would have been previously below threshold for detection? Given that figure 2B also shows warping at the tail ends of similar distributions, how is this to be interpreted?<br /> - The readability of the figures is poor. Some of them have inconsistent boundary boxes, bizarre axes, text that appears skewed as if the figures were quickly thrown together and stretched to fit.<br /> - I'm concerned about the optogenetic restoration of activity experiment. Cortical pyramidal neuron mean firing rates are log normally distributed and span multiple orders of magnitude. The stimulation experiments can only address the total firing at a network-level - given than a network level "mean" is meaningless in a lognormal distribution, how are we to think about the effect of this manipulation when it comes to individual neurons homeostatically stabilizing their own activities? In essence, the argument is made at the single-neuron level, but the experiment is conducted with a network-level resolution.<br /> - Line 198-99: multiplicativity is not a requirement of a homeostatic mechanism.<br /> - Line 264-265 - again, neither multiplicativity and synaptic mechanisms are fundamentally any more necessary for a homeostatic locus than anything else that can modulate firing rate in via negative feedback.<br /> - 277: do you mean AMPAR?<br /> - Example: Figure 1A is frustratingly unreadable. The axes on the raster insets are microscopic, the arrows are strangely large, and it seems unnecessary to fill so much realestate with 4 rasters. Only one is necessary to show the concept of a network burst. The effect of time+CNQX on the frequency of burst is shown in B and C.<br /> - Example: Figure 2 appears warped and hastily assembled. Statistical indications are shown within and outside of bounding boxes. Axes are not aligned. Labels are not aligned. Font sizes are not equal on equivalent axes.<br /> - The discussion should include mention of the limitations and/or constraints of drawing general conclusions from cell culture.<br /> - The discussion should include mention of the role of developmental age in the expression of specific mechanisms. It is highly likely that what is studied at ~P14 is specific to early postnatal development.

      It is essential to ensure that the data presented in the paper adequately supports the conclusions drawn. A more cautious approach in interpreting the results may lead to a stronger argument and a more robust understanding of the underlying mechanisms at play.

    1. Reviewer #1 (Public Review):

      In this paper, Scholz and colleagues introduce a new paradigm aimed to bridge the gap between two domains that rely on hierarchical processing: language and memory. They find that, generally in line with their hypotheses, hierarchical processing is associated with activation in hippocampus (especially anterior), medial prefrontal cortex (mPFC), posterior superior temporal sulcus (pSTS), and inferior frontal gyrus (IFG). They also report that these effects in IFG are particularly strong late in the task, once participants have had a lot of experience and processing is presumably more automatic.

      This work has many strengths. The goal to bridge these literatures by developing a new task is commendable. I appreciate also that the authors separately validated their new task behaviorally by comparing it to another accepted as tapping hierarchical processing. I also liked that the authors were transparent about their hypotheses, and certain analyses like the grid coding one that was planned but did not work out. I do however have a number of concerns about the interpretations of the findings, such as whether some patterns are ambiguous as to the true underlying effects. I also have a number of clarification questions. All concerns are described below.

      1. Broadly, I would like to see the authors provide more information and logic on why hierarchical processing should be associated with a big reduction in univariate activation between P1 and P2-why would this signify item in contexts binding? How does this relate to existing work using other methods (e.g., like animal studies, which seem to make predictions more about representational structures)?

      2. There are many differences between what kind of information participants are processing between Position 1 and Position 2 for the HIER but not ITER conditions, and these may not be related to the hierarchical structure specifically. Related to but I think distinct from some of the limitations mentioned in the Discussion is the fact that in the HIER condition, what is happening cognitively between Position 1 and Position 2 items is more distinct (attending to color for position 1, and shape for position 2), whereas the two positions are equivalent in the ITER condition. This is a bit different from the authors' intended manipulation of hierarchy, because it involves a specific dimension. A stronger design might have been to flip the dimensions with respect to position specifically, to make shape sometimes important for position 1, and color for position 2 (perhaps by counterbalancing across subjects, so half would see the current P1=color and P2=shape rules, and the other half P1=shape and P2=color rules). Another important difference between color and shape is that while color is a simple binary distinction that participants can make based on their preexisting knowledge of red versus green, and to which they can assign a verbal label; whereas, the shape distinction was something novel they acquired during the experiment, has no real-world validity or meaning, and would presumably rely more on visuospatial processing. The shape dimension was also much more variable, I believe. I should say that I do find comfort in a few things - (1) that behavior on this task is correlated with another one that also indexes hierarchy processing, and (2) that the results show regional specificity in a pattern at least not easily explained by this distinction. However, I do think future work will be needed to ask whether it is hierarchy processing per se or rather something to do with the particular cognitive states engaged during each phase in this particular task that is eliciting activation in this set of regions. It would strengthen the paper to discuss this issue directly so readers are alerted to the caveat.

      3. I did not understand what data went into creating the schematic in Figure 2E. First, I think this depiction of a gradient might be easily misinterpreted because it seems to imply that the authors have a higher resolution analysis than they actually do. I believe the data were just analyzed in three subregions of hippocampus - head, body, and tail. Variability within each subregion (as seems to be implied by certain parts of a region being more grey and others more red/orange), is not something that could be assessed in this analysis. For example, why does the medial part of the head seem to be more "unspecific" whereas lateral regions look more HIER Pos1 specific? This type of depiction would only make sense in my mind if the authors had performed something like a voxelwise analysis to determine where specifically the interaction "peaks." I would recommend this visualization be cut or significantly changed to do away with the gradient.

      4. I believe the authors have not reported enough information for us to know that hippocampus involvement indeed does not change with experience. It is interesting that hippocampus in the task x experience ROI analysis shows, if anything, bigger differentiation between the two tasks (numerically) for the late trials. This seems to go against the authors' hypothesis, and a lot of existing data, that hippocampus is preferentially involved in early (vs. late) learning. Given that the key signature in this region, though, is that it differentiates between position 1 and position 2 in HIER but not ITER, and doesn't show a big difference in magnitude across the two tasks, it makes me wonder whether the task x experience interaction collapsing across the two positions makes sense for this region. Did the authors consider a similar task x experience interaction within hippocampus, but additionally considering position? I think there are multiple ways to look at this question (e.g., either looking for a task x experience x position interaction, a task x experience within position 1, a task x position interaction separately in early vs. late portions of the task, or even a position x experience interaction only within the HIER task), and I'm sure the authors would be in a better place to decide on a specific path forward. The same logic might go for mPFC, which shows an interaction but no main effect of task. This relates to claims in the discussion as well, such as that "hippocampus was equally active in early and late trials," but given this analysis is collapsing across the dimension hippocampus (and mPFC) seem to be sensitive to (position), it seems like this could be masking an underlying effect in which hippocampus/mPFC might still be differentially involved early vs. late (i.e., they might show the task x position interaction preferentially during some task phases).

      5. For the IFG regions, the task x experience interaction seems to be driven mainly by change (decrease in activation) for the ITER, rather than change in the HIER. The authors are at times careful to talk about this as "sustained" activity in IFG, which I appreciated, but other times talk about a "relative increase." I am not sure how I feel about that. I see the compelling evidence that there are task differences by experience, and that there is reduction for ITER that is interestingly not present for HIER, but I think I am still feeling uncomfortable with the term "increase" or even "relative increase" for HIER. For example, couldn't it simply be that the ITER task is requiring less processing with experience, whereas the HIER does not (perhaps because it requires more processing to begin with)? i.e., we do not know whether the reduction for ITER is simply a neural signal thing (i.e., activations diminish over time/experience) or a cognitive thing, specific to the ITER task. I think the authors are wanting to interpret the reductions as the former, but perhaps it would be more powerful to demonstrate if there was a baseline task that also showed reductions but for which not much would be expected in the way of cognitive change. Can the authors provide more justification for their choice of terminology (through either more logic or analyses), or if not, simply talk about it as sustained activity for HIER-which is especially interesting in the face of reductions for the ITER task?

      6. Please define what is meant by the term "automaticity" in the introduction. A clearer definition of the concept would make the paper generally easier to follow, and it would also help foreshadow the hypotheses about mPFC activity in the introduction. To this end, it could be useful to elaborate on how learning takes place in this task, how it could foster increasing automaticity, and how automaticity maps onto behaviour (e.g., is it RT decrease alone, which happens for both conditions in this task?) the brain regions discussed.

      7. There was no association between brain and behavior, which the authors interpret as a positive (as therefore task difficulty differences could not explain the effects). However in light of these null findings, it is on the flip side hard to know whether this neural engagement carries any behavioral significance. It seems to me as though the authors' framework makes predictions about brain-behavior correlations that were not tested in the manuscript. For example, I believe the authors asked whether behavior overall was correlated with activation. However, wouldn't the automaticity in IFG explanation for example predict that more engagement or an increase in engagement from early to late should be associated with e.g., faster RTs-not necessarily a relationship overall?

      8. On p. 8, it is stated that "In the hippocampus, this effect is driven by higher betas for the presentation of the first object (H1 > I1) and lower betas for the second object (H2 < I2) when comparing across tasks." Can the authors confirm whether the pairwise comparisons following up on the interaction here are significant, or rather if they are referring to a numerical difference in the betas? It looked like the same (numerically) would be true for mPFC; is there a reason why the same information is not included for the mPFC ROI? Also, might the authors provide more speculation as to why one might see both enhanced and reduced activation for P1 and P2, respectively?

      9. I was expecting some discussion of how hippocampus does not seem to show preferential involvement early, given that its potential role being restricted to early in learning (i.e., during acquisition only) was one of the primary motivators for using this task. As noted in my above comment (#4), I am not quite sure that I think there is evidence that the hippocampal role remains constant over this task, given the analyses provided (i.e., that they did not look at the position effect for early vs. late). However upon further analysis if it does seem to be more stable, and/or if it even increases over experience, the authors might want to talk about that in the Discussion.

      10. The fact that the hierarchies in this paradigm unfolded over time makes them distinct on some level from the hierarchies present in the VRT task that was used to validate the HIER task's hierarchical processing demands. For example, there might be additional computations required to processes these temporally ordered structures, support online maintenance, and so on. It may be worth considering this aspect of the task, and whether/to what extent the results could be related to it, in the paper.

      11. I also have many methodological and analytic clarification questions, which I detail in the recommendations for authors.

    1. It may already be clear that ethical conflict in psychological research is unavoidable. Because there is little, if any, psychological research that is completely risk free, there will almost always be conflict between risks and benefits. Research that is beneficial to one group (e.g., the scientific community) can be harmful to another (e.g., the research participants), creating especially difficult trade-offs. We have also seen that being completely truthful with research participants can make it difficult or impossible to conduct scientifically valid studies on important questions.   Of course, many ethical conflicts are fairly easy to resolve. Nearly everyone would agree that deceiving research participants and then subjecting them to physical harm would not be justified by filling a small gap in the research literature. But many ethical conflicts are not easy to resolve, and competent and well-meaning researchers can disagree about how to resolve them. Consider, for example, an actual study on “personal space” conducted in a public men’s room (Middlemist, Knowles, & Matter, 1976). The researchers secretly observed their participants to see whether it took them longer to begin urinating when there was another man (a confederate of the researchers) at a nearby urinal. While some critics found this to be an unjustified assault on human dignity (Koocher, 1977), the researchers had carefully considered the ethical conflicts, resolved them as best they could, and concluded that the benefits of the research outweighed the risks (Middlemist, Knowles, & Matter, 1977). For example, they had interviewed some preliminary participants and found that none of them was bothered by the fact that they had been observed.   The point here is that although it may not be possible to eliminate ethical conflict completely, it is possible to deal with it in responsible and constructive ways. In general, this means thoroughly and carefully thinking through the ethical issues that are raised, minimizing the risks, and weighing the risks against the benefits. It also means being able to explain one’s ethical decisions to others, seeking feedback on them, and ultimately taking responsibility for them.

      It would be beneficial to speak a bit more of the achievements from an unethical study. For example, we do tests on rats and that's not completely ethical, right? So are there any studies that weren't ethical but we learned a lot from that we could add to the conversation. Was there a benefit to deceiving participants? I think an example of this could make readers analyze is there's a reason some fight ethics boards to do studies that may not be entirely ethical. You could also add that most of the time there is a way to get rid of an unethical part of a study, for example, the study by Lahaut, was there a need to visit people's houses multiple times, or could they have just offered an incentive?

    1. Background Reproducibility of data analysis workflow is a key issue in the field of bioinformatics. Recent computing technologies, such as virtualization, have made it possible to reproduce workflow execution with ease. However, the reproducibility of results is not well discussed; that is, there is no standard way to verify whether the biological interpretation of reproduced results are the same. Therefore, it still remains a challenge to automatically evaluate the reproducibility of results.Results We propose a new metric, a reproducibility scale of workflow execution results, to evaluate the reproducibility of results. This metric is based on the idea of evaluating the reproducibility of results using biological feature values (e.g., number of reads, mapping rate, and variant frequency) representing their biological interpretation. We also implemented a prototype system that automatically evaluates the reproducibility of results using the proposed metric. To demonstrate our approach, we conducted an experiment using workflows used by researchers in real research projects and the use cases that are frequently encountered in the field of bioinformatics.Conclusions Our approach enables automatic evaluation of the reproducibility of results using a fine-grained scale. By introducing our approach, it is possible to evolve from a binary view of whether the results are superficially identical or not to a more graduated view. We believe that our approach will contribute to more informed discussion on reproducibility in bioinformatics.

      This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giad031 ) , which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      **Reviewer Stian Soiland-Reyes ** Hi, I am Stian Soiland-Reyes https://orcid.org/0000-0001-9842-9718 and have pledged the Open Peer Review Oath https://doi.org/10.12688/f1000research.5686.2: *

      Principle 1: I will sign my name to my review Principle 2: I will review with integrity Principle 3: I will treat the review as a discourse with you; in particular, I will provide constructive criticism Principle 4: I will be an ambassador for the practice of open science. This review is licensed under a Creative Commons Attribution 4.0 International License

      . --- This article presents a method for comparing reproducibility of computational workflow runs captured as RO-Crates, by calculating a set of genomics metrics ("features") and adding these to the crate's metadata. Overall I find this a valuable contribution and worthy of publication with GigaScience, primarily as a way for users of workflow systems CWL, Nextflow, Cromwell or Snakemake to ensure reproducibility, but also for workflow engine developers who may want to build on this methodology to improve their provenance support. In general the method proposed is sound, however it does have some limitations and inherent assumptions that are not highlighted sufficiently in the current manuscript, particularly concerning the selection of features and the reproducibility of the metrics calculation itself. I have detailed this with some points below that I would like the authors to clarify in a minor revision.

      --- Note - the below questions from GigaScience Reviewer Guidelines mainly relate to data, but I also here interpret them for the software described.

      Q1: Is the rationale for collecting and analyzing the data well defined? The author's workflow executions https://doi.org/10.5281/zenodo.7098337 are based on three 3rdparty bioinformatics workflows. Although they are not particularly "large-scale", they are representative best-practice pipelines in this field (data sizes from 200 MB to 6 GB) and also fairly representative for scalable workflow systems (Nextflow, CWL and WDL) used by bioinformaticians.

      Q2: Is it clear how data was collected and curated? It is not explicit in the text why these particular workflows were selected, beyond being realistic pipelines used in research. I would suggest something like "these workflows have been selected as fairly representative and mature current best-practice for sequencing pipelines, implemented in different but typical workflow systems, and have similar set of genomics features that we can assess for provenance comparison." The workflows have each been cited, but I would appreciate some consistency so that each workflow is cited both by its closest journal article and as their original download sources (e.g. GitHub).

      Q3: Is it clear - and was a statement provided - on how data and analyses tools used in the study can be accessed? Yes, full availability statements have been provided both for data and software, archived on Zenodo for longevity.

      Q4: Are accession numbers given or links provided for data that, as a standard, should be submitted to a community approved public repository? Yes, the tools have been added to https://bio.tools/ -- I don't think it's necessary to further register the data outputs with accession numbers. RRIDs for tools can be considered at a later stage, perhaps only for Sapporo.

      Q5: Is the data and software available in the public domain under a Creative Commons license? Yes, the software and dataset is open source under Apache License, version 2.0. The dataset https://doi.org/10.5281/zenodo.7098337 embeds existing workflows and data, however this is OK as included resources such as the rnaseq Nextflow workflow have compatible licenses (MIT) or are also Apache-licensed. The manuscript has software citations for two of the workflows, but this is missing for the CWL workflow, which is only cited by manuscript (33) (also missing DOI). It is unclear if any of the workflows are registered in https://workflowhub.eu/ but that should primarily be done by their upstream authors. The RO-Crates in https://doi.org/10.5281/zenodo.7098337 don't include any licensing and attribution for the embedded workflows, and its metadata file is misleadingly declaring the crate license as CC0 public domain. While CC0 is appropriate for examples and metadata file itself, the embedded MIT/Apache workflows from third parties can't legally be relicensed in this way and should have their original licenses declared. See https://www.researchobject.org/ro-crate/1.1/contextualentities.html#licensing-access-control-and-copyright I understand these RO-Crates are generated automatically by Sapporo, which does not directly understand licensing, and for documenting the test runs with Sapporo, I think these should not be modified post-execution. Pending further license support by Sapporo, perhaps a manual outer RO-Crate that aggregate these (e.g. adding a direct top-level ro-crate-metadata.json to the Zenodo entry) can provide more correct metadata as well as workflow citations. The authors could add to Discussion some consideration on (lack of) propagation of such metadata for auto-generated crates as part of workflow run provenance. For instance, if a workflow run was initiated from a Workflow Crate https://w3id.org/workflowhub/workflow-ro-crate/ at WorkflowHub, its license, attributions and descriptions could be carried forward to the final Workflow Run Crate provenance together with the Sapporo-calculated features.

      Q6: Are the data sound and well controlled? Yes, the data is sound. The testing on Mac gives null-results, but the authors explain the workflows failed to execute there due to archicectural differences, which is flagged as a valid concern for reproducibility. It may be worth further investigating if this is due to misconfiguration on that particular test machine in which case these columns should be removed.

      Q7: Is the interpretation (Analysis and Discussion) well balanced and supported by the data? The authors' discussion have some implicit assumptions that should be made more clear, together with implications: The Tonkaz tool assumes the workflow execution has already extracted the features and added them to the RO-Crate This assumes the right features have been correctly extracted by each execution Feature extraction also depend on bioinformatics tools that are subject to change/updates Newer versions of Sapporo-service, and in particular any non-Sapporo executors also making Workflow run Crates, may have a different feature selection Being able to fairly compare two workflow runs therefore depends on careful control of the Sapporo executor versions so that they have consistent feature selection This means the reproducibility metrics proposed has a potential reproducibility challenge itself This is not to say that the approach is bad, as the feature extraction is using predictable measures such as counting sequences, rather than heuristics. This means Future Work should point out the need for guidelines on what kind of features should be selected, to ensure they are consistent and reproducible. The set of features also depend on the type of data and class of analysis. As a minimum, the RO-Crate should therefore include provenance of that feature extraction, noting the Sapporo version, and ideally the version of the tools used for that. The authors may want to consider if feature extraction should be a separate workflow (e.g. in CWL), that itself can be subject to the same reproducibility preservation measures, and therefore also can be performed post-execution as part of Tonkaz' comparison or as a curation activity when storing Workflow Run Crates.

      Q8: Are the methods appropriate, well described, and include sufficient details and supporting information to allow others to evaluate and replicate the work? Yes, it was very easy to replicate the Tonkaz analysis of the workflow run crate that is already provided, as it is provided also as a Docker container. The Docker container is provided as part of GitHub releases, and so is not at risk of Docker Hub's automatic deletion. I have not tried installing my own Sapporo service to re-execute the workflow, but detailed installation and run details are provided in the README of both Tonkaz https://github.com/sapporowes/tonkaz#readme and sapporo-service https://github.com/sapporowes/sapporo/blob/main/docs/GettingStarted.md

      Q9: What are the strengths and weaknesses of the methods? The method provided is strong compared to naive checksum-based comparison of workflow outputs, which has been pointed out as a challenge by previous work. The advantage of the feature extraction is that the statistics can be compared directly and any disreprancies can be displayed to the user at a digestible high-level. The disadvantage is that this depends wholy on the selection of features, which must be done carefully to cover the purpose of the particular workflow and its type of data. For instance, a workflow that generates diagrams of sequence alignments could not be sufficiently tested in the suggested approach, as analyzing the diagram for correctness would require tools that may not even exist. Perhaps feature extraction should be a part of the workflow itself, so it can self-determine what is important for its analysis? The current approach also is quite sensitive to output data filenames, so changes in filename would mean features are not compared, even where such files are equivalent. This should be made more explicit in the manuscript, for instance workflows should ensure they don't include timestamps or random identifiers in their filenames. Further work could have a deeper understanding of the workflow structure to compare outputs based on their corresponding FormalParameter in the RO-Crate.

      Q10: Have the authors followed best-practices in reporting standards? Yes, the details provided are at a sufficient detail level, and the authors have re-used the RO-Crate data packaging. The RO-Crates created by Sapporo-service adds several terms for the metrics, which are declared on the @context according to RO-Crate specs https://www.researchobject.org/rocrate/1.1/appendix/jsonld.html#extending-ro-crate However the terms point to GitHub "raw" pages, which are not particularly stable, and may change depending on sapporo versions and GitHub's repository behaviour. I recommend changing the ad-hoc terms to PIDs such as a namespace under https://w3id.org/ or https://purl.org/ so that these terms can be stable semantic artefacts, e.g. submitting them to https://github.com/ResearchObject/ro-terms to register https://w3id.org/ro/terms/sapporo#WorkflowAttachment that can be used instead of https://raw.githubusercontent.com/sapporo-wes/sapporo-service/main/sapporo/roterms.csv#WorkflowAttachment or alternatively https://w3id.org/sapporo#WorkflowAttachment could be set up to redirect to the ro-terms.csv on GitHub. (discussed with the authors at ELIXIR Biohackathon) In doing so you should separate into two namespaces, the general Sapporo terms like "sha512", and the particular genomics feature sets including "totalReads" (e.g. https://w3id.org/datafeatures/genomics#WorkflowAttachment) as the second are a) Not sapporo-specific b) domainspecific. RO-Crate is developing Workflow Run profiles https://www.researchobject.org/workflow-runcrate/profiles/, although these have not been released at time of my review they are now stable, so the authors may want to check https://www.researchobject.org/workflow-runcrate/profiles/workflow_run_crate to ensure "FormalParameter" are declared correctly in the generated RO-Crate as separate entities, linked from the "File" using "exampleOfWork".

      Q11: Can the writing, organization, tables and figures be improved? The language and readability of this article is generally very good. Light copy-editing may improve some of the sentences, e.g. reducing the use of "Thus" phrases.

      Q12: When revisions are requested. See suggestions from above for minor revisions: Make explicit why these 3 workflows where selected (see Q2) Make pipeline software citations consistent in manuscript (see Q2, Q5) Avoid declaring CC0 within generated RO-Crate -- move this to only apply to the ro-cratemetadata.json Add an outer RO-Crate metadata file to Zenodo deposit to carry the correct licenses and pipeline licenses for each of rnaseq_1st.zip, trimming.zip etc. Improve discussion to better reflect limitations of the features and its own reproducibility issues (see Q7, Q9) Consider improvements to the RO-Crate context (see Q10) - this may just be noted as Future Work in the manuscript rather than regenerating the crates In addition: p2: Add citation for claim on file checksums different depending on software versions etc., for instance https://doi.org/10.1145/3186266 p3. "We converted Sapporo's provenance into RO-Crate" -- re-cite (20) as this is the paragraph explaining what it is. p10. Citations 7, 8 are missing authors p10. Citation 15 is now published, replace with https://doi.org/10.1145/3486897 p0. Citations 28, 33 is missing DOI

      Q13: Are there any ethical or competing interests issues you would like to raise? No, the third-party pipelines selected for reproducibility testing are already published and are here represented fairly, and only used as executable methods (as intended by their original authors), which I would say do not need ethical approval.

    1. Background Integration of data from multiple domains can greatly enhance the quality and applicability of knowledge generated in analysis workflows. However, working with health data is challenging, requiring careful preparation in order to support meaningful interpretation and robust results. Ontologies encapsulate relationships between variables that can enrich the semantic content of health datasets to enhance interpretability and inform downstream analyses.Findings We developed an R package for electronic Health Data preparation ‘eHDPrep’, demonstrated upon a multi-modal colorectal cancer dataset (n=661 patients, n=155 variables; Colo-661). eHDPrep offers user-friendly methods for quality control, including internal consistency checking and redundancy removal with information-theoretic variable merging. Semantic enrichment functionality is provided, enabling generation of new informative ‘meta-variables’ according to ontological common ancestry between variables, demonstrated with SNOMED CT and the Gene Ontology in the current study. eHDPrep also facilitates numerical encoding, variable extraction from free-text, completeness analysis and user review of modifications to the dataset.Conclusion eHDPrep provides effective tools to assess and enhance data quality, laying the foundation for robust performance and interpretability in downstream analyses. Application to a multi-modal colorectal cancer dataset resulted in improved data quality, structuring, and robust encoding, as well as enhanced semantic information. We make eHDPrep available as an R package from CRAN [[URL will go here]].

      This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giad030 ), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer Janna Hastings

      The manuscript describes a toolkit for the automated semantic enrichment and quality control of electronic health data using ontologies. This is a much needed utility that will add value to electronic data sharing and re-use for many different purposes including the development of machine learning for medical applications and personalised medicine. Overall the manuscript is well written and the functionality offered by the toolkit is well thought out and motivated. The internal consistency checks and the use of ontology-based information content to semantically aggregate variables into more informative meta-variables are particularly welcome functions.

      However, I recommend that the description of the tool functionality be clarified in some points, and the evaluation could be strengthened.page 6-7, internal consistency:

      1. How should the user specify semantic dependencies between variable pairs? Would it not be helpful to use a standard format for this specification to enable interoperability and re-use of such specifications?

      2. Should the specification of semantic relationships between variables not be linked to the knowledge from the ontologies? Ontologies are able to represent many different types of logical relationships between classes, which make them ideal for then serving as a standard and interoperable format for specifying this type of constraint. Rules are another promising standard approach for logic-based knowledge representation.

      Page 11, figure 4 a: I think it would be informative for evaluating the operation of the tool if the heatmap of variable missingness after application of the tool could also be illustrated beside the current Fig 4a.

      Page 13, ontology preparation: The paragraph describes what the authors have done to prepare ontologies for use with the tool. Is this preparation procedure also necessary for users to follow when they use the eHDPrep tool? How can alternative ontologies be incorporated (which may be useful for other domains)?Evaluation: The biggest shortcoming of the presented manuscript is that the evaluation is limited to the application of the tool to one dataset and subsequent manual evaluation of the outcome by one group, the study authors.

      The results as presented are positive, but there is a significant risk that the tool performs well on this task, as assessed by these study authors, but then fails to generalise to other tasks and datasets that future users might wish to use it with. To mitigate against this challenge, it would be optimal if somewhat more independent methods could be found for evaluating the performance of the different aspects of the tool. One approach could a rigorous comparison of this tool's performance against the performance of other tools that have similar functionality, e.g. comparison of the semantic aggregation function with other tools that find and recommend MICAs. An alternative approach might be to apply the tool to an additional dataset for which a group outside of the study authors would be prepared to provide an independent evaluation.

    1. Background Eukaryotic gene expression is controlled by cis-regulatory elements (CREs), including promoters and enhancers, which are bound by transcription factors (TFs). Differential expression of TFs and their binding affinity at putative CREs determine tissue- and developmental-specific transcriptional activity. Consolidating genomic data sets can offer further insights into the accessibility of CREs, TF activity, and, thus, gene regulation. However, the integration and analysis of multi-modal data sets are hampered by considerable technical challenges. While methods for highlighting differential TF activity from combined chromatin state data (e.g., ChIP-seq, ATAC-seq, or DNase-seq) and RNA-seq data exist, they do not offer convenient usability, have limited support for large-scale data processing, and provide only minimal functionality for visually interpreting results.Results We developed TF-Prioritizer, an automated pipeline that prioritizes condition-specific TFs from multi-modal data and generates an interactive web report. We demonstrated its potential by identifying known TFs along with their target genes, as well as previously unreported TFs active in lactating mouse mammary glands. Additionally, we studied a variety of ENCODE data sets for cell lines K562 and MCF-7, including twelve histone modification ChIP-seq as well as ATAC-seq and DNase-seq datasets, where we observe and discuss assay-specific differences.Conclusion TF-Prioritizer accepts ATAC-seq, DNase-seq, or ChIP-seq and RNA-seq data as input and identifies TFs with differential activity, thus offering an understanding of genome-wide gene regulation, potential pathogenesis, and therapeutic targets in biomedical research.

      This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giad026 ), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer Kaixuan Luo

      This paper develops a novel pipeline TF-Prioritizer to prioritize condition-specific TFs thorough integrative analysis of histone modification (HM) ChIP-seq and RNA-seq data. The pipeline integrates multiple computational tools: calculate TF binding site affinities and link candidate binding sites to genes using the TRAP and TEPIC. It uses DYNAMITE, a sparse logistic regression classifier, to infer TFs related to differential gene expression between conditions. It computes an aggregated score "TF-TG score" to score TFs from multiple types of evidence, and obtains a prioritized list of TFs from all histone modifications using a discounted cumulative gain ranking approach. It also provides additional functionality and web interface to visualize the results.

      Overall, the pipeline could be very useful for biologists with a user-friendly web application to automate the entire process from data preprocessing to statistical analysis and obtain interactive reports to gain novel biological insights. However, more systematic evaluations are needed to demonstrate the benefits of this pipeline.

      Major comments:

      1. In the computation of an aggregated score "TF-TG score", it uses a multiplicative function to combine differential expression (absolute log2FC), TF-Gene scores computed from TEPIC, and the total coefficients computed from DYNAMITE. One concern about this approach is that it may miss some TFs with support from only one or two types of evidence. In Fig 5, we see diffTF identifies a lot more TFs than diffTF. I don't think we can conclude that diffTF is less specific than TF-Prioritizer simply based on the number of TFs prioritized. Some of the TFs identified only by diffTF may be important but missed by TF-Prioritizer? I would like to see more detailed analysis comparing the lists of TFs identified by diffTF and TF-Prioritizer. Other evidence or metrics in addition to the number of prioritized TFs would be helpful to evaluate the plausibility of the prioritized lists of TFs.

      2. It is hard to interpret and evaluate the contribution of the evidence for prioritized TFs. Figure 6b is helpful, but it is unclear how the users would be able to evaluate the contribution of the components. Does the software run each of the combination separately and outputs a list of prioritized TFs under each combination?

      3. The TEPIC2 paper has already developed a very comprehensive pipeline, including TF affinity calculation by TRAP and computation of TF gene scores by TEPIC, as well as logistic regression to identify TFs between conditions by DYNAMITE, and it is already well paralyzed. The authors should clearly list the novel contributions from this work. It would be helpful to have a table comparing the functionalities and technical features between TF-Prioritizer and TEPIC2.

      4. The software takes histone modification ChIPseq and RNA-seq data as input. It will significantly improve the usage of the software if it supports DNase-seq and/or ATAC-seq, which are widely used. If this software could take ATAC-seq or DNase-seq data as input, it is important to include those data types and provide some examples to illustrate the usage and performance.

      5. The software combines multiple histone modification ChIP-seq datasets using a discounted cumulative gain ranking approach. However, different types of histone modifications have different epigenomic functions and different combinations indicate different chromatin states. Some TFs may be only enriched in a small subset of histone modifications (already discussed by the authors) and may be missed by the simple discounted cumulative gain ranking approach. The authors should provide prioritized TFs from each histone modification ChIP-seq dataset, and evaluate which TFs were prioritized by all the combined datasets, and which TFs by only one dataset. Also, some ChIP-seq datasets may be of poor quality. Does the software provide other options to rank the TFs from different epigenomic datasets? e.g. set different weights for different epigenomic datasets, etc.

      6. The authors conducted cooccurrence analysis based on the overlapping of peaks. It is unclear if the method would calculate some statistical measure (e.g. p-value) for the significance of co-occurrence. Also, since the TRAP model generates quantitative measure of TF binding affinity, I am curious to see if the quantitative TF binding affinity are also correlated for those co-occurred binding sites.

      Minor comments: 1. In Figure 1, it would be helpful to highlight which steps were already implemented in existing tools (and label the tools used), and which steps are novel in this study. 2. H3K4me3 data seems to be missing in the L10 time point. How does the method handle missing data? 3. It is unclear how the Pol2 ChIP-seq data was used in this study? Was it included in the model or only in the downstream analysis? 4. It is hard to interpret the browser tracks of the TF predictions ("Predicted xxx") in Figure 3 and 4. Please add more details about those tracks .5. Figure 6, the authors should provide more details to help understand this figure, especially panel b. The figure legend is too short.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1: Major comments: The key point of the manuscript is to provide resources for the plant community. The motivation for selecting these specific promoters, how they were obtained and cloned, what they are in detail and how they will be made publically available is all clearly described. The infection experiments presented in it are an added bonus and a proof of concept of the applicability of the system.

      Thank you very much.

      Minor comments: The promotor sequences will probably be included in the AddGene submission, however, it might be helpful to also deposit the promoter sequences at e.g. GenBank.

      Indeed, we have sent all sequence files to AddGene and they will be available for download there. We will look into transferring them to GenBank as well. We have not done this before, but are generally always supportive of maintaining data in open repositories.

      Line 133: "There are few exceptions to this rule...". It would probably helpful to list/mark these exceptions in Table 1

      We agree. We have now marked them in the table, and included the sentence “There are a few exceptions to this rule (marked with a * in the ‘Bases’ column in table 2), where we used a defined stretch of DNA that has previously been described to complement a mutant” in lines 135-137.

      Line 138: "A overhangs". In the GreenGate system, A-modules (promoters) are flanked by A- (5') and B- (3') overhangs (applies to line 144, too). Also, the B-overhang listed here (TTGT) is the reverse complement, which might be confusing for readers.

      A very good point. We have modified these lines to “standard four base pair GreenGate promoter module overhangs (5´-ACCT and TTGT-3´) were added via primers during amplification of the promoter sequences (see Supplementary Table 1 for a list of primer sequences. Note that TTGT is the complementary sequence of the A-to-B-module overhang, as this is added via the reverse primer)” in lines 141-144.

      Line 149 ff.: How many lines have been established per promoter tested? Did they all yield a similar expression pattern?

      This is indeed a very important point which was somehow lost along the way during manuscript preparations, after being moved around between results and methods section. We have put it back in in lines 162-165 as “We recovered several independent transgenic lines for the PEP1 and 2, PEPR1 and 2, as well as BIK1 and RBOHD reporters. Out of those, a minimum of three (RBOHD) and up to seven (PEPR2) independent lines showed fluorescence, and out of those, all individual lines for each reporter showed the same expression patterns.”

      Line 163: As someone not being familiar with microscoping Arabidopsis roots, I'm wondering how the authors can be sure that the tissue in question is the vasculature. Is this obvious for experts in the field?

      Of course, we can’t give a totally objective answer here, but we believe that by including the transmitted light image next to the fluorescence image, it is indeed visible that the fluorescence is limited to the center of the root, not the complete circumference. At the same time, it is important to note that all images are stereomicroscopic images, not confocal images. Thus, it is indeed not possible to, e.g., conclude if pericycle cells are included or excluded in the region with expression. So, while it is, we believe, safe to assume that it is vascular cells, we can’t determine which cell types in the vascular cylinder are expressing the reporters. This would require confocal imaging, which would increase the resolution, but at the expense of a good overview, which we think is more valuable for such a proof-of-principle.

      Discussion: Is there by any chance prior (cell-resolution) knowledge about the expression behaviour of any of the investigated promoters? E. g. by in-situ hybridizations? If so, do the expression patterns match?

      No, the expression of these reporters in direct response to fungal infection have so far only been studied by transcriptomics.

      Presentation and quality of the images need be improved. Scale bars are missing in all confocal images. In Figure 3 and 4, the name of genes examined can be labeled on the image, which will make it easier for readers. In addition, key information such as the inoculum and sampling time point after fungal inoculation should be described in the legend or the main text.

      We have added the scale bars and gene names into the images. We agree that the gene names make it easier for the reader. Further, we have added the inoculum and sampling time to the legend.

      More importantly, a "mock" inoculation or "before fungal inoculation" should be performed to reveal the expression changes of the marker genes after fungal inoculation.

      This is information was provided in the text and via the supplemental figures, but I assume we didn’t make it clear that these results and images were indeed specific control/mock experiments, and not some ‘general’ expression analysis. We have now tried to make this clearer, specifically in lines 192-194.

      Lines 172-174, the pictures are too small to see these details. The same for BIK1 (line 187).

      We have split up figure 3 into two separate figures (figures 3 and 4), to allow for them to be displayed larger, so that more details can be observed. Of course, it would also be helpful to do some confocal microscopy on specific regions of interest of these stereomicroscopic images to obtain high-resolution images of these regions, but, unfortunately, we did not reach this point in this project, before our team was disbanded, and we therefore only have the overview images to get a general idea of the responsiveness of the different reporters.

      Line 174-176, which results are these referring to? The same for line 200-203.

      We assume that this was not clear because we previously failed to make it clear that the control supplementary figures are from experimental controls/mock. We have reworded both paragraphs to, hopefully, explain it a bit better, and included the supplementary figure number that refers to. It’s now in lines 212-215 and 237-242.

      This study provides a valuable collection of vectors/constructs for investigation of transcriptional dynamics of plant immunity genes and should attract broad interest of the plant immunity field.

      Thank you very much.

      The current study by Calabria et al., entitled "pGG-PIP: A GreenGate (GG) entry vector collection with Plant Immune system Promoters (PIP)," reported the development of a set of GreenGate-compatible entry plasmids that contain promoter sequences of a series of immunity-related genes. This tool enables live-cell observation of immune responses at a cellular resolution. Being compatible with many other GreenGate tools, it opens up a door toward simultaneous visualization of different but overlapping immune pathways and ultimately describes the 4D dynamics of plant immunity. It is more than expected that these constructs will be used by a wide range of researchers and contribute to the ultimate understanding of plant innate immunity.

      Thank you very much.

      It is exciting that the authors observed the marker expression by a fluorescent stereomicroscope. This allows for non-destructive observation of response over time, keeping the system gnotobiotic. However, it was partly disappointing that the author did not take full advantage of this. It would have been much nicer if the authors observed the infection process over time, such that one could tell when and where the response starts, and whether local and systemic reactions occur simultaneously or instead require local-to-systemic signal transduction. They indeed seem to have done such time-course observation (line 378) however did not provide the results. I am curious to know what the authors could have found from those experiments. It would also be a strong appealing point of this method and is therefore highly encouraged

      We absolutely agree that this temporal data would be valuable and interesting. So far, we always imaged the colonization sites in the root tips from the first day when they become visible, until the day when the entire root was colonized/dying. However, we only recorded the infection sites directly, and did not image the entire plants, and local as well as systemic responses. This is, of course, something that we would have liked to do, and planned to do in the future, but, so far, we have not gotten to that point. We also attempted to use the images of the infection sites that we have recorded over time to obtain information about disease progression, e.g., colonization speed of the fungus, but this data is not (yet) at a point, where we feel confident that we have enough information to draw solid conclusions. So, while we absolutely agree that this kind of whole-plant imaging with both, high spatial and temporal resolution, must be the aim, at this point, unfortunately, we simply are not at that place yet.

      Immune responses are not always induction of expression but sometimes reduction. Some genes up-regulated in the first phase will also be down-regulated afterward in order to go back to the initial non-responding state. During such down-regulation, the expression of a fluorescence marker gene might not accurately reflect the real expression levels, because the translated proteins might stay longer even while its transcription is suppressed. To address this point, it is suggested that the authors observe the marker lines in the presence of a translation inhibitor, such as cycloheximide, and quantitatively analyze the dynamics of protein degradation when no new protein is synthesized.

      This is indeed an excellent point. Unfortunately, we have to first say that due to funding issues we are currently unable to do this experiment. However, we did include two things in the revised manuscript: First, we have put in a note that this is indeed a caveat of the system that must be acknowledged (lines 334-337). Second, we have included some information from a different study, which at least addresses this point to some degree. We have imaged the transcriptional response of the WRKY11 transcription factor in response to colonization by Fo5176, and in this case, we not only see a local upregulation next to the colonization site, but we see a complete switch in expression pattern. As part of this switch, WRKY11 expression, which was expressed in all root tissues and cells in uninfected control experiments, switches expression off in all tissues and cells except the vascular cells close to the infection site. So here, we indeed have a downregulation of the reporter. In these experiments, signal from the fluorescent WRKY11 reporter disappears from the cells within a day. As we imaged once per day, we can, unfortunately not get more specific than this one-day window. The day before colonization of the tip, signal is seen in all tissues, one day later, if/when the vasculature if colonized in the tip, there is no weak/residual fluorescence left in the cells of the outer tissues. So we can at least state that we would probably also detect downregulation of expression, despite the protein lifetime. Importantly, all our imaging is done on a regular stereomicroscope, and thus, camera sensitivity is moderate. I could imagine that we may be able to detect some residual fluorescence with ultra-sensitive cameras at a spinning disc, or a sensitive detector at a laser-scanning microscope, but we have not tested this. We have added this information in lines 337-347. I apologize that we can’t add more information than this.

      It is remarkable that the authors managed to clone 75 promoter sequences. However, whether all promoters work as expected was not clearly assessed in the present study. Did the authors only transform plants with PEP1, PEP2, PEPR1, and PEPR2 marker constructs? How would they know that the other promoters also work appropriately? In terms of providing these constructs to the research community, it is needed to disclose to which extent the expression has been validated in planta and which promoter has not been assessed.

      This is indeed important information. We have not used the promoters in mutant complementation assays, and have added this caveat in lines 348-350.

    1. Reviewer #1 (Public Review):

      This paper provides valuable (and impressive) data on the geometry of cerebellar foliation among 56 species of mammals and gives novel insights into the evolution of cerebellar foliation and its relationship with the anatomy of the cerebrum. Thus far, the majority of the research on brain folding focuses on the cerebral cortex with little research on the cerebellum. The results from Heuer et al confirm that the evolution of the cerebellum and cerebrum follows a concerted fashion across mammals. Moreover, they suggest that both the cerebrum and cerebellum folding are explained by a similar mechanistic process.

      1. Although I found the introduction well written, I think it lacks some information or needs to develop more on some ideas (e.g., differences between the cerebellum and cerebral cortex, and folding patterns of both structures). For example, after stating that "Many aspects of the organization of the cerebellum and cerebrum are, however, very different" (1st paragraph), I think the authors need to develop more on what these differences are. Perhaps just rearranging some of the text/paragraphs will help make it better for a broad audience (e.g., authors could move the next paragraph up, i.e., "While the cx is unique to mammals (...)").

      2. Given that the authors compare the folding patterns between the cerebrum and cerebellum, another point that could be mentioned in the introduction is the fact that the cerebellum is convoluted in every mammalian species (and non-mammalian spp as well) while the cerebrum tends to be convoluted in species with larger brains. Why is that so? Do we know about it (check Van Essen et al., 2018)? I think this is an important point to raise in the introduction and to bring it back into the discussion with the results.

      3. In the results, first paragraph, what do the authors mean by the volume of the medial cerebellum? This needs clarification.

      4. In the results: When the authors mention 'frequency of cerebellar folding', do they mean the degree of folding in the cerebellum? At least in non-mammalian species, many studies have tried to compare the 'degree or frequency of folding' in the cerebellum by different proxies/measurements (see Iwaniuk et al., 2006; Yopak et al., 2007; Lisney et al., 2007; Yopak et al., 2016; Cunha et al., 2022). Perhaps change the phrase in the second paragraph of the result to: "There are no comparative analyses of the frequency of cerebellar folding in mammals, to our knowledge".

      5. Sultan and Braitenberg (1993) measured cerebella that were sagittally sectioned (instead of coronal), right? Do you think this difference in the plane of the section could be one of the reasons explaining different results on folial width between studies? Why does the foliation index calculated by Sultan and Braitenberg (1993) not provide information about folding frequency?

      6. Another point that needs to be clarified is the log transformation of the data. Did the authors use log-transformed data for all types of analyses done in the study? Write this information in the material and methods.

      7. The discussion needs to be expanded. The focus of the paper is on the folding pattern of the cerebellum (among different mammalian species) and its relationship with the anatomy of the cerebrum. Therefore, the discussion on this topic needs to be better developed, in my opinion (especially given the interesting results of this paper). For example, with the findings of this study, what can we say about how the folding of the cerebellum is determined across mammals? The authors found that the folial width, folial perimeter, and thickness of the molecular layer increase at a relatively slow rate across the species studied. Does this mean that these parameters have little influence on the cerebellar folding pattern? What mostly defines the folding patterns of the cerebellum given the results? Is it the interaction between section length and area? Can the authors explain why size does not seem to be a "limiting factor" for the folding of the cerebellum (for example, even relatively small cerebella are folded)? Is that because the 'white matter' core of the cerebellum is relatively small (thus more stress on it)?

      8. One caveat or point to be raised is the fact that the authors use the median of the variables measured for the whole cerebellum (e.g., median width and median perimeter across all folia). Although the cerebellum is highly uniform in its gross internal morphology and circuitry's organization across most vertebrates, there is evidence showing that the cerebellum may be organized in different functional modules. In that way, different regions or folia of the cerebellum would have different olivo-cortico-nuclear circuitries, forming, each one, a single cerebellar zone. Although it is not completely clear how these modules/zones are organized within the cerebellum, I think the authors could acknowledge this at the end of their discussion, and raise potential ideas for future studies (e.g., analyse folding of the cerebellum within the brain structure - vermis vs lateral cerebellum, for example). I think this would be a good way to emphasize the importance of the results of this study and what are the main questions remaining to be answered. For example, the expansion of the lateral cerebellum in mammals is suggested to be linked with the evolution of vocal learning in different clades (see Smaers et al., 2018). An interesting question would be to understand how foliation within the lateral cerebellum varies across mammalian clades and whether this has something to do with the cellular composition or any other aspect of the microanatomy as well as the evolution of different cognitive skills in mammals.

    1. Considerate

      My reflections here build on Lino Pertile’s 2010 essay, ‘L’inferno, il lager, la poesia’. Pertile notes the profound correspondence between the opening poem of the book (OC I, 139) and this chapter. He points out how the main theme of Levi’s book, the dehumanising experience in the Lager, based on the annihilation of people’s identity, is expressed in the poem and resurfaces explicitly again in the chapter dedicated to Dante’s Ulysses. The key term revealing the correspondence of themes and intentions is ‘Considerate [consider]’, used twice in Levi’s poem (‘Consider if this is a man | … | Consider if this is a woman’) and rooted in the memory of Dante’s famous tercet where Ulysses addresses his crew as they sail towards the horizon of their last journey beyond the pillars of Hercules: ‘Considerate la vostra semenza: | fatti non foste a viver come bruti, | ma per seguir virtute e canoscenza’ (Inf. 26, 118-20 and OC I, 228).

      There are many other correspondences between the chapter of Ulysses and the opening poem, besides the ‘Considerate’, and that they are profound and filtered through the theme of memory, an eminently Dantean theme: the urgency to fix in the memory itself what is or will be necessary to tell, or the urgency to express and recount what is deposited in memory. Indeed, for Levi, the memory of each individual person contains that person’s humanity.

      Memory is immediately activated as Primo and Jean exit the underground gas tank (‘He [Jean] climbed out and I followed him, blinking in the brightness of the day. It was warm [tiepido] outside; the sun drew a faint smell of paint and tar from the greasy earth that made me think of [mi ricordava] a summer beach of my childhood'). Temporarily escaping hell by means of a ladder (a sort of Dantesque ‘natural burella’), it is the tiepido sun and a characteristic smell that evoke the childhood memory and that at the same time the reader cannot avoid connecting to the tiepide case of the initial poem (‘You who live safe | in your heated houses [tiepide case]’ [my emphasis]). It is then around the memory ‘of our homes, of Strasbourg and Turin, of the books we had read, of what we had studied, of our mothers’ that another theme in the chapter coalesces, the theme of friendship (‘He and I had been friends for a week’), a theme that had already emerged in a more general connotation in the opening poem (‘visi amici’). Warmth, friendship (visi amici…Jean), the kitchens as destination for Primo and Jean’s walk (the walk from the tank with the empty pot is ‘the ever welcomed opportunity of getting near the kitchens’, not for that hot food [cibo caldo] evoked in the poem, but for the soup of the camp, an alienating incarnation of Dantesque ‘pane altrui’ whose various names are dissonant). During the respite of the one hour walk from the tank to the kitchens, the intermittent memory of Dante’s canto emerges as if from an underground consciousness, the memory of Inferno as a partial and imperfect mirror of the human condition in the Lager, Ulysses as poetic memory, a sudden epiphany of a semenza, a seed, of humanity that the Lager is made to suppress, and Primo’s wondering in the face of this sudden internal revelation of still possessing an intact humanity. Primo’s memory of his home resurfaces as if springing from the memory of Dante’s text: the ‘montagna bruna’ of Purgatory is reflected in the memory of ‘my mountains, which would appear in the evening dusk [nel bruno della sera] when I returned from Milan to Turin!' But the real, familiar landscape is too heartbreaking a memory of ‘sweet things cruelly distant’, one of those hurtful thoughts, ‘things one thinks but does not say’. There is an epiphanic memory then, the poetic memory that surfaces during the walk and that reveals to Primo that he still is a man, a memory to which he clings despite the sense of his own audacity (‘us two, who dare to talk about these things with the soup poles on our shoulders’); there is also a more intimate memory, equally pulsating with life and humanity - but dangerous, because it makes Primo vulnerable to despair, threatening his own survival in the camp.

      The urgent need to remember Dante’s verses in this chapter develops the theme of memory, which has been central from the opening poem. In Levi’s poem, though, memory is perceived from a different angle: the readers (who live safe…) must honour that memory and transmit it as an imperative testimony of what happened in the concentration camp from generation to generation, testifying to the suffering of the man and the woman ‘considered’ in the poem. This is a memory to be carved in one’s heart, which must accompany those who receive it in every action and in every moment of each day like a prayer. Not coincidentally the poem follows the text of the most fundamental prayer of Judaism, the Shemà Israel, which is read twice a day, a memory to be passed on to one’s own children, a responsibility which is a sign of one’s humanity. The commandment to remember of the opening poem (‘I consign these words to you. | Carve them into your hearts') issues a potential curse to the reader, threatening the destruction of what most fundamentally characterises their humanity - home, health, children: ‘Or may your house fall down, | May illness make you helpless, | And your children turn their eyes from you’. Finally, Primo’s act of remembering during the walk to the kitchens is submerged by the Babelic soup (‘Kraut und Rüben…cavoli e rape…Choux et navets…Kàposzta és répak…Until the sea again closed – over us’) and yet the memory of it becomes part of his testimony in such a central chapter of the book written after surviving the Shoah. If the memory of Dante’s verses contributed to Primo’s faith in his own humanity and his psychological and physical survival in the camp, he then accomplishes the commandment of memory and his responsibility as a man through his own writing.

      CS

    2. non lasciarmi pensare alle mie montagne

      Very often, when we think about ‘Il canto di Ulisse’, we tend to recall only the most famous pages in which Levi tries to remember Dante’s canto. The depth and sense of urgency of the Ulyssean passages are so overwhelming and passionate that they may distract us from other elements in the chapter. However, if we go back to the text and read it closely, we cannot avoid noticing that, after a brief opening in which Levi introduces Pikolo and narrates how he came to be Pikolo’s ‘fortunate’ chaperone to collect the soup for the day, ‘Il canto di Ulisse’ also dwells quite significantly on a moment of domestic memories. While going to the kitchens, Levi writes: ‘Si vedevano i Carpazi coperti di neve. Respirai l’aria fresca, mi sentivo insolitamente leggero’. This is the first moment in the chapter in which Levi refers to the mountains as something that revitalises him and makes him feel fresh and light, both physically and mentally.

      This moment foreshadows another, also in this chapter, when Levi goes back to his mountains, those close to Turin, and compares them to the mountain that the protagonist of Dante’s canto, Ulysses, encounters just before his shipwreck with his companions:

      ... Quando mi apparve una montagna, bruna

      Per la distanza, e parvemi alta tanto

      Che mai veduta non ne avevo alcuna.

      Sì, sì, ‘alta tanto’, non ‘molto alta’, proposizione consecutiva. E le montagne, quando si vedono di lontano... le montagne... oh Pikolo, Pikolo, di’ qualcosa, parla, non lasciarmi pensare alle mie montagne, che comparivano nel bruno della sera quando tornavo in treno da Milano a Torino! Basta, bisogna proseguire, queste sono cose che si pensano ma non si dicono. Pikolo attende e mi guarda. Darei la zuppa di oggi per saper saldare ‘non ne avevo alcuna’ col finale.

      The significance of the mountains in Levi’s narration is confirmed in this passage. For him, the mountains represent his experience of belonging, his youthful years, and his work as a chemist – the job he was doing when he commuted by train from Turin to Milan. At the same time, Levi’s own memories of the mountains intertwine and overlap with another mountain, Dante’s Mount Purgatory. Here, a deep and perhaps not fully conscious intertextual game starts to emerge and to characterise Levi’s writing. The lines that Levi does not remember are these (compare, on the Dante page):

      Noi ci allegrammo, e tosto tornò in pianto,

      ché de la nova terra un turbo nacque,

      e percosse del legno il primo canto.

      For Dante’s Ulysses, Mount Purgatory signifies the final moment of his adventure and his desire for knowledge. The marvel and enthusiasm that Ulysses and his company feel when they see the mountain is suddenly transformed into its contrary. From the mountain, a storm originates that will destroy the ship and swallow its crew: ‘Tre volte il fe’ girar con tutte l’acque, | Alla quarta levar la poppa in suso | E la prora ire in giù, come altrui piacque’. Dante’s Mount Purgatory, so majestic and spectacular, represents the end of any desire for knowledge that aims to find new answers to and interpretations of human existence in the world without God’s word.

      Going back to Levi’s text, we find that, instead, in a kind of reverse overlapping between his image and that of Ulysses, the image of the mountain of Purgatory suggests to Levi a very different set of thoughts that, although seemingly and similarly overwhelming, opens up new interpretations: ‘altro ancora, qualcosa di gigantesco che io stesso ho visto ora soltanto, nell’intuizione di un attimo, forse il perché del nostro destino, del nostro essere oggi qui’. For a moment, it is almost as if Levi, a new Dantean Ulysses in a new Inferno, stands in front of Mount Purgatory and forgets the terzine and the shipwreck. Maybe Levi cannot or does not want to remember those terzine because the mountain in Purgatory represents something very different for him than for Dante’s Ulysses. Levi’s view of the mountain does not lead to a moment of recognition of sin, as it does in Dante’s Ulysses. For him, the mountain, like his mountain range, is the gateway to knowledge, enrichment, and illumination and to a world that lies beyond the imposed limits of traditional, constricting, and distorted views and that awaits discovery (‘qualcosa di gigantesco che io stesso ho visto ora soltanto’). Something about and beyond the Lager.

      To better understand how the mountains are central in ‘Il canto di Ulisse’, we have to remember that Levi’s view of the mountains strongly depends on his anti-Fascism, which he expressed particularly vigorously in two moments of his life: during his months in the Resistance, just before he was captured and sent to Fossoli, and, even more intensely, during the adventures of his youth, when he was a free young man who enjoyed climbing the mountains surrounding Turin. As Alberto Papuzzi has suggested, ‘le radici del suo rapporto con la montagna sono ben piantate in quella stagione più lontana: radici intellettuali di cittadino che cercava sulla montagna, nella montagna, suggestioni e risposte che non trovava nella vita, o meglio nell’atmosfera ispessita di quella vita torinese, senza passato e senza futuro’ (OC III, 426-27). Indeed, reports Papuzzi, Levi confirms that:

      Avevo anche provato a quel tempo a scrivere un racconto di montagna […]. C’era tutta l’epica della montagna, e la metafisica dell’alpinismo. La montagna come chiave di tutto. Volevo rappresentare la sensazione che si prova quando si sale avendo di fronte la linea della montagna che chiude l’orizzonte: tu sali, non vedi che questa linea, non vedi altro, poi improvvisamente la valichi, ti trovi dall’altra parte, e in pochi secondi vedi un mondo nuovo, sei in un mondo nuovo. Ecco, avevo cercato di esprimere questo: il valico.

      The heart of that epic story made its way into the chapter ‘Ferro’ in Il sistema periodico. The discovery of this (brave) new world, ‘mondo nuovo’, is an integral part and a direct achievement of Levi’s experience in the mountains. The mountains open a new understanding and a new perspective on the world.

      Something that escapes common understanding is revealed through the experience of the mountains, both in Levi’s memories of his youth and in his literary recounting of Auschwitz. Reciting Dante in ‘Il canto di Ulisse’ is therefore not only an intertextual exercise for Levi. Only by inserting Levi’s literary references in the complexity of his own experience – before, during, and after Auschwitz – can we fully capture the depth of his reflections. Levi mentally and metaphorically brought to Auschwitz not only Dante but also his ‘metafisica dell’alpinismo’. Together, they contributed to his attempt to come to terms with that reality.

    1. Author Response:

      The following is the authors' response to the original reviews.

      Reviewer #1 (Public Review):

      […] Overall, the authors build a convincing case for TEs being an important source of regulatory information. I don't have any issues with the analysis, but I am concerned about the sweeping claims made in the title. Once you get rid of eQTLs that could be altered by either SNPs or TIPs and include only those insertions that show strong evidence of selection, the number of genes is reduced to only 30. And even in those cases, the observed linkage is just that, not definitive evidence for the involvement of TEs. Although clearly beyond the scope of this analysis, transgenic constructs with the TEs present or removed, or even segregating families, would have been far more convincing. 

      We notice that the referee thinks that we "built a convincing case for TEs being an important source of regulatory information". This is what we wanted to convey in the title, were we were cautious to not claiming that TEs are the most important contributor to gene expression variability in rice populations. However, we agree with the referee that the title may be improved to better describe the results presented. We have therefore changed the title to "Transposons are an important contributor to gene expression variability under selection in rice populations".

      With respect to demonstrating causality by removing or introducing the TEs, this is indeed a work we plant to do but that, as stated by the referee, is beyond the scope of this analysis.

      The fact that many of the eQTL-TIPs were relatively old is interesting because it suggests that selection in domesticated rice was on pre-existing variation rather than new insertions. This may strengthen the argument because those older insertions are less likely to be purged due to negative effects on gene expression. Given that the sequence of these TEs is likely to have diverged from others in the same family, it would have been interesting to see if selection in favor of a regulatory function had caused these particular insertions to move away from more typical examples of the family. 

      The TIP-eQTL are from different classes, superfamilies and families and the number of TIP-eQTLs of the same family is too small to deduce sequence communalities (4.6 TIP-eQTLs/family in indica and 3.6 TIP-eQTLs/family in japonica). On the other hand the effect of TIPs on expression can be positive or negative (we show actually that it is often negative). In the later case, a plausible scenario would be of the insertion inactivating a promoter element, and in this case it would be the insertion itself, and not the actual sequence of the TE what would be selected.

      Also, previous work done in our lab has shown that TEs can amplify and mobilize transcription factor binding sites that are bound by the TF even when they are not close to a gene and therefore probably not directly affecting gene expression (Hénaff et al.,2014. The Plant Journal). In that case, the sequence of the eQTL TEs and those that are far away from genes will not necessarily differ. 

      Reviewer #2 (Public Review):

      In this manuscript, Castanera et al. investigated how transposable elements (TEs) altered gene expression in rice and how these changes were selected during the domestication of rice. Using GWAS, the authors found many TE polymorphisms in the proximity of genes to be correlated to distinct gene expression patterns between O. sativa ssp. japonica and O. sativa ssp. indica and between two different growing conditions (wet and drought). Thereby, the authors found some evidence of positive selection on some TE polymorphisms that could have contributed to the evolution of the different rice subspecies. These findings are underlined by some examples, which illustrate how changes in the expression of some specific genes could have been advantageous under different conditions. In this work, the authors manage to show that TEs should not be ignored when investigating the domestication of rise as they could have played an important role in contributing to the genetic diversity that was selected. However, this study stops short of identifying causations as the used method, GWAS, can only identify promising correlations. Nevertheless, this study contributes interesting insights into the role TEs played during the evolution of rice and will be of interest to a broader audience interested in the role TEs played during the evolution of plants in general. 

      We agree with the referee that the results presented do not allow concluding on causality, and we have been careful not to pretend they would in the manuscript. We plan to perform analysis of adding or removing TEs by CRIPR/Cas 9 approaches to address this, but, in line with referee's 1 comment, we think this is beyond the scope of this analysis.

      ---------- 

      Reviewer #1 (Recommendations For The Authors): 

      Everything that I need to say is provided in the public portion of my review. 

      Reviewer #2 (Recommendations For The Authors): 

      Major concerns:

      1. The authors compare the proportion of the variance explained by the most significant TIP and SNP on the observed eQLTs associated with TIPs and SNPs. Thereby the authors conclude that TIPs explain more variance than SNPs. If I am not mistaken the GWAS was run separately for TIPs and SNPs, however, I am wondering if running the GWAS on the combined TIP and SNP dataset might be the better way to compare the variance explained by TIPs and SNPs on gene expression differences. It would be nice to see if these results also hold true if a TIP and SNP combined dataset is used as the most significant marker in a GWAS might not be the causal mutation but might just be linked to the causal mutation. Further in the TIP dataset, the number of markers is only 45k and in the SNP dataset, it is 1 000k, which could bias the GWAS toward finding markers that explain more of the variation in the dataset with fewer markers. 

      We addressed the reviewer concern by using two complementary approaches, whose results are described in the text (lines 119-121) and in the new Figure 1-figure supplement 1.

      First, we addressed the concern regarding the independent GWAS for TIPs and SNPs vs a combined strategy. For this, we built new japonica/indica genotype matrices containing all TIP and SNP matrix together and ran eQTL mapping again. Using the same strategy (association + FDR adjust), we found 100% of the previous TIP-eQTLs and 99% of the previous SNP-eQTLs. We repeated the same analysis (proportion of expression variance), and the results were mostly the same (Figure 1-figure supplement 1A).

      Second, we addressed the two concerns (combined genotypes and different amount of TIP and SNP markers) using a single approach. SNP matrices were LD pruned using a r2 = 0.9 and later subsampled to the exact number of TIPs (Indica = 30,396, Japonica = 25,168). We verified that these SNPs covered well the 12 rice chromosomes. SNP and TIP genotypes were later merged into a single matrix, and eQTL mapping was repeated for each of the subspecies and conditions using the same parameters as in the previous version of the manuscript. 100 % of the previously reported TIP-eQTL associations were found using this new approach. Nevertheless, we found a very important drop of sensitivity in the SNP-eQTLs (only 15-20% of the previous associations were detected), possibly due to the strong reduction in the number of SNPs (> 95 %), which results in much lower number of markers at < 5Kb from genes). We repeated the analysis of Figure 1D, and observed very similar results (Figure 1-figure supplement 1D). There is a very important number of TIP-eQTL associations that do not coincide with SNP-eQTLs, (74% in indica, 83% in japonica) indicating that TIP-eQTL mapping is complementary to SNP-eQTL mapping as it uncovers additional associations (note that in this case the overlap between TIP-eQTLs and SNP-eQTLs is lower than in the previous analysis due to the lower sensitivity of SNP-eQTL mapping using less markers). In the cases were both a TIP and a SNP coincide as eQTL, TIPs explained slightly more variance than SNPs in both indica and japonica (in 54% of the cases TIP variance > SNP variance).

      2. Line 146 to 152: in this section, the authors describe overlaps between TIP-eQTLs in two different growth conditions, however, in the text it is not mentioned if the TIPs have the same effect on gene expression in the two conditions or if the gene expression is up-regulated in one condition but down-regulated in the other. This information would be interesting to have here, especially as the authors go on to say that only a small number of TIP-eQTLs are stress-specific. The same comment also goes for the eQTL overlap described on lines 167 to 170. 

      We checked the effect type (positive or negative) of TIP-eQTLs in both scenarios (associations shared between wet/dry conditions, and associations shared between subspecies). In both cases, 100 % of the shared TIP-eQTLs have the same effect type in the two conditions or subspecies. We have updated the text accordingly (Lines 55-157 and Lines 179-181)

      3. Lines 192 to 196: the authors mention that the frequency of non-eQTL-TIPs was at the same frequency in indica and japonica, which is in contrast to eQTL-TIPs. However, on line 132 it is mentioned that eQTL-TIPs were overrepresented in 1 kb regions upstream of genes. Hence, is the pattern of the frequency of non-eQTL-TIPs being at the same frequency in indica and japonica also observed in the 1 kb regions upstream of genes and/or if the distribution of non-eQTL-TIPs is matched to one of the eQTL-TIPs? Or is this pattern driven by non-eQTL-TIPs far away from genes?

      We checked the frequencies of TIPs at 1Kb upstream genes and found that the general pattern is maintained, with the frequencies of TIP no-eQTLs being more correlated than that of TIP-eQTLs. We have included this information (lines 204-206) an added a new supplementary file (Figure 2-figure supplement 2)

      4. In the discussion, the authors could briefly discuss how linked selection affecting TIPs could contribute to the observed results. After reading the second example in the result section where one of the example TIPs (TIP_50059) is found on the Hap B which contains "some additional structural differences" (line 290), I was left wondering how much of the increase in TIP frequency can be attributed to genetic hitchhiking? And how much of the results could be caused by linked selection, especially when considering that structural variations are not included in the GWAS analyses. 

      We agree with the referee in that some of the TIP eQTLs here described might be not the actual cause of expression variability (ej, TIP linked with the causal mutation), although we cannot know the exact fraction. This is stated in several places of the results and discussion sections. However, the fact that TIPs tend to explain more variance than SNPs and that TIP eQTL, but not SNP eQTL, tend to concentrate in the upstream proximal region of genes where most transcription regulatory sequences are located (Figure 1), suggest that TIP eQTLs could be more frequently the causal than SNP eQTLs. We revised the text to ensure that we convey this message appropriately.

      Minor comments: 

      • Lines 80 to 83: the description of the rice phylogeny should be moved to the introduction. 

      Done (Lines 68-72)

      • Line 177 to 186: It was unclear to me if the authors checked in the ancestral rice population laced the TIPs described in this section as recently inserted in the indica and japonica ssp. It would be nice to add this information to this section. 

      Thanks to the referee comment we noted an imprecision in the text. The approximate 1/3 of subspecies specific TIP-eQTLs refers to the TIPs at 3% MAF (ie, some of these insertions could be present at > 3% in indica, but at < 3% MAF in japonica). We now indicate only the TIPs that are truly specific to any of the two subspecies (frequency is zero in one of the two) and looked for their presence in rufipogon:

      59 insertions are indica-specific. Of those, 33 are present in rufipogon.

      21 insertions are japonica-specific. Of those, 5 are present in rufipogon.

      We have incorporated this information in the manuscript (Lines 185-189). The species-specific TIPs are also available in the Supplementary File 3.

      • Line 353: "have two of more TIPs" should be "two or more" 

      Done (Line 369)

      • Figure 1D: Using a square layout instead of a rectangle layout for the plot will make it easier to interpret. 

      Done.

    1. wide variety of methods for any given project.

      Having a variety of methods to get to a solution is exteremely important as a lot of people can come at a problem with different angles and solve it differently. However, this can breed confusion as to which way is the "right" way and which way is the "wrong way" It also may seem like their way might not work but we should do a though examination from their side to see why they think it might work and maybe square that up with the harsh reality.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reply to the reviewers

      Manuscript number: RC-2023-01932

      Corresponding author(s): Dennis KAPPEI

      We would like to thank all reviewers for their recognition of our approach and the quality of our work as well as their constructive criticism.

      Reviewer #1

      Reviewer #1: The manuscript by Yong et. al. describes a comparison of various chromatin immunoprecipitation-mass spectrometric (ChIP-MS) methods targeting human telomeres in a variety of systems. By comparing antibody-based methods, crosslinkers, dCas9 and sgRNA targeted methods, KO cells and various controls, they provide a useful perspective for readers interested in similar experiments to explore protein-DNA interactions in a locus-specific manner.

      Response: We would like to thank the reviewer for the feedback and the appreciation of our work.

      Reviewer #1: While interesting, I found it somewhat difficult to extract a clear comparison of the methods from the text. It was also difficult to compare as data and findings from each method was discussed in its own context. Perhaps it is not in their interest to single out a specific method and it is indeed true that there are caveats with each of the methods.

      Response: Across our manuscript we have established one single workflow, for which we present some technical comparisons (e.g. using single or double cross-linking in Fig. 2a/b), technical recommendations such as the use of loss-of-function controls (e.g. Fig. 1c v. Fig. 2a and Extended Data Fig. 3g vs. 3i) and an application to unique loci using dCas9 (Fig. 3f). Based on the suggestions below, we believe that we will improve the clarity of communicating our approach.

      Reviewer #1: I think the manuscript would be of interest but I believe that there are remaining questions that need to be addressed before publication. In particular, I found it difficult to reconcile the discrepancy in protein IDs between most experiments vs. the WT/KO experiment in Fig 2. The authors make a big deal about the importance of the KO control but I think the fewer proteins identified there may be experiment-specific and not general to the KO system. I ask that this be investigated more carefully by the authors in their revisions.

      Response: We thank the reviewer for highlighting this point. We do not think that the ChIP-MS comparison between U2OS WT and ZBTB48 KO clones (Fig. 2a) has experiment-specific caveats. Instead the KO controls as well as the dTAGV-1 degron system for MYB ChIP-MS (Extended Data Fig. 3) reveal antibody-specific off-targets, which are indeed false-positives. Please see below for further details.

      Reviewer #1: Ln 57: What is "standard double cross-linking ChIP reactions" in this context? Is it the two different crosslinkers? The two proteins? The reciprocal IPs of one protein, and blotting for another? It's not clear here or from Extended Fig 1A. Upon further reading, it seems to pertain to the two crosslinkers - if so, the authors should briefly describe their workflow to help readers.

      Response: As the reviewer correctly concludes, we indeed intended to highlight the use of two separate crosslinkers (formaldehyde/FA and DSP). This combination is important as illustrated in the side-by-side comparison of Fig. 2a and Fig. 2d. Here, we performed ZBTB48 ChIP-MS in five U2OS WT and five U2OS ZBTB48 KO clones. While in both experiments the bait protein ZBTB48 was abundantly enriched in the samples that were fixed with formaldehyde we lose about half of the telomeric proteins that are known to directly bind to telomeric DNA independent of ZBTB48 and all of their interaction partners. For instance, while the FA+DSP reaction in Fig. 2a enriched all six shelterin complex members, the FA only reaction in Fig. 2d only enriches TERF2. These data suggest that the use of a second cross-linker helps to stabilise protein complexes on chromatin fragments. This is a critical message of our manuscript as ChIP-MS only truly lives up its name if we can enrich proteins that genuinely sit on the same chromatin fragment without protein interactions to the bait protein. We will expand on this in both the text and our schematics in Fig. 1a and 3a to make this clearer for the readers.

      Reviewer #1: Ln 95: It is surprising and quite unclear to me why it is that the WT ZBTB48 U2OS pulldown in Fig 1B shows 83 hits for the WT vs Ig control experiment but 27 hits for the WT vs KO condition in Fig 2A. The two WT experiments have the same design and reagents, shouldn't they be as close as technical replicates and provide very similar hits?

      The authors seem to make the claim that most of the 'extra' proteins in WT vs Ig are abundant and false positives, but if this is so, shouldn't they bind non-specifically to the beads and be enriched equally in Ig control and ZBTB48 WT IPs?

      Response: We again thank the reviewer for raising this point and the need to explain in more detail why we interpret the difference between 83 hits (anti-ZBTB48 antibody vs. IgG; Fig. 1c) and 27 hits (anti-ZBTB48 antibody used in both U2OS WT and ZBTB48 KO cells; Fig. 2a) primarily as false-positives. The KO controls in Fig. 2a allow to keep the ZBTB48 antibody as a constant variable while instead comparing the presence (WT) or absence (KO) of the bait protein. Hence, proteins that were enriched in the IgG comparison in Fig. 1c but that are lost in the WT vs. KO comparison in Fig. 2a are likely directly (or indirectly) recognised by the ZBTB48 antibody, akin to off-targets to this particular reagent. In a Western blot this would be equivalent to seeing multiple bands at different molecular weights with only the band belonging to the protein-of-interest disappearing in KO cells. To illustrate this we would like to refer to Extended Data Fig. 2, in which we have replotted the exact same data from Fig. 2a. However, in addition we have here highlighted proteins that were enriched in the IgG comparison in Fig. 1c. 46 proteins (in pink) are indeed quantified in the WT vs. KO comparison, but these proteins are found below the cut-offs (and most of them with very poor fold changes and p-values). In contrast to the other several hundred proteins common between both experiments that can be considered common background non-specifically bound to the protein G beads, these 46 proteins represent antibody-specific false-positives.

      The above consideration is not unique to ChIP-MS as illustrated by the Western blot example. We also do not claim novelty on the experimental logic, e.g. pre-CRISPR in 2006 Selbach and Mann demonstrated the usefulness of RNAi controls in immunoprecipitations (IPs) (PMID: 17072306). However, our data suggests that ChIP-MS is particularly vulnerable to this type of false-positives given that the approach requires (double-)cross-linking to sufficiently stabilise true-positives on the same chromatin fragment.

      To supplement the WT vs. ZBTB48 KO comparison, we had included a second experiment in the manuscript that illustrates the same point in even more dramatic fashion. First, KO controls are very clean in principle, but they themselves might come with caveats if e.g. the expression levels between WT and KO samples differ greatly. This might create a situation that the reviewer hinted to, i.e. differential expression of abundant proteins that would proportionally to their expression levels stick to the beads, resulting in “fold enrichments”. The resulting false positives could e.g. be controlled by matched expression proteomes. For ZBTB48 we have previously measured this (PMID: 28500257) and demonstrated that only a small number of genes are differentially expressed (~10) and hence we can interpret the WT vs. ZBTB48 KO comparison quite cleanly. However, for other classes of proteins such as transcription factors that regulate a large number of genes, E3 ligases etc. this might present a more serious concern. Therefore, we extended our loss-of-function comparison to such a transcription factor, MYB, by using the dTAGV-1 degron system. Importantly, the MYB antibody has been used in previous work for ChIP-seq applications (e.g. PMID: 25394790). Here, instead of 186 hits in the MYB vs. IgG comparison using the same MYB antibody in control-treated and dTAGV-1-treated cells (upon 30 min of treatment only) we only detect 9 hits. Again, similar to the WT vs. ZBTB48 KO comparison, 180 proteins are quantified in the DMSO vs. dTAGV-1 comparison, but these proteins fall below the cut-offs (Extended Data Fig. 3g vs. 3i). Again, we believe that this quite drastically illustrates how vulnerable ChIP-MS data is to large numbers of false-positives. This is not only a technical consideration as such datasets are frequently used in downstream pathway/gene set enrichment analyses etc. Such large false discovery rates would obviously lead to error-carry-forward and additional (unintended) misinterpretations. We will carefully expand our textual description across the manuscript to make these points much clearer. In addition, we will move the previous Extended Data Fig. 3 into the main manuscript to more clearly highlight this important point.

      Reviewer #1: Volcano plots in Figs 1, 2, and Suppl. Tables etc: Are the plotted points the mean of 5 replicates? Was each run normalized between the replicates in each group, for e.g. by median normalization of the log2 MS intensities? This does not appear to be the case upon inspection of the Suppl Tables. Given the variability in pulldown efficiency, gel digest and peptide recovery, this would certainly be necessary.

      Response: All volcano plots are indeed based on 4-5 biological replicates (most stringently in the WT vs. KO comparisons in Fig. 2 based on each 5 independent WT and ZBTB48 KO single cell clones). The x-axis of each volcano plot represents the ratio of mean MS1-based intensities between both experimental conditions in log2 scale. However, precisely to account for the variation that the reviewer highlighted we did not base our analysis on raw MS1 intensities but we used the MaxLFQ algorithm (PMID: 24942700) as part of the MaxQuant analysis software (PMID: 19029910) for genuine label-free quantitation across experimental conditions and replicates. In this context, we would also like to refer to a related comment by reviewer #2 based on which we will now addd concordance information for each replicate (heatmaps for Pearson correlations and PCA plots). We will improve this both in the text and methods section accordingly.

      Reviewer #1: Ln 125: The authors make the claim that the ChIP-MS experiments are inherently noisy, with examples from WT cells, dTAG system and IgG controls. This is likely the case, yet their experiments with WT vs KO cells do not identify as many proteins overall. I find this inconsistency somewhat unclear and does not seem to match the claim of ChIP-MS experiments and crosslinking adding to non-specificity. Can the authors add the total number of identified proteins in each volcano plot, for easier reference?

      Response: The number of identified proteins does not vary majorly between matched IgG and loss-of-function comparisons and for instance the single cross-linking (FA only) experiment in Fig. 2c has the largest number of quantified proteins among all ZBTB48 IPs. But we will of course add the requested information to all plots.

      Reviewer #1: I think the manuscript is interest as it provides important benchmarks for ChIP-proteomics experiments. I believe that there are remaining questions that need to be addressed before publication. In particular, I found it difficult to reconcile the discrepancy in protein IDs between most experiments vs. the WT/KO experiment in Fig 2. The authors make a big deal about the importance of the KO control but I think the fewer proteins identified there may be experiment-specific and not general to the KO system. I ask that this be investigated more carefully by the authors in their revisions.

      Response: We would like to thank the reviewer for recognising our work as a source for important benchmarks for ChIP-MS experiments. We hope that with a more detailed description and discussion the highlighted aspects will be more clearly communicated. We originally conceived our manuscript as a short report and now realised that some of the information became too condensed and might therefore benefit from more extensive explanations.

      Reviewer #2

      Reviewer #2: Summary: In this manuscript, Yong and colleagues have introduced a optimized technique for studying actors on chromatin in specific regions with a localized approach thanks to revisited ChIP-mass spectrometry (MS) with label-free quantitative (LFQ). The authors exhibited the utility of their approach by demonstrating its effectiveness at telomeres from cell culture (human U2OS cells) to tissue samples (liver, mouse embryonic stem cells). As a proof of concept, this technique was tested by the authors with proteins from complex shelterin specific to telomeres (TERF2 and ZBTB48), transcription factors (MYB), and through dCas9-driven locus-specific enrichment. Notably, the authors created a U2OS dCas9-GFP clone and then introduced sgRNAs to target either telomeric DNA (sgTELO) or an unrelated control (sgGAL4). The cells expressing sgTELO exhibited a significant localization of telomeres and an enriched amount of telomeric DNA in ChIP with dCas9. They also found the proteins previously identified as known to be enriched at telomeres (for example, the 6 shelterin members).

      Moreover, the authors illustrated the importance of double crosslinking (formaldehyde (FA) and dithiobis(succinimidyl propionate) (DSP) in ChIP-MS. Their data demonstrated also that ChIP-MS is inclined towards false-positives, possibly owing to its inherent cross-linking. However, by utilizing loss-of-function conditions specific to the bait, it can be tightly managed.

      • Can you show the concordance between biological replicates for each ChIP with LFQ? (heatmap of Pearson correlation and PCA plot). This will confirm the robustness of the use of LFQ.

      Response: We will add the requested concordance data for all volcano plots both in the form of heatmaps of Pearson correlation and PCA plots. Across our datasets, the replicates from the same experimental condition clearly cluster with each other and replicates have high concordance values of >0.9. As expected replicates for the target/bait samples have slightly higher concordance values compared to the negative controls (IgG or loss-of-function samples). We thank the reviewer for this suggestion as the new Extended Data panel will strengthen the illustration of our robust LFQ data.

      Reviewer #2: You say that your technique is " a simple, robust ChIP-MS workflow based on comparably low input quantities » (line 139). What would be really interesting for a technical paper would be: a schematic and a table illustrating the differences between your method and the previously published methods (amount of material, timeline,...) to really highlight the novelty in your optimized techniques.

      Response: We will add a comparison table with previous publications using ChIP-MS and for reference include some complementary approaches as requested by reviewer #3. On this note, we would like to stress that we are not “only” intending to use less material and to have an easy-to-adopt protocol. A cornerstone of our manuscript is to apply rigorous expectations to ChIP-MS experiments, in particular the ability to enrich proteins that independently bind to the same chromatin fragments as the bait protein (regardless of whether this is an endogenous protein or a exogenous, targeted bait such as dCas9). Otherwise, such experiments risk to be regular protein IPs under cross-linking conditions, which as illustrated by our loss-of-function comparisons are prone to yield particularly large fractions of false-positives.

      Reviewer #2: It would be interesting to perform the dCas9 ChIP experiment in telomeric regions with and without LFQ. Since the novelty lies in this parameter, at no time does the paper show that LFQ really allows to have as many or more proteins identified but in a simpler way and with less material. A table allowing to compare with and without LFQ would be interesting.

      Response: We do not fully understand what the suggestion “without LFQ” refers to exactly. We assume that this reviewer might suggest to use a different quantitative mass spectrometry approach other than LFQ, e.g. SILAC labelling, TMT labelling etc. Please note that we do not claim that LFQ quantification is per se superior to the various quantification methods that had been developed and widely used across the proteomics community especially before instrument setups and analysis pipelines were stable enough for label-free quantification (a name that is strongly owed to this historic order of development). However, a central goal of our workflow is to make robust and rigorous ChIP-MS accessible to the myriad of laboratories using ChIP-qPCR/-seq and that may not be extensively specialised in mass spectrometry. Both metabolic and isobaric labelling come not only at a higher cost but also present an experimental hurdle to non-specialists compared to performing biological replicates without any labelling, essentially the same way as for any ChIP-qPCR etc. experiment. We will further elaborate on these points in the manuscript to more clearly convey these notions.

      In general, with the right effort different quantitative methods should and will likely yield qualitatively similar results. However, comparisons between LFQ approaches (MaxLFQ, iBAQ,…) and labelling approaches (SILAC, TMT, iTRAQ) have already been better explored and verbalised elsewhere (e.g. PMID: 31814417 & 29535314). Therefore, we believe that this will add relatively little value to our manuscript.

      Reviewer #2: Put a sentence to explain "label free quantification". For a reader who is not at all familiar with this technique, it would be interesting to explain it and to quote the advantages compared to PLEX.

      Response: Thanks for highlighting this. In line with the point above as well as a similar comment by reviewer #1 we will improve this both in the main text and manuscript to clearly explain the terminology, the MaxLFQ algorithm (PMID: 24942700) used and to highlight the advantages compared to labelling approaches.

      Reviewer #2: what does the ranking on the right of each volcano plot represent (figure 1 b-e, figure 2a,d,e for example)? top of the most enriched proteins in the mentioned categories? Not very clear when we look on the volcano plot. it must be specified in the legend.

      Response: The numbering these panels is meant to link protein names to the data points on the volcano plots. The order of hits is ranked based on strongest fold enrichment, i.e. from right to center. We will clarify this in the figure legends.

      Reviewer #2: General assessment/Advance: The authors explain in their article that the ChIP exploiting the sequence specificity of nuclease-dead Cas9 (dCas9) to target specific chromatin loci by directly enriching for dCas9 was already published. Here, the novelty of this study lies in the use of LFQ mass spectrometry to optimize the technique and make it easier to handle. Some comparisons with previous papers or data generated by the lab will be interesting to really show the improvement and the advantage to use LFQ and therefore, to highlight better the novelty of the study.

      Response: We thank the reviewer for this assessment and as mentioned above we will include such a comparison table. dCas9 has been used previously in a ChIP-MS approach termed CAPTURE (PMID: 28841410). While this is clearly a landmark paper that illustrated the dCas9 enrichment concept across multiple omics applications (i.e. not limited to proteomics) in their application to telomeres, the authors enriched only 3 out of the 6 shelterin proteins with quite moderate fold enrichments (POT1: 0.99, TERF2: 2.13, TERF2IP: 1.06; in log2 scale). Based on this alone, POT1 and TERF2IP would not have qualified for our cut-off criteria. In addition, while the authors had performed three replicates, detection is only reported in 1-2 out of 3 replicates. While it is difficult to reconstruct statistical values based on the publicly accessible data, it is therefore unlikely that even these 3 proteins would have robustly be considered hits in our datasets. Similarly, using recombinant dCas9 with a sgRNA targeting telomeres that was in vitro reconstituted with sonicated chromatin extracts from 500 million HeLa cells (CLASP; PMID: 29507191) the authors identified only up to 3 shelterin subunits (TERF2, TERF2IP and TPP1/ACD) based on 1 unique peptide each only. For comparison, in our dCas9 ChIP-MS dataset all 6 shelterin subunits are identified with 9-19 unique peptides, contributing to our robust quantification. Even when considering cell line-specific differences (HeLa cells have shorter telomeres and hence provide less biochemical material for enrichment per cell), these comparisons illustrate that prior attempts struggled to robustly replicate even the most abundant telomeric complex members.

      Based on these findings, others had suggested that dCas9 “might exclude some relevant proteins from telomeres in vivo” (PMID: 32152500), implying that dCas9 ChIP-MS might inherently not be feasible including at repetitive regions such as telomeres. Therefore, we believe that our dCas9 ChIP-MS data is a proof-of-concept that the method has the genuine ability to robustly enrich key proteins at individual loci. In concordance with the comment above we will include a comparison table with previous papers and expand on these points in the discussion.

      Reviewer #2: By presenting this technical paper, the authors allow laboratories across different fields to use this technique to gain insights into protein enrichment in specific chromatin regions such as the promoter of a gene of interest or a particular open region in ATACseq in a easier way and with less materials. This paper holds value in enabling researchers to answer many pertinent questions in various fields.

      Response: We again thank the reviewer for this encouraging assessment and we do indeed hope that this manuscript makes a contribution to a much wider use of ChIP-MS approaches as a promising complement to existing genome-wide epigenetics analyses.

      Reviewer #3

      Reviewer #3: Strengths of the study:

      The study is well-structured and provides a robust workflow for the application of ChIP-MS to investigate chromatin composition in various contexts.

      The use of telomeres as a model locus for testing the developed ChIP-MS approach is appropriate due to its well-characterized protein composition.

      The comparison of WT vs KO lines for ZBTB48 is a rigorous way to control for false-positives, providing more confidence in the results.

      The direct comparison of double vs only FA-crosslinking provides valuable insights into the benefit of additional protein-protein crosslinking in ChIP-MS workflows.

      Response: We thank the reviewer for this assessment and we agree that the above are several of the key features of our manuscript.

      Reviewer #3: Areas for improvement: The novelty of the method is more than questionable as both ChIP-MS coupled to LFQ and dCas9 usage for locus-specific proteomics have been previously reported. The fact that the authors directly pulldown dCas9 instead of using a dCas9-fused biotin ligase and subsequent streptavidin pulldown is only a very minor change to previous methods (not even improvement). It would be more accurate for the authors to present their study as an optimization and rigorous validation of existing techniques rather than a novel approach.

      Response: While we appreciate where the reviewer is coming from, it occurs to us that most of the reviewer’s comments equate ChIP approaches with other complementary methods, in particular proximity labelling. The latter is indeed a powerful experimental strategy and in fact we are ourselves avid users. As highlighted to reviewer #1 as well, our manuscript was originally conceived as a shorter report and based on the feedback we will now expand our discussion to more broadly incorporate related approaches.

      However, we would like to stress that dCas9 ChIP-MS and dCas9-biotin ligase fusions are not the same thing and this is not a minor tweak to an existing protocol. While both approaches have converging aims – to identify proteins that associate with individual genomic loci – the experimental workflows differ fundamentally. Biotin ligases use a “tag and run” approach by promiscuously leaving a biotin tag on encountered proteins. Subsequently, cellular proteins are extracted and in fact proteins can even be denatured prior to enrichment with streptavidin beads. While this is an in vivo workflow that (depending on the biotin ligase used) may provide sensitivity advantages, it does not retain complex information. The latter is inherently part of ChIP workflows due to the use of cross-linkers. One obvious future application would be to maintain (= not to reverse as we have done here) the crosslink during the mass spectrometry sample preparation in order to read out cross-linked peptides to gain insights into interactions and structural features. We will now more clearly incorporate such notions into our discussion.

      In addition, we would like to stress that while this reviewer focuses primarily on the dCas9 aspect of our manuscript, we believe that our general ChIP-MS workflow including the combination with label-free quantitation is useful and important already by itself as e.g. recognised by both reviewers #1 and #2.

      Reviewer #3: The authors should more thoroughly discuss previous works using ChIP-MS and dCas9 for locus-specific proteomics. This would give readers a better understanding of how the current work builds on and improves these earlier methods. For a paper that aims on presenting an optimized ChIP-MS workflow it is crucial to showcase in which use cases it outperforms previously published methods.

      E.g., compare locus-specific dCas9 ChIP-MS to CasID (doi.org/10.1080/19491034.2016.1239000) and C-Berst (doi.org/10.1038/s41592- 018-0006-2); how does your method perform in comparison to these?

      Response: Again, while we will now incorporate more extensively comparisons with previous ChIP-MS publications (and the few prior manuscripts that included dCas9) as well as related techniques, we would like to stress that dCas9 ChIP-MS is not the same approach as CasID and C-BERST, which rely on dCas9 fusions to BirA* and APEX2, respectively. dCas9-APEX2 strategies were also published by two additional groups as CASPEX (back-to-back with the C-BERST manuscript; PMID: 29735997) and CAPLOCUS (PMID: 30805613). All of these methods target specific loci with dCas9 and promiscuously biotinylate proteins that are in proximity to the dCas9-biotin ligase fusion protein. As described above, while the application of the BioID principle (PMID: 22412018) to chromatin regions has converging aims with the dCas9 ChIP-MS part of our manuscript, they do not test the same. ChIP carries chromatin complexes through the entire workflow while the CasID approaches are independent of that. This is the same scenario if we were to compare IP-MS reactions (such as the ChIP-MS reactions presented here for endogenous proteins) and BioID-type experiments for proximity partners of the same bait proteins.

      Reviewer #3: Compare likewise the described protein interactomes to previously published interactomes.

      Response: We will add comparisons in form of Venn diagrams with previously published interactomes. However, we would like to stress that a key aspect of our manuscript is the smaller yet rigorous hit lists based on e.g. loss-of-function controls, higher stringencies and specificity. Simply comparing final interactomes remains reductionist relative to the importance of other variables such as experimental design, number of replicates, data analysis etc.

      Reviewer #3: The authors use sgGAL4 as a control for the telomeric targeting of dCas9. The IF results (Fig3b) show that sgGAL4 barely localizes to the nucleus with very faint signals. It would be helpful to use a control with homogenous nuclear localization of dCas9 to further strengthen the author's conclusions.

      Response: dCas9-EGFP in the presence of sgGAL4 localises diffusely to the nucleus as expected. We have here used a very widely used non-targeting sgRNA control that has been originally used for imaging purposes (PMID: 24360272) and has since been used in a variety of studies (e.g. PMID: 26082495, 32540968, 28427715) including a previous dCas9 ChIP-MS attempt (PMID: 28841410). In addition, to the diffuse nuclear, non-telomeric localisation we provide complementary validation of clean enrichment of telomeric DNA specifically in the sgTELO samples. Therefore, we do not see how other non-targeting sgRNAs would provide for better controls or improve our data.

      Reviewer #3: The extrapolation of results from the use of telomeres as a proof-of-concept to other loci is not a given considering the highly repetitive structure of telomeric DNA. The authors should either be more cautious about generalizing the results to other loci or demonstrate that their method can also capture locus-specific interactomes at non-repetitive regions.

      Response: We agree that the adoption of any locus-specific approach to single genomic loci is a steep additional hurdle and warrants rigorous data on well characterised loci with very clear positive controls. We will expand on these challenges in our discussion. However, we would like to stress that we did not make any such statement in our original manuscript apart from simply referring to our telomeric experiment as proof-of-concept evidence that locus-specific approaches are feasible by ChIP.

      Reviewer #3: What are concrete biological insights from this optimized ChIP-MS workflow that previous methods failed to show?

      Response: We explicitly used telomeres as an extensively studied locus with clear positive controls that at the same time allows us to evaluate likely false positives. As such the intention of the manuscript was not to yield concrete biological insights but to develop a new methodological workflow.

      As also highlighted in a response to reviewer #2, based on other prior attempts to enrich telomers in ChIP-like approaches with dCas9 (PMID: 28841410 & 29507191), it had been suggested that dCas9 “might exclude some relevant proteins from telomeres in vivo” (PMID: 32152500), implying that dCas9 ChIP-MS might inherently not be feasible including at repetitive regions such as telomeres. Therefore, recapitulating the set of well-described telomeric proteins was no trivial feat and our ChIP-MS workflow (both targeted and applied to individual proteins) represents a well-validated method to in the future systematically interrogate changes in chromatin composition. As one example at telomeres, this may include chromatin changes upon the induction of telomeric fusions or general DNA damage.

      Reviewer #3: For instance, the authors could compare their mouse and human TERF2 interactomes and discuss similarities and differences between both species.

      Response: We thank the reviewer for this suggestion, but the comparison between mouse and human TERF2 interactomes is not suitable across the datasets that we generated. U2OS is a human osteosarcoma cell line that relies on the Alternative Lengthening of Telomeres (ALT) pathway while our mouse data is based on embryonic stem cells (mESCs) and mouse liver tissue. Even the latter, in contrast to adult human tissue, expresses telomerase. We can certainly still pinpoint (as already done in our original manuscript) individual differences among known factors, e.g. the fact that proteins such as NR2C2 are more abundantly found at ALT telomeres (PMID: 19135898, 23229897, 25723166) vs. the detection of the CST complex as telomerase terminator (PMID: 22763445) in the mouse samples. However, the TERF2 datasets contain hundreds of proteins as “hits” above our cut-offs and a key message of our manuscript is that the majority of them are likely false positives. Here, differences are likely extending to expression differences between U2OS cells, mESCs and liver samples. So while appealing in theory, this cross data set comparison would remain rather superficial and error prone at this point. As a biology focused follow-up study, this would need to be rigorously conceived based on an appropriate choice of human and murine cell line models. In addition, this would likely require the generation of FKBP12-TERF2 knock-in fusion clones to allow for rapid depletion of TERF2 for a clean loss-of-function control since sustained loss of TERF2 leads to chromosomal fusions and eventually cell death in most cell types.

      Reviewer #3: The authors should also describe which interaction partners are novel and try to validate some of these using orthogonal methods.

      Response: We will now highlight more explicitly two proteins, POGZ and UBTF, that are most robustly and reproducibly enriched on telomeric chromatin across datasets, including the U2OS WT vs. ZBTB48 KO comparison (Fig. 2a). However, we would like to abstain from a molecular characterization at this point. As mentioned above, the discovery of novel telomeric proteins is not the focus of this manuscript, which is primarily dedicated to method development. In addition, these type of validations in methods papers are often limited to a few assays (e.g. can 1 or 2 proteins be enriched by ChIP? Do you see some localisation by IF? etc.). However, our research group has a history of publishing in-depth mechanistic papers on the characterisation of novel telomeric proteins (e.g. PMID: 23685356, 28500257, 20639181, doi.org/10.1101/2022.11.30.518500). Therefore, a genuine validation of such factors would require functional insights and clearly warrants independent follow-up work.

      Reviewer #3: Human Terf2 ChIP-MS (Fig1A) seems to be much more specific than the mouse counterpart (Fig1D) (32 TERF2 interactors out of 176 hits in human vs 12 TERF2 interactors out of 500 hits in mouse). Could the authors explain this notable difference?

      Response: As eluded to above, Fig. 1A and 1D cannot be directly compared, starting with the difference in complexity in the input material – cell line vs. tissue. For comparison, the Terf2 ChIP-MS data from mouse embryonic stem cells tallies up to 19 out of 169 hits, which is much closer to the U2OS results. Again, we deem the majority of hits from the TERF2 ChIP-MS data to be false-positives and the more complex input material from mouse livers likely accounts for the difference in these numbers.

      Reviewer #3: The authors used much higher cell numbers than previously published ChIP-MS experiments; while this is understandable for dCas9-based pulldowns, the cell number is expected to be down-scalable for the other IPs (TERF2, ZBTB48, MYB). Since this work primarily describes an optimized Chip-MS workflow, the authors should show that they can reasonably downscale to at least 15 Mio cells per replicate; one way of achieving this could be through digesting on the beads and not in-gel.

      Response: As we will illustrate in the comparison table that was also requested by reviewer 2, our approach does not use higher cell numbers than previous ChIP-MS approaches – quite the contrary. In addition, we would like to highlight that while we state 50 million cells in Fig. 1a, we only inject 50% of our samples for MS analysis to retain a back-up sample in case of technical issues with the instruments. In other words, our workflow is already effectively based on 25 million cells and thereby pretty close to the requested 15 million cells while simultaneously requiring substantially less reagents.

      Importantly, our examples are based on rather lowly expressed bait proteins such as ZBTB48 (not detected within DDA-based proteomes of ~10,000 proteins in U2OS cells). While the workflow can be applied across proteins, exact input numbers might vary depending on the bait protein, e.g. histones and its modifications would likely require less for the same absolute sample enrichment. For instance, PMID 25990348 and 25755260 performed ChIP-MS on common histone modifications but still used 300-800 million cells per replicate. Considering that we worked on substantially less abundant proteins, we here present a workflow with comparably low input samples.

      Reviewer #3: It is not clear from the text or figure what the authors are trying to show in Fig2c. They should either explain this further or take the figure out.

      Response: We are trying to illustrate the following: As in any IP reaction the bait protein is the most enriched protein with very high relative intensities, e.g. TERF2 in the TERF2 ChIP-MS data. Direct protein interaction partners – here the other shelterin members – follow at about 1 order of magnitude lower signal intensities. In contrast, proteins that are enriched via an interaction with the same DNA molecule (i.e. that do not physically interact with the bait protein) such as NR2C2, HMBOX1 and ZBTB48 further trail by at least 1 more order of magnitude. These are information that are not easily visualised within the volcano plots and mainly “buried” within the Supplementary Tables. However, these relative intensities displayed in Fig. 2c clearly illustrate the dynamic range challenge that ChIP-MS poses for proteins that independently bind to the same chromatin fragment. We have now modified our text to make this point more clear.

      Reviewer #3: Was there any benefit in using a Q Exactive HF vs timsTOF flex?

      Response: Yes, measuring the same samples (e.g. the 50% backup mentioned above) on both instruments enriches more telomeric proteins/shelterin proteins in e.g. the dCas9 ChIP-MS data set on the timsTOF fleX. However, given the difference in age of these instruments/technologies between a Q Exactive HF and a timsTOF fleX (in the context of these experiments the equivalent of a timsTOF Pro 2), this is not a fair comparison beyond concluding that a more recent instrument like the timsTOF fleX achieves better coverage and is more sensitive with otherwise comparable measurement parameters. As we did not have the opportunity to run matched samples on e.g. an Exploris 480, we would not want to make claims across vendors. As stated in the discussion we are expecting that even newer generation of mass spectrometers, such as the very recently released Orbitrap Astral or timsTOF Ultra would further improve the sensitivity and/or allow to reduce the amount of input material. Therefore, the main conclusion is that improvements in the mass spec generations improve proteomics data quality and our samples are no exception, i.e. this is not specifically pertinent to our approach.

      Reviewer #3: How did the authors analyze the PTM data? This is not described in the methods section. In addition, it would be important to validate the novel PTMs described for NR2C2.

      Response: We apologise for the oversight and we will add the description of PTMs as variable modifications during our MaxQuant search in the methods section. The originally deposited datasets already include this and we had simply missed this in our methods text.

      While we are not 100% sure to understand the request for validation correctly, we would like to point out that the PTMs on NR2C2 have been previously reported in several high-throughput datasets and for S19 in functional work on NR2C2 (PMID: 16887930). However, the relevance in our data set is as follows: While the PTMs on TERF2 as the bait protein could occur both on telomere-bound TERF2 as well as on nucleoplasmic TERF2, NR2C2 is only enriched in the TERF2 ChIP-MS reactions due to its direct interaction with telomeric DNA. The co-detection of its modifications therefore implies that at least some of the telomere-bound NR2C2 carries these modifications. We showcase this example as an additional angle of how such ChIP-MS datasets can be analysed.

      While the robust, MS2-based detection of these modified peptides in our data set and several other publicly available datasets provides strong evidence that these modifications are genuine, further functional validation would involve rather labour-intensive experiments and resource generation (e.g. phospho-site specific antibodies). We hope that the reviewer agrees with us that this would require an independent follow-up study and that this goes beyond the scope of our current manuscript.

      Reviewer #3: For this kind of methods paper one would expect to see the shearing results of the ChIP-MS experiments since variations in DNA shearing can impact the detection of false-positives in the ChIP-MS experiments

      Response: We will include agarose gel pictures of our sonicates, which we indeed routinely quality controlled prior to ChIP experiments as stated in our methods description.

      Reviewer #3: Overall, the current state of the manuscript neither provides direct evidence that the "optimized" ChIP-MS workflow is better in certain aspects/use cases than previously published methods nor does it provide novel biological insights. At the current state it even cannot be considered as a validation of previously published methods since it does not discuss them.

      Response: We politely disagree with this conclusion. Again, as mentioned above we are under the impression that this reviewer somehow equates our entire manuscript to a comparison with dCas9-biotin ligase fusions.

      Instead, we here provide a workflow for ChIP-MS that incorporates label-free quantification as the experimentally easiest, most intuitive quantification method for non-mass spectrometry experts. This offers a particularly low barrier to entry aimed at making ChIP-MS more widely accessible as a complement to commonly used ChIP-seq applications. Furthermore, we showcase that as a gold standard ChIP-MS – to truly live up to its name – should have the ability to enrich proteins independently binding to the same chromatin fragment. We demonstrated that double cross-linking is critical for these assays and in return illustrate how rigorous loss-of-function controls (both KOs and degron systems) can mitigate prevalent false-positives that are exacerbated due to the cross-linking. Finally, we applied this workflow to different types of endogenous proteins (transcription factors, telomeric proteins) in cell lines and tissue and extend our work to dCas9 ChIP-MS as a targeted method.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements

      We thank the Reviewers for their detailed and constructive comments. As we describe below, we have now amended the manuscript to address their concerns and suggestions.

      2. Point-by-point description of the revisions

      Reviewer #1

      __In the first paragraph the reviewer states that our study is well presented and convincing, but that it seems “an incremental advance to the previous ones, which properly accounted for PLK4 symmetry breaking and are based on similar assumptions”. __We apologise for not explaining properly why our work is an important advance on these previous studies. Although both previous models can account for some aspects of PLK4 symmetry breaking, they both have significant issues. For example, Takao et al. perform no analysis of the robustness of their model, and from the small number of simulations shown it is clear that some very odd behaviours emerge—e.g. the oscillation of the dominant PLK4 site around the 6 compartments (Figure 3C, Example 3) and the bizarre manner in which PLK4 overexpression drives the formation of multiple PLK4 peaks (Figure 4B, first two examples). The authors do not comment on, analyse, or explain these strange phenomena. This model also relies on STIL being added to the system only after PLK4 has already broken symmetry; this is not plausible in rapidly dividing systems such as the fly embryo where Ana2/STIL levels remain constant through multiple rounds of centriole duplication (Steinacker et al., JCB, 2022). The Leda et al. model predicts that inhibiting PLK4 kinase activity will deplete PLK4 from the centriole, but it is now clear that PLK4 accumulates at centrioles when its kinase activity is inhibited (e.g. Yamamoto and Kitagawa, Nat. Comms., 2019). Moreover, this model supposes no spatial relationship between PLK4-binding compartments; this has important implications for the system’s behaviour (see point 1 in our response to Reviewer #2), and is biologically highly implausible. Thus, neither of the previous models can properly account for several important aspects of PLK4 symmetry breaking.

      Moreover, the two previous studies are not based on similar assumptions. It is only through our analysis that we discover that the underlying biological process driving symmetry breaking in both previous models can be described in the same terms: with short-range activation and long-range inhibition causing diffusion-driven instability. This crucial conclusion was not obvious from, nor claimed by, either of the previous publications. We believe this is an important step in model development for these systems.

      __The reviewer raises a number of minor concerns, the first of which is a previous study from Chau et al. (Cell, 2012), which studies how two component systems break symmetry. Differential diffusion is not essential for symmetry breaking in some of the models considered by Chau et al., and so they wonder if it is really essential in our system. __We thank the reviewer for pointing us to this study. It can be proven mathematically that differential diffusion is essential for symmetry breaking in the Turing-type framework. In the systems studied by Chau et al., symmetry can be broken without differential diffusion if one of the two components can be depleted from the cytoplasm. Such cytoplasmic depletion does not occur in traditional Turing-type systems, and it is almost certainly not occurring during PLK4 symmetry breaking—e.g. FRAP experiments show that PLK4 continuously turns over at centrioles (Cizmecioglu et al., JCB, 2010; Yamamoto and Kitagawa, Nat. Comms., 2019). We discuss this point (p8, para.3).

      __The reviewer states that it is unclear which term in equations (3-4) and (5-6) correspond to the self-activation and activation/inhibition of the other component that are indicated in the schematic summary of the models shown in Figure 1C. __As we now clarify, in general it is not always possible to pinpoint a single term in an equation that corresponds to activation/inhibition. Mathematically, a positive feedback for means that , and a negative feedback for means that . Hence, activation and inhibition can change depending on the values of these derivatives during the dynamics as these inequalities may be achieved with complex expressions that extend beyond the usual proportional relationships. We have amended the manuscript to make this clearer (p10, para.2).

      The reviewer pointed out an error in the arrows in Figure 2 (we believe this is actually Figure 4). We thank the reviewer for pointing this out and have now corrected this mistake.

      Reviewer #2

      Major Comments:

      __ 1. The reviewer points out that in all models of PLK4 symmetry breaking the overexpression of PLK4 should be able to generate multiple PLK4 peaks (as, experimentally, PLK4 overexpression can generate up to 6 procentrioles around the mother centriole). The Reviewer suggests that the two previous models can do this, but we only show examples where PLK4 overexpression generates two peaks, and the reviewer questions whether this is a general limitation that would invalidate our approach. __We are grateful to the reviewer for pointing this out, and we now expand our analysis and discussion of this important issue (p13-15). It is indeed possible to produce more peaks in our model using different parameters—e.g. decreasing diffusivity leads to thinner peaks, allowing more peaks to form (Figure 3B, Figure 5B). Importantly, however, when diffusion is decreased, the region of the parameter space in which only a single peak will form inevitably becomes smaller—as diffusion can no longer efficiently suppress the formation of additional peaks around the rest of the centriole surface. Hence, in both our original models we struggled to find a parameter regime in which PLK4 robustly formed a single peak, but also formed >3 peaks when PLK4 was overexpressed. As we now discuss in detail, we believe that this is a general problem, as any model of PLK4 symmetry breaking must involve information being communicated around the centriole surface. We now show that a possible solution to this problem is to postulate that increasing PLK4 levels leads to a decrease in PLK4 diffusivity (Figure 3C, Figure 5C)—a biologically plausible possibility (p15, para.2).

      In addition, it is not correct to say that the previous formulations of these models do not have this problem (or, in the case of Leda et al., the model actually has a related problem). This problem must apply to the Takao et al. model, as it also relies on information travelling around the centriole surface. This problem is far from obvious, however, because Takao et al. do not analyse the robustness of their model. This problem does not apply to the Leda et al. model, but this is because their model supposes no spatial relationship between the individual compartments and instead assumes that communication between compartments is instantaneous. This allows their system to overcome this communication problem and so robustly form a single peak at low PLK4 concentrations, while forming multiple peaks at high concentrations (as shown in Figure 6B). However, this requires that diffusion is sufficiently fast that concentration gradients are negligible between centriolar compartments, but not so fast that the relevant species are diluted in the much larger cytoplasm. It seems implausible that both of these effects may be achieved with a single diffusion rate in the real-world physical system.

      __ 2. The reviewer points out that in our modelling any multiple PLK4 peaks formed will tend to be evenly spaced around the centriole surface whereas, in their original formulations, the two previous models predict that any multiple ‘winning’ PLK4 compartments will not have any preferential spatial location with respect to each other. They ask that we address this difference and justify why we think our prediction is a better representation of PLK4 symmetry breaking. __Although it is not obvious, neither of the previous models makes clear predictions about the spacing of multiple PLK4 peaks. As described above, Leda et al. assume no spatial relationship between PLK4-binding compartments, so relative peak-spacing cannot be assessed. Moreover, from the limited analysis shown, it is not clear that Takao et al. predict random spacing. The authors show only two simulations of PLK4 overexpression (Figure 4B, first two simulations) and the behaviour of PLK4 is very odd: the initial noise in the system fades away before PLK4 levels rapidly and near-simultaneously rise at multiple, reasonably well-spaced, peaks, before fading away to low levels—even after STIL addition. At the end of the simulation the “winning” compartments contain very low levels of PLK4 (often lower than the noise initially introduced into the system), but these compartments are reasonably (simulation 1) or very (simulation 2) evenly spaced.

      Nevertheless, the reviewer is correct that the even spacing of multiple peaks is a feature of our model. Unfortunately, it is not possible to compare this prediction to reality because the spacing of multiple PLK4 peaks in cells overexpressing PLK4 has not been quantified yet. Thus, one has to interpret published images, some of which support equal spacing while others do not (e.g. Kleylein-Sohn et al, Dev. Cell, 2007). Moreover, this analysis is likely to be complicated because CEP152 can form incomplete rings. This can be appreciated in Figure 2C in Hatch et al., (JCB, 2010) where the extra centrioles induced by PLK4 overexpression do not appear to be evenly spaced around the centriole, but are quite evenly spaced around the partial CEP152 ring. Therefore, equal spacing of peaks in ideal conditions is a feature predicted by our model that still needs to be fully explored experimentally. We believe that part of the power and value of our model is to suggest such hypotheses. We now discuss this important point (p26, para.2).

      __ 3. The reviewer questions our attempt to discretise our continuum model (where we convert the continuous centriole surface to a series of discrete compartments on the centriole surface and show that symmetry breaking can still occur). They note that we only show one example (9 compartments), they ask for more information about how the discretisation was done, and they question the independence of the compartments as PLK4 appears to accumulate in compartments adjacent to the dominant compartment. __We apologise for the lack of clarity here. We now state that our models can break symmetry provided that there are at least two compartments, and we now include simulations showing that this happens for 2 – 10 compartments (Figure S2). The discrete model is a finite-difference discretisation of the continuum model (described in Appendix V). We also now clarify that the compartments are ‘independent’ in the sense that all chemical reactions only occur between components that are within the same compartment. The compartments are still spatially linked via a discretized diffusion (as would likely be the case at the centriole), which explains the observed relationship between neighbouring compartments.

      __ 4. The reviewer asks whether all the parameter values that satisfy the mathematical constraints we calculate for our models will break symmetry. If so, they suggest we are using a circular argument when demonstrating that the models break symmetry as we use parameter values chosen specifically to satisfy these constraints. __In Turing-systems, one can mathematically calculate parameter constraints that allow symmetry breaking. As we now clarify, all parameters that satisfy these constraints can break symmetry, while any parameters outside these constraints cannot break symmetry. Thus, it was never our intention to claim something new or surprising when we illustrated the symmetry-breaking properties of our models (Figures 2 and 4, and associated parameter space analysis in Figures 3 and 5), so we apologise that our intention on this point was unclear. Rather, these Figures illustrate the detailed behaviour of each system under different conditions—something that is not possible to intuit from the equations alone.

      5. The Reviewer requests more information about how we chose the particular parameter values we use to illustrate each model and asks that we convince readers that other sets of values that satisfy the derived mathematical requirements would result in the same qualitative outcomes. As described in point 4 above, and as we now state more clearly, it is a mathematical fact that parameter values that satisfy the derived mathematical requirements can break symmetry. We now discuss our reasons for choosing specific parameters in more detail (see point 6, below).

      __ 6. The Reviewer asks whether the dimensionless parameters we use in our models have any biological relevance, and requests a biological interpretation of all of them. They also request that we relate the Diffusivity ratios of the Activator and Inhibitor species (____) to the experimental observations made by Yamamoto and Kitagawa. __Relating our dimensionless parameters to biologically-relevant dimensional parameters is a complex issue. For example, one can see from equations (5) and (6) that simultaneously doubling (A), (I), and (a), and decreasing (b) by a factor of 4 leaves the system unchanged. Since the concentrations of A and I are unknown at the centriole surface, this means that it is not possible to determine the dimensional values of the rate of production of I (a) and its rate of conversion to A (b). This limitation is the root of the mathematical fact that FRAP experiments can reveal “off” rates but not “on” rates. Moreover, to convert the rate of loss of A (c) and I (d) into dimensional parameters it is necessary to know the timescale of symmetry-breaking. This is unknown, but was assumed to be on the order of hours in the previous models. This corresponds to a degradation/loss rate of minutes with our current choice of parameters, which is consistent with FRAP data (e.g. Yamamoto and Kitagawa, Nat. Comms., 2019). Regarding the ratio, the effective diffusion in our model depends on both the bulk diffusion and the binding/unbinding/degradation rates – a complexity also noted by Yamamoto and Kitagawa. This makes it very difficult to relate the “effective” surface diffusivity to the bulk diffusivity. We are currently investigating the form of this dependency, but this is a complex mathematical problem that is beyond the scope of this manuscript. These issues are difficult to discuss succinctly, so we now simply state that we chose specific parameter values based, in part, on the values and ratios used in the previous modelling papers (p10, para.2; p17, para.2).

      Unfortunately, we could not find any experimental measurements of diffusivity in the Yamamoto and Kitagawa paper, as the Reviewer suggests. We now clarify, however, that the ratio we use in both models (2500) is chosen to be between the effective diffusivity ratio (as the previous models used binding/unbinding rates rather than diffusivity) used by Takao et al. (10000) and Leda at al. (200). We also include a phase diagram showing how varying the diffusivity of both factors influences symmetry breaking in both models (Figure 3B, Figure 5B), and we state that we have chosen all remaining parameter values to reflect the parameter values in the original models, when adjusted to the same timescale.

      __ 7. The Reviewer asks for more information about how we normalised time in our simulations and whether the time in different simulations is comparable. __We now clarify that the simulations run for a single unit of dimensionless time (so they can be compared), and that the reaction/diffusion parameters in the system are sufficiently large by comparison with unity that all simulations achieve steady state within a unit of time (p11, para.2).

      8. The Reviewer asks whether concentrations of _and can be compared between simulations, and also questions our description of _ being uniformly accumulated in Figure 4D, rather than uniformly depleted. __We clarify that concentrations can be compared within a model, but not between models. This is because the dimensional values depend on the dimensional reaction rates, which differ between the models. This is not just a theoretical limitation; experimental fluorescence signals are typically compared in relative arbitrary units so the absolute values of different systems cannot be easily compared for the same reason. We agree with the reviewer that it is better to describe Figure 4D as showing uniform depletion of the activator, and we have adjusted the legend accordingly.

      The reviewer makes a number of minor points that are not numbered.

      __The reviewer asks for clarification of what we mean by “robustness”: does this refer to the ability to produce the same result in multiple simulations, or to the ability to produce the same result when parameter values are varied? If the latter, then the reviewer suggests our models are not very robust. __We apologise for this confusion and now more clearly define what we mean by robust (p13, para.2). As we discuss in point 1 of our response to this Reviewer, our initial models are indeed not very robust at producing a single PLK4 peak over a range of PLK4 concentrations. We now discuss why this lack of robustness is likely to be intrinsic to any PLK4 symmetry breaking system, and how robustness in all such models can be improved by allowing diffusivity to vary with PLK4 expression levels (p13-p15).

      __The Reviewer points out that the original models introduce a noise term at every iteration, whereas we only introduce an initial noise term; they ask us to discuss this difference. __We have run simulations introducing a noise term at every iteration and find that this makes negligible difference (Reviewer Figure 1, attached to the end of this letter). We do not take this approach, however, as this would significantly complicate the mathematical analysis that we perform (the additional noise term turns the system of PDEs into a system of SDEs which do not fit the Turing framework as readily). We now mention this in Appendix V.

      The Reviewer states that the reaction schemes are unnecessarily repeated in Figures 1, 2 and 4. We would like to keep these schematics, as in Figure 1 we show a generic scheme (illustrating the two possible Turing-type reaction diffusion systems) whereas in Figures 2 and 4 we show specific reaction regimes (specifying the relevant species) that we test in each model. We feel this information will be useful to readers in this visual format.

      The Reviewer states that it is confusing that we refer to the specific reaction parameters (k11 and k12) that need to be swapped to convert the Leda et al. model to the Takao et al. model, as this information will not mean anything to readers who are not familiar with the models. We agree and have now removed this information.

      The Reviewer suggests several textual amendments and/or corrections. We thank the reviewer for spotting these and have amended them all accordingly.

      __Finally, the Reviewer states in their significance summary that although our key conclusions are convincing, they are not new as Takao et al. describe their model as analogous to a “reaction-diffusion system (also known as a Turing model)”. __We were aware that Takao et al. make this statement, but this does not invalidate the novelty or significance of our work. This is because although Takao et al. described their model as being analogous to a “Turing model”, it is not actually a reaction-diffusion system, and it does not exhibit the property of long-range inhibition that is central to all Turing-systems to produce a single PLK4 peak. Instead, they use lateral inhibition (in which the influence of the inhibiting species does not extend beyond the neighbouring compartments) to reduce the number of potential PLK4 binding sites from ~12 to ~6. A single winning site is subsequently selected when STIL is added to the system—with additional positive feedback (not involving reaction-diffusion) ensuring that the compartment with most PLK4 becomes the dominant site. Their analysis of the reaction-diffusion version of their system is limited to a single supplementary figure (Figure S2D), and they do not perform or refer to any of the relevant mathematical analyses of their model that makes these well-studied systems such powerful tools. We believe that the model presented here is simple enough to draw the attention of the applied mathematics community while robust and complete enough to provide a mechanistic explanation of many interesting features and suggest new possible phenomena. We now discuss these points (p22, para.1).

      Reviewer #3

      __The Reviewer found our manuscript well-written, and judged it of interest to centriole duplication enthusiasts. __We interpret this to mean that the Reviewer did not think it of more general interest. This seems a harsh assessment, as the precise one-for-one duplication of centrioles is generally considered to be one of the great mysteries of cell biology. It is now widely appreciated that robustly breaking PLK4 symmetry to form a single PLK4 peak is crucial to this process. Thus, our discovery that this process can be described using a well-studied mathematical framework that has already been applied to a vast range of biological processes is potentially of significance even to non-centriole enthusiasts.

      The Reviewer made a number of specific comments:

      Figure 1. The Reviewer felt the graphic in Figure 1A could be improved by combining it with Figure 1B, and noted that the centrioles look strange. We thank the reviewer for these suggestions and we have now rearranged this Figure. We also now clarify that the schematic depicts Drosophila centrioles, which are simpler than human centrioles.

      __Figure 2. The Reviewer suggests that to make the system depicted in Figure 2A fit as a Type I Turing system we have to assume that (I) must dissociate from the centriole or be degraded at higher rates than (I) converts (A) to (I). They suggest this assumption is implicit in the model and they request further explanation. __The reviewer is correct that, in Model 1, the degradation/dissociation of () is the root of its self-inhibition. However, we do not need to make any assumption about the relationship between the rate at which converts to (b), and the dissociation/degradation rate of (d) for this system to work (as the Reviewer implies). This is because, whatever these rates are, the system will approach a steady-state where the production and degradation terms balance, and it is the stability/instability of this state that determines whether the system can break symmetry. Since the degradation rate of (the - term in equation 4) increases more rapidly than its production rate (the term in equation 4) as increases, this results in a stable (i.e. self-inhibiting) system regardless of the parameter values. We have rewritten the sections explaining these equations to try to make these points more clearly and to point readers to Appendix II where we explain the form of the equations.

      __The Reviewer asks if in Model 1 it is realistic to assume no turnover or loss of PLK4 (A), and will the system still work if this is altered? __This is a good point. In Model 1, we set c=0 as this makes the analysis significantly simpler, enabling us to display the mathematical predictions alongside the numerical simulation. We have now added the (c,d) phase diagram to show the effect of varying these parameters on the symmetry breaking properties of the system (Figure 3D). We find that the value of c has a relatively weak effect on the symmetry breaking properties of the model since it does not affect the function of as an activator.

      __The Reviewer asks if our 1D model would work in 2D, and notes the PLK4 peaks in our models are broad, likely limiting the number of peaks formed. They also note that in our Model 1 it is the unphosphorylated form of PLK4 that accumulates in the peak, which seems unlikely as it is widely believed that PLK4 must be active to phosphorylate STIL to promote its interactions with SAS6 and CPAP. __From a mathematical perspective, modelling our system in 2D would produce very similar results. Symmetry breaking is driven by long-range inhibition/short-range activation, and these behaviours will work analogously in 2D. As discussed in our response to Reviewer #2 (point 1), the broad peaks do indeed limit the number of centrioles that can form, but by altering the parameters we can generate more peaks that are less broad (Figures 3 and 5). The Reviewer is correct that Model 1 (based on Takao et al.) predicts that non-phosphorylated PLK4 () accumulates in the peak. This is also true of the original Takao et al. model, although this was not highlighted or commented on by the authors. We now expand our discussion of this point (p25-p26).

      The Reviewer asks if our models can form multiple peaks at higher PLK4 levels. This is again related to Reviewer #2, point 1, and we now show that this is indeed possible under the appropriate parameter regime (Figure 3C and Figure 5C).

      The Reviewer asks for more description of how lateral diffusion works in our system. For example, do we consider that not every molecule of (I) will diffuse laterally (as some will be lost to the cytoplasm), or that the probability of a molecule leaving the surface will increase as distance/time increases. We apologise for our lack of clarity. We now state that the proportion of molecules not rebinding to the surface is accounted for in the reaction components of all our models (p7, para.1). In reality, and as we now state, the relationship between this loss and the diffusion rates (and their relation to distance/time, for example) is complicated. We are investigating this relationship in more detail, but this is beyond the scope of the current paper.

      The Reviewer asks if symmetry breaking might eventually occur if the system in which we reduce the kinase activity of PLK4 (Figure 2D) were given more time. They also ask whether reducing PLK4 levels by half would lead to a failure in site-selection. The kinase inhibited scenario we show here will not break symmetry over any period of time; this can be proven mathematically, and is verified in the numerical simulations (Figure 3A and 5A, bottom left regions of graphs), which we now state more clearly are always run for a long enough period to reach a steady-state (p11, para.2). The effect of reducing PLK4 levels in our models is analysed in the phase diagrams shown in Figure 3 and 5 (and analysed in more detail in Figure S1), where it can be seen that there are multiple PLK4 concentrations that can be halved without a failure in site selection (although, see also our response to Reviewer #2, point 1).

      The Reviewer pointed out some errors in our presentation of Figure 3, (and suggested some improvements in presentation in a point further below) and also asked for more information about the parameters used to generate the data in Figures 2B-D and 4B-D. We thank the Reviewer for these suggestions and have made these changes and provided the additional information requested (e.g. marking the specific parameters used in our simulations on the phase diagrams shown in Figure 3 and Figure 5 with coloured dots).

      The Reviewer points out that when PLK4 levels and activity are both high no centrioles are produced in Model 2, whereas 1 centriole is produced in Model 1—neither of which are consistent with experimental observation. We now show an expanded parameter space (new Figures 3A and 5A) where it can be seen that this is not a problem for Model 1. For Model 2, the region of high kinase levels and activity (dark blue, top right, Figure 5A) corresponds to the uniform accumulation of the activator species. Thus, while there are no peaks, this region might produce multiple centrioles, as it is equivalent to a compartmental model in which all of the compartments are occupied. We now discuss this point (p19, para.1).

      __The Reviewer questions how the biology fits a Type II Turing system, pointing out that current data suggests that active PLK4 turns over more rapidly at centrioles, whereas in the Type II model we describe (based on the Leda et al. model) it is the phosphorylation state of STIL that determines which species of PLK4:STIL turns over rapidly. They also question the logic of the Model 2 Type II circuit (Figure 3A), questioning how A could drive the dephosphorylation of STIL to promote the production of I. __We agree that current data is more consistent with phosphorylated species of PLK4 turning-over more rapidly at centrioles, but this is not what Leda et al. proposed, and so this is not what we implemented in trying to reformulate their model (although this is effectively the change we make that turns the Leda et al. model into the Takao et al. model). As to the second point, the Reviewer has correctly spotted a problem with our model that arises because the direction of the arrows linking and were inadvertently flipped in Figure 4A. This mistake has been corrected, and we now explain more clearly how the biology of this system fits a Type II Turing system in the legend.

      __The Reviewer points out that although we can convert the Leda et al. Model (Model 2) to the Takao et al. Model (Model 1) simply by changing the identity of the _ and _ species, the underlying assumption of the Takao et al. model (that non-phosphorylated PLK4 promotes its own accumulation) was not an inherent assumption of the Leda et al. model. __We apologise for this confusion. As we now clarify (p20, para.1) the Reviewer is correct that when we make mathematical changes to the Leda et al. model we must also assume changes in the underlying biology—so that non-phosphorylated species of PLK4 are now slow diffusing, rather than non-phosphorylated species of STIL, as originally proposed). As the Reviewer points out, current data suggests that non-phosphorylated species of PLK4 do turnover more slowly, although it is not clear why—for example, liquid-liquid phase separation driving the formation of PLK4 condensates has been postulated, but is far from proven. This remains an interesting problem that will be further probed mathematically and experimentally.

    1. Author Response:

      The following is the authors' response to the original reviews.

      We thank both reviewers for their comments, which have suggested changes that have improved the manuscript.

      Reviewer #1 (Public Review): 

      […] A weakness in the methodology is the link to tissue tension and conclusions about tissue mechanics. Methods that directly affect tissue tension and a more thorough and systematic application of laser ablation experiments would be needed to profoundly investigate mechanosensation and consequential effects on tissue tension by the various genetic perturbations.

      Response: In revision, we have added some additional experiments that examine altered tension.

      While the in-silico analysis of competing for F-actin binding sites for βH-Spec and myosin appears logical and supports the authors' claims, no point mutation or truncations were used to test these results in vivo.

      In its current structure the manuscript's strength, the genetic perturbations, is compromised by missing clear assessments of knockdown efficiencies early in the manuscript and other controls such as the actual effect on myosin by ROCK overactivation. 

      Response: In revision, we reorganized the manuscript and figures to document the knockdown efficiency earlier in the manuscript, and have added additional figure panels illustrating the effects of altered tension on myosin levels.

      Reviewer #2 (Public Review):

      […] The authors suggest that Ajuba is required for the effect of beta-heavy spectrin. However, it is still formally possible that this could be a parallel pathway that is being masked by the strong phenotype of Ajuba RNAi flies. 

      Response: While it is formally true that the genetic requirement for Jub could reflect a role in parallel to, rather than downstream of, spectrins, our conclusion that spectrins act through Jub is based not only on the genetic requirement for Jub, but also on the influence of spectrins on junctional tension and Jub localization, which indicate that spectrins influence Jub activity in a manner consistent with their affecting the Hippo pathway through Jub.

      One of the major points of the manuscript is the observation that alpha- and beta-heavy-spectrin are potentially working independently and not as part of a spectrin tetramer. This is mostly dependent on the observation that alpha- and beta-heavy-spectrin appear to have non-overlapping localizations at the membrane and the fact that alpha- and beta-heavy-spectrin localize at the membrane seemingly independently. It is not entirely obvious that a potential lack of colocalization and the fact that protein localization at the membrane is not affected when the other partner is absent is sufficient to argue that alpha- and beta-heavy-spectrin do not form a complex. Moreover, it is possible that the spectrin complexes are only formed in specific conditions (e.g. by modulating tissue tension). 

      Response: Our results argue that alpha- and beta-heavy-spectrin do not form a detectable complex in the wing disc under the conditions examined, and thus that they act independently is this context. However, we agree that it is possible that they could function together contexts, eg in other tissues or under different conditions, and we have revised the text in the Discussion to note this.

      If indeed spectrins function independently, would it not be expected to see additive effects when both spectrins are depleted? 

      Response: Not necessarily, since both alpha- and beta-heavy-spectrin act through Jub, and there may be a limit as to how much Yki activity can be increased by Jub (eg the increases in wing size induced by spectrin RNAi are similar to the increases in wing size observed with constitutive recruitment of Jub through alpha-catenin mutation (Alegot et al 2019).

      Related to the two previous points, the fact that the authors suggest that both alpha- and beta-heavy-spectrin regulate Hippo signaling via Ajuba would be consistent with the necessity of an alpha- and beta-heavy-spectrin complex being formed. How would the authors explain that both spectrins require Ajuba function but work independently? 

      Response: The different spectrins both affect Jub because they both affect cytoskeletal tension, but our results suggest that they act in different ways to affect tension. We have made some revisions to the Discussion section to try to make this clearer.

      Another major point of the manuscript is the potential competition between beta-heavy-spectrin and myosin for F-actin binding. The authors suggest that there is a mutual antagonism between the two proteins regarding apical F-actin. However, this has not been formally assessed. Moreover, despite the arguments put forward in the discussion, it seems hard to justify a competition for F-actin when beta-heavy-spectrin seems to be unable to compete with myosin. Myosin can displace beta-heavy-spectrin from F-actin but the reciprocal effect seems unlikely given the in vitro data. 

      Response: We show in vivo, in vitro, and in silico data that are all consistent with the inference that beta-heavy-spectrin and myosin compete for binding to F-actin. As the reviewer notes, and as we discuss, the in vitro competition experiments were limited because, for technical reason, we were unable to increase the protein concentrations higher. We also note that our in vitro experiments used an active form of myosin, which binds F-actin much more strongly than inactive myosin.

      Reviewer #1 (Recommendations For The Authors): <br /> While the flow of experiments is logical in general, I see major problems regarding the structure of the manuscript and essential controls: 

      • It is very confusing to have samples (kst-CRISPRa) in figures 1-3 that were not introduced in the text until the second-last paragraph of the results. I would suggest introducing this elegant overexpression experiment early in the manuscript as it fits well in the scope of these experiments or alternatively (if the authors prefer) make a new figure containing all the data regarding the overexpression in the end. 

      Response: We have now moved these results to a new figure (new Fig 7) that is described later in the text.

      • At the beginning of the manuscript, essential controls regarding the knockdown efficiency are missing in the main figure. Many of the key experiments are based on KD and as a reader, I want to assess their efficiency. Only in Figure 4, at the end of the manuscript, KST and α-Spec KD efficiency is revealed - this should be shown earlier and quantified properly. While reading the manuscript in its current form, the doubt remains that differences e.g. in α-Spec and KST KD can be explained by varying knockdown efficiencies as their levels can't be assessed. 

      Response: We have now moved these results to a new supplemental figure (Fig 1-supplement 1) that is cited earlier in the text.

      • On a similar line, in Figure 5 where myosin activity is perturbed, induction or repression of myosin activity is only suggested but not formally shown. The authors have to demonstrate that this is indeed the case by showing the myosin signal, ideally accompanied by measurement of tissue tension. 

      Response: This was not included because we and others have assessed these manipulations in earlier publications. However, as requested we have now added a supplemental figure (Fig 6 supplement 1) showing myosin levels in these genotypes.

      • On p. 7, the authors claim that "The epistasis of jub to kst suggests that βH-Spec regulates wing size through its tension-dependent regulation of Jub." While the authors show that KST KD increases myosin and junctional Jub, and that the wing overgrowth phenotype of KST KD depends on Jub, the tension-dependency was not demonstrated. To make that claim, the tension profile should be perturbed e.g. by overexpression of rok, myosin mutants (as the authors do in Fig 5) and the effect on Jub should be analyzed. Induction of tension in these conditions should be measured by laser ablation or a suitable alternative method. It might well be that the induction of Jub in KST KD is not via tension but an alternative mechanism such as the release of steric hindrance, interaction competition, etc. Also: Does KD of Jub affect spectrin localization? 

      Response: The effect of tension on Jub, and the effects of the myosin activity changes we employed on tension, have been analyzed in prior publications (eg Rauskolb et al 2014). To further address the issue raised by the reviewer here as to whether Kst affects Jub and wing growth via tension, we have also now added an additional experiment (Fig 3 supplement 1) in which we decreased tension in a βH-Spec RNAi wing disc by simultaneously expressing RNAi targeting Rok. The results show that the wing growth and Jub accumulation associated with βH-Spec RNAi are suppressed by Rok RNAi, consistent with our conclusion that these effects are mediated via cytoskeletal tension.

      As KD of Jub alters the pattern of myosin accumulation in wing discs (Rauskolb et al 2019) it could be expected to have a complementary influence on βH-Spec localization, but we have not examined this.

      • The authors make a very strong point in saying "The influence of βH-Spec on junctional tension is thus a direct consequence of its competition with myosin for overlapping binding sites on F-actin." While the authors provide some in vitro and in silico evidence, it was for example not possible to outcompete myosin by increasing levels of KST CH1-CH2 domains in vitro (for possible reasons the authors discuss). More importantly, the hypothesis that competition for actin binding is the definite cause of the antagonizing effect was not tested in vivo. Overexpression of a mutant version of KST that is unable to bind F-actin, or that has an increased affinity (etc) for actin was not tested. Such an experiment would be very valuable to enrich this manuscript but at least, claims like that have to be less bold and need to be written in a more speculative language. 

      Response: We consider creating and analyzing mutant forms of Kst in vivo to be beyond the scope of this manuscript, but as suggested we have now modified the text highlighted by the Reviewer to be more cautious.

      Further points: 

      • Why does the thickness of the wing disc epithelium change due to KST and α Spec KD, the authors should introduce this experiment better and draw a proper conclusion. Is there any relocalization of myosin along the apical-basal axis? Can the authors speculate about the differences between KST and α Spec KD? 

      Response: The epithelium thickness changes with α-Spec KD, but does not change with Kst KD. We think the explanation is provided by work from the Pan lab (done mainly in pupal eyes), which reported decreased cortical tension and increased apical area when α-Spec is lost. The interpretation in essence is that with the loss of attachment of F-actin to membranes along the lateral sides of the cells, the sides of the cells are "softer" and the cells expand laterally and thus also (by conservation of volume) shorten apical-basally. This is somewhat speculative, and it's not a focus of our study, but we have added some text to try to explain this better. Myosin along apical-basal axis was not visibly altered, but it is harder to analyze as it is very weak compared to junctional myosin.

      • Given the authors' observation of differences in the relative localization of KST and α Spec (Figure 4), proper quantification of KST, α Spec and myosin levels along the apical-basal cell axis would be important. This would also ease data interpretation. 

      Response: We have now added a higher resolution image and also a line scan of Kst, α-Spec  and Myo in a new supplemental figure (Fig 6 supplement 1)

      • KD of α Spec seems to induce myosin activity more, causes a bigger reduction of wing thickness, a stronger induction of Jub, and a similar effect on wing size. What lead the authors to focus on KST rather than α Spec regarding the detailed analysis of myosin competition? 

      Response: Our observations identify a competition between Kst and myosin, but we have no indication that α-Spec competes with myosin. (It's conceivable that β-Spec might also compete with myosin in some contexts, but wing discs would not be a good place to examine this because the localization profiles of β-Spec and Myosin are so different).

      • A big criticism regarding the figures is the bad color choice which makes it difficult to decipher the fluorescent signals. Likewise, the labels are difficult to read with the present coloring. They should really be changed. 

      Response: We have now changed the single color images to gray scale (for multi-color images we retain RGB coloring).

      A minor point: 

      • To make the manuscript more accessible for researchers outside the Drosophila field, I'd suggest adding explanatory labels for Drosophila-specific terms such as hyperactive myosin for sqhEE, a scheme to show where UAS-dcr2 is active, explain the purpose of Rfp expression as a control for tissue specificity, etc. 

      Response: We have added some explanations to the text to try to make this clearer.

      Reviewer #2 (Recommendations For The Authors): <br /> Major points: 

      In lines 99-101, the authors mention that Deng et al., 2015 report that the depletion of spectrins leads to an increase in pMLC, with no associated changes in the colocalization of myosin and F-actin. It is more accurate to mention that Deng et al. suggest that the levels of a GFP-tagged rescue construct of MLC (Sqh) are unchanged in alpha-spectrin mutants, although this was not formally quantified. Moreover, there was not a formal assessment of colocalization between MLC and F-actin, but rather a suggestion that F-actin levels are unaffected by the alpha-spectrin mutation. Finally, Deng et al. mostly analyzed alpha-spectrin so it remains possible that the new results shown by the authors are compatible with the initial observations from Deng and colleagues. 

      Response: As suggested, we revised the text to note that Deng et al., 2015 specifically examined Sqh:GFP. While we agree that our focus is more on Kst and Deng et al focused on α-Spec, we also examined α-Spec, and as described our results examining Myosin and Jub differ from what was reported by Deng et al 2015.

      As mentioned above, it is still possible that spectrins and Ajuba are working in parallel and Ajuba is not necessarily downstream of spectrins. The strong phenotype of Ajuba RNAi flies in adult wings could mask the effect of spectrins. Are the results similar in other settings, such as in the absence of Dicer2? Also, can Ajuba RNAi phenotypes be modified by overexpression of spectrins? This would provide further evidence of a link to Ajuba function. 

      Response: While formally it is true that the genetic requirement for Jub could reflect a role in parallel to, rather than downstream of, spectrins, our conclusion that spectrins act through Jub is based not only on the genetic requirement for Jub, but also on the influence of spectrins on junctional tension and Jub localization, which indicate that spectrins influence Jub activity in a manner consistent with their affecting the Hippo pathway through Jub.

      We would not expect over-expression of spectrins in a jub RNAi background to further reduce Hippo signaling, and as the jub RNAi phenotype is much stronger than the Kst over-expression phenotype even if there were an effect it would likely be difficult to detect.

      Regarding the potential independent functions of spectrins, it would be interesting to determine if alpha- and beta-heavy-spectrin can still interact at the level of the AJ despite the fact that their distributions appear to be partly non-overlapping. Would it be possible to assess this using PLA? If an interaction is not detected via PLA, it would be more convincing that spectrins are functioning independently. 

      Response: We have now performed this experiment, and no significant signal was detected by PLA. As a control, we used identical antibodies (GFP and α-Spec) to conduct PLA on α-Spec and β-Spec, and we did detect signal by PLA. These results (included in a revised Figure 4) further support the conclusion that α-Spec and βH-Spec are not physically associated in wing discs.

      Related to this point, if the spectrins work independently, it is reasonable to assume that they could display additive effects. Is this the case? If alpha- and beta-heavy-spectrin are simultaneously depleted are the phenotypes more severe than either depletion alone? 

      Response: We disagree here. Since both alpha- and beta-heavy-spectrin act through tension and Jub, and there is likely a limit as to how much Yki activity can be increased by this pathway. For example, the increases in wing size induced by spectrin RNAi are similar to the increases in wing size observed with constitutive recruitment of Jub through alpha-catenin mutation (Alegot et al 2019), which may thus represent the maximum increase that can be induced through this pathway (as there are multiple, independent factors that regulate Hippo signaling).

      Authors should modulate membrane tension and assess if this affects the localization of alpha- and beta-heavy-spectrin and, specifically, their colocalization, as their interaction could be regulated. 

      Response: As reported, we do see effects of tension on βH-Spec localization. We would not expect significant effects of membrane tension on α-Spec localization, but we consider analysis of this outside the scope of this manuscript.

      In lines 185-187, the authors mention that beta-spectrin depletion does not affect beta-heavy-spectrin localization. Interestingly, Figure 4E appears to show that the levels of Kst-YFP appear to be lower in the beta-spectrin-depleted tissue. The localization of beta-heavy-spectrin is not necessarily affected but the overall levels could be. 

      Response: Indeed the levels appear slightly lower, but elucidating the reason for this will require further experiments that are beyond the scope of this manuscript (we suspect it is because cytoskeletal tension increases in β-Spec-depleted tissue as it does in α-Spec depleted tissue, which based on our observations should decrease levels of Kst at near junctions). The key point of these experiments was to show that α-Spec localization does not require βH-Spec, but does require β-Spec, which supports our conclusion that in wing discs α-Spec forms a complex with β-Spec but not with βH-Spec.

      In lines 200-203, the authors state that beta-heavy-spectrin and myosin colocalize extensively at the apical region. However, this colocalization is not as clear as stated. Do the authors have alternative data that suggests that the two proteins are indeed colocalizing? Would it be possible to perform PLA to detect a potential colocalization? 

      Response: Unfortunately we do not have antibodies against both proteins that work well enough for PLA. However, we quantified the co-localization by analysis of Pearson's correlation coefficient, as reported in the manuscript. We also added an additional higher magnification image, and a line scan, in a supplemental figure (Fig. 6 supplement 1).

      Authors should try to assess and quantify colocalization with F-actin for both beta-heavy-spectrin and myosin in wild-type conditions and when the levels (and/or activity) for each of them are modulated. 

      Response: We have added quantification of the co-localization of βH-Spec with F-actin and of myosin with F-actin to the revised manuscript.

      Minor points: 

      In lines 122-124, the authors should clarify the relevance of the observation that alpha-spectrin knockdown affects the thickness of the wing disc epithelium. 

      Response: We have added some text to try to elaborate on this.

      In the intro, it is perhaps necessary to mention that there are conflicting reports regarding the role of spectrins in the regulation of cell proliferation, at least in the follicular epithelium. For instance, Ng et al., 2016 argued that spectrins do not regulate cell proliferation in FECs. 

      Response: Rather than wading into a detailed discussion of issues that are peripheral to this study, we modified the text in the Introduction to avoid implying that spectrins control cell proliferation in the ovary.

      In Figures 1, 2, 3, and 4 (and respective supplements), it is encouraged that, wherever appropriate, the authors mark the different compartments or the relevant boundary using dashed lines, to more clearly indicate the regions to compare. 

      Response: We have now done this.

      In Figure 2, supplement 1 panels C and D should have an indication of the genotype for clarity. 

      Response: We have now added this.

      In lines 362-367, the authors suggest that other actin-binding proteins are likely to influence the role of beta-heavy-spectrin. Have the authors tested the role of spectrin interactors such as Ankyrin and Adducin?

      Response: No, we have not examined this.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2023-01939

      Corresponding authors: Jiro Toshima, Junko Y. Toshima

      1. __ General Statements __ We are grateful for the reviewer’s evaluation of our study. In the new manuscript, we have answered all of the points raised by the two reviewers (the altered or added text is indicated in red in the new manuscript). Reviewer #1 pointed out that definition of "Vps21 activity" is unclear throughout the manuscript. In this study we have developed a novel biochemical method capable of detecting Vps21p activity with high sensitivity (Fig. 2) and utilized this method to measure Vps21p activity, which is clearly stated in the new manuscript. The reviewer #1 also pointed out the issue that we have not clearly explained about difference of two Vps21p-residing structures, small endosome-like puncta and aberrant large structure. To clearly distinguish them, in the new manuscript we have added data showing the size distribution of Vps21p-residing structures (Fig. S2). Regarding comment #2, we think that the reviewer may have misunderstood the data (please see the response to this comment described below). Reviewer #2 did not request any additional experiments but gave us many helpful comments to improve the manuscript. In the new manuscript, we have revised all the places that the reviewer pointed out.

      __ Point-by-point description of the revisions__

      __Reviewer #1 (Evidence, reproducibility and clarity (Required)): __

      (Reviewers’ comments are in italics)

      *Summary: *

      In the present study Nagano et al. identify an overlapping function of clathrin adaptors in the activation of the yeast Vps21 Rab GTPase. This activation is regulated in a concerted manner by two TGN cargo adaptors, AP-1 and GGA1/2. The basis of this study is derived from the previous work Nagano et al., 2019 where authors reported that Ent3p and Ent5p are important for the formation of the Vps21p-positive endosome. By utilizing a synthetic genetic approach, the authors observed that disruption/loss of the AP-1 complex (apl4 mutant), Ent3p, Ent5p or Pik1 decreased fluorescence intensity for GFP-Vps21p and increased number of Vps21p puncta. They found that these effects for AP-1 disruption are additive, that is, each makes a distinct contribution, at least in ent3∆/ent5∆ mutant cells. They next examined the role of factors required for TGN localization of Ent3p/5p and AP-1 in Vps21p activation. The authors reported that GGA1/2, Pik1p and the Ypt31/32 Rab GTPases make modest contributions to targeting of AP-1 and Ent3/5 to the TGN. The observation that accumulation of GFP-Vps21 next to vacuolar compartments in pik1-1 ent3D mutants similar to that of ent3Dent5Dapl4D, lead authors to conclude that both PI(4)P as well as PI(4)P independent Ent3p recruitment to TGN plays a crucial role in Vps21p activation. Further they found that compared to the pik1-1 ypt31ts mutant (41%), activity of Vps21p (14%) was severely reduced in the pik1-1 ypt31ts gga1D gga2D mutant pointing towards redundancy among these factors in Vps21p activation. Finally using a class E Vps mutant authors found a fall in endosomal population of GFP-Vps9p ~29% in the ent3D ent5D mutant, which was further reduced to 0% in the ent3D ent5D apl4D* mutant. Collectively this study suggests a differential role of TGN adaptors, AP-1 and GGA in early endosome formation. Ent3p/5p and AP-1 are proposed to activate Vps21p by localizing Vps9p on endosomes and thus facilitating its transport whereas GGAs act redundantly along with Pik1p and Ypt31/32 in regulating TGN localization of Ent3p/5p and AP-1. *

      Major comments:

      There is a considerable amount of data that address the roles of AP-1, Ent3, Ent5, Gga1/2, and Pik1 in targeting of Vps21 and related trafficking pathway components to the TGN/endosome. The experiments are essentially genetic epistasis tests that compare the fluorescence patterns of GFP-Vps21 in a sophisticated set of strains. The genetic data are interpreted in terms of spatiotemporal dynamics of Vps21: proportion Vps21GTP on a compartment and number of GFP-Vps21 positive compartments. *Being genetic in nature, the data are open to wide interpretations in terms of molecular mechanisms that target candidate proteins Vps21p and Vps9 to the TGN/endosome. The authors presentation (Fig. 7) is based on well controlled experiments and is logical, but key questions regarding Vps9 trafficking as it relates to Vps21 endosome formation are not resolved. *

      Response:

      In this study, in addition to comparison of the fluorescence patterns of GFP-tagged yeast Rab5 (Vps21p), we have developed a novel biochemical method capable of detecting the amount of active Vps21p with high sensitivity. The amount of active Vps21p obtained by this method correlated well with the results obtained by imaging analysis, and we think this approach significantly increased the reliability of our results.

      Using this new biochemical method and fluorescence imaging analysis, we have clarified the overall regulatory mechanisms of Vps21p by vesicle transport from the TGN. In particular, we believe that this is an important study that links the activation of Vps21p that mediates endosome formation with numerous previous studies involving vesicle transport from the TGN to the endosome.

      Comment #1(a)

        • Throughout their study the authors conflate measurements of GFP-Vps21 puncta intensity and number of Vps21p puncta as readouts of Vps21 "activity". Figure 7 exemplifies this especially: "Vps21p Activity: 100%; Vps21p Activity: 45%; Vps21p Activity: 10%". *
      1. *a) Would the authors please explicitly define how they use "activity" in the manuscript? * Response:

      We appreciate the reviewer’s pointing out our error. As the reviewer pointed out, since we have used the word “activity” when we explained the result obtained by the fluorescence intensity and the number of Vps21p puncta in lines 312-315 (in the new manuscript), we have revised this sentence “~ a decreased PI(4)P level reduces Vps21p activity and thus inhibits fusion of Vps21p compartments.” to “~a decreased PI(4)P level seems to inhibit fusion of Vps21p compartments.” (lines 314-315).

      In other parts of the manuscript, we have used the word “activity” only when we explained the result obtained by measuring the amount of active Vps21p by the biochemical method (Fig. 2). “Vps21p Activity” depicted in Fig. 7A-C are also based on the results obtained by the biochemical assay, and thus we have added explanatory sentences in the Discussion section (lines 432-433, 447) and figure legend (lines 996-998) in the new manuscript.

      Comment #1(b)

      1. *b) The amounts of Vps21-GTP were measured for the ent3D ent5 and ent3D ent5 apl4D mutants (Fig. 2). Other mutant backgrounds should be analyzed in order to address the specific requirements of gga1/2, pik1 and ypt31/32 genes and to challenge the assumption that aspects of GFP-Vps21 localization correlate with the proportion of Vps21GTP. * Response:

      We agree with the reviewer’s comment that it is crucial to confirm that aspects of GFP-Vps21 localization correlate with the proportion of Vps21GTP. In the previous manuscript, we have already measured the amount of active Vps21p (GTP-bound form of Vps21p) in the pik1-1, and pik1-1 ent3D mutants (Fig. 4E) and shown that it decreases to ~62% in the pik1-1 mutant, or to ~22% in the pik1-1 ent3D mutant relative to wild-type cells (Fig. 4E). The relative amount of GTP-bound form of Vps21p in these mutants correlated well with the results obtained by imaging analyses of GFP-Vps21p (Fig. 4B and C). To make it clearer, we have added sentences “and the amounts of active Vps21p in these mutants correlate well with the results obtained by imaging analyses of GFP-Vps21p (Fig. 4B, C, and H).” in lines 326-327. We have also demonstrated that the amount of active Vps21p correlated with the fluorescence intensity of GFP-Vps21p at puncta in the pik1-1 ypt31ts or the pik1-1 ypt31ts gga1D2D mutant (Figs 4F-J, S4E), and explained about this in lines 334-341.

      Comment #1(c)

      1. *c) Regarding the measurements of fluorescence intensity of GFP-Vps21 puncta, how were distinct puncta identified, particularly in the large clusters of puncta shown in Figs. 1D, 3A, 4F, 5A, 5C. * Response:

      As the reviewer pointed out, in the previous manuscript we have not clearly explained about how we had distinguished two Vps21p-residing structures, small endosome-like puncta and aberrant large structure. To clearly distinguish them, in the new manuscript we examined the size and number of these structures and showed the data in Fig. S2. This result revealed that the ent3D5D apl4D mutant contains single large Vps21p-residing structure with a size of >100 pixels and many small Vps21p-residing puncta with a size of ~50 pixels. To explain about this, we have added sentences in lines 235-239. Regarding Fig. 5A and 5C, since these figures do not show the localization of Vps21p, we have not added explanation about them.

      Comment #2

      • In the representative micrographs shown in Fig. 1A (Vph1-mCH), 1B (Hse1-tdTom), 1D (Sec7-mCH) and 5A, why do only (roughly) half of the cells in each micrograph express the tagged organelle marker protein? Shouldn't all of the cells? What is especially concerning is that the appearance of GFP-Vps9 in cells that express Sec7-mCH is different than in cells that do not. Specifically, there are fewer GFP-Vps9 puncta in expressing cells and GFP-Vps9 appears to be largely cytosolic in these cells. Have the authors noted the same? *

      Response:

      In Fig. 1, we expressed mCherry/tdTomato-tagged protein only in wild-type cells (Fig. 1A and B) or in ent3D5D mutants (Fig. 1D) to distinguish the mutant cells from the wild-type cells, as described in the Result section (lines 156-159) and figure legends. As explained in the text (lines 156-159), by labeling only wild-type or mutant cells, we precisely evaluated the differences in the localization of GFP-Vps21p by comparing mutant cells directly alongside wild-type cells.

      In Fig. 5A, we expressed Sec7-mCH only in the ent3D5D mutants to distinguish the mutants from wild-type cells (the upper panels) or the ent3D5D apl4D mutants (the lower panels), as described in figure legend. Therefore, the reviewer’s comment that “the appearance of GFP-Vps9 in cells that express Sec7-mCH is different than in cells that do not. Specifically, there are fewer GFP-Vps9 puncta in expressing cells and GFP-Vps9 appears to be largely cytosolic in these cells.” is exactly what we wanted to show in this figure. To show this more clearly, we labeled cells with “WT” or “mutant” in these micrographs (Fig. 1A, 1B, 1D, and 5A).

      Comment #3

      • Figure 4A: How were the proportional contributions of each factor to the TGN localization of Ent3/5, AP-1 determined? What do the percentiles indicate? *

      Response:

      As described in the Result section (lines 293-297), we have shown that deletion of the GGA1 and GGA2 genes significantly decreased the localization of Ent3-GFP at the TGN to ~33% of wild-type cell, without changing the localization of Ent5-GFP and Apl2-GFP (Fig. S3A, B). Based on these results, the contribution of Gga1/2p to the localization of Ent3p, Ent5p, or AP-1 was evaluated to be 37%, 0%, or 0%, respectively (Fig. 4A). To make this clearer, we have added sentence “~ and thus, we evaluated the contribution of Gga1p/2p to the localization of Ent3p, Ent5p, or AP-1 to be 37%, 0%, or 0%, respectively (Fig. 4A)” in line 296-297. Similarly, we have determined the contribution of PI(4)P by assessing the localization of Ent3p, Ent5p and Apl2p at the TGN in the pik1-1 (Fig. S3C and D), as described in lines 297-305. Regarding Rab11s (Ypt31p/32p), we have evaluated the contribution based on the data in our previous study, as described in line 305-309.

      Comment #4

      • In the model presented in Figure 7, the authors proposed that AP-1 is required to target Vps9 from the late TGN to the early TGN. The best characterized function of AP-1 is to concentrate integral membrane proteins to form the inner layer of a clathrin coated vesicle. Vps9 is a soluble protein that fractionates with cytosolic proteins (Burd et al., 1996). Despite measuring intensity and localizing Vps9p with different endosomal markers (Fig. 6), the basis of membrane recruitment of Vps9 by TGN clathrin adaptors is unclear. How do the authors envision AP-1 to function in targeting of Vps9, a soluble protein, between compartments? *

      Response:

      Like other many Rab-GEFs (e.g., Sec2p, the GEF for Sec4p or Mon1p/Ccz1p, the GEF for Rab7), we think that Vps9p transiently localizes to the donor organelle to activate Rab proteins and load them on the transport vesicle. We have previously demonstrated that Arf1p, a Golgi-resident GTPase, plays an important role in the recruitment of Vps9p to the Golgi (Nagano et al., Comm. Biol., 2019). In this study we have shown that deletion of AP-1 in the ent3D5D mutant increases the localization of Vps9p at the TGN (Fig. 5A and B). These suggest that AP-1, like Ent3p/5p (Nagano et al., Comm Bio, 2019), is dispensable for the recruitment of Vps9p to the TGN but required for the transport of Vps9p from TGN to endosomes.

      In a recent study Casler et al. proposed a role of AP-1 function that maintain Golgi-resident proteins by mediating intra-Golgi recycling pathway (Casler et al., JCB, 2021). Based on this model, we have speculated that AP-1 also functions to maintain Vps9p in the TGN by recycling from the late TGN to early TGN and discussed about this in the second paragraph of the Discussion section (lines 434-454 in the new manuscript). However, as the reviewer #2 pointed out (please see comment #6 of the reviewer #2), Casler et al proposed AP-1’s role in transport from the TGN back to earlier Golgi compartment but did not discuss compartmentalization within the TGN, we have modified sentence in the Discussion from “~ the role of AP-1 that recycles Vps9p back to the early TGN might become apparent” to “~ the role of AP-1 that recycles Vps9p back to the earlier Golgi compartment might become apparent” (lines 444-445).

      __Minor Comment: __

      • The interchangeable terminology used to refer to Rab GTPases throughout the manuscript made it exceptionally difficult for me to focus on the presentation of the experiments. Vps21 and Rab5 are used interchangeably, but this study investigated Vps21, not Rab5. Vps21 does not even appear in the title or abstract. Similarly, Ypt31/32 is used interchangeably with Rab11, but this study investigated Ypt31/32, not Rab11. The accurate names of the yeast proteins should be used. A discussion regarding significance of the yeast proteins for understanding mammalian Rab5 and Rab11 belongs in the Discussion. *

      Response:

      In accordance with the reviewer’s suggestion, we have replaced Rab5 with yeast Rab5 or Ypt21p. We have also replaced Rab11 with yeast Rab11 or Ypt31p/32p.

      __Reviewer #1 (Significance (Required)): __

      *General assessment: In general, this is a well-executed and controlled study. The major strengths are the large quantity of data from complementary experiments that provide a rationale for the proposed mechanistic model proposed (Fig. 7). The major weaknesses lie with the genetic approach, which does not lend itself to the mechanistic interpretations that the authors propose, and the narrow scope of the work such that the study will be of interest to a small group of colleagues. The audience will likely include researchers who use yeast to investigate proteins sorting in the endo-lysosome network of organelles and colleagues who investigate signaling by Rab GTPases. *

      Response:

      We cannot agree with the reviewer’s comment that “the narrow scope of the work such that the study will be of interest to a small group of colleagues”, because the regulation of endosome formation by Rab5 is one of the major topics in the field of membrane traffic, and many mechanisms still remain to be elucidated. Moreover, the model we have proposed in this study is adaptable not only to yeast but to higher organisms, as discussed in the last paragraph of the Discussion section. The endolysosomal pathway is important for the regulation of a wide variety of crucial cellular processes, including mitosis, antigen presentation, cell migration, cholesterol uptake, and many intracellular signaling cascades. Our work thus also has implications for development, immunity, and oncogenesis. We believe that the studies described in our paper represent an advance in our understanding of the cellular biology of endocytic trafficking and therefore would be interesting to researchers in other fields, as well as membrane traffic filed.

      __ __

      __Reviewer #2 (Evidence, reproducibility and clarity (Required)): __

      (Reviewers’ comments are in italics)

      *Summary: *

      *The manuscript by Nagano et al. describes the results of extensive analysis on the roles of clathrin adaptors for activation of Rab5 during TGN-to-endosome traffic in budding yeast. They examined the localization and activation status of Vps21, a major Rab5 member in yeast, in a variety of mutants and showed that AP-1 had a cooperative role with Epsin-related Ent3/5 in transport of Vps9 (Rab5 GEF) to endosomes. GGAs, PI4 kinase Pik1, and Ypt31/12 (Rab11) had partially overlapping functions in recruitment of AP-1 and Ent3/5 to TGN. *

      *It is an indeed extensive study but the interpretation of the results is complicated and somewhat speculative. It is most probably because the differences between mutants are partial (even though the authors tried to show statistics) and the logics to lead conclusions are not always compelling. To be honest, I had a hard time to follow rationales to justify arguments. The conclusions the authors make, that is, multiple clathrin adaptors cooperate in the TGN-to-endosome traffic, are reasonable, but I have several questions as follows, which I would like the authors to address. *

      Comment #1

        • The description about Vps21 fluorescence is often quite confusing. When the authors say fluorescence intensity, is it the total intensity of a whole cell or the average fluorescence intensity of individual puncta? For example, in Fig. 1D, it doesn't look to me at all that the GFP intensity of ent3/ent5 is lower than WT. How did the authors obtain the data of Fig. 1E? If the authors measured the fluorescence of individual puncta, how did they do it? * Response:

      We agree that in the previous manuscript explanation about how we measured Vps21p fluorescence intensity was insufficient. In this study, we have measured the whole fluorescence intensity of single GFP-Vps21p punctate structure, which was subtracted the cytoplasmic fluorescence background, and shown it as the fluorescence intensity of Vps21p compartment (the aberrant large GFP-Vps21p structure (Fig. 3A) were excluded). The graphs of fluorescence intensity of GFP-Vps21p show the average of three data (each average of 50 puncta) from three independent experiments. To clarify where and how Vps21 fluorescence was measured, in the new manuscript we have revised text (lines 160-161, 163, 166, 177, 179) and added explanatory sentences in “Materials and Methods” (lines 542-546).

      Regarding Fig. 1D and E, since the fluorescence intensity of GFP-Vps21p at the cytosol was increased in the ent3D5D mutant (Fig. 1D), the fluorescence intensity in the mutant may not have appeared lower than that in wild-type cell. To show the decrease of the fluorescence intensities of individual Vps21p puncta in the mutant cells more clearly, we have added the higher magnification view of GFP-Vps21p puncta in Fig. 1D in the new manuscript.

      Comment #2

      • Related to the previous question, how the images were taken is very important. In the legend to Fig.1, there is no description about the image analysis. Are they epifluorescence images or confocal images, and if the latter, are they ones of 2D confocal images or maximum intensity projections of Z stacks as mentioned in the legend to Fig. 3A? It matters very much. *

      Response:

      We appreciate the reviewer’s helpful suggestion. In Fig. 1, we have used epifluorescence images for analyzing the fluorescence intensity or number of GFP-Vps21p puncta, because Vps21p puncta have high mobility (please see also the responses to comment #9). In accordance with the reviewer’s suggestion, we have added the description about imaging method in the legend of Fig. 1 (lines 831-832, 837 and 843).

      Comment #3

      • It is also confusing when the authors say increase or decrease of fluorescence. Is it the intensity or the number of puncta? Please clarify which the authors intend to mention whenever relevant. There are many places that bother readers. *

      Response:

      We appreciate the reviewer’s helpful suggestion. In accordance with the reviewer’s suggestion, we have revised manuscript (lines 274 and 316).

      Comment #4

      • The method the authors developed to estimate the activation states of Vps21 is intriguing. It may provide important information without direct measurements of the GTP-binding activity. However, the results should be carefully interpreted because this kind of tricky experiments may not reflect the exact biochemical statuses in the cell. For example, I am concerned about whether release of GTP or spontaneous GTPase activity during the preparation processes is ignored. *

      Response:

      As the reviewer pointed out, we cannot rule out the possibility that the GTP-bound status might be changed during the preparation processes. However, this problem also occurs in the conventional pull-down assay, which assesses the amount of the GTP-bound form of Rab proteins. To confirm whether the activity of Vps21p assessed by this method reflects in vivo activation level, we have demonstrated that the level of active Vps21p correlated with the in vivo phenotypes, such as fluorescence intensity of GFP-Vps21p at the endosome and number of GFP-Vps21p puncta, that implicate defect of endosomal fusion. Thus, in the new manuscript we have added some sentences to explain about this (lines 221-222).

      Comment #5

      • In Discussion (p. 20, line 410), the authors describe that "Gga2p is localized predominantly at the Tlg2-residing compartment," but this is wrong. In the BioRxiv paper (2022), the authors showed that "Gga2p appears around the Sec7p-subcompartment and disappears at a similar time as Sec7p." I understand that, to explain the roles of GGAs in endosomal transport, it is reasonable to assume their presence in the Tlg2 compartment (and I agree on that), but the above description is wrong and must be corrected. *

      Response:

      We appreciate the reviewer’s helpful suggestion. As the reviewer described, we have recently demonstrated that Gga2p localization well overlapped with the Tlg2p-residing TGN sub-compartment that is structurally distinct from the Sec7p-residing sub-compartment (Toshima et al., BioRxiv, 2022). Thus, in accordance with reviewer's suggestion, we have changed this sentence to “Interestingly, Gga2p appears to reside at the Tlg2p sub-compartment, which is distinct from the Sec7p sub-compartment.” in the new manuscript (lines 427-428).

      Comment #6

      • Hypothesizing the role of AP-1 in the recycling from the late TGN to the early TGN is new. Glick's group proposed its role in transport from the TGN back to earlier compartment (Golgi) but did not discuss compartmentalization within the TGN. The authors' speculation is a fancy idea, but I am afraid there is no direct evidence for that. *

      Response:

      We appreciate the reviewer’s appropriate and helpful suggestion. As the reviewer pointed out, Glick's group has proposed its role in transport from the TGN back to earlier Golgi compartment, but not discussed compartmentalization within the TGN (Casler et al., 2021, JCB), and thus we modified sentence in the Discussion section from “~ the role of AP-1 that recycles Vps9p back to the early TGN might become apparent.” to “~ the role of AP-1 that recycles Vps9p back to the earlier Golgi compartment might become apparent.” (lines 444-445).

      Comment #7

      • The role of Ypt31/32 (Rab11) is also puzzling to me. It could be an indirect effect, which might be due to the complex network of GTPases as proposed by Chris Fromme (2014). Am I correct? *

      Response:

      As the reviewer pointed out, Fromme’s group has shown that Ypt31/32 forms the complex networks with several GTPases and their GEFs (McDonold and Fromme, 2014, Dev Cell; Thomas and Fromme, 2016, JCB, Thomas et al., 2019, Dev Cell), in which Ypt31/32 promotes the activation of Arf1p via its GEF Sec7p. We have previously shown that Arf1p plays an important role in the recruitment of Vps9p to the Golgi (Nagano et al., Comm. Biol., 2019). These findings suggest that disruption of Ypt31p/32p may affect the localization of Vps9p through reduced activity of Arf1p. However, arf1D and ypt31ts mutants exhibit different effects on the Vps9p localization: in arf1D mutant the recruitment of Vps9p to the TGN is impaired and in ypt31ts mutant Vps9p localization at the TGN is increased (Nagano et al., 2019, Comm Biol.). Thus, the role of Ypt31/32 in the Vps9p localization appears to be independent of Arf1p activity. In the new manuscript, we have added a brief discussion about this (lines 466-473).

      Comment #8

      • In the legend to Fig. 3D, the authors state that the read arrowheads indicate 50 nm vesicles and black arrowheads indicate vesicle clusters. However, the electron micrograph clearly shows that their morphologies are different. Red ones, which I estimate to be a little larger than 50 nm, often appear to have dense material inside, while those in black are even larger (probably around 200 nm) and do not look like a cluster of the same type of vesicles (I do not even think that such large structures should be called vesicles). How do the authors explain these differences? *

      Response:

      In the previous manuscript explanation about the electron microscopy analysis was insufficient. In the new manuscript, to clearly distinguish two Vps21p-residing structures, small endosome-like puncta and aberrant large structure, observed in ent3D5D apl4D mutant by fluorescence microscopy (Fig. 3A), we examined the size and number of these structures and showed the data in Fig. S2. This result revealed that the ent3D5D apl4D mutant contains single aberrant large aggregate with a size of >100 pixel adjacent to the vacuole and endosome-like structures with a size of Comment #9

      • In Fig. 4F, the authors show different sets of images, Focal plane and Z projection. What is the purpose to do it? The results with Z projection should be more informative. Why the authors use only Focal plane data for the analysis in panel G? *

      Response:

      We measured the fluorescence intensity or number of individual GFP-Vps21p puncta using a single focal plane images (Figs. 1C, 1E, 3I, and 4B), because Vps21p-residing small puncta have high mobility and identical endosome often appears in multiple different planes in the Z-stack image taken by a conventional epifluorescence microscope. In contrast, we analyzed the aberrant large aggregate using Z projection image (Figs. 3B, S3G) because this structure is relatively stable and low motile, and not observed if it is not in the focal plane. In Fig. 4F, since both of small puncta and large aggregate are analyzed, we have shown both of focal plane image and Z-projection image. In new manuscript, we have added about the description about imaging method in each figure legend or text (lines 230-232, 332-334).

      __Reviewer #2 (Significance (Required)): __

      *It is a complicated story but I find most of the conclusions reasonable. It provides important knowledge to the understanding on the Rab5 GTPase regulation in trafficking from the TGN. *

      Response:

      We are very grateful for this reviewer’s favorable evaluation of our studies.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      1. General Statements [optional]

      We would like to thank all reviewers for their constructive feedback and for raising specific points that have helped to improve our manuscript. We accept that the initial submission did not include some quantitative aspects of the observed effects. These are now included together with all the suggested experiments from the reviewers with the use of additional mutants and appropriate protein markers. We believe that the manuscript offers a conceptual advance and a molecular mechanism for the effects of caffeine on cell cycle progression of eukaryotic cells and is of interest to geneticists working on cell cycle, cancer and biogerontology.

      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity):

      Summary:

      In the manuscript “The AMPK-TORC1 signaling axis regulates caffeine-mediated DNA damage checkpoint override and cell cycle effects in fission yeast,” the authors studied the role of genes that are potentially involved in the caffeine-mediated override of a cell cycle arrest caused by activation of the DNA damage checkpoint. The methylxanthine substance caffeine has been known to override the DNA damage checkpoint arrest and enhance sensitivity to DNA damaging agents. While caffeine was reported to target the ATM ortholog Rad3, the authors previously reported that caffeine targets TORC1 (Rallis et al, Aging Cell, 2013). Inhibition of TORC1, like caffeine, was also reported to override DNA damage checkpoint signaling. Therefore, in the present study, the authors compared the effects of caffeine and torin1 (a potent inhibitor for TORC1 and TORC2) on cell cycle arrest caused by phleomycin, a DNA damaging agent, using various gene deletion S. pombe mutants.

      The authors concluded that they identified a novel role of Ssp1 (calcium/calmodulin-dependent protein kinase) and Ssp2 (catalytic subunit of AMP-activated kinase) in the cell cycle effects caused by caffeine, based on the following findings; (1) the caffeine-mediated DNA damage checkpoint override requires Ssp1 and Ssp2; (2) Ssp1 and Ssp2 are required for caffeine-induced hypersensitivity against phleomycin; (3) under normal growth conditions, caffeine leads to a sustained increase of the septation index in a Ssp2-dependent manner; (4) Caffeine activates Ssp2 and partially inhibits TORC1.

      Major comments:

      I do not think that many of the authors’ claims are supported by the results of the present study. The corresponding parts are detailed below.

      1. The conclusion of the first paragraph in the Results (top in page 6; Our findings indicate that caffeine and torin1 indirectly and directly inhibit TORC1 activity respectively.) is not supported by the data in Figure 1. The result that caffeine, but not torin1, requires Ssp1 and Ssp2 to override the phleomycin-induced cell cycle arrest does not necessarily indicate that caffeine indirectly inhibits TORC1 via Ssp1 and Ssp2. Rather, the authors should mention that this conclusion is based on the authors’ previous reports by citing them (e.g., Rallis et al, Sci Rep, 2017). To add to Figure 1, an additional experiment using a constitutively active AMPK mutant, a temperature-sensitive TORC1 mutant, and a srk1 deletion mutant will help the authors claim their original conclusion as one possibility.

      Torin1 inhibits TORC1 and 2 leading to G2 cell cycle arrest following accelerated mitosis. In contrast, caffeine has been reported to enhance the inhibitory effect of rapamycin on TORC1 signaling but does not inhibit growth. It has not been reported that TORC1 is a direct target of rapamycin. We previously demonstrated that caffeine induces Srk1 in a Sty1 dependent manner (Alao et al., 2014). Furthermore, Ssp1 plays a role in regulating Srk1/ Cdc25 activity. It is therefore possible, that Ssp1 influences the ability of caffeine to promote mitotic progression as part of the stress response while also affecting TORC1 activity via Ssp2. As ssp2∆ cells have higher intrinsic TORC1 activity, this could also attenuate the effect of caffeine on mitosis.

      We have modified the first paragraph of the results section to address the reviewer’s concerns.

      We have previously reported that Srk1 modulates the ability of caffeine to drive cells into mitosis (Alao et al., 2014).

      1. The conclusion of the second paragraph in the Results (lower-middle in page 6; Our results indicate that caffeine induces the activation of Ssp2.) is not based on the results of Figure 2. Figure 2 simply illustrates that both caffeine and torin1 cause hypersensitivity to phleomycin dependent on Ssp1 and Ssp2.

      We appreciate the reviewer’s contention and have modified the text.

      1. The conclusion of the fourth paragraph in the Results (middle in page 7) is not clearly supported by the result, due to an insufficient data analysis. As the cell length and the progress through mitosis are the key assay parameters in Figure 3, the average cell length should be shown next to each micrograph of Figure 3A and 3B. In Figure 3C, a mitotic index and the average cell length should be shown next to each micrograph. A statistical analysis is necessary for the authors to compare the measurements and to claim as the headline (Caffeine exacerbates the ssp1D phenotype under environmental stress conditions), as the effect of caffeine was not evident._

      We have conducted additional experiments to measure cell length and modified the figure to include this data. We believe our observation that caffeine alone induces increased cell length in ssp1 mutants, confirms a role for the Ssp1 protein in modulating the effects of caffeine. We previously showed that Caffeine activates Srk1 which in turn inhibits Cdc25 activity similar to other environmental stresses (Alao et al., 2014). Ssp1 negatively regulates Srk1 following exposure to stress. In contrast, caffeine advances mitosis in wt cells and thus does not result in increased cell length. We also demonstrate that caffeine greatly enhances cell length in ssp1 mutants exposed to heat stress in marked contrast to rapamycin and torin1. These findings indicate that Ssp1 mediates the effect of caffeine on mitosis.

      1. In the middle of page 8, the statement “Accordingly, the effect of caffeine and torin1 on DNA damage sensitivity was attenuated in gsk3D mutants (Figure 5C and 5D).” is not supported by the corresponding results. Rather, Figure 5C and 5D look almost the same.

      We agree with this and other reviewers that demonstrating enhanced sensitivity to caffeine is problematic. Nonetheless, our cell cycle data clearly indicate a differential role for Gsk3 in mediating the cell cycle effects of caffeine and torin1. In terms of DNA damage sensitivity, we have reproducibly observed a lower degree of DNA damage sensitivity in gsk3 mutants relative to wt cells. Hence, while caffeine is less effective at enhancing DNA damage sensitivity relative to torin1 in wt cells; we observed that caffeine and torin1 increase DNA damage sensitivity to a similar degree in gsk3 mutants.

      1. The description and the conclusion of the last paragraph in the Results (bottom in page 8 – page 9) are not supported by the results of Figure 6, due to an insufficient data analysis. The extent of phosphorylation must be quantified as a ratio of the phosphorylated species (e.g., pSsp2) to all species of the protein (e.g., Ssp2).

      We have carefully repeated our experiments under various conditions. Our results clearly indicate caffeine induced Ssp2 phosphorylation. These observations have not been reported previously.

      From Figure 6, the authors claim that caffeine (10 mM) partially inhibits TORC1 signaling. However, the authors previously showed that the same concentration of caffeine inhibited phosphorylation of ribosome S6 kinase as strongly as rapamycin, the potent TOR inhibitor (Rallis et al, Aging Cell, 2013). The authors are advised to assess phosphorylation of S6 kinase again in the present study and compare to the results of the present results in Figure 6, because addition of that data may allow the authors to discuss that caffeine affects TORC1 downstream pathways at different intensities.

      While rapamycin is a strong inhibitor of TORC1 in budding yeast, this is not the case in fission yeast. Our previous assessments of p-S6 levels and polysomal profiles as well as cell-cycle progression kinetics have shown this (Rallis et al, Aging Cell, 2013). In addition, gene expression analysis from our previous studies have shown that caffeine treatment results in a gene expression profile similar to that of cells in nitrogen starvation (TORC1 inhibition).

      We have now used an Sck1-HA strain to further enhance our study and address the reviewer’s concerns. Previous studies have shown that 100 ng/mL rapamycin does not affect Sck1 phosphorylation. We demonstrate that in contrast to rapamycin (100 ng/ mL) 10 mM caffeine affects Sck1-HA expression and or phosphorylation. This effect was also observed with 5 µM torin1 albeit to a greater degree.

      Also, immunoblotting of the same proteins looks somehow different from panel to panel (e.g., pSsp2 in panel A and D; Actin in panel A, C, and D). Therefore, the blotting result before clipping had better be shown as a supplementary material.

      We repeated the blots were necessary and used ponceau S as a loading control. The original blots can be made available to all.

      Minor comments:

      1. (Figure 1) The septation index of the phleomycin-treated cells (without any further additional drugs) should be shown, as a baseline.

      We have included data for untreated cultures and phleomycin-only treated cultures.

      1. (Figure 1D, Optional) As a ppk18D cek1D double deletion mutant is reported, the authors are advised to add and test that mutant in this experiment.

      We have added the related data for the _ppk18_Δ _cek1_Δ double mutant.

      1. (Figure 2) The authors need to clarify the number of cell bodies spotted (e.g., in the Figure legend).

      We have modified the figure legend accordingly.

      1. (Figure 3) The different number of cells in micrographs may give an (wrong) impression on the cell proliferation rate. Therefore, it is advisable to use the micrographs in which the similar number of cells are shown for conditions with the similar cell proliferation rates.

      We have included data to show the cell lengths under different conditions. We find that different conditions greatly affect proliferation rates. For instance, cells do not proliferate in the presence of torin1. We initially sought to investigate if caffeine induces a phenotype in ssp1 mutants by virtue of its interaction with the DNA damage response. The micrographs were included as representative examples and have been now complemented with cell length data.

      1. (Figure 4B) ssp2D, not spp2D.

      The figure legend has been edited.

      1. (Figure 4) The septation index of the none-treated cells should be shown as a baseline.

      We have included base line data for untreated wt cells in figure 1. We have no reason to suspect any of the mutants would provide different results over the time investigated.

      1. (Figure 6B, 6E) What do the black arrows indicate? Figure Legend does not seem to explain them.

      The legend has been modified to indicate what the arrows refer to.

      1. (Figure 6C) Indicate which part of the Maf1-PK blot corresponds to the phosphorylated species, because Maf1-PK is probed with an anti-V5 (not a phosphorylation-specific) antibody.

      These experiments have been carefully repeated under different conditions and the figure is now modified accordingly.

      1. (Figure 6D) gsk3Dssp1D, not gs3Dssp1D.

      We have deleted this figure and have now replaced it with data we believe is more appropriate.

      Reviewer #1 (Significance):

      As caffeine is implicated in protective effects against diseases including cancer and improved responses to clinical therapies, the topic of the present study is of interest and importance to the broad audience.

      In the present study, the most significant finding is that caffeine- and torin1-induced hypersensitivity to phleomycin is dependent on Ssp1 and Ssp2 (Figure 2). This result may be important in chemotherapy against cancers. On the other hand, caffeine is known to activate AMPK (e.g., Jensen Am J Physiol Endocrinol, 2007). Besides, as detailed in the Major comments, many of the major conclusions are not supported by the present results. Therefore, based on my field of expertise (cell cycle, cell proliferation, and TOR signaling), I conclude that the present study hardly extends the knowledge in the field of "the cell biology of caffeine."_

      We thank the reviewer for their helpful comments. We accept the constructive criticisms and have carried out extensive additional experiments to provide further roles for Ssp2 and TORC1, in mediating the cell cycle effects of caffeine. We stress that caffeine has previously been proposed its effects via inhibition of Rad3 activity. Our previous work showed that caffeine did not inhibit Rad3 mediated checkpoint signaling. As later studies suggested caffeine inhibited TORC1 activity, the major goal was to investigate if caffeine is an indirect inhibitor of TORC1 via Ssp2 which is activated by several stresses. It has never been demonstrated that caffeine signals via Ssp2. This study provides the first evidence that caffeine modulates cell cycle progression by at least partially signaling via Ssp2 and TORC1. After nearly 30 years, it is vital that its precise activity, in particular enhancing DNA damage sensitivity is properly characterized. Such work woold open the way for additional studies on how caffeine activates cell physiology. For instance, we show that caffeine at 10 mM is more effective at inhibiting Sck1 activity than Rapamycin at 100 ng/ ml. In contrast, rapamycin at this concentration is more effective at inhibiting Maf1 activity. Hence further studies on how exactly the combination of caffeine and rapamycin influences their effect on ageing and other TORC1 regulated processes.

      Reviewer #2 (Evidence, reproducibility and clarity):

      Summary: In this paper, Alao and Rallis analyze the role of AMPK and TORC1 pathways, and the respective crosstalk, in regulating cell cycle progression in the presence of DNA damage in S. pombe. The authors show, almost exclusively through chemo-genetic epistasis assays, that caffeine inhibits TORC1 indirectly activating AMPK, in contrast to the specific ATP-competitive TORC1 inhibitor torin1. Specifically, it is shown that in the absence of a functional AMPK pathway caffeine is unable to revert the TORC1-inhibition-dependent override of cell-cycle arrest caused by the DNA-damaging agent phleomycin, henceforth partially suppressing the growth inhibition caused by the co-treatment.

      Major comments: The overall story of the paper is convincing. However, the choice of an almost exclusively chemo-genetic approach, lack of controls in some experiments and some discrepancy in data presentation suggest that the manuscript undergoes revision before the authors claim that their conclusions are fully supported by the results. In detail:

      In Figure 1, graphs of septation indexes are presented separately for each strain. This presentation prevents the reader from clearly comparing the differences of septation caused by genetic background rather than the treatment, i.e. the septation happening by treatment with torin1. I feel it would be better to group the results by drug rather than by strain/mutant. If the results are presented this way because the experiments on different strains were run separately, I further suggest that they are re-run so to always include at least the wt in every run._

      We have included data for untreated and phleomycin only treated wt cells as a reference. Additionally, all experiments were repeated at least 2 times. We have used this assay for over 10 years and have found it to be reproducible and reliable. We are not able to include wt cells in every run as this would be beyond the manpower capacity and time constraints involved. It is also likely that torin1 activity is influenced by the ssp1/ 2 backgrounds due to increased basal TORC1 activity as previously reported. The main goal was to illustrate that caffeine differs from a direct inhibitor such as torin1.

      Furthermore, torin1 inhibits both TORC1 and TORC2 and thus cannot be directly compared to caffeine. We do prove however, in this and other figures that in contrast to torin1 and rapamycin that caffeine signals via targets upstream of TORC1. We can therefore deduce that it functions in a manner similar to other environmental and nutrient stresses, which require with the Ssp1 and Sty1 regulated pathways to advance mitosis and other processes such as autophagy induction.

      In Figure 2C-D, an inconsistency is observable between the phleo+caffeine sensitivity of ssp1Δ and ssp2Δ, the latter retaining a higher sensitivity. Provided that this is not only due to this specific replicate, how would the authors explain such a difference and fit it into their conclusion of a "cascade" signaling with Ssp1 acting upstream of Ssp2?

      We agree that analyzing the different interacting pathways involved, is complex. For instance, Ssp1 is required for suppressing Srk1 following Sty1 activation independently of its effects on Ssp2 and TORC1. Furthermore, basal TORC1 activity is higher in Ssp2 mutants as previously reported. It is likely that Ssp1 exerts a more definitive role as it is required to directly reactivate Cdc25 activity following exposure to stress. In contrast Ssp2 activation eventually results in increased Cdc25 activity via inhibition of PP2A (Figure 8). These experiments are, thus, intended to compliment those in figure1 but the DNA damaging effects of caffeine must also be taken into account.

      In Figure 2I, a huge discrepancy is observable compared to panel 2A in terms of phleo+caffeine (no ATP) sensitivity of wt cells. Here, cells seem to cope well with the phleomycin treatment even if co-treated with caffeine. This renders the main finding of the panel (the effect of phelo+caffeine+ATP) rather uninterpretable.

      We have noted that relevant assays, at least in fission yeast, are influenced by the culture vessels (e.g., plastic type/ glass) as well as the vessel volume (probably due to different aeration, oxygen availability that affects growth and metabolism parameters). We have corrected figure 1a. In terms of ATP, these experiments are highly reproducible even if the exact mechanism remains unclear.

      In Figure 3A, the simple observation of elongation is sometimes hard to assess, for example in the ATP-caused suppression of the effect of torin 1, as also acknowledge by the authors in the text. I feel it would be really necessary to quantify such results on an adequate number of cells.

      We have reproducibly observed this uncharacterized effect of ATP. We have analysed the cell length in additional experiments to show that ATP influences average cell length under these conditions. It is important to note that the effects of phleomycin are pleotropic. For instance, it likely induces cell cycle arrest at various cell cycle phases as well as in early and late G2. Additionally, it may influence other cellular processes such as DNA or compete with drug targets such as TORC1 which is influenced by ATP.

      In Figure 3B,C wt is missing to compare the results in the presence of the same treatments. I understand the focus on Ssp1, but the authors should show the same treatments on wt cells. Similarly, it would be better to show the drug treatments in panel C also at 30{degree sign}C. For the same reasons as in the previous point, quantifications would greatly enhance the credibility of the claims here.

      Previous work by other investigators have shown that wt cells proliferate normally under these conditions. We also show in figure 1 that cell proliferation is not affected under nor cycling conditions in these assays. We have added cell length data that convincingly prove that Ssp1 is required to mediate the mitotic effects of caffeine. It appears that caffeine induces a cell cycle delay that requires Ssp1 to suppress Srk1- mediated Cdc25 inhibition. Furthermore, recent studies have demonstrated that rapamycin (which targets TORC1 downstream of Ssp1) allows cell proliferation at higher temperatures in S. pombe.

      A major point is the almost complete absence of molecular data. Except for Figure 6, the data do not include a detection of the relative activation of the relevant pathways. Figure 6 could hardly fill this gap, since the samples therein analyzed are not the ones utilized in most of the other figures, but simple, single time-point treatment with a single drug. The authors usually refer in the text to previous knowledge about how a treatment influences a pathway. However, they should show it here in their experimental conditions.

      We have performed extensive additional experiments including those suggested by the reviewer. These experiments conclusively show caffeine induces Ssp2 phosphorylation in an Ssp1- dependent manner. We also demonstrate that caffeine attenuates TORC1 signaling. Together with the cell cycle data, our findings strongly suggest caffeine indirectly inhibits TORC1 signaling a manner analogous to other environmental stresses. We also note that the inhibitory effect of caffeine on TORC1 has been demonstrated in several studies. What have provided further evidence for this but have for the first time demonstrated, that caffeine affects Ssp2.

      Minor comments:<br /> • A different grouping of the experiments/panels would help the reader. For example, Fig. 2I would fit better together with Fig. 3A, to match the composition of the various chapters of the results.

      We have performed additional experiments as suggested by the other reviewers. We believe the data is now easier to understand.

      Torin 1 is sometimes referred to with a capital T or with a lowercase t, especially in the Figures. I suggest to uniform the nomenclature.

      We have edited the text.

      In the results, the authors state that "ATP may increase TORC1 activity or act as a competitive inhibitor towards both compounds.". It's a little bit odd to refer to ATP as a competitive inhibitor of drugs. I would rather be ATP, the physiological agonist, outcompeting two compounds which are working as ATP-competitive inhibitors.

      We have modified the text accordingly.

      Reviewer #2 (Significance):

      The interplay between TORC1 and AMPK is of great interest in the cell signaling field, basically in every model organism.

      The paper provides a conceptual advance in the field showing a genetic interaction between the two pathways using a model organism which has probably been overlooked so far, which is a pity because S. pombe is the best organism to study G2/M cell cycle/size regulation. The story would be of interest especially for an audience working in cell signaling in microorganisms, but not so much (at least at this stage) for the community working on aging, disease and chemo-/radio-sensitization, contrary to what the authors claim. Furthermore, for the above-mentioned reasons, I feel like the authors are a little bit overshooting when claiming (for example in the abstract and in the discussion), that their work provides a clear understanding of the mechanism.<br /> As requested by Review Commons, I specify that my expertise is on TORC1/AMPK/PKA pathways, on their crosstalk and their regulation by metabolic intermediates.

      We believe that the additional requested experiments have adequately improved the manuscript and support our presented mechanistic model.

      Caffeine is interest in cancer biology and the biogerontology field proven by recent reports on metabolic phenotyping, liver function testing, induction of autophagy and interplay with HIF-1, just to mention a few.

      Reviewer #3 (Evidence, reproducibility and clarity):

      Summary<br /> This manuscript examines the genetic requirements for checkpoint override by caffeine in the fission yeast model organism. The main outcome is to show that checkpoint override, which has previously been linked to the downregulation of TORC1, is dependent on on the AMPK pathway (Ssp1/Ssp2). Additional analysis of downstream factors and the cross-talking Sty1 pathway implicates Greatwall kinases and Igo1 (PP2A inhibitor - endosulfine analogue) although the pleiotropic nature of these pathways and the rather blunt endpoints of septation index and phleomycin sensitivity makes robust data interpretation difficult.

      Major comments<br /> For clarity the manuscript would benefit from some restructuring. In particular it would help the reader if the diagram presented in figure 7 was presented first as this would help orientate the reader with the pathways. The mammalian equivalents should be indicated.

      Figure 8 (previously figure 7) summarizes our findings schematically. We believe that it works well at the end as a conclusion to the work and the discussion. Wherever appropriate we have mentioned the mammalian equivalent (e.g., for Rad3).

      For scientific accuracy and clarity the manuscript requires significant attention. For example in the abstract where Rad3 is introduced it is not made clear that this is the fission yeast gene. It would be better to introduce ATR at this point? Anther example in the abstract: 'Deletion of ssp1 and ssp2 suppresses...' should read 'Deletion of ssp1 or ssp2 suppresses...' as the two genes are not deleted in the same strain. I would recommend that the authors carefully revise the manuscript paying close attention to each statement. Fore example on page 4: 'Downstream of TORC1, caffeine failed to accelerate ppk18D but not igo1D and partially overrode DNA damage checkpoint signalling'. It is unclear what the authors mean by accelerate. I assume they mean accelerate cell cycle progression, but there is no direct analysis of cell cycle kinetics in the results. Similarly on page 5: '... ppk18D mutant displayed slower cell cycle kinetics than wild type cells exposed to phleomycin and caffeine or torin1 (Figuer 1D)'. However, the figure shows no cell cycle kinetic analysis.

      We have modified the wording of the abstract according to the reviewer’s suggestions.

      We refer to accelerated progression into mitosis and have edited the text where appropriate. Depending on the type of DNA damage, S. pombe cells transiently or permanently arrest cell cycle progression. It is well known that caffeine overrides these cell cycle DNA damage checkpoints. We previously proved that this was not due to Rad3 inhibition. Additionally, TORC1 (which controls the timing of mitosis) inhibition overrides checkpoint signaling. Our aim was to investigate if caffeine mimics this effect at least partially, via activation of Ssp2. We have demonstrated this is the case, although the basal state of the various mutants can complicate the data analysis in terms of cell cycle progression. Following exposure to phleomycin, this septation index peaks at 60 minutes following exposure to caffeine. In ppk18 mutants this peak was delayed by 30 minutes. Thus, wt and ppk18 mutants proceed through mitosis and cytokinesis at different rates (as determined by measuring the septation index).

      The authors appear to make the assumption that 'Inhibition of DNA damage signalling by caffeine and torin1 enhanced phleomycin sensitivity...' (page 6) but then clearly go on to show that the mutants used are sensitive for other unknown reasons. To make this link it would be necessary to artificially impose a G2 delay and show how much and in which circumstances this reverses the effect on sensitivity of caffeine/torin1. The authors should thus be very clear that they cannot equate sensitivity to 'checkpoint over-ride' and adjust their wording and assumptions accordingly. Assumptions on epistasis need to use the same assay and not equate between assays. As an example F1C and F2D do not equate as phleo+caffeine would be expected to be sensitised above phleo+torin1. This is not commented on in the text. Also on page 7 '... ATP also suppressed the ability of torin1 to override DNA damage checkpoint signalling albeit to a lesser degree (Figure 2I).' However, this figure only shows sensitivity, not septation index.

      We accept that these results can be difficult to interpret. Firstly, caffeine appears to modulate cell cycle progression by various means. We previously demonstrated that it stabilizes Cdc25 independently of checkpoint signaling. However, it also activates Ssp2 which subsequently affects Cdc25 activity via PP2A. Its effect on mitosis can thus differ depending on the context. For instance, igo1 mutants already have high PP2A activity which would affect the subsequent effect of caffeine on Cdc25 activity. Ssp2 on the other hand appears to regulate cell fate according to the nutritional state. Its sensing of nutritional cues is not limited to ATP/ AMP levels as it also regulates the response to amino acid quality (e.g., glutamate versus torin1).

      We have carried out additional experiments on the effect of ATP. While it did affect progression into mitosis, the results were complicated and have not been shown. Instead, we have provided additional data to show that it affects cell length which is an indicator of G2 cell length. In other words, longer cells spend more time in G2 prior to septation.

      We also suspect that caffeine is itself a DNA damaging agent as previously reported in the early 1970s. More recent studies have also indicated a role for Rad3 and DNA repair proteins for tolerance to caffeine. In fact, TORC1 itself has been reported to be required for DNA damage repair. Thus, TORC1 inhibition could potentially enhance DNA damage sensitivity independently of mitotic progression as shown in some of our experiments.

      While we have clearly identified a role for Ssp2 in mediating the cell cycle effects of caffeine, we accept that these findings will require further studies (beyond the scope of this one); to give more insights on how these caffeine- mediated effects occur. What is clear is that caffeine overrides DNA damage checkpoint signaling by at least partially inhibiting TORC1 signaling.

      All the septation index graphs require an untreated (I.e no caffeine or torin1) control.

      We now show in figure 1a, that the septation index does not change over the time period studied, when cells were left untreated. These assays have been routinely used for many years now and are very reproducible. The graphs clearly show the differential effects caffeine and torin1 exert on cell cycle progression in wt and mutant strains exposed to phleomycin.

      Figure 3 is not quantitative and cannot support the conclusions drawn from it. If, for example, the authors wish to demonstrate ATP can suppress checkpoint override (Figure 3A) they should use the same septation assay used before. If this is not possible, then it should be explained why not and an alternative quantitative assay should be developed. It is unclear why the authors include Figure 3B,C at all.

      Ssp2, on the other hand, appears to regulate cell fate according to the nutritional state. Its sensing of nutritional cues is not limited to ATP/AMP levels as it also regulates the response to amino acid quality (e.g., glutamate versus torin1). Additionally, exposure to stress may induce a transient decline in ATP levels. We thus investigated how ATP might affect caffeine or torin1. We could not detect any major changes in the septation index (not shown). Cells exposed to ATP in the presence of caffeine and phleomycin were shorter. We cannot tell how exactly suppresses the effect of caffeine and torin1 on DNA damage sensitivity.

      It is unclear to this reviewer what the significance of the data with gsk3D cells is (Figure 5). The authors should introduce the protein, why there is an expectation that it would have a role in the pathway and explain its relevance. Similarly when discussing the resulting data.

      Gsk3 lies downstream of TORC2 which is inhibited by torin1 but not caffeine. Gsk3 regulates Pub1 stability which is the E3 ligase for Cdc25. We showed previously that caffeine stabilizes Cdc25, suggesting it might interfere with Pub1 activity. Additionally, we are investigating caffeine as an indirect inhibitor of TORC1 with torin1 that directly inhibits both complexes. Our data provide further evidence for a differential effect of caffeine and torin1 on TORC1 signaling. We have modified the text accordingly.

      Figure 5A shows a similar response of wild type cells to phleomycin regarding checkpoint override as was shown in Figure 1A. However Figure 5C is not recognisable as equivalent to Figure 2A, yet both report sensitivity to phleomycin od wild type cells under equivalent circumstances. This is a major concern as to reproducibility of these data. It is also not possible to conclude from either Figure 5C or 5D that caffeine or torin1 treatment is, or is not, sensitising cells to phleomycin treatment, yet this conclusion is made when discussing the data.

      We agree with this and other reviewers that demonstrating enhanced sensitivity to caffeine is problematic. Nonetheless, our cell cycle data clearly indicate a differential role for Gsk3 in mediating the cell cycle effects of caffeine and torin1. In terms of DNA damage sensitivity, we have reproducibly observed a lower degree of DNA damage sensitivity in gsk3 mutants relative to wt cells. Hence, while caffeine is less effective at enhancing DNA damage sensitivity relative to torin1 in wt cells; we observed that caffeine and torin1 increase DNA damage sensitivity to a similar degree in gsk3 mutants.

      Figure 6A shows that caffeine, but not torin1 results in Ssp2 phosphorylation. Is this experiment reproducible and does the total level of Ssp2 increase reproducibly? This should be doe ae and the results discussed. Ideally, the bands would be quantified against actin intensity and presented as a bar graph with standard deviation.

      We have repeated these experiments alone and in combination with phleomycin. This data convincingly show that caffeine but not torin1 induces Ssp2 phosphorylation. In fact, torin1 suppresses Ssp2 phosphorylation, likely due to inhibition of a feedback mechanism resulting from TORC1 inhibition. In contrast, caffeine likely activates Ssp1 via the stress response, which in turn phosphorylates Ssp2.

      Figure 6B, when introduced should explain the background as to why eIF2alpha phosphorylation is a readout of TORC1 activity. Importantly, the figure should be supported by an actin control and 3 repeats quantified. Figure 6C purports to establish that caffeine moderately attenuates Maf1 phosphorylation. To be able to state this, it would be essential to quantify the gel and report repeated results relative to actin and the total levels of Maf1. Similarly Figure6D and 6E require an actin control and would benefit from proper quantification.

      We have repeated the Maf1 experiments to clarify the data and show that caffeine suppresses Sck1 an additional TORC1 phosphorylation target.

      Minor comments<br /> p3 'cigarette smoke and other gases'?

      We have edited the statement.

      P4 torin1 was dissolved in DMSO (not were)

      We have edited the text.

      p5 phospho not phosphor Ssp2

      We have edited the text.

      p6 exlpain why ppk18 deletion results are surprising. Also this result could be discussed.

      It had been proposed previously, that Ppk18 is the Greatwall homologue in S. pombe and thus the major regulator of PP2A and mitosis downstream of TOCR1. Later studies suggested a redundant role for Cek1 in this pathway. While deletion of cek1 in a ppk18 background modulated the effect of torin1 on cell cycle progression, it did not interfere with the effects of caffeine. At present we cannot account for this observation. We cannot rule out that caffeine activates an additional kinase that regulates Igo1 activity.

      Together our data show that caffeine advances progression into mitosis in a manner that differs from direct inhibition of TORC1 by torin1.

      We have now added the relevant comments on this unexpected observation within the discussion.

      Explain why Cek1 is not tested

      We have now tested a ppk18 cek1 double mutant.

      p6 introduce what pap1 is when first mentioned

      We have introduced PP2APab1 as requested.

      Reviewer #3 (Significance):

      The data show that fission yeast Ssp1/2 has a role in inhibiting TORC1 in response to caffeine and this influences checkpoint override. This is an incremental, but potentially interesting, observation contributing to understanding mechanism(s) of caffeine action. The lack of quantification, the pleiotropic nature of the mutants used and the rather blunt endpoints assayed make it hard to establish to what extent the direct TORC1 inhibition by Ssp2 causes the checkpoint override, which limits is potential impact. The core observation may, however, be of interest to the wider caffeine field. The referee has the perspective of a yeast cell cycle geneticist.

      We thank the reviewer for identifying the significance of the study in understanding the mechanisms of caffeine effects on the cell cycle. We have added all the suggested experiments with additional mutants and protein markers as well quantitative approaches that have appropriately improved the manuscript. We believe that the mechanism provided is of more general interest and not limited to the caffeine field: manipulating the cell cycle and understanding the interplays between growth and stress are of general interest and importance.

      Reviewer #4 (Evidence, reproducibility and clarity):

      The authors provide a series of genetic studies identifying a role for Ssp1-Ssp2 signaling in TORC1-dependent responses to DNA damage. The main assays are cell division (i.e. septation index) and cell viability (i.e. serial dilution spot assays) following treatment with the DNA damaging agent phleomycin. The authors perform these assays in a number of genetic mutant backgrounds to determine which genes and pathways are required for the relevant cellular response. Supporting data also include microscopy images and western blots to test protein phosphorylation. In general, the results support a role for Ssp1-Ssp2 acting upstream of TORC1. However, in several cases the data do not support a straightforward relationship, and it is confusing to parse through a number of intermediate effects, which often vary between different assays. I have provided some specific comments below that might be addressed to strengthen the technical aspects of the manuscript.

      Major<br /> 1. The authors conclude "that caffeine and torin1 indirectly and directly inhibit TORC1 activity respectively" based on Figure 1. This conclusion seems quite strong given the indirect nature of assays in Figure 1, which test septation in the presence of DNA damage. The conclusion would require experiments that assay TORC1 activity itself.

      Both caffeine and torin1 have previously been reported to inhibit TORC1 which controls the timing of mitosis. We sought to investigate if caffeine mediates its effects via the stress response pathway. We have conducted additional experiments which clearly demonstrate that caffeine inhibits TORC1 at least partially via the activation of Ssp2. These observations make sense as we have previously shown that caffeine actives the stress response pathway to activate Srk1 which inhibits Cdc25. More recent studies my others indicate that Ssp1 is required to suppress Srk1 to allow progression into mitosis. This accounts for the failure of ssp1 mutants to advance mitosis under stress conditions. Additionally, Ssp1 activates Ssp2 which leads to the downstream inhibition of TORC1.

      1. Figure 2 needs some explanation to introduce the idea that cell growth reflects an intact DNA damage response that prevented division in the presence of phleomycin. I also felt that the conclusions were very strong given the data, and the authors should discuss each case more carefully. For example, deletion of ssp1 does not really suppress the ability of torin1 to enhance phleo sensitivity (Figure 2C).

      We would not expect the deletion of ssp1 to suppress the effect of torin1 under stress conditions. We have provided further evidence to show that Ssp1 is required to facilitate progression into mitosis at least in the presence of phleomycin or heat stress.

      1. Microscopy imaging in Figure 3 nicely complements some of the other assays. However, it seems important to know if the cells are actively growing in each of these cases. An example is torin and rapamycin shortening ssp1 mutants at 35 degrees: are these cells actively cycling?

      Our aim was to demonstrate that caffeine exacerbates the ssp1 phenotype. This would provide further evidence to show that caffeine exerts its effects at least in part by activating Ssp1. Cells do not cycle in the presence of torin1 as it inhibits both TORC complexes. We have provided additional evidence to show that caffeine does indeed interact with Ssp1. As the primary aim of the study was to determine is caffeine overrides DNA damage via Ssp1 we have not investigated if they are cycling. Their shortened size suggests that rapamycin and torin1 affect cell division in a different manner from caffeine.

      1. From Figure 6A, the authors conclude that caffeine induces phosphorylation of Ssp2. However, it appears that both Ssp2 protein levels and its phosphorylation levels are both increased, which seems an important distinction.

      We have repeated these experiments several times under different conditions. Some proteins become more stable when phosphorylated as has been previously demonstrated for Srk1 for instance.

      1. In Figure 6D, the authors should show separate gsk3 and ssp1 mutants. It seems likely that all phosphorylation of Ssp2 is due to Ssp1, but this should be shown.

      We have replaced the figure with a ssp1 single mutant.

      1. I am confused about Maf1 phosphorylation in Figure 6C. It is increased upon torin1 treatment, but it is discussed as an indicator or TORC1 activity. Does that mean that loss of its phosphorylation correlates with increased TORC1 activity? As written, I thought it was a TORC1 substrate, which led to confusion about its increased phosphorylation upon torin1 treatment.

      Maf1 is phosphorylated by TORC1. Inhibition of TORC1 would thus lead to a loss of phospho-Maf1 moieties and the accumulation of the unphosphorylated form. We have conducted additional experiments and under various conditions to show that caffeine weakly inhibits Maf1 phosphorylation. We note however, that different stresses result in differential outcomes following TORC1 inhibition. As such we have included new data to show that caffeine suppresses the TORC1 target Sck1. In S. pombe Sck1 and Sck2 regulate progression into mitosis.

      Minor<br /> 1. An untreated control should be shown for assays in Figure 1.

      We have included this data for figure 1a.

      1. An untreated control should be shown for assays in Figure 4.

      We have noted in the results for figure 1, that untreated cells and phleomycin only treated cells do not show any changes in septation index over the time course studied in these experiments.

      Reviewer #4 (Significance):

      The study has significance in connecting several conserved and central signaling pathways including TORC1, AMPK, and PP2A. Also, the study uses caffeine and torin1 that have effects in many different cell types. The connection between caffeine and torin1 effects on phleomycin-treated cells was previously established by these researchers. The significance of the current study is providing a genetic pathway for this connection. The significance is partly limited by some of the technical points raised in the previous section, such as some inconsistencies in the strength of results from different assays. Also, the role of these pathways in DNA damage response signaling is not new. While the main significance of this work might relate to a more specialized audience, it does add to a broader body of literature regarding these conserved pathways and processes.

      My expertise is yeast cell biology.

      While the roles of the pathways in DNA damage has been reported usinbg genetic and pharmacological combinations we dissect their relationships and provide mechanistic connections.

      We thank the reviewer for identifying the significance of this study. We believe we have now addressed the technical issues raised.

    1. Reviewer #3 (Public Review):

      This paper proposes a computational account for the phenomenon of pattern differentiation (i.e., items having distinct neural representations when they are similar). The computational model relies on a learning mechanism of the nonmonotonic plasticity hypothesis, fast learning rate and inhibitory oscillations. The relatively simple architecture of the model makes its dynamics accessible to the human mind. Furthermore, using similar model parameters, this model produces simulated data consistent with empirical data of pattern differentiation. The authors also provide insightful discussion on the factors contributing to differentiation as opposed to integration. The authors may consider the following to further strengthen this paper:

      The model compares different levels of overlap at the hidden layer and reveals that partial overlap seems necessary to lead to differentiation. While I understand this approach from the perspective of modeling, I have concerns about whether this is how the human brain achieves differentiation. Specifically, if we view the hidden layer activation as a conjunctive representation of a pair that is the outcome of encoding, differentiation should precede the formation of the hidden layer activation pattern of the second pair. Instead, the model assumes such pattern already exists before differentiation. Maybe the authors indeed argue that mechanistically differentiation follows initial encoding that does not consider similarity with other memory traces?

      Related to the point above, because the simulation setup is different from how differentiation actually occurs, I wonder how valid the prediction of asymmetric reconfiguration of hidden layer connectivity pattern is.

      Although as the authors mentioned, there haven't been formal empirical tests of the relationship between learning speed and differentiation/integration, I am also wondering to what degree the prediction of fast learning being necessary for differentiation is consistent with current data. According to Figure 6, the learning rates lead to differentiation in the 2/6 condition achieved differentiation after just one-shot most of the time. On the other hand, For example, Guo et al (2021) showed that humans may need a few blocks of training and test to start showing differentiation.

      Related to the point above, the high learning rate prediction also seems to be at odds with the finding that the cortex, which has slow learning (according to the theory of complementary learning systems), also shows differentiation in Wammes et al (2022).

      More details about the learning dynamics would be helpful. For example, equation(s) showing how activation, learning rate and the NMPH function work together to change the weight of connections may be added. Without the information, it is unclear how each connection changes its value after each time point.

      In the simulation, the NMPH function has two turning points. I wonder if that is necessary. On the right side of the function, strong activation leads to strengthening of the connectivity, which I assume will lead to stronger activation on the next time point. The model has an upper limit of connection strength to prevent connection from strengthening too much. The same idea can be applied to the left side of the function: instead of having two turning points, it can be a linear function such that low activation keeps weakening connection until the lower limit is reached. This way the NMPH function can take a simpler form (e.g., two line-segments if you think the weakening and strengthening take different rates) and may still simulate the data.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      __Reviewer #1 (Evidence, reproducibility and clarity (Required)):____

      Summary: In this manuscript by Berg et al the authors demonstrate that RNA polymerase activity is important for the formation of nuclear blebs. This is an interesting and significant finding because prior work has suggested nuclear bleb formation is a result of changes in nuclear rigidity (lamins) or chromatin (via histone modifications). Overall I thought the manuscript was quite interesting and the data well presented. I think the inclusion of multiple mechanisms of blebbing (VPA treatment, as well as lamin B KO) helps to further support the importance of RNA polymerase/transcription activity in the blebbing process. However, I do have some concerns regarding the conclusions of the data that I think should be addressed as a revision.__

      We appreciate that Reviewer states that “the manuscript was quite interesting and the data well presented”, it is a “significant advancement”, and “the first report of this phenomena, and thus will be impactful to the nuclear mechanics field.”

      In the points below, the Reviewer specifically suggests that we: 1) clarify possible contributions from RNA pol III, 2) address how global vs. local chromatin motion might contribute to our findings, and 3) discuss the force production capabilities of RNA pol II. We also appreciate the feedback regarding the conclusions and have made the specific changes requested in the revision.

      Major Comments:____ 1. One concern I have is that the alpha-amanitin inhibitor has been shown to also inhibit RNA polymerase III. In an old study (1974 Weinmann PNAS) it appears that the inhibitor starting at 1 to 10 ug/ml. In this study the authors are using 10 uM alpha-amanitin, which is ~ 9 ug/ml and within the range of inhibiting some RNA polymerase III. Additionally, the other drug (actinomycin D) is even less specific for RNA polymerase II. I would suggest that the authors consider one of the following approaches 1) acknowledge in the manuscript the potential for RNA polymerase III to be important in the blebbing process 2) try a 10-fold lower dose of alpha-amanitin and see if that also inhibits blebbing, 3) try to find a way to demonstrate that RNA polymerase III activity is not inhibited at the 10 uM alpha-amanitin dosage, or 4) consider an alternate method to perturb RNA polymerase II activity (see Zhang Science Advances 2021 for an auxin-based approach to downregulate RNA polymerase II).

      The Reviewer raises the point that alpha-amanitin inhibits both RNA pol II and III. In the revised manuscript, we provide new data to further support that the observed effects arise from RNA pol II. We now include new data from cells treated with the transcription inhibitors flavopiridol (which inhibits RNA pol II elongation) and triptolide (which inhibits RNA pol I and II initiation). These transcription inhibitors also suppress nuclear blebbing in VPA-treated nuclei (Figure 2C) as well as three other nuclear blebbing perturbations in chromatin and lamins (Supplemental Figure 1A). These new experiments directly show that nuclear bleb suppression by transcription inhibitors can be observed without possible inhibition of RNA pol III by alpha-amanitin.

      __ A second concern I have is that the inhibition of RNA polymerase is global. Thus it is difficult to know for sure the biophysical function of the polymerase occurs immediately at the bleb, or instead is somehow affecting the overall chromatin state throughout the entire nucleus. I agree that figure 3 does provide some evidence that major mechanical and biophysical properties of the nuclei are not changed in response to the inhibition of the polymerase. However, micromanipulation experiments are done with isolated nuclei, which may be somehow mechanically altered already by isolation from cells. I feel that there still must be given some consideration in the discussion of the possibility that RNA polymerase activity outside of the bleb may be having some role in the stabilization of the chromatin and blebbing propensity.__

      We appreciate the Reviewer’s insightful comments and we have revised the manuscript to clarify that we do not attribute blebbing purely to local effects. Instead, we argue that global changes in chromatin motion driven by transcription could contribute to nuclear blebs.

      We did not intend to communicate that alterations to chromatin or its dynamics were necessarily only local. Indeed, we found that relative levels in RNAP Ser2 and Ser5 phosphorylation were different inside the blebs (Figure 6). Nonetheless, transcription was perturbed globally in our experiments, so we realized that blebbing could be driven by global changes (Figure 1). We hypothesize that global regulation of transcription can stimulate nuclear blebbing since transcription and its inhibition can, respectively, drive and suppress correlated chromatin motion throughout the entire nucleus (as previously observed by Zidovska et al. (PNAS 2013) and Shaban et al. (NAR 2018, Genome Biol. 2020), among others). We have revised the manuscript to clarify this point (Discussion section, page 15). We have also added new simulation snapshots showing global chromatin motions and how these motions are coupled to nuclear morphology (Figure 7C).

      In response to the concern that isolated nuclei exhibit different mechanical properties than nuclei inside of cells, we refer to our previously published micromanipulation measurements (Stephens et al. MBoC 2017). There, we found that nuclei within the cell and outside of the cell have quantitatively similar spring constants and qualitatively similar force-extension curves. Therefore, we are confident that the lack of change in nuclear stiffness measured by micromanipulation accurately reflects the mechanics of nuclei inside of cells across different perturbations.

      __ While I lack expertise to evaluate the basis of the model, I appreciate the model can show that motor activity can influence bulge. But it is not clear in the manuscript that RNA polymerase can generate these kinds of forces. The Liu citation is a model, and does not provide direct evidence that the RNA polymerase can generate force, or forces large enough to be meaningful. To me the model in this paper (Figure 7) felt as if it was only a possible hypothesis of why the RNA polymerase has an effect on blebbing, but I imagine there could be other hypotheses that would cause the same effect. The authors state (in the abstract) that RNA pol II can generate active forces, but I am concerned this is not sufficiently established. Since this motor/force activity of RNA polymerase is not experimentally demonstrated in this paper the authors should either do a better job of including evidence of this from the literature or consider removing this part of the manuscript.__

      RNA polymerase is capable of exerting forces in excess of 10 pN (e.g., see Wang et al. Science 1998; Herbert et al., Annu Rev Biochem 2008). The collective activity of many motors (10’s of thousands, e.g., see Zhao et al. Proc. Natl. Acad. Sci. 2014) may generate even larger forces. As discussed in our earlier modeling paper, this force scale is consistent with the motor strengths studied in our simulations (Liu et al. Phys. Rev. Lett. 2021); in the present work, we present simulation results for motors that generate 0.14 pN forces. Thus, transcription, in principle, could generate forces even larger than the ones we considered in the model.

      Additional experiments indicate that at larger length scales, RNA polymerase activity appears to drive coherent motions of chromatin throughout the cell nucleus (Zidovska et al. PNAS 2013; Shaban et al. NAR 2018; Shaban et al. Genome Biol 2020). It is these motions, driven by motors, that appear to drive the formation of nuclear bulges in our model (please see new panel Figure 7C).

      Therefore, the aim of the model is to build on established and new results to better understand how transcription could alter nuclear morphology. Our model is adapted from earlier models, which could reproduce observations of chromatin-based nuclear rigidity, (Stephens et al. MBoC 2017, Banigan et al. Biophys J 2017, Strom et al. eLife 2021), some aspects of nuclear morphology (Banigan et al. Biophys J 2017, Lionetti et al. Biophys J 2020), and possibly explain how nonequilibrium motor activity (such as RNA pol II) can drive coherent chromatin dynamics (Liu et al. PRL 2021), which have been observed in live-cell imaging experiments (e.g., Zidovska et al. PNAS 2013; Shaban et al. NAR 2018; Shaban et al. Genome Biol. 2020, among others). The precise form of the motor activity is not the focus of our model (or the previous motor model in Liu et al. PRL 2021). Instead, our simulation result indicates that the relatively small motor forces that generate coherent chromatin dynamics could explain the surprising observation that transcription is a critical component of nuclear blebbing.

      To address the Reviewer’s comment, we have added additional text to the Introduction and the Results sections to support the inclusion of motors to model the possible effects of transcription on chromatin dynamics and nuclear shape.

      In the Introduction (page 4), we now write:

      Simulations suggest that chromatin connectivity combined with the forces generated by polymerase motor activity (~10 pN per polymerase (Herbert et al. 2008)) could generate these dynamics (Liu et al., 2021).

      In the Results section (page 10), we write:

      We consider motors that generate sub-pN forces, well below the 10 pN forces that may be generated by individual RNA polymerases (Herbert et al. 2008).

      Additionally, we have updated Table 1 to include the simulated motor strength.__ __

      __ Minor Comments: 1. Did the authors do any analysis to see if the increased RNA transcription with VPA treatment (Figure 1B) has any spatial relationship to where the bleb occurs? Could an analysis of this be done similar to Figure 6 (with a bleb/body ratio)?__

      The Reviewer raises an interesting point about measuring RNA localization relative to the bleb. We measured RNA intensity in the bleb and the nuclear body for wild type cells only. We find that RNA levels are significantly decreased in the bleb (80% of body signal, p

      __ Is there anything known about lamin B1 KO cells as to whether or not they have increased transcription? Or could the authors do an analysis like they did with VPA treatment to check this?____ If they were to have increased transcription this would further support the authors' proposed mechanism of transcription itself (or RNA polymerase activity) driving blebbing).__

      In the revised manuscript, we show that several nuclear perturbations that are known to decrease nuclear stiffness and cause increased nuclear blebbing also rely on active transcription. Lamin B1 knockout or knockdown cells have been shown to result in changes in transcription. However, it was difficult to find data that shows whether the overall level of transcription changes. Collaborators of ours have unpublished data that indicates that twice as many genes are upregulated as downregulated upon lamin B1 knockdown, but this still does not assess the total level of transcription within the nucleus. Alternatively, increasing transcription via other means is fraught with off-target effects, which would require many additional complementary experiments. We thank the Reviewer for this interesting suggestion, but we believe this is beyond the scope of this manuscript, in which we have focused on showing that transcription inhibition suppresses bleb formation.

      __ Figure 1D, the VPA ser2 image appears much brighter than the untreated image. Yet the graph shows they are similar. Perhaps a more representative image should be used?__

      The image used reflects the data that Ser2 signal is brighter (by ~10%) in VPA-treated cells but is not significantly altered compared to wild type (unt), and thus it is an accurate reflection of the data.

      __ Can the authors comment if there is less DNA at the bleb site? In Figure 6 A this appears to be the case (based on the VPA image). If true, is the alpha-amanitin treatment rescuing this such that there is more DNA at the bleb (maybe causing the bleb to be smaller?).__

      We find that there is less DNA signal intensity per unit area in the nuclear bleb as compared to the nuclear body (bleb has ~60% the signal of the body; see teal dots/data in Figure 6B). This agrees with previously published work from our lab (Stephens et al. 2018 MBoC).

      Alpha-amanitin treatment does not rescue this effect. Decreased DNA enrichment in the bleb remains with alpha-amanitin treatment (p > 0.05, comparing across all 4 conditions in Figure 6B).

      __ What is the significance of bleb vs non-bleb nuclear rupture? Is there anything known in the literature as to how these ruptures may be different in terms of biophysics, impact to DNA, repair? It would be helpful to have some context, as well as to understand if non-bleb rupture is something that may have been previously missed in other contexts.__

      The Reviewer asks a valid and interesting question that this manuscript only begins to address. In general, we believe that ruptures occurring with blebs vs. without blebs may reflect aspects of the underlying mechanism(s) of blebbing and rupture, in the presence or absence of transcription. We offer a few further thoughts below.

      1) Non-bleb nuclear ruptures have been reported in a few papers by our group (Stephens et al., 2019 MBoC) and others (Chen et al., 2018 PNAS), but much is still unknown.

      2) Non-bleb nuclear rupture is part of normal nuclear behavior, as it accounts for ~20% of nuclear ruptures in wild type and perturbed cells (VPA and LMNB1-/-).

      3) Overall, we think that bleb-based and non-bleb-based ruptures may occur through different mechanisms. The simplest difference is that bleb-based nuclear ruptures follow the nucleus’ ability to form blebs, whereas non-bleb-based nuclear rupture occurs in cases where there is less bleb formation, suggesting that factors other than the ability to form blebs may also be important for rupture. In the current study, we observed that bleb-based nuclear ruptures (and bleb formation) require transcription. In another manuscript from our lab under review, bleb-based nuclear ruptures (and nuclear blebbing) can be suppressed by actin contraction inhibition and increased by increased actin contraction (Pho et al., biorxiv 2022).

      Additionally, we note it was reported that non-bleb-based nuclear ruptures, at least some of which are driven by microtubule prodding, result in increased levels of DNA damage (Earle et al. Nat Mater 2020), as has been observed for bleb-based ruptures (Stephens et al., 2019 MBoC; Xia et al. J Cell Bio 2018). Thus, nuclear rupture in general is thought to lead to DNA damage. However, total levels of DNA damage due to rupture may be controlled by different cellular processes.

      In the revision, we have clarified our motivation for quantifying ruptures with and without blebs. We have also added a few remarks, drawn from the above comments, to the Discussion section (pages 11-14).

      Reviewer #1 (Significance (Required)):____ General assessment: This study is a careful analysis of how RNA polymerase inhibition reduces nuclear blebbing. The study demonstrates this very well, using a variety of approaches. However, some limitations are the overstatement of some conclusions (specifically that it is RNA polymerase II when the inhibitor may also affect RNA polymerase III; that the RNA polymerase activity is important at the bleb and involves motor activity). Advance: This paper is a significant advancement because it shows the role of transcription in the biophysics of the nuclear shape. To my knowledge this is the first report of this phenomena, and thus will be impactful to the nuclear mechanics field. Audience: I think the findings are of broad interest, including beyond the nuclear mechanics field. I think the audience would be the entire cell biology community. Expertise: My expertise is in cell mechanics, including forces at the the nuclear LINC complex. While I do not work in the field of nuclear blebbing and rupture, I follow this field quite closely.

      We greatly appreciate the Reviewer’s statement that “To my knowledge this is the first report of this phenomena, and thus will be impactful to the nuclear mechanics field.__” __We thank the Reviewer for their thoughtful comments and suggestions, which have helped to improve the manuscript. __

      __

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The authors present data supporting the potential involvement of active transcription in the formation of nuclear blebs when the global deacetylase inhibitor valproic acid (VPA) has been applied to cells

      Reviewer #2’s greatest concern throughout the review was that we focused on the use of VPA as a model for generating increased nuclear blebbing and 24-hour treatment with alpha-amanitin as a transcription inhibitor. In the revised manuscript, we provide new data to show that nuclear blebbing generated by a variety of different nuclear perturbations (VPA, DZNep, LMNB1-/-, and LA KD Figure 2D __and __Supplemental Figure 1A) is reliant on active transcription in two different cell lines (MEF and HT1080, Figure 2 A and B). This is supported by use of four different transcription inhibition drugs, which work over varying time periods (24 hrs in alpha-amanitin, triptolide, or flavopiridol; actinomycin D for 1.5 hrs Figure 2C). We also timelapse imaged during drug treatment to show that transcription inhibitors for which we used 24-hour incubation times, can suppress nuclear blebs within 8 hours (Supplemental Figure 1B). __We also show that nuclear bleb formation and stability in wild type is transcription dependent (__Figure 5). We believe the new data added in our revised manuscript addresses the concerns of the Reviewer that the findings were specific to VPA and alpha-amanitin together only.

      __Reviewer #2 (Significance (Required)):____

      The authors present data supporting the potential involvement of active transcription in the formation of nuclear blebs when the global deacetylase inhibitor valproic acid (VPA) has been applied to cells. __

      While somewhat interesting, this is a rather specific condition that is further restricted by the limited use of experimental approaches. For example, the only deacetylase inhibitor used is VPA. Is this because VPA is the only one to trigger the effect? The authors should expand their approach to include additional inhibitors or, preferably, a directed knockdown tactic that targets the specific HDACs driving their phenomena.

      The Reviewer is concerned that we have used limited experimental approaches by focusing on VPA treatment to induce nuclear blebs and alpha-amanitin overnight treatment to suppress nuclear blebbing. VPA treatment is a well-established perturbation to induce nuclear blebbing via HDAC inhibition, and it is similar to a variety of other nuclear perturbations that also induce blebs (Stephens et al. MBoC 2018, 2019; Kalinin et al. MBoC 2021; Pho et al. biorxiv 2022).

      Nonetheless, to clearly address the Reviewer’s concerns we have provided new data which shows that four different nuclear perturbations are suppressed by transcription inhibition and that four different transcription inhibitors suppress nuclear blebbing. In addition to these perturbations, we also note that transcription inhibition affects bleb formation and stability in wild type cells. Below we outline the diverse experimental approaches that support the major conclusion of our manuscript.

      Our data shows that transcription inhibition suppresses nuclear blebbing through data for:

      1. Multiple cell lines (MEF and HT1080, Figure 2, A and B) – original data
      2. Multiple transcription inhibitors (Figure 2C __and Supplemental Figure 1__):
      3. Alpha-amanitin (RNA pol II and III degradation) – original data
      4. Triptolide (RNA pol I and II initiation inhibition) – new data
      5. Flavopiridol (RNA pol II elongation inhibition) – new data
      6. Actinomycin D (DNA intercalation) – original data

      7. Multiple perturbations that cause nuclear blebbing (Figure 2D ____and Supplemental Figure 1):

      8. VPA histone deacetylase inhibitor, which increases euchromatin and chromatin decompaction; used because it is the most highly studied treatment by our lab (Stephens et al., 2017, 2018, 2019 MBoC; Pho et al., 2022 biorxiv) – original data
      9. DZNep histone methyltransferase inhibitor, which decreases heterochromatin and chromatin decompaction (Stephens et al., 2018, 2019 MBoC) – new data
      10. Lamin B1 null cells (LMNB1-/- or LB1-/-) (many previous works, including Stephens et al. MBoC 2018) – original data
      11. Lamin A constitutive knockdown cells (LA KD) (Vahabikashi et al., 2022 PNAS) – new data

      12. Nuclear bleb formation and stabilization in wild type cells is dependent on transcription in addition to VPA (Figure 5). – original data

      13. Time dependence of suppression of nuclear blebbing requested by Reviewers 2 & 3:
      14. Actinomycin D treatment of 1.5 hrs is sufficient to suppress nuclear blebs (Figure 2C) – original data
      15. Transcription inhibition with alpha-amanitin, triptolide, and flavopiridol all show an increased rate of nuclear bleb reabsorption in the first 8 hrs of treatment for both VPA and LMNB1-/- perturbations (Supplemental Figure 1B) – new data.
      16. This new data indicates that even formed blebs require active transcription to remain blebbed for long times
      17. This new data also shows that the effect of transcription inhibition on nuclear blebbing does not require 24 hours of treatment.

      __Moreover, the authors imply that VPA works through histone deacetylation yet do not provide direct evidence. It is equally likely that the application of VPA alters the acetylation pattern of a non-histone protein that eventually alters nuclear blebbing. __

      The Reviewer questions whether histone deacetylation due to VPA treatment is responsible for nuclear blebbing. As the Reviewer notes in their next point below, histone deacetylation (e.g., by VPA or TSA treatment) as a mechanism for nuclear blebbing was previously established by work from our lab (Stephens et al., 2018 and 2019 MBoC) and others (Kalinin et al. MBoC 2021). This was described and referenced in the original manuscript’s introduction.

      To summarize previous work, inhibition of histone deacetylation by VPA induces chromatin decompaction (Stypula-Cyrus et al. PLoS One 2013, Lleres et al. J Cell Bio 2009), increasing histone acetylation/euchromatin (Göttlicher et al. EMBO J 2001; Krämer et al. EMBO J 2003). In turn, this softens the nucleus (Stephens et al. MBoC 2017; Shimamoto et al. MBoC 2017), which succumbs to nuclear blebbing (Stephens et al., MBoC 2018). Softening and blebbing effect can also be induced by histone hyperacetylation via TSA or histone demethylation via DZNep (Stephens et al., MBoC 2018). This effect can be reversed by chromatin compaction via increased histone methylation/heterochromatin formation (Stephens et al. MBoC 2019).

      In the present work, we measured histone acetylation (H3K9ac) in both VPA and VPA+alpha-amanitin perturbations to ensure that alpha-amanitin does not simply reverse the increase in VPA-based histone acetylation and thereby decrease nuclear blebbing, which it does not (Figure 3, A and B).

      Altogether, inhibition of histone deacetylation by VPA as a mechanism for nuclear blebbing is established by the previous literature. The present work builds on those results to uncover a surprising new driver of nuclear blebbing which is transcirption. Therefore, we consider it to be unnecessary to provide further confirmatory measurements of VPA-treated cells beyond what is already provided in the manuscript. Finally, we point to the inclusion of new data from three other nuclear perturbations that cause nuclear blebbing that can be suppressed by transcription inhibition (Figure 2).

      __Regardless, the reported findings with VPA were previously reported (Stephens et al. 2018) and the influence of alpha amanitin only represents an incremental advancement in our understanding of nuclear blebs. __

      The finding that alpha-amanitin inhibits nuclear blebbing implies that a previously unknown mechanism/pathway, involving an essential genomic process, is critical to nuclear shape regulation. We therefore strongly disagree with the Reviewer that bleb inhibition upon alpha-amanitin treatment represents an incremental advance.

      Moreover, the existing literature generally argues that nuclear blebbing is caused by actin-based compression and confinement. It is widely believed that the cytoskeleton deforms the nucleus, which can herniate a nuclear bleb in softer nuclei. Here, we show that with transcription inhibition there are no overt changes to actin contraction (Supplemental Figure 2), actin confinement (Figure 3E), and nuclear mechanics (Figure 3G). However, levels of blebbing change anyway! This will be a new and surprising result to those who believe the current prevailing narrative from the literature. We have now shown for the first time that transcription is also needed to form and stabilize nuclear blebs; to our knowledge, this was almost entirely unknown until now.

      Further supporting our belief in the significance of our findings, Reviewer #1 and Reviewer #3 clearly state that our work is novel and important:

      Reviewer #1 “To my knowledge this is the first report of this phenomena, and thus will be impactful to the nuclear mechanics field.”

      Reviewer #3 “This is an interesting study that shows, for the first time, that inhibition of transcription reduces the occurrence of nuclear blebs in cells that have been pre-treated with valproic acid.”

      To address the Reviewer’s concern, we have revised the manuscript to clarify that active transcription is required to form nuclear blebs across all of the perturbations now presented in this manuscript. Furthermore, we have clarified that transcription inhibition appears to suppress blebbing without altering other cellular components and properties (actin, nuclear stiffness) that are widely believed to control blebbing (see Results page 7, Results page 10, Discussion page 14).

      Adding to the concern is that actinomycin D does not have the same level of influence as alpha amanitin (Figure 2), which suggests the alpha amanitin is having a pleotropic impact on blebbing. To validate that the changes in blebbing in the presence of VPA are dependent upon active transcription, the authors should use the anchor-away technique to remove RNAP from the nucleus thereby avoiding any indirect effects of the drugs (i.e., alpha amanitin) in use. Further adding concern that it is an indirect outcome is the prolonged incubation period (16-24 hours) that is apparently needed to observe the changes (page 5 paragraph 4). If it is active transcription that is causing the change in blebbing, then this should be apparent in a much shorter time frame (The Reviewer is worried about possible differences between transcription inhibitors actinomycin D and alpha amanitin. To further address these concerns in the revised manuscript, we now present new data for VPA without transcription inhibitor and VPA with transcription inhibition vy four different transcription inhibitors (__Figure 2C). Inhibitors include alpha-amanitin (RNA pol II degradation), triptolide (transcription initiation inhibition), flavopiridol (transcription elongation inhibition), and actinomycin D (DNA intercalation). All VPA plus transcription inhibitor treatments result in a significant decrease in nuclear blebbing relative to VPA treatment alone (p (p > 0.05, Figure 2C). Thus, there is no significant difference in the degree of nuclear blebbing suppression between the four different transcription inhibitors used.

      Furthermore, the Reviewer raises concerns about the time interval from the start of transcription inhibitor treatment to suppression of nuclear blebbing. We agree that considering this time interval is valuable. However, we need to consider that the time interval for each of the different transcription inhibitors to take effect is different (Bensaude 2011 Transcription). Alpha-amanitin inhibits transcription in 4-8 hours (10 µM, Nguyen et al., 1996 NAR), triptolide (1 µM, Chen et al. 2014 Genes Dev) and flavopiridol (0.5 µM, Chen et al., 2005 Blood) work in 2-4 hours, and actinomycin D works in about 1 hour (10 mg/mL, Lai et al. 2019 Methods). These times are now mentioned in the manuscript (Figure 2 legend and Methods section).

      It was not, however, known in advance how long it would take for transcription inhibition to have an effect on nuclear morphology. Therefore, the time to observe bleb suppression could have been longer than these treatment durations. As mentioned above, treatment with actinomycin D for 1.5 hours results in a similar decrease in nuclear blebbing as compared to the other inhibitors with 24-hour treatment (Figure 2C). To further address these concerns, we provide new data in the revised manuscript showing tracking of nuclear bleb reabsorption during the first 8 hours of treatment with alpha amanitin, triptolide, and flavopiridol via live cell imaging. Nuclear bleb reabsorption for both VPA and LMNB1-/- perturbations goes from ~5 % to 30% or greater during the first 8 hours of treatment with each of the transcription inhibitors (Supplemental Figure 1B), consistent with the time required to fully inhibit transcription. This supports our conclusion that transcription is essential to stabilizing nuclear blebs.

      __In addition to these issues, the authors rely on immunofluorescence signals to measure the levels of various factors including the Ser5 and Ser2 phosphorylation, which is capturing the total levels of these factors and not the DNA bound forms. If the changes in blebbing actually involve transcription initiation, then the authors should include measurements on the DNA-bound factors. __

      We are measuring Ser5 and Ser2 phosphorylation of RNA polymerase to track the actively DNA transcribing population. These markers appear on DNA-bound RNAP. Ser5 and Ser7 of RNAP are phosphorylated during initiation, and subsequently dephosphorylated during transcription elongation, while Ser2 is added at that time (Hsin and Manley 2012 Genes Dev). Ser2 is removed at transcription termination. Therefore, we expect immunofluorescence to measure DNA-bound RNAP.

      __As reported the authors conclude that there is no changes in Ser2 and Ser5 phosphorylation yet they report that total RNA levels rise (Figure 1). How is the disconnect between RNA levels and Ser2 and Ser5 phosphorylation occurring? __

      The Reviewer raises a question about how VPA treatment increases RNA levels but not levels of active RNA pol Ser2 and Ser5. While this is an interesting question, without a dedicated investigation, we can only speculate, at best; this question is beyond the scope of the paper focused on how transcription inhibition suppresses nuclear blebbing. The point of this data is to show that treatment with alpha-amanitin alone and along with VPA causes decreases in both RNA and RNA pol II Ser2 and 5 confirming transcription inhibition.

      __Comparably, they use H3K9ac immunofluorescence as a measure of euchromatin. While the authors might be gaining a view on the total levels of H3K9ac under these experimental conditions, it is not clear whether this is DNA associated or not. Minimally, the authors should perform ATAC-Seq to judge the changes in euchromatin. __

      The Reviewer questions the use of H3K9ac immunofluorescence as measurement of euchromatin levels, particularly in VPA-treated cells. The relationship between VPA and chromatin decompaction / euchromatin levels has been previously established (e.g., Stypula-Cyrus et al. PLoS One 2013, Felisbino et al. J Cell Biochem 2014, Lleres et al. J Cell Bio 2009). New data in Figure 3B shows that heterochromatin marker H3K9me2,3 also is not altered by alpha-amanitin treatment. In the case VPA + alpha-amanitin treatment, micromanipulation and nuclear height measurements provide further evidence that chromatin decompaction remains, since chromatin-based force response is unchanged from VPA treatment alone (Figure 3, E and G).

      Again, we note that our manuscript focuses on the effects of transcription on nuclear blebbing and rupture, which were not previously reported and differ from the current understanding in the literature. Furthermore, ATAC-seq is a major undertaking that is simply not appropriate for further proving an auxiliary point about a previously established effect.

      In summary, the original manuscript addresses this point. The specific experiment requested by the Reviewer is not necessary and is far beyond the scope of this study.

      A final major concern is the lack of a correlation between the blebbing and nuclear ruptures (page 7 paragraph 3; Figure 4). If ruptures are not correlating with the blebbing, what is the relevance of the blebbing?

      The Reviewer is asking for a clarification of the importance of nuclear blebbing in relation to nuclear ruptures. We have revised the manuscript to add new text to the Figure 4 legend clarifying the measurements and to the Discussion section describing the importance of this data (Discussion pages 12-13 and page 14). We discuss this in more detail below.

      We would like to clarify that blebbing and nuclear rupture are not uncorrelated, as suggested by the Reviewer. We and others have shown that nuclear blebs are sites of high curvature that result in nuclear ruptures. In the present manuscript, timelapse imaging of nuclear bleb formation has been observed to result in nuclear rupture within minutes in all imaged cases (Figure 5). This data in the manuscript agrees with previous published data from our lab of bleb formation to rupture in >95% of the time (Stephens et al., 2019 MBoC). Furthermore, stabilized nuclear blebs persist for hours (Supplemental Figure 1B) and undergo more rupture, as shown in Figure 4D. Therefore, ruptures remain correlated with nuclear blebs in our study.

      What we have shown, however, is that the percentage of cells that undergo at least one nuclear rupture during the time lapse is not statistically significantly decreased from VPA-treated levels by the addition of alpha-amanitin (Figure 4B). This appears to be due to two factors: 1) a basal level of nuclear rupture (see wild type data in Figure 4) and 2) an increase in the level of non-bleb-based nuclear rupture. However, importantly, non-bleb-based ruptures appear to occur less frequently for cells that undergo nuclear ruptures. Of the cells that exhibit nuclear rupture, those with non-bleb-based ruptures on average undergo only a single rupture over a 3-hour timelapse whereas those undergoing bleb-based rupture undergo an average of > 2 ruptures over the same time (Figure 4D).

      Altogether, these data point to a correlation between blebbing and rupture, where blebbing can promote nuclear rupture, but is not essential for rupture. Therefore, observations of blebs are important in that they correspond to increases in nuclear rupture and corresponding nuclear dysfunction, such as DNA damage. The observation of non-bleb-based rupture, while not entirely a new (Chen et al. PNAS 2018, Stephens et al. MBoC 2019, Pho et al. bioRxiv 2022), is interesting because it may be driven by a different mechanism; transcription is not essential for nuclear ruptures in the absence of nuclear blebs but promotes rupture in the presence of blebs. These results add to our knowledge of the factors regulating nuclear integrity and shape, and we anticipate that they will be further investigated in future studies.

      Finally, beyond these findings, we speculate that blebbing itself may be harmful to cell nuclear function. Previous studies have observed that nuclear deformations can cause DNA damage (Shah et al. Curr Biol 2021), chromatin reorganization (Jacobson et al. BMC Biol 2018, Golloshi et al. EMBO J 2022), and alterations to mechanotransduction (reviewed in Kalukula et al. Nat Rev Mol Cell Biol 2022). The extent to which the changes associated with these “nuclear deformations” require blebbing, rupture, or both is under investigation by various labs. Furthermore, previous studies (Shimi et al. Genes Dev 2008; Pfleghaar et al. Nucleus 2015) along with the present study (RNA Pol Ser2 and Ser5; Figure 6) have shown that chromatin content and, possibly, functionality is different within the nuclear bleb. Data in another manuscript in preparation from our lab, further suggests that there is limited exchange of biomolecular content between the nuclear body and bleb. Therefore, while we cannot conclusively claim that blebs are themselves deleterious to function, there is a growing body of suggestive evidence that this is the case.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This is an interesting study that shows, for the first time, that inhibition of transcription reduces the occurrence of nuclear blebs in cells that have been pre-treated with valproic acid. The data that supports this is in Figure 2, collected in two different cell types (MEFs and HT1080 cells). The effect appears robust. New data is also provided that a marker of initiation of transcription but not transcriptional elongation is enriched in valproic acid-induced blebs.

      We thank Reviewer #3 for positive comments that our study is “interesting”, “reproducible”, and data that shows the effect of transcription on nuclear blebbing “for the first time”.

      This Reviewer asks for clarifications on 1) how transcription is a new mechanism for nuclear bleb formation and not part of the traditional view, 2) the generality of our conclusions (similar to Reviewer #2) since we report “on the inhibition of transient, small, valproic acid-induced blebs by alpha-amanitin”, and 3) the insight the modeling provides. We have provided new data and made changes to the manuscript to address all the Reviewer’s comments.

      __ Major comments

      1. The paper makes general claims about transcription and nuclear shape, when in reality, it is only reporting on the inhibition of transient, small, valproic acid-induced blebs by alpha-amanitin. This scenario under which the experiments were performed, for which there is no obvious physiological counterpart, ought not to be construed to challenge or contrast with the current understanding that the nucleus maintains its shape by resisting cytoskeletal forces. Cytoskeletal forces are well-known to establish nuclear shape; nuclear shape in this context, is generally taken to refer to the gross shape of the nucleus (e.g. elliptical, circular, etc.), and not small local blebs that may form due to F-actin based confinement or other mechanisms. Thus, this interpretation is overstated:

      "Surprisingly, we find that while nuclear stiffness largely controls nuclear rupture, it is not the sole determinant of nuclear shape. This contrasts with previous studies, which suggested that the nucleus maintains its shape by resisting cytoskeletal and/or other external antagonistic forces (Khatau et al., 2009; Le Berre et al., 2012; Hatch and Hetzer, 2016; Stephens et al., 2018; Earle 12 et al., 2020)."

      __

      The Reviewer appears to be concerned with two issues in this comment. First, the Reviewer is concerned about our use of the word shape, which could be interpreted too generally, rather than as categorizing the blebbing and rupture phenomena that we observe in this study. We appreciate the Reviewer’s feedback and have made changes to this sentence as well as the paper in general to clarify that we are focused on nuclear blebs. Second, there is the issue of to what degree our results modify our understanding of the role of nuclear stiffness in nuclear blebbing and rupture. We discuss this below.

      To address the Reviewer’s comment that the results are limited to “the inhibition of transient, small, valproic acid-induced blebs by alpha-amanitin” we provide new data and context for our results. The revised manuscript includes 1) new data using four transcription inhibitors and four nuclear blebbing perturbations and 2) original data showing that nuclear blebs are persistent rather than small and transient, and they alter gross nuclear shape. Our results are relevant to a wider range of blebbing/rupture and bleb/rupture suppression scenarios, as exemplified by the different nuclear perturbations, transcription inhibitors, cell types tested in our experiments, and long lifetimes for nuclear blebs. More specifically:

      1) The Reviewer notes that our original studies were done with VPA and alpha-amanitin, similar to Reviewer #2 concerns. We provide new data to now show that 4 different transcription inhibitors can suppress nuclear blebbing across 2 chromatin and 2 lamin perturbations (Figure 2 and Supplemental Figure 1). Thus, our new data supports the idea that transcription is broadly required for nuclear blebbing.

      2) The Reviewer states that blebs are small and transient, and that “shape” is meant to reflect the gross shape (e.g., circular). In fact blebs are long-lived as we show with new data that most (>95%) of VPA and LMNB1-/- blebs, remain at the end of an 8-hour timelapse (Supplemental Figure 1B). Furthermore, on average, nuclear blebs account for 15% of the nuclear size in VPA-treated cells (Figure 6E). While not measured in this paper, many studies have shown that nuclear blebs cause gross circularity to decrease significantly and that changes in circularity are associated with nuclear rupture (e.g., Stephens et al. MBoC 2018, Xia et al. JCB 2018). Most recently, we show nuclear blebs decreased nuclear circularity significantly in another manuscript under review (Pho et al., 2022 biorxiv).

      The Reviewer also argues that our data showing the importance of transcription in nuclear blebbing “ought not to be construed to challenge or contrast with the current understanding that the nucleus maintains its shape by resisting cytoskeletal forces.” We acknowledge that our results are not sufficient to rule out the broad assertion made by the Reviewer. However, our data shows for the first time that nuclear blebbing relies on transcriptional activity, while we measure no change in actin contraction or confinement or nuclear stiffness (respectively, Supplemental Figure 2 and Figure 3, C-E). Consequently, these results are a challenge to the current understanding, which must be updated by our results and future experiments. At the same time, we note that this manuscript’s Discussion section acknowledges that we have data in another preprint in which inhibition of actin contraction decreases nuclear blebbing to near 0% in wild type and perturbations (Pho et al., 2022 biorxiv). Together, these observations suggest a complicated picture in which multiple factors are jointly responsible for regulating nuclear blebbing and rupture.

      __ As an aside, the data in the paper does not appear to support the interpretation that "nuclear stiffness largely controls nuclear rupture". It is unclear what the authors mean by this statement.__

      We originally intended that comment to state the previous understanding in the literature, but we realize it was unclear. We appreciate the Reviewer’s feedback and have revised the text.__ __

      __ 2. Further to point 2, treatment with alpha-amanitin does nothing to the occurrence of blebbing in normal cells. Thus, the data are specifically applicable to valproic acid-treated cells. As such, the broad interpretations related to nuclear shape and mechanics should be tempered.__

      The Reviewer is concerned that we cannot support the claim that this effect is broad and general; these concerns are also raised by Reviewer #2. We have provided new data and highlight original data to support that this effect is in fact broad and general, and moreover, that the data supports a role for transcription in nuclear blebbing.

      We specifically address the Reviewer’s statement: “treatment with alpha-amanitin does nothing to the occurrence of blebbing in normal cells”. In the original manuscript, we provided data that showed that wild type nuclear bleb formation and stability are suppressed upon transcription inhibition (Figure 5) even though the percentage of wild type nuclei exhibiting a bleb is not changed by alpha-amanitin treatment (Figure 2). We also provided data showing that the predominant type of nuclear rupture changes with alpha-amanitin treatment, including in wild type cells (blebbed vs. not, Figure 4C). Thus, while the effects of transcription inhibition are most easily visible in VPA-treated cells, they are also present in wild type cells in how blebs are formed and stabilized (Figure 5). We have revised the manuscript to better highlight this important point.

      In addition, we again emphasize that our results extend beyond VPA-induced blebs. Our revised manuscript now includes new data of 4 different perturbations (to chromatin histone modifications and lamins A and B) that induce nuclear blebs, which can be suppressed by 4 different transcription inhibitors (Figure 2 and Supplemental Figure 1). As previously noted by both Reviewers 1 and 3, this effect is reproducible in different cell lines. This new data directly addresses the concern that the effect is only applicable to VPA and alpha amanitin.

      Nonetheless, we agree with the Reviewer that we cannot support broader claims that nuclear mechanical properties are unaltered by transcription inhibitors across all scenarios, as we only measured this change in VPA-treated cells. Micromanipulation force experiments are detailed and time consuming, making it difficult to include data for multiple perturbations. We chose VPA because we have the most measurements of this perturbation which have remained consistent over the life of micromanipulation force measurements. Therefore, we have revised our statements on nuclear mechanics in the revised manuscript (page 14).

      __ T____he motor model for RNA pol II activity assumes that the motor 'repels' nearby chromatin units. It is not clear how this is related to the mechanism of motor action of RNA pol II on chromatin during transcription.__

      The point of the model is not to precisely reproduce the manner in which transcribing RNA pol II exerts forces on the chromatin fiber. Instead, we have developed a coarse-grained model to study how the collective activity of molecular motors might drive chromatin dynamics and consequently, changes in nuclear shape, either global or local.

      The model itself is based on our earlier models, which were used to recapitulate and understand how changes to chromatin mechanical properties governed nuclear rigidity (Stephens et al. MBoC 2017, Banigan et al. Biophys J 2017, Strom et al. eLife 2021; also see a similar model by Lionetti et al. Biophys J 2020) and how nonequilibrium activity due to molecular motors, such as RNA pol II, can drive coherent chromatin dynamics (Liu et al. PRL 2021), which have been observed in live-cell imaging experiments (e.g., Zidovska et al. PNAS 2013; Shaban et al. NAR 2018; Shaban et al. Genome Biol. 2020, among others). The current model therefore explores how the newly observed connection between transcription and nuclear blebbing could be explained by known phenomena.

      We note that the "repelling” motors used to model RNA pol II activity in the present work are in many ways qualitatively similar to the dipolar “extensile” motors used by other researchers to model motor-driven chromatin dynamics (e.g., see Saintillan et al. PNAS 2018). More generally, study of “active matter” over the last 20-30 years (and statistical physics over the last century) has shown that precise details of active molecular agents are often unimportant to the larger-scale behavior of the system (e.g., see Marchetti et al. Rev Mod Phys 2013). Thus, we view the repulsive motors as modeling the effective behavior of many RNA pol II within a sub-micron region of chromatin. Better establishing the differences between different choices of motor activities is the subject of a modeling paper in preparation.

      To address the Reviewer’s concern, we have more clearly stated the scientific foundations of the model, and we have revised our description of the model to clarify that we do not intend to model the behavior of individual RNA pol II by individual repulsive motors (see Results section, page 10).

      __The motor model also does not seem to add conclusive insight to the manuscript, as the nuclear shapes predicted are not directly comparable to the experimental shapes which are flat and smooth with only an occasional, single, local bleb. __

      The Reviewer raises two related points with this comment: that bulges and blebs are not directly comparable, and therefore, that the model “does not seem to add conclusive insight to the manuscript.”

      We agree with the Reviewer that bulges in the simulations are not blebs as they are observed in the experiments. However, it seems likely to us that bulges are necessary precursors to bleb formation; it is difficult to envision how a large local nuclear protrusion could form without first bulging outward from the nuclear body. Furthermore, we disagree with the assertion that nuclei are generally flat and smooth, as qualitative and quantitative analysis of imaging data reveals that nuclei exhibit shape fluctuations and irregularities across multiple scales (see, for example, Chu et al. PNAS 2017, Patteson et al. JCB 2019, Stephens et al. MBoC 2019, Liu et al. PRL 2021).

      Nonetheless, the observation of bulges but not blebs is a shortcoming of the simulation model. We believe this shortcoming reflects a tradeoff made in developing this model; we chose to develop and study a model with relative simplicity compared to a real cell nucleus. A more complicated model might better capture some aspects of nuclear blebbing at the expense of additional complexity. For example, the current model does not allow lamin-lamin or chromatin-lamin bonds to rupture, either stochastically or due to high forces. This effect, which is likely present in vivo, might be necessary for generating more bleb-like structures in simulations. Developing and refining such a model is an active pursuit within our collaboration, but for the moment, it is beyond the present purpose of the model.

      Instead, the purpose of the model is to determine whether the observed effect of transcription inhibition on nuclear blebbing / localized shape deformations can be understood through known biophysical phenomena. Established models – to the extent that they exist – were insufficient because they typically relied on nuclear mechanics, which our experiments provide data that transcription is not changing nuclear mechanical rigidity. The current model demonstrates how motor activity within chromatin can alter the structure and dynamics of the lamina. The simulations are certainly not proof that transcription affects nuclear blebbing through the proposed mechanism. However, they are a first-of-their kind demonstration of how nonequilibrium biophysical activity (such as that generated by transcription) within a biopolymer system (chromatin) can emergently alter the geometry of the confining boundary (the lamina). This new result provides a plausible interpretation for the experiments in the manuscript.

      In the revised manuscript, we have clarified our modeling approach and objectives in the Results and Discussion sections, and we have more clearly identified and discussed the limitations of the model (Results pages 10-11, Discussion page 15).

      The model offers 'proof of principle', but is not capable of ruling out alternative mechanisms (such as nuclear pressurization by confinement, chromatin decompaction, or changes to osmotic pressure). It may be more appropriate to include the model in the discussion as opposed to presenting it as a new result that can be reliably interpreted through comparisons with experiment.

      We respectfully disagree with the suggestion to include the model in the Discussion section instead of the Results. As discussed above, the model is new biophysics research and the simulations produced new scientific results, even if the overall interpretation remains open.

      However, we have some thoughts about the alternatives suggested by the Reviewer. This is discussed in detail below, but briefly: experimental data, rather than the model itself, suggests that the alternative mechanisms mentioned by the Reviewer do not explain the effects of transcription. After treatment with alpha-amanitin, we do not observe changes to actin-based confinement or contraction (Figure 3E, Supplemental Figure 2), and there are no changes to chromatin histone modifications or nuclear rigidity (Figure 3). We also are skeptical of osmotic pressure arguments since 1) fluid, ions, and small biomolecules should freely flow through nuclear pores to maintain osmotic pressure balance between the nucleus and the cytoplasm, especially on hours-long time scales, and 2) increasing the osmotic pressure by fragmenting chromatin has previously been observed to have either no effect or a suppressive effect on nuclear stiffness (Stephens et al. MBoC 2017, Belaghzal et al. Nat Genet 2021), which would potentially increase blebbing (the opposite of the effect suggested by the Reviewer). We have addressed this further in the revised Results section (page 10) and below.__ __

      __ 4. The data in the paper is not strong enough to rule out the more conventional mechanism of nuclear pressurization, which could be caused by F-actin based confinement or chromatin decompaction, or changes to osmotic pressure. Immunostaining of myosin is not a reliable way to compare myosin activity across conditions. It is possible that the long treatment with alpha-amanitin (unto 24 h, Fig. 2) relieves the pressure in the nucleus without measurable changes in the already established cell shape and hence the nuclear shape (height changes in spread cells are small at best -- valproic acid appears to reduce height by ~0.5 microns in Figure 3E which is smaller than the optical resolution along the z-axis of a typical confocal microscope).__

      The Reviewer has proposed several alternative mechanisms and questioned the use of immunostaining and nuclear height measurements in the manuscript. We address each of these below.

      Specifically, the Reviewer is concerned that we cannot rule out the more conventionally believed mechanisms of 1) actin confinement, 2) actin contraction 3) chromatin decompaction and/or 4) osmotic pressure. We have revised the text to clarify that our data and data from others strongly supports that these four “conventional” mechanisms are not responsible for transcription inhibition-based nuclear blebbing suppression (revisions on pages 7, 10, 14).

      1) Actin confinement, as measured by nuclear height does not change upon transcription inhibition (Figure 3, C-E). Thus, our data supports the idea that transcription inhibition suppresses nuclear blebbing through a different mechanism. The Reviewer objects to this measurement on the basis that even the 0.5 µm change observed for VPA-treated cells is below optical resolution. However, optical resolution is not relevant to this measurement because we are not resolving two objects; rather, we are measuring the size of one object, the nucleus.

      When two dots/objects are separated in the same frame or in different z slices, one needs to clearly distinguish two gaussians point spreads from the two objects a distance X apart. That is resolution and that is not the relevant limitation here. We measure the size of one object (the nucleus) using full-width half-maximum, which can quantify changes in nuclear height at scales finer than the optical resolution. For example, the FWHM of a fluorescence bead can be observed to change by just 10’s of nm depending on the light emitted; with small wavelengths, one has smaller FWHM (from the Rayleigh criterion, θ = 1.22λ/D, where λ is the wavelength of the light). Our measurements are through a z-stack at 200 nm steps, thus the change in distance from wild type to VPA-treated of 0.5 µm is 2.5 z steps (not smaller than one z step). Finally, we have additional data showing our ability to measure these differences many times over (Pho et al. 2022 biorxiv).

      Image left is from: https://en.wikipedia.org/wiki/Full_width_at_half_maximum

      Image right is a crop of Figure 3D from the manuscript.

      2) Actin contraction, as measured by γMLC2, does not change either (Supplemental Figure 2). However, we know that actin contraction is a major determinant of nuclear blebbing (Mistriotis et al., 2019 JCB and Pho et al., 2022 biorxiv). Therefore, our data support that transcription affects blebbing in some other way than actin contraction.

      The Reviewer disputes this finding by stating that “immunostaining of myosin is not a reliable way to compare myosin activity across conditions.” Published reports show that γMLC2 immunostaining is a reliable way to measure actin contractility changes (Wan et al. MBoC 2012; Ramachandran et al. Mol Vision 2011; Duan et al. Cell Cycle 2016; Nishimura et al. PLOS One 2020). We have another preprint showing that alterations to actin contraction as measured by immunostaining of phosphorylated myosin light chain 2 (γMLC2) determine nuclear blebbing, independent of changes in actin confinement (Pho et al., 2022 biorxiv). There, we clearly show that changes in γMLC2 immunostaining can measure changes in actin contraction due to well-established modulators. Similarly, the ROCK inhibitor Y27632 in Supplemental Figure 2 can be viewed as a positive control in that γMLC2 immunostaining is clearly decreased after treatment with the inhibitor.

      3) Chromatin decompaction via H3K9ac and chromatin-based nuclear rigidity are not rescued by transcription inhibition. New data also shows that levels of heterochromatin H3K9me2,3 does not change upon transcription inhibition (Figure 3B). The new data presented in this manuscript shows that transcription inhibition also suppresses blebbing in DZNep-treated cells (Figure 2D), where chromatin compaction by heterochromatin formation is inhibited (Stephens et al. MBoC 2019). Together, these experiments suggest that transcription inhibition is not suppressing nuclear blebs through increases in heterochromatin-based chromatin compaction.

      Furthermore, the lack of change in the measurement of nuclear stiffness via micromanipulation (Figure 3G) provides a complementary metric suggesting that chromatin compaction is unchanged, at least in the case of VPA + alpha-amanitin.

      Altogether, these results are inconsistent with transcription inhibition suppressing blebs through alterations to chromatin compaction.

      4) Osmotic pressure is the least or not at all established of the four “traditional” mechanisms. The Reviewer proposes that transcription inhibitors, such as alpha-amanitin, could relieve osmotic pressure within the nucleus. We disagree with this explanation in that it is implausible for the nucleus to maintain an osmotic pressure imbalance in VPA-treated cells over long periods of time. Fluid, ions, and small biomolecules likely can flow through nuclear pores to maintain osmotic balance between the nucleoplasm and cytoplasm, especially over the hours long duration of VPA treatment. Furthermore, we are skeptical that VPA treatment, even with its chromatin-decompacting effects, significantly increases osmotic pressure because nuclear stiffness actually decreases after VPA treatment (Stephens et al. MBoC 2017, 2018, 2019; Krause et al. Phys Bio 2013; Shimamoto et al. MBoC 2017; Hobson et al. MBoC 2020) . Increased osmotic pressure should cause the nucleus to be stiffer. Moreover, nuclei in VPA-treated cells consistently undergo blebbing and rupture, which would naturally relieve any pressure imbalance. Thus, the notion that the measurements after hours VPA or VPA+aam treatment (Figures 2-5) are the result of a steady-state change in osmotic pressure is simply inconsistent with the experimental data.

      We note that in cases of acute osmotic shock, where the osmotic pressure balance of the nucleus may be altered, the nucleus changes in size (e.g., see Finan et al., 2009 Ann Biomed Eng), which we do not observe in our experiments. Our measurements of nuclear area (Figure 6C) and height (Figure 3E) show no change nuclear size upon transcription inhibition (for more on the issue of height measurement, see the previous point).

      To further address concerns about overnight treatment causing off-target effects, we have provided new data from a shorter treatment duration in the manuscript. The new data shows that within 8 hours, blebs exhibit more reabsorption after alpha-amanitin, triptolide, and flavopiridol treatment in both VPA-treated and LMNB1-/- cells (Supplemental Figure 1B). Additionally, we note that actinomycin D decreased nuclear blebbing in 1.5 hours, and thus did not require overnight treatment.

      In summary, our original and new data clearly show that transcription contributes to nuclear blebbing. Transcription inhibition does not change other factors (such as actin-based confinement or contraction, changes in chromatin compaction, or osmotic pressure), that have been shown or may be thought to contribute to nuclear blebbing. The revised manuscript addresses this issue through the inclusion of new data, as discussed above.

      __

      Further to point 4, the data in Figure 4B and 4D both show a decrease in the mean of the % of ruptured nuclei and rupture frequency (please provide units for this frequency on the Y-axis). With more experiments, perhaps the data would have reached statistical significance?__

      The Reviewer is asking for clarification on the data included in Figure 4 B and D reporting the percentage of cells that display a nuclear rupture.

      We have revised the manuscript to clarify that Figure 4B is the percentage of all nuclei that show at least one nuclear rupture. The measurement unit, percent (listed as “[%]”), is shown on the y-axis. The revised manuscript also clarifies that Figure 4D reports, for the nuclei that rupture, the average number of times a nucleus ruptures during the 3-hour time-lapse.

      The Reviewer stats that “with more experiments, perhaps the data would reach statistical significance?” To address this comment, we have altered the text to explain that % of all nuclei that rupture at least once does not significantly decrease by t-test but does show a non-statistically significant decrease. The data in Figure 4B shows that VPA causes 18.5 +/- 2.7 % rupture and VPA+alpha-amanitin causes 12.4 +/- 1.5 % rupture. Student’s t-test is p = 0.08 which is not statistically significant (p > 0.05) for six biological replicates each consists of n = 100-300 cells. We feel the data speaks for itself without us doing more experiments with the sole purpose of getting a lower p value. The stronger data is in Figure 4D, which clearly shows less nuclear ruptures per nucleus. We appreciate the Reviewer’s perspective and have modified the text in the Results and Discussion sections to reflect these important points (pages 8 and 14). __ __

      __ Minor comments.

      1. Confirmatory data, which has already been published in the same cell line in the past, could be moved if possible to supplemental information. Figure 1 seems to be a characterization of the efficacy of alpha-amanitin which is well-known, and therefore does not represent an original finding. It should perhaps be in supplemental information.__

      We understand the Reviewer’s point but would like to leave Figure 1 as a main text figure to provide a clearer story for all readers of our manuscript.__ __

      __ 2. Did the counting method used to collect data in Figure 4B exclude nuclei that rupture multiple times? This should be specified in the manuscript.__

      No, Figure 4B is the percentage of nuclei that rupture, which includes nuclei that rupture any number of times as a single nucleus that ruptures. We have revised the Figure 4 legend to clarify this point. __ __

      __ 3. This statement should be rephrased: "Since transcription is needed to form and stabilize nuclear blebs, at least some aspect of nuclear shape deformations appears to be non-mechanical" - deformation in the model in Figure 7 is clearly 'mechanical' - driven by motor force.__

      We appreciate the Reviewer’s feedback and have rewritten the text changes this to “independent of the bulk mechanical strength of the nucleus”. __ __

      __ 4. It is important to specify the times for which cells were treated with the various drugs in each figure (and not just in figure 2).__

      We appreciate the Reviewer’s feedback and have added this information to each figure legend.__

      __

      __

      Reviewer #3 (Significance (Required)):

      This paper reports new data that nuclear blebbing induced by treatment with valproic acid can be inhibited by co-treatment with alpha-amanitin. The data provided are reproducible across different cell lines. The data suggest that inhibition of transcription inhibits blebs which are induced by valproic acid treatment, but it does not inhibit blebs in cells untreated with valproic acid. Immunostaining reveals some enrichment of RNA pol II phosphorylated at Ser5 in valproic acid-induced blebs, suggesting an enhancement of transcription-initiation (but not transcriptional elongation) in the bleb. Alpha-amanitin treatment reduces bleb formation and bleb lifetime.

      While the data are clearly presented, and interesting in terms of relating transcription to blebbing, the proposed interpretation in terms of a new mechanism of blebbing is not strongly supported by the data or by the computational model. More definitive evidence is required to rule out that blebbing in valproic acid treated cells is not caused by a pressurization of the nucleus due to valproic acid treatment, which could be released by treatment with alpha-amanitin treatment for upto 24 h. The manuscript generalizes the findings to 'nuclear shape', and interprets them as suggestive of an alternative mechanism of establishment of nuclear shape; this generalization seems unsupported by the data.__

      Overall, the data provided is novel and interesting to cell biologists, provided more definitive evidence can be provided to rule out other models and to establish the new proposed model for nuclear blebbing. Else, the claims of an alternative mechanism for blebbing could be toned down, and the data on the relation between transcription and blebbing, which is the novel and interesting finding in this paper, could be presented in a more focused way.

      We appreciate that the Reviewer points out that “the data are clearly presented and interesting” and “reproducible across different cell lines.” The Reviewer’s main concerns appear to be with: 1) the effect of transcription inhibition on blebbing that is not induced by VPA, 2) alternatives or limitations to our proposed interpretation of the results, and 3) describing our results as applicable to “nuclear shape” in general.

      We have addressed each of these concerns in detail in the above response and the revised manuscript. To summarize:

      • We have included new data to show that four different transcription inhibitors combined with four different nuclear perturbations exhibit the same effects (Figure 2 and Supplemental Figure 1). Furthermore, we have clarified in the revised manuscript that even wild type (“untreated”) nuclei exhibit changes to blebbing dynamics (decreased stability, increased reabsorption) after transcription inhibition (Figure 5). Furthermore, concerns about time intervals was addressed by time lapse imaging showing that bleb reabsorption (return to normal shape) increases six-fold in the first 8 hours of transcription inhibitor treatment (Supplemental Figure 1B).
      • The original manuscript, new data, and previous data from the literature provides evidence that alternative mechanisms involving “pressurization” (discussed above), the actin cytoskeleton (Figure 3E and Supplemental Figure 2), and chromatin and nuclear rigidity (Figure 3) do not explain the observed effects of transcription inhibition. We discuss this in detail in the revised manuscript and the above response. Furthermore, we have revised our presentation and discussion of the simulation model to describe its relevance more clearly to the results, support its inclusion in the manuscript, and provide appropriate caveats on our computational findings.
      • We have revised the manuscript to clarify that our results primarily concern nuclear blebbing and rupture. The Reviewer is correct that the current investigation does not particularly focus on larger-scale shape such as circularity/ellipticity. In summary, our data clearly indicate that transcription contributes to nuclear blebbing and rupture. Previously suggested mechanisms of blebbing are generally inconsistent with the observed effect in combination with our other measurements. The model investigates a plausible new, complementary mechanism, which in itself represents an advance in biophysical modeling and ties the manuscript together.

      We thank the Reviewer for their thorough critique, which we have now addressed. We believe that the new experimental data and analysis and computational modeling in our manuscript significantly advances our overall understanding of nuclear blebbing, even as it raises new questions to be addressed by future work.

    1. Author Response

      Reviewer #1 (Public Review):

      The Introduction starts by setting up a straw-man argument, claiming that the assumption is that gene expression is set up as stable expression domains that undergo little or no subsequent change. I don't think that any current developmental biologist thinks this is true. The references used to support this claim are from the 1990s up to the early 2000s. There are numerous examples since then that show that developmental gene expression is dynamic as a rule.

      Our argument might seem like a strawman for certain sector of developmental biologists who work in the field of pattern formation, or aware of the latest advances in the field. However, a look at current publications on developmental enhancers reveals that the dominant model with which enhancer biologists interpret their data is still the French Flag model (specifically, the eve-stripe-2 model of enhancer function). We meant to address this audience, and attempted to clarify this from the very beginning by stating that “Much of our models of how enhancers work during development relies on the assumption that …”. Please, note here that we are talking about “models of how enhancers work”, not models of pattern formation in general.

      The Introduction then continues as a rather detailed review of enhancers, Tribolium methodology, tools for identifying enhancers, and more. The Introduction cites 99 references, which seems excessive for what is essentially an experimental paper. Significant parts of the Introduction can be trimmed or removed. There is no need to mention all the tools available for Tribolium if they are not used in the described experiments. A thorough analysis of the advantages and disadvantages of different modes of ATAC-seq is also beyond the scope of the Introduction. The authors should explain why they chose the tools they chose without excessive background.

      In the revised manuscript, we shortened the discussion of Tribolium methodologies and imaging techniques. However, we think that the paragraph discussing ATAC-seq strategies are important to justify our choices as why we took the effort to cut the embryos to perform tissue-specific ATAC-seq analysis, instead of performing whole-embryo ATAC-seq.

      Having said that, the Introduction actually overlooks a lot of significant work that is relevant to the subject of the paper. Specifically, the authors completely ignore all of the work on development in hemimetabolous insects such as Oncopeltus and Gryllus - the omission is glaring. There has been a lot of relevant work on dynamic gene expression patterns coming out of these species.

      You are right indeed. We apologize for that. We added now citations to relevant works from those to insect to the manuscript.

      The experimental setup involves cutting embryos into three sections at two time points. The results then discuss differences in "space" and "time" but there is no discussion of the embryological meaning of these terms. What is happening at the two time points from a developmental perspective? What is the difference between the three sections? There is a lot of relevant development going on at these stages and important regional differences, which have been well-studied in Tribolium and in other insects but are not even mentioned.

      A good point. Correlating chromatin landscape changes with embryological events is an interesting point that needs further analysis and the application of ATAC-seq to further timepoints. We chose leaving this to future work (possibly using single cell ATAC-seq). In this work, we restricted our analysis to the benefits of applying time- and tissue-specific ATAC-seq in predicting active enhancers. We added a note on this point in the discussion.

      In the preliminary results of the ATAC-seq analysis, it is clear that there are significant differences between the sections, which should come as no surprise, but fairly minor differences between the same section at the two time points. This could be because the two time points are pretty close together at a stage when there is a lot of repetitive patterning going on. A possible interpretation, which the authors don't mention because it goes against their main thesis, is that maybe most of the processes that are taking place at this stage are not dynamic enough to show up at the temporal resolution they have applied. This is worth at least a mention.

      We agree with this observation. We would like to draw the reviewer’s attention to our statement “Together, our findings indicate that changes in chromatin accessibility in Tribolium at this developmental stage are primarily associated with space rather than time…””. Detailed analysis of the chromatin dynamics across time would need taking more datapoints, which is something we plan to do in future work.

      The authors link each accessible site to the nearest gene when looking at putative enhancer function. This is a risky assumption since there are many examples of enhancer sites that are far upstream or downstream of the target gene and often closer to an unrelated gene than to the target gene. The authors should at least acknowledge this problem with their functional annotation.

      The reviewer is correct in that, in particular for large eukaryotic genomes, enhancers are often located far away from their target genes. We have no comprehensive enhancer-target data that would enable us to perform a more accurate analysis. Furthermore, the assumption that at least for some of the enhancers the nearest genes will also be their targets, and hence, provide insight into the function of the enhancers themselves seems reasonable given the relatively compact organization of the Tribolium genome. In any case, the analysis was just presented as one of several sanity checks for our ATAC-seq data; for the sake of streamlining the manuscript we no longer include this analysis in the current version of the manuscript.

      In the Discussion, the authors claim that contrary to how it may seem, the question they are addressing is not a "fringe problem". Once again, I think this is a straw man. No active researcher thinks that the question of dynamic regulation of gene expression during development is a fringe problem. On the contrary, most researchers will accept that this is one of the most interesting and important questions in current developmental biology.

      This whole argument was removed from the Discussion in the revised manuscript.

      Perhaps the most significant problem with the manuscript is that it is all built around the premise of enhancer switching between dynamic enhancers and static enhancers. The authors find one site that is consistent with their prediction for a dynamic enhancer and one site - regulating a different gene - that is consistent with their prediction for a static enhancer and claim that they have provided support for their model. I think this claim is grossly exaggerated. They present data that can be seen as consistent with their model but are a long way from providing evidence for it.

      We actually thought we were cautious enough about this. Nowhere in our text did we mention that our data “support” the enhancer switching model. We stated quite early (in the abstract, actually) that:

      “We found our data consistent with a model in which the timing of gene expression during embryonic pattern formation is mediated by a balancing act between enhancers that induce rapid changes in gene expressions (that we call ‘dynamic enhancers’) and enhancers that stabilizes gene expressions (that we call ‘static enhancers’).”

      To make this message clearer, we added the following sentence to the abstract of the revised manuscript: “However, more data is needed for a strong support for this or any other alternative models.” And again at the end of the Introductions: “While these data are in line with our Enhancer Switching model, more data is needed as a strong support for the model.” Also, at the end of the Results section examining runB enhancer dynamics, we stated: “However, this merely shows that runB activity dynamics are consistent with our model, but is still far from strongly supporting the model (more on that in the Discussion).” Also for the Results section on enhancer hbA dynamics: “Again, this merely shows that hbA activity dynamics are consistent with our model, but is still far from strongly supporting it.”.

      Moreover, in the opening paragraph of the Discussion, we explicitly and quite openly addressed this point, and suggested what kind of observations and experiments needed in the future to qualify as a “strong support” for the model. We even ran simulations for what kind of observation should one expect in enhancer deletion experiments if the model is correct (Figure 7).

      But it seems like discussing the enhancer switching model in detail gives the impression of its central importance to the paper. In our view, our experimental system is quite general and does not depend on that model, but the point of mentioning it is that it is an example of how could an alternative model of enhancer regulation be of relevance to the problem of dynamic gene expression. This wouldn’t be obvious without this or a similar model that is showing this, even if it is hypothetical. But since our presentation is obviously giving the impression that our claims are stronger that they really are, we altered our phrasing in the introduction of the revised manuscript to make our point clearer:

      “Despite its potential inaccuracies, the Enhancer Switching model exemplifies the type of alternative frameworks we need to explore in order to elucidate the mechanisms driving the generation of gene expression waves during development. Consequently, an appropriate model system is required, allowing us to test not only the Enhancer Switching model but also any other prospective model that provides a satisfactory explanation for the initiation of gene expression waves at the enhancer level.”

      We hope that this addresses the reviewer’s quite legitimate concerns.

      Like the Introduction, the Discussion includes long paragraphs (lines 450-480) that are more suitable for a review/hypothesis paper. The data presented in this manuscript has little relevance to the question of kinematic vs. trigger waves, and therefore there is no real reason for the question to be discussed here.

      We have now significantly shortened the discussion.

      Reviewer #2 (Public Review):

      Open questions:

      What happens with the runB enhancer at later stages of embryogenesis? With what kind of dynamics do the anterior-most stripes fade and does that agree with the model? Do they show the same dynamics throughout segmentation? I think later stages need to be shown because the prediction from the model would be that the dynamics are repeated with each wave. I am not so sure about the prediction for ageing stripes – yet it would have been interesting to see the model prediction and the activity of the static enhancer.

      Yes, the dynamics repeats in the germband. This is shown in Supplementary Figure 8. The dynamics in germband were shown by visualizing yellow mRNA and intronic probes. MS2 imaging was not possible to be used because the embryo dive into the yolk for a while, and then it becomes difficult to capture the germband in the right orientation for imaging. We are currently working to use light sheet microscopy for imaging germband stages.

      I understand that the mRNA of the reporter gene yellow is more stable than the runt mRNA. This might interfere with the possibility to test your prediction for static enhancers: The criterion is that the stripes should increase in strength as the wave migrates towards the anterior. You show this for runB – but given that yellow has a more stable transcript – could this lead to a “false positive” increase in intensity with the slower migration and accumulation of transcripts? I would feel more comfortable with the statement that this is a static enhancer if you could exclude that the signal is blurred by an artifact based on different mRNA stability. What about re-running the simulation (with the p–rameters that have shown to well reflect endogenous –unt mRNA levels) but i“creasing the parameter for the stability of the mRNA? Are static and dynamic enhancers still distinguishable? The claim of having found a static enhancer rests on this increase in signal, hence, other explanations need to be excluded carefully.

      Good questions. Note that runB reporter dynamics were examined not only by visualizing yellow mRNAs (which indeed seem to be more stable than endogenous run mRNA; see Supplementary Figure 10), but also using MS2 (with virtually zero mRNA stability; although stability was simulated in the shown movies to show virtual mRNA dynamics), and intronic yellow mRNA (showing de novo transcription; Supplementary Figure 10; you will need to zoom in to see intronic de novo transcripts). The expected dynamics of a static enhancer reporter is quite unique: it progressively increases initially as it propagates from posterior to anterior, then it progressively decreases as it slows down and stabilizes at the anterior. Then they eventually fade. These full range of dynamics is obvious in germband embryos stained for intronic yellow to show de novo transcription of runB enhancer reporter (Supplementary Figure 10; you will need to zoom in to see intronic de novo transcripts).

      Running the simulation for the model using different degradation rates for the enhancer reporter made the static enhancer’s expression either less or more persistent, but gave the same overall result: the static enhancer expression has diminished expression at the very posterior, but high expression as its expression wave exiting the growth-zone/SAZ. This is consistent with not only yellow mRNA expressions of runB, but with its intronic expression as well (Supplementary Figure 10; you will need to zoom in to see intronic de novo transcripts).

      What about the head domain of the runB enhancer (e.g. Fig. 6A lowest row): This seems to be different from endogenous expression in your work and in Choe et al. Is that aspect different from endogenous expression and can this be reconciled with your model?

      Yes, indeed this aspect cannot be explained by our model. We believe that head patterning in insects is regulated by a different regulatory network. This network might be (de)-activated by missing repressors in the selected DNA segment for runB enhancer. We mentioned this issue in the revised manuscript.

      The claim of similar dynamics of expression visualized by in situ and MS2 in vivo relies on comparing Fig. 6C with 6A. To compare these two panels, I would need to know to what stage in A the embryo in C should be compared. Actually, the stripe in 6C appears more crisp than the stripes in 6A.

      Were the enhancer dynamics tested in vivo at later stages as well? I would appreciate a clear statement on what stages can be visualized and where the technical boundaries are because this will influence any considerations by others using this system.

      One really cannot be that super-precise about the timing of a very dynamic process in space and time like this one we are studying. We believe that Figure 6D shows clearly that runB activity dynamics are similar to endogenous run expression.

      How do the reported accessibility dynamics of runA enhancer correlate with the activity of the reporter: E.g. is the enhancer open in the middle body region but closed at the posterior part of the embryo? Or is it closed at the anterior – and if so: why is there a signal of the reporter in the head?

      You show that chromatin accessibility dynamics help in identifying active enhancers. Is this idea new or is it based on previous experience with Drosophila (e.g. PMID: 29539636 or works cited in https://doi.org/10.1002/bies.201900188)? Or in what respect is this novel?

      Our manuscript contributes to the growing body of evidence confirming that accessibility per se does not imply activity. Of course, this is not a new idea, but given the widely use of accessibility as a proxy for enhancer activity in the genomics community, we do feel it is important to reiterate the message. As the reviewer correctly indicates, several published findings point to a correlation between accessibility dynamics and enhancer activity. However, to our knowledge, this is the first example in Tribolium. It is important to point out that what “dynamic” means strongly depends on the experimental design. Even in Drosophila, not enough studies have been conducted to fully understand the relationship (e.g., ideally, this should be done on a continuous time scale and at single cell level). We acknowledge in the manuscript that this relationship has been observed before in other species (and have added the references suggested by the reviewer, for which we are very grateful), but still believe that our observations are highly significant to the Tribolium community.

      Reviewer #3 (Public Review):

      I have two major concerns: First, the claim about differential accessibility being related to enhancer activity is not really established from the presented data, in my view. This needs to be clarified. (I do believe in the claim to some extent, but not based on presented evidence.)

      We agree with the reviewer that more data – and, more importantly, independent replication – are necessary to confirm this finding. Please, refer to our response to your comment regarding the statistical significance of the findings.

      Second, the evidence in support of the Enhancer Switching model for runt should be accompanied by identification of and spatiotemporal profiling of the “speed regulator”, if this is not established yet.

      Experiments supporting the role of Cad as a speed regulator for both pair-rule and gap genes have been published in El-Sherif et al PLOS Genetics 2014 and Zhu et al PNAS 2017. We added a comment stressing this fact.

      In addition to these two concerns, the simulations of the Enhancer Switching model need to be described, at least in the outline, in the Methods section.

      Done

    1. Author Response

      Reviewer #1 (Public Review):

      Specifically, the authors define "efficacy" (eta) of a ligand as the fractional change in binding free energy between the open and the closed states of the channel.

      We assume that the word in quotes is a typo; ղ is efficiency, not efficacy (now given the symbol λ). We now emphasize the distinction immediately after Eq. 2.

      1) One concern regards the clustering of the data sets in Fig. 5 into exactly 5 eta-classes. First, two clusters contain only two data points each. Second, the proposed "catch&hold LFER model" (Fig. 2) does not predict the existence of a discrete number of such eta-classes. How strong is the evidence that there are exactly 5 classes as opposed to a continuum of possible eta values.

      Statistical (x means cluster) analysis indicates that the 23 agonists segregate into 5 ղ classes. Groups with only 2 members (plus the intercept) are less well defined (Fig 4) but are supported by the 5 mutational ղ classes (Fig. 7). (see above)

      2) The authors do not discuss the uniqueness of the proposed model.

      see above. Ln 405 Induced fits are common.

      In fact, it seems to me that the existence of eta-classes might be explained just as well by an alternative model which assumes a single gating mechanism for the receptor,

      We are not sure what a “single gating mechanism” means. Does non-single refer to i) the2 stage induced fits (catch-hold LFER)? … ղ classes makes this conclusion unavoidable. ii) our conjecture that are there are 5 different C versus O binding site structural pairs…? Energy derives from structure, so we the 5 energy ratios indicate 5 structural pairs. iii) multiple steps inside gating (ϕ)? …So far there have not been any alternative explanations for the organized map of ϕ. iv) catch itself?... Evidence for this induced fit is given in Fig 2 and 7 SI, and on Ln 528-547 we discuss the implications of kon to C versus O. Ln 405 Local ‘Induced fit’ rearrangements in enzymes are common. We think the evidence is strong for the bottom scheme in Fig 2A.

      but distinct patterns of ligand-protein interactions for the different agonists.

      ղ classes derive from distinct interactions for different agonists, but what these are and whether the ‘contact number’ idea is useful are uncertain (see above).

      The pore opening-associated increase in agonist affinity is typically caused by a tightening of the substrate binding site (often called clamshell closure) …

      Ln 379-386 In the Discussion we now relate catch-hold to induced fit

      Ln 455, 461-463, 471-474 Fig 2SI and the induced fit to clamshell closure

      Reviewer #2 (Public Review):

      This is an interesting manuscript with a worthwhile approach to receptor mechanisms. The paper contains an impressive amount of new data. These single molecule concentration response curves have been compiled with care and the authors deserve great credit for obtaining these data.

      Ln 233 ղ can be estimated from a CRC built from whole-cell currents…

      Ln 150 …or indeed any method that estimates KdC and KdO (for example binding assays, or perhaps in silico simulations of AC and AO structures)

      I judge the main result to be that there are different values of the recently-proposed agonist-related quantity "efficiency".

      Ln 21, 26-27, 535-547 OK, but to us the most interesting insight is that in AChRs binding IS gating.

      These values are clustered into 5 quite closely spaced groups. The authors propose that these groups are the same whether considering mutations in the binding site or different agonists.

      see above

      It was unclear to me in several places, what new data and what old data are included in each figure. Therefore readers may have difficulty judging the claimed advance. This difficulty is not helped by the discussion, which includes some previous findings as "results".

      see above.

      A further weakness is that it is unclear how general or how specific these concepts are. The authors assert that they are, by definition, completely universal. However, we do not have reference to previous work or current data on any other receptor than the muscle nicotinic. I could not square the concept that "every receptor works like this" with the evident lack of desire to demonstrate this for any other receptor.

      Ln 132-136 There are reasons to think that receptors in general work according to Figure 1A. A thermalized ligand (for instance TriMA, MW 60) has the momentum of only ~3 water molecules. A momentum sensor would have terrible signal/noise.

      Reviewer #3 (Public Review):

      This work attempts to introduce a new attribute of the receptor- efficiency, a fraction of an agonist binding energy consumed by conformational transition of the receptor from resting to active (open) states. Furthermore, the authors use an impressive set of experimental data (single channel recordings with 23 agonists and 53 mutations) to measure the efficiency for each agonist and mutant receptor. All the estimated efficiencies fall into a few groups and inside each of the efficiency groups there is a strong correlation between agonist affinity and receptor opening efficacy.

      The main finding in this study is that estimated efficiencies fall into 5 groups.

      see above.

      There is no clear description of the method how the efficiencies were allocated into different groups. Most importantly, it is not clear if the method used takes into account the uncertainty of the efficiency estimate. The study does not show any statistical metrics of the efficiency estimates as well as any other calculated variable such as dissociation equilibrium constants to resting or open states. Surely, the uncertainty of the efficiency should matter especially considering how near the efficiency group values are (eg. difference about 10% between 0.51 and 0.56 or 0.41 and 0.45).

      see above

      All the tested agonists fell into groups according to the efficiency value attributed to them. It is difficult to see why some of the agonists belong to the same group. For example, it is not obvious at all why such agonists as epibatidine, decamethonium and TMP are in the same group. The question, I guess, arises if this grouping based on efficiency has any predictability value. Furthermore, if a series of mutations with the same agonist fall into different groups, the prediction power of this approach is very limited if one attempts to design a new agonist or look for a new mutation.

      see above and Ln 548-561 (last para of text). Efficiency is a relatively new idea. This report is one of only a few on the subject. More experiments with different receptors by more labs using other approaches are needed to ascertain whether ղ is general.

    1. Author Response:

      The following is the authors' response to the current reviews.

      We appreciate the thoughtful critiques of the reviewers. While we agree that performing additional experiments and analyses probing the sensitivity of the technique would be useful for future studies, we are unable to perform additional experiments as our lab has closed. We share this technique as a starting point for further investigation, but it may need to be modified for success in other contexts. We have provided details of the scenarios (life stage, feeding, day, number of ticks) where we successfully sequenced B. burgdorferi from ticks, as well as one where we did not (unfed nymphs) as a starting point. We will clarify in proofing that our qPCR experiments show that we capture the vast majority of B. burgdorferi flaB mRNA from our input samples, suggesting that we are likely capturing the majority of the B. burgdorferi.

      In this work, we were most interested in using RNA-seq to perform differential expression analysis between annotated mRNAs across our timepoints. We have provided the number of genes detected in each sample (92% of annotated transcripts on average) as well as the median number of reads covering each gene (604 on average) in the supplemental file containing sequencing statistics. This coverage is highly reproducible across replicates, with an average Pearson correlation of 0.99 between gene expression levels (as Transcripts Per Million) between any two replicates. These data and the fact that many of the gene expression changes we observed align with previous observations of others give us confidence in our differential expression analysis. For those interested in tRNAs or sRNAs, we think that it would be best to modify the protocol to focus specifically on capturing those sequences in the library preparation. We encourage others interested in other aspects of our data to download it and explore it.

      We will correct remaining wording issues in proofing.

      —————

      The following is the authors' response to the original reviews.

      Dear Reviewing Editor,

      We thank you and the reviewers for the thoughtful comments on our manuscript, and we are excited to submit a revised version of our manuscript “Longitudinal map of transcriptome changes in the Lyme pathogen Borrelia burgdorferi during tick-borne transmission.” In response to the reviews, we have made the following changes to our manuscript:

      1. We updated the text for increased clarity around experimental details, including statistical analyses.

      2. We added additional details about the mapping of non-Bb reads as well as more information about Bb read coverage.

      3. We compared our differentially expressed genes to 4 previous studies of global transcriptional changes in different tick feeding contexts.

      4. We updated the discussion to address these comparisons as well as caveats of our study more directly.

      Please see our responses to individual comments below.

      Reviewer #1 (Public Review):

      In this study, Sapiro et al sought to develop technology for a transcriptomic analysis of B. burgdorferi directly from infected ticks. The methodology has exciting implications to better understand pathogen RNA profiles during specific infection timepoints, even beyond the Lyme spirochete. The authors demonstrate successful sequencing of the B. burgdorferi transcriptome from ticks and perform mass spectrometry to identify possible tick proteins that interact with B. burgdorferi. This technology and first dataset will be useful for the field. The study is limited in that no transcripts/proteins are followed-up by additional experiments and no biological interactions/infectious-processes are investigated.

      Critiques and Questions:

      We thank the reviewer for these thoughtful critiques and helping us improve our manuscript.

      This study largely develops a method and is a resource article. This should be more directly stated in the abstract/introduction.

      We edited the abstract and introduction to more directly state that we are sharing a new method and a resource for future investigations. (Lines 29-32; 101-103)

      Details of the infection experiment are currently unclear and more information in the results section is warranted. State the species of tick and life-stage (larval vs nymphal ticks) used for experiments. For RNA-seq, are mice are infected and ticks are naïve or are ticks infected and transmitting Borrelia to uninfected mice?

      We updated the results section to more clearly state the tick species and life stage and to make it more clear that infected ticks are transmitting Bb to naïve mice. (Lines 113-115)

      What is the limit of detection for this protocol? Experimental data should be provided about the number of B. burgdorferi required to perform this approach.

      We performed this protocol on pools of 6 (for later feeding stages) to 14 (for early stages) infected nymphs. Published studies (PMID: 7485694, PMID: 11682544) suggest that one day after attachment, there may be a few thousand Bb per tick, suggesting what we’ve measured here may come from on the order of 104 Bb. We were not able to capture consistent data from Bb from unfed ticks, which may be due to lower numbers or to an altered transcriptional state caused by lack of nutrients in the unfed tick. We updated the discussion to reflect some of these limitations and uncertainties. (Lines 461-465)

      More information regarding RNA-seq coverage is required. Line 147-148 "read coverage was sufficient"; what defines sufficient? Browser images of RNA-seq data across different genes would be useful to visualize the read coverage per gene. What is the distribution of reads among tRNAs, mRNAs, UTRs, and sRNAs?

      As we were interested in differential expression analysis, we defined sufficient as the number of reads needed per gene to determine statistically significant expression changes across days, which with DESeq2 is typically 10 reads. We reworded this section for clarity and added additional information about the median number of reads per gene which is also useful in thinking about differential expression analysis. (Lines 163-170) As we chose to focus on differential expression analysis here, we believe these are most relevant metrics to cover.

      My lab group was excited about the data generated from this paper. Therefore, we downloaded the raw RNA-seq data from GEO and ran it through our RNA-seq computational pipeline. Our QC analysis revealed that day 4 samples have a different GC% pattern and that a high percentage of E. coli sequences were detected. This should be further investigated and addressed in the paper: Are other bacteria being enriched by this method? Why would this be unique to day 4 samples? Does this affect data interpretation?

      We appreciate the interest in our data and pointing out this anomaly. We found that the day 4 samples do have a high percentage of reads that mapped to a bacterial species, Pseudomonas fulva, rather than ticks as we expected. (The reads that map to E. coli also map to P. fulva.) We have updated the results to include this information (Lines 156-165). We believe this is likely due to contamination from collecting ticks after they have fallen off mice in cages on day 4, rather than pulling ticks off the mice as in days 1-3. Unfortunately, as our lab has shut down, we cannot investigate the source further. We do think the high percentage of P. fulva reads suggests that other bacteria can be enriched with the anti-Bb antibody we used. We’ve updated the discussion to highlight this caveat. (Lines 459-460)

      While the presence of these bacterial reads did lower our overall Bb mapping rate and necessitate deeper sequencing for the day 4 samples, the Bb sequencing coverage of these samples is on par with samples from the other days in terms of percentage of genes with at least 10 reads and median number of reads per gene. Fewer than 0.0002% of the reads that map to Bb genes in any day 4 sample also map to P. fulva. We found that this small fraction of reads is dispersed across 334 genes in which an average of 0.05% (maximally 2.3%) of day 4 reads also map to P. fulva. Therefore, these bacterial reads do not change our interpretation of the results comparing gene expression across days, including day 4.

      Comprehensive data comparisons of this study and others are warranted. While the authors note examples of known differentially expressed genes (like lines 235-241), how does this global study compare to other global approaches? Are new expression patterns emerging with this RNA-seq approach compared to other methods? What differences emerged from day 1 to day 4 ticks compared to differences observed in unfed to fed ticks or fed ticks to DMC experiments? Directly compare to the following studies (PMID: 11830671; PMID: 25425211; PMID: 36649080.

      We added comparisons of our list of DE genes to those noted to change between “unfed tick” and “fed tick” culture conditions (PMID: 11830671 and 12654782), as well as fed nymph to DMC (PMID: 25425211 and 36649080) (Lines 231-252, Figure S4). These comparisons pointed us to two main findings: that global changes to Bb in different culture conditions generally agreed with the most dramatic changes we saw in our data, and that the timing of expression increases during feeding may relate to whether genes are more highly expressed in fed ticks or in mammalian conditions. Overall, the majority of our DE genes have been identified in at least one of these studies or in the other studies we compared to outlining RpoS, Rrp1, and RelBbu regulons. As many of these studies were asking slightly different questions and using different conditions and vastly different technology, we would expect some differences to arise from different contexts and some to be purely technical. The genes that were not seen in these previous studies tended to follow the same functional patterns we saw overall, heavily skewing towards genes of unknown function, outer surface proteins, and a handful of genes related to other functions. With the current state of the functional annotation of the genome, it is difficult to assess whether these amount to new expression patterns in and of themselves, so we focused on the overall trends in our data rather than those that were different from other studies.

      Details about the categorization of gene functions should be further described. The authors use functional analysis from Drechtrah et al., 2015, but that study also lacks details of how that annotation file was generated. Here, the authors have seemed to supplement the Drechtrah et al., 2015 list with bacteriophage and lipoprotein predictions - which are the same categories they focus their findings. Have they introduced a bias to these functional groups? While it can be noted that many lipoproteins are upregulated (or comment on specific genes classes), there are even more "unknown" proteins upregulated. I argue that not much can be inferred from functional analysis given the current annotation of the B. burgdorferi genome.

      We strongly agree that the current annotation of the Bb genome makes it difficult to perform meaningful global functional analysis, but we feel it is useful to get a general overview of gene functions. We described our methods for classifying genes into functional categories in the methods, in which we relied on previously published papers to make our best estimate of gene category (noted for each gene in the Table S4). Due to the lack of annotations for many genes, we focused on the relatively well-defined category of lipoproteins, as these are overrepresented as a group in our upregulated genes, as well as phage genes, which are not necessarily overrepresented, but are still interesting to us. We hope that others will look at the data (particular in Table S4, but also Table S3, or download the raw data and do their own analysis) with their own interests and biases and dig more into genes that we did not highlight specifically. We provide this data as a resource with the hope that some of the genes of unknown function that we see change here will be the subject of future functional studies so that this is less of problem in the future.

      Reviewer #1 (Recommendations For The Authors):

      In general, the paper is well written and digestible for a broad audience. However, some of the figure graphics are unnecessary and take away from the data. Please label tick species and tick life-stage in Figure 1 drawings. The legend of Figure 1 requires citations. The Figure 4B graphic is unnecessary and the colors are confusing as they are too similar to the color palette of Figure 4A, where the colors have meaning. The Figure 5A graphic is unnecessary and takes away from the data embedded within it.

      We more clearly labeled the species in Figure 1 and added citations to the legend. We have simplified Figures 4A and 5A for clarity.

      Clarify lines 220-259 and Figure 3. What days are being compared? Downregulated genes should also be commented on.

      We considered our set of differentially expressed genes as those that changed two-fold (multiple hypothesis adjusted p-value < 0.05) in any of the three comparisons shown in Figure 2 (day1 to day2, day1 to day3, day1 to day4). We clarified this at multiple points in the results (i.e Line 273). We commented on downregulated genes throughout, although as there were fewer genes and the magnitude of change was smaller, we focused more on upregulated genes.

      Line 327-329, state numbers not percentages. How many Bb proteins were actually detected?

      We updated this section to include numbers (Lines 371-374). In concordance with our sequencing data, we found (and were looking for) mainly tick proteins in this experiment.

      Data availability: B. burgdorferi and tick oligo sequences used for DASH should be provided in a supplemental table.

      We added a supplemental table of these sequences (Table S9). Please note they have been previously published in Dynerman et al. 2020 and Ring et al. 2022.

      Reviewer #2 (Recommendations For The Authors):

      The manuscript is overall well written and easy to follow. The data are compelling and support the conclusions. The discussion of this work is however highly insufficient and needs to be thoroughly edited:

      - Statistical analysis: The authors mention that DESeq2 was used. Please provide information on the type and the stringency of the tests used for differential gene expression analysis, including any additional potential correction for p-values (Bonferroni). The authors mention that genes with fold changes >2 were used for analysis, yet there is no information on the p-value cut off or if the genes with fold changes >2 were statistically significant. Please provide detail and rationale for the analysis.

      We clarified in the results and methods (Lines 200, 642-644) that we required a adjusted p-value < 0.05 from DESeq2’s Wald test with Benjamini-Hochberg correction along with a two-fold change when determining our genes of interest. As small fold changes showed statistically significant differences, we chose to set a fold change cutoff in most of our analysis to help us focus on the most highly expressed genes, like other studies we compared our data to. We included all of the DESeq2 results in Table S3 so that others may explore the data with different cutoffs if desired.

      - The field has been generating data on gene expression in ticks for decades. Yet, many of these studies are not referenced here. There is no discussion of how the data described here compares to what is known in the literature. For example, Venn diagrams or tables could be included for comparison with the data described lines 208-216. Extensive description and comparison of the data to the literature should be added in the discussion, and similarities/discrepancies should be discussed appropriately.

      We added additional comparisons to four different papers looking at global gene expression in Bb in the fed tick or tick-like culture conditions (Lines 231-252, Figure S4). This information as well as comparisons to transcriptional regulons (Figure S3) is available in Table S4. In addition to discussing some examples in the results, we added more information in the discussion regarding these comparisons (Lines 420-425). The majority of the genes that we see change over feeding have been previously noted to change expression during the enzootic cycle or be regulated by transcriptional programs active during this timeframe, and we have more clearly stated that. We focused on similarities here as these papers all ask slightly different questions in different contexts and use different technology which could all account for the many differences in individual genes between all of them and our work.

      - There is no discussion of the caveats of the study: for example, the authors are using an anti-OspA antibody, which could induce bias. The authors provide in-vitro pull down data supporting that this should not be an issue, but the pull down is performed from BSK-grown bacteria. This caveat should be discussed.

      We’ve added a paragraph to the discussion including this caveat and others (Lines 453-463).

      - Timing of RNA extraction: There is over 1h of delay between initial tick collection and RNA fixation. The effects of time on gene expression should be discussed.

      Although we were able to show that this timeframe did not affect cultured Bb gene expression, we added this to the discussion.

      - Gene expression is compared to Day 1. This introduces analyses bias as it does not allow identification of transcripts that first change upon initial feeding. This caveat should also be discussed

      We added this caveat – that we may miss gene expression changes in the first 24 hours of feeding – to the discussion.

      - This study is performed with 1 strain of B. burgdorferi on one tick species. Please provide perspective on the impact of these findings on Lyme disease causing spirochetes and their vectors broadly.

      We believe this method could be easily adaptable to study gene expression in other spirochete/vector pairs to determine similarities and differences and we added a comment to the discussion.

      - The discussion should also include insights on how to build on this work and include additional areas of method development to increase the recovery of B. burgforferi from ticks or other organisms and facilitate future transcriptomic studies.

      We added a few ideas to the discussion noting that this protocol could be modified for use in other timeframes, with other antibodies, or in other organisms. We also highlight the recent advent of TBDCapSeq by Grassmann et al. that may be used in conjunction with this type of protocol.

      Minor comments:

      - Consider re-wording the description of the methods and findings to the third person for coherence.

      The majority of the methods are now written in third person.

      - Over 90% of the reads did not map to B. burgdorferi: please provide additional information on what these reads mapped to (tick or mouse), and if the data reflects what is known in the literature

      We have updated the results and discussion with information about the reads that do not map to Bb (Lines 156-166). The majority of reads mapped the tick genome, which is what we expected. While a large number of reads in our day 4 samples unexpectedly mapped to Pseudomonas fulva, we do not believe this affects the interpretation of our data as we were still able to get broad genome coverage of Bb in these samples.

      - Please be more clear in the result section on the life stage of the ticks used for these studies.

      We have updated the results to clarify throughout.

      - Indicate how many total reads were generated for each sample

      This information is present in Table S1.

      - Provide statistical analyses for Figures 1C and D.

      We added t tests to determine statistical differences for these panels.

      Reviewing Editor (Recommendations for The Authors):

      1. It is important to mention in the abstract (line 27) that 'upregulated genes' is in comparison to day 1. This is also true in the introduction (lines 92-93).

      We updated in the results and introduction to more clearly include that day 1 is our baseline measurement.

      2. It is also important to discuss in the manuscript that because your 'controls' are day 1 samples, initial transcriptome changes in response to the tick environment might be missed.

      This has been added in the discussion as a caveat (Lines 460-463).

      3. As someone who does not work with Bb, I would like to have seen a clearer description of what the feeding event looks like. Although there is some text in the introduction that touches on that ('prolonged nature of I. scapularis feeding'), I would like to see something even clearer. Maybe stating that feeding may take from x-y days would clarify that for the non-specialist.

      We updated the results to more clearly state that the tick falls off of the mice by around 4 days after feeding, our last time point (Lines 113-115). Additional details of tick feeding are also in the Figure 1 legend.

      4. In Fig. 3 linear DNA molecules seem to be drawn to scale. Is that also the case for plasmids? This could be clarified in the legend.

      The genome is drawn approximately to scale. We noted this and updated the legend with more information about how linear and circular plasmid names denote their size.

      5. Figure 5C: Colors are a bit confusing here. The legend indicates that they refer to fold changes, but the scale in the panel shows expression levels, not fold changes. Please clarify. Also, is this really TPM or RPKM? If comparisons of relative levels between different genes are made, number of reads should be normalized by gene length.

      The heatmap in Figure 4C does show expression levels, and we updated the legend to more clearly state this. The highlighted gene names are meant to show which genes change two-fold during this time (those present in panel A). The data are presented as TPM (transcripts per million), which, like RPKM, is normalized by gene length (PMID: 20022975).

    1. Author Response:

      The following is the authors' response to the original reviews.

      We have now incorporated the changes recommended by the reviewers to improve the interpretations and clarity of the manuscript. We are grateful for their thoughtful comments and suggestions, which have significantly strengthened the manuscript.

      Reviewer #1 (Public Review):

      Park et al demonstrate that cells on either side of a BM-BM linkage strengthen their adhesion to that matrix using a positive feedback mechanism involving a discoidin domain receptor (DDR-2) and integrin (INA-1 + PAT-3). In response to its extracellular ligand (Collagen IV/EMB-9), DDR-2 is endocytosed and initiates signaling that in turn stabilizes integrin at the membrane. DDR-2 signaling operates via Ras/LET-60. This work's strength lies in its excellent in vivo imaging, especially of endogenously tagged proteins. For example, tagged DDR-2:mNG could be seen relocating from seam cell membranes to endosomes. I also think a second strength of this system is the ability to chart the development of BM-BM linkage over time based on the stages of worm larval development. This allows the authors to show DDR signaling is needed to establish linkage, rather than maintain it. It likely is relevant to many types of cells that use integrin to adhere to BM and left me pondering a number of interesting questions.

      We thank the reviewer for highlighting the strengths and impact of our work in expanding our understanding of tissue linkages and how DDR and integrins might work in other contexts.

      For example: (1) Does DDR-2 activation require integrin? Perhaps integrin gets the process started and DDR-2 positively reinforces that (conversely is DDR-2 at the top of a linear pathway)?

      DDR activation by receptor clustering upon exposure to its ligand collagen is well documented (Juskaite et al., 2017 eLife PMID: 285ti0245). Clustered DDR is rapidly internalized into endocytic vesicles, where full activation of tyrosine kinase activity is thought to occur (Fu et al., 2013 J Biol Chem PMID: 23335507). Supporting this model, we found that concentrated type IV collagen is required for vesicular DDR-2 localization in the utse and seam cells at the utse-seam connection. Whether DDR-2 activation requires integrin has not been fully established. However, one study using mouse and human cell lines showed that DDR1 activation occurs independent of integrin (Vogel et al., 2000 J Biol Chem PMID: 10681566), consistent with the latter possibility raised by the reviewer that DDR-2 is upstream of integrin.

      To test these hypotheses, we require an experimental condition where loss or near complete loss of INA- 1 integrin is achieved by the mid-to-late L4 larval stage, when DDR-2 is activated by collagen and taken into endocytic vesicles. Currently, we can only partially deplete INA-1 by RNAi (Figure 5—figure supplement 2E), and strong loss of function mutations in ina-1 result in early larval arrest and lethality (Baum and Garriga, 1titi7 Neuron PMID: ti247263). To overcome these obstacles, we are adapting the new FLP-ON::TIR1 system developed for precise spatiotemporal protein degradation in worms (Xiao et al., 2023 Genetics PMID: 36722258). We hope to achieve a near complete knockdown of ina-1 with this timed depletion strategy. In the future, we will use this system to block DDR-2 and integrin function specifically in the utse or seam cells, to complement our current dominant negative mis-expression approach.

      (2) In ddr-2(qy64) mutants, projections seem to form from the central portion of the utse cell. Does this reveal a second function for DDR-2, regulating perhaps the cytoskeleton?

      We thank the reviewer for their observation and agree with their interpretation. We think it is important to comment on this and have stated in the results text, lines 208-212: “In addition, membrane projections emanating from the central body of the utse were detected in ddr-2(qy64) animals. These projections were first observed at the mid L4 stage and persisted to young adulthood (Figure 2C). These observations suggest that DDR-2 functions around the mid L4 to late L4 stages to promote utse-seam attachment, and that DDR-2 may also regulate utse morphology.”

      And (3) can you use the forward genetic tools available in C. elegans to find new genes connecting DDR-2 and integrin?

      This is an excellent suggestion. We found that loss of ddr-2 strongly enhanced the uterine prolapse (Rup) defect caused by RNAi mediated depletion of integrin. To find new genes connecting DDR-2 and integrin, a targeted screen for the Rup phenotype could be performed in an integrin reduction of function condition. As we cannot work with null or strong loss-of-function ina-1 alleles (described above), the screen could be conducted with either timed depletion of INA-1 with candidate RNAi treatments, or combinatorial ina-1 RNAi with candidate RNAi treatments.

      I do see two areas where the manuscript could be improved. First, the authors rely on imprecise genetic methods to reach their conclusions (i.e. systemic RNAi, or expression of dominant negative constructs.) I think their conclusion would be stronger if they used tissue specific degradation to block ddr-2 function specifically in the utse or seam cells. Methods to do this are now regularly used in C. elegans and the authors have already developed the necessary tissue-specific promoters.

      We agree with the reviewer that tissue specific degradation of DDR-2 in the utse and seam cells will complement and strengthen our evidence for the site of action of DDR-2. As described earlier, we are currently adapting the FLP-ON::TIR1 tissue degradation system to perform these experiments and will provide our findings in a follow-up manuscript.

      Second, the manuscript is presented in the introduction as a study on formation and function of BM-BM linkage. The authors start the discussion in a similar manner. But their results are about adhesion between cells and BM. In fact they show the BM-BM linkage forms normally in ddr-2 mutants. Thus it seems like what they have really uncovered is an adhesion mechanism that works in parallel to the BM-BM linkage. Since ddr-2 appears to function equally in both utse + seam cells (based on their dominant negative data), there are likely three layers of adhesion (utse-BM, BM-BM, BM-seam) and if any of those break down, you get a partially penetrant rupture phenotype.

      The reviewer raises an important and interesting point, and we agree that we did not articulate the organization of the utse-seam tissue connection clearly. The utse-seam connection is comprised of the utse and seam BMs each ~50nm thick, and a connecting matrix bridging the two BMs, which is ~100nm thick (Vogel and Hedgecock, 2001 Development PMID: 11222143). Type IV collagen builds up to high levels within the connecting matrix and links the utse and seam BMs, and its concentration is required for DDR-2 vesiculation. An important point we did not highlight is that type IV collagen is approximately 400 nm long (Timpl et al. 1ti81, Eur J Biochem PMID: 6274634). Thus, collagen molecules within the connecting matrix could span the entire length of the utse-seam connection and project into the utse and seam BMs to interact with cell surface receptors. Consistent with this possibility, we found that buildup of type IV collagen that spans the utse-seam BM-BM linkage correlated with the timing of DDR-2 activation/vesiculation within utse and seam cells. In addition, super-resolution imaging of the mouse kidney glomerular basement membrane (GBM), a tissue connection between endothelial BM and epithelial (podocyte) BM, showed type IV collagen, which spans the BMs, projects into the endothelial and podocyte BMs (Suleiman et al., 2013 eLife PMID: 24137544 ). We carefully considered these points to generate the schematics in Figure 1A and Figure 8, but failed to articulate this point in the manuscript. We are grateful for the reviewer for bringing up our error and have now stated these details in the text to address the reviewer’s concern as outlined below.

      In the introduction (lines ti3-ti6): “A BM-BM tissue connection between the large, multinucleated uterine utse cell and epidermal seam cells stabilizes the uterus during egg laying. The utse-seam connection is formed by BMs of the utse and the seam cells, each ~50 nm thick, which are bridged by an ~100 nm connecting matrix (Vogel and Hedgecock 2001, Morrissey, Keeley et al. 2014, Gianakas, Keeley et al. 2023).”

      In the discussion (lines 507-520): “We also found that internalization of DDR-2 at the utse-seam connection correlated with the assembly of type IV collagen at the BM-BM linkage and was dependent on type IV collagen deposition. Type IV collagen is ~400 nm in length and the utse-seam connecting matrix spans ~100 nm, while the utse and seam BMs are each ~50 nm thick (Timpl, Wiedemann et al. 1ti81, Vogel and Hedgecock 2001). Thus, collagen molecules in the connecting matrix could project into the utse and seam BMs to interact with DDR-2 on cell surfaces. Consistent with this possibility, super- resolution imaging of the mouse kidney glomerular basement membrane (tiBM), a tissue connection between podocytes and endothelial cells, showed type IV collagen within the tiBM projecting into the podocyte and endothelial BMs (Suleiman, Zhang et al. 2013). As DDR-2 is activated by ligand-induced clustering of the receptor (Juskaite, Corcoran et al. 2017, Corcoran, Juskaite et al. 201ti), it suggests that the BM-BM linking type IV collagen network, which is specifically assembled at high levels, clusters and activates DDR-2 in the utse and seam cells to coordinate cell-matrix adhesion at the tissue linkage site.”

      These concerns do not undercut the significance of this work, which identifies an interesting mechanism cells use to strengthen adhesion during BM linkage formation. In fact, I am excited to read future papers detailing the connection between DDR-2 and integrin. But before undertaking those experiments the authors should be certain which cells require DDR-2 activity, and that should not be determined based solely on mis expression of a dominant negative.

      We thank the reviewer for recognizing the significance of our work and reiterate that we will use tissue-specific degradation for site of action experiments in future studies on the biology of the utse- seam tissue linkage.

      Reviewer #2 (Public Review):

      This paper explores the mechanisms by which cells in tissues use the extracellular matrix (ECM) to reinforce and establish connections. This is a mechanistic and quantitative paper that uses imaging and genetics to establish that the Type IV collagen, DDR-2/collagen receptor discoidin domain receptor 2, signaling through Ras to strengthen an adhesion between two cell types in C. elegans. This connection needs to be strong and robust to withstand the pressure of the numerous eggs that pass through the uterus. The major strengths of this paper are in crisply designed and clear genetic experiments, beautiful imaging, and well supported conclusions. I find very few weaknesses, although, perhaps the evidence that DDR-2 promotes utse-seam linkage through regulation of MMPs could be stronger. This work is impactful because it shows how cells in vivo make and strengthen a connection between tissues through ECM interactions involving collaboration between discoidin and integrin.

      We appreciate the reviewer’s assessment of the impact of our work in detailing a mechanism for how cells increase their adhesion to the ECM to establish connections between adjacent tissues. We have softened the interpretation of our MMP localization data to address the reviewer’s concern (detailed below).

      Reviewer #1 (Recommendations For The Authors):

      Regarding Figure 1D, is it possible to show when the BM forms on the cartoons more clearly (something like the 3rd section of Fig 3A)? I can see it in the timeline but it's hard to follow in the diagrams.

      We agree with the reviewer that we could show when the BM-BM connecting matrix forms more clearly in Figure 1D. Hemicentin and fibulin, the earliest components of the connecting matrix, are detected at very low levels at the utse-seam connection during the mid-L4 stage and are more prominently localized by the mid-to-late L4 stage (Gianakas et al., 2023 J Cell Biol PMID: 36282214). For this reason, we only show the connecting matrix in yellow from the mid-to-late L4 stages onward. We have now made the BM-BM connection more prominent in the figure 1D cartoons with boxed outlines (similar to Figure 3A as the reviewer suggested). We also added a label for the time window when the BM-BM connection forms.

      Regarding the RNAi induced prolapse phenotype, looking at 2B, it appears that between 5% and 10% of animals have uterine prolapse when fed control RNAi. Is this correct, it seems very high? This prolapse in control animals was not observed other RNAi experiments such as Figure 5C.

      We thank the reviewer for pointing this out. For Figure 2B, the control used was wild-type N2 animals fed with OP50 E. coli bacteria, rather than HT115 bacteria carrying the L4440 empty vector (control RNAi). This is because the main comparisons were to five ddr-1 and ddr-2 mutant strains. We did notice a slightly higher baseline uterine prolapse frequency (5% on average, detailed in Figure 2—Source data 1) in wild-type animals fed OP50 bacteria, compared to HT115 bacteria fed animals (approximately 1-2% on average). It is possible this could be linked to the nutritional differences in the two bacterial strains. However, we are confident of our data in Figure 2B as we carried out 3 independent trials, and the uterine prolapse frequencies in ddr-1 mutant animals matched the baseline in wild-type animals, while the frequencies for ddr-2 mutants were all increased over the baseline in all trials (as detailed in Figure 2—Source data 1).

      Relating to the point above, in reading the methods to try to understand how they did the RNAi, I noticed that they measure prolapse continually over five days. I didn't realize it takes a long time to occur. I think they should explain this in the text and in the figures. Reading the manuscript I thought prolapse occurred as soon as mutant animals began laying eggs. In the text they should explain this when they first assay the phenotype (page 7), and for figures the Y axis on the graphs could say "% uterine prolapse after 5 days."

      We thank the reviewer for their suggestions. We did not articulate clearly that the utse-seam connection is able to withstand some mechanical stress, even when key components are lost. It’s only over time and repeated use that the connection breaks down. This is likely because a number of components contribute to the connection and as we have shown previously, there is feedback, such that when one components is reduced, such as collagen, hemicentin is increased in levels at the BM-BM connection. Since ruptures arising from utse-seam detachments typically occur sometime after the onset

      of egg-laying, we screened the entire egg-laying period (days two to five post-L1) as described in Gianakas et al. 2023. We have now incorporated these points in the text and figures as follows:

      In the introduction, we clarified that utse-seam BM-BM connection breaksdown over time, by adding (lines titi-105): “Hemicentin promotes the recruitment of type IV collagen, which accumulates at high levels at the BM-BM tissue connection and strengthens the adhesion, allowing it to resist the strong mechanical forces of egg-laying. The utse-seam connection is robust, with each component of the tissue- spanning matrix contributing to the BM-BM connection (Gianakas, Keeley et al. 2023). This likely accounts for the ability of the utse-seam connection to initially resist mechanical forces after loss of any one of these components, delaying the uterine prolapse phenotype until sometime after the initiation of egg-laying.”

      We expanded the results text when we first describe the Rup phenotype (lines 183-184): “We first screened for the Rup phenotype caused by uterine prolapse, observing animals every day during the egg-laying period, from its onset (48 h post-L1) to end (120 h) (Methods)”.

      We provided more detail in the Methods section (lines 784-7ti0): “Uterine prolapse frequency was assessed as described previously (Gianakas et al 2023). Briefly, synchronized L1 larvae were plated (~20 animals per plate) and after 24 h, the exact number of worms on each plate was recorded. Plates were then visually screened for ruptured worms (uterine prolapse) every 24 h during egg-laying (between 48 h to 120 h post-L1). We chose to examine the entire egg-laying period as ruptures arising from utse-seam detachments do not usually occur at the onset of egg-laying, but after cycles of egg-laying that place repeated mechanical stress on the utse-seam connection (Gianakas et al 2023).”

      Finally, we modified the Y-axes of graphs in Figure 2B and 5C and the respective figure legends as suggested by the reviewer.

      Then I went back and compared to the previous publication (Gianakas, 2023). I would be interested to see a time course of how many animals prolapse after 1 day, 2 days, etc.? Is this consistent with their data on hemicentrin?

      We agree with the reviewer that a time course of uterine prolapse would be interesting as we saw ruptures occur throughout the egg-laying period. However, for the hemicentin knockdown experiments in Gianakas et al. 2023 as well as the experiments in this study, we recorded only the pooled number of animals with ruptures at the end of the experimental window. In future studies we will also record the uterine prolapse frequencies on each day to generate time courses that will provide more insight into the function of proteins at the utse-seam connection.

      Lines 183-184: I'm not sure what it means to say "trended towards displaying a significant Rup phenotype?" Since the difference was not statistically significant, it would be better to say something like "increased but not statistically significant."

      We agree with the reviewer and have now modified this sentence (lines 190-193): “Animals carrying the ddr-2(ok574) allele, which deletes a portion of the intracellular kinase domain (Unsoeld, Park et al. 2013),also showed an increased frequency of the Rup phenotype compared to wild-type animals, although this difference was not statistically significant (Figure 2A and B)”.

      Line 186: 'penetrant' needs a qualifier to indicate the magnitude of the proportion of individuals with the phenotype.

      As we provide the Rup frequency numbers in Figure 2—Source data 1, we modified the sentence as follows (lines 1ti3-1ti5): “We further generated a full-length ddr-2 deletion allele, ddr-2(qy64), and confirmed that complete loss of ddr-2 led to a significant uterine prolapse defect (Figure 2A and B).”

      Lines 206-208; could the mounting/imaging procedure (which I assume requires squeezing the worm between agarose pad and coverslip) alter the occurrence of prolapse? I would think prolapse would occur more frequently under these conditions as compared to worms laying eggs on a plate.

      The reviewer brings up an important concern. The mounting and imaging procedure does require placing the worm between an agarose pad and a coverslip. However, this did not alter the occurrence of uterine prolapse in this experiment. We were careful to perform the same procedure on both wild-type and ddr- 2(qy64) animals to control for this. As detailed in the manuscript, none of the eight wild-type animals we mounted underwent uterine prolapse after recovery off the coverslip, and among the ddr-2(qy64) mutants we mounted, only the ones that exhibited utse-seam detachments went on to rupture later.

      We articulated these points more clearly by modifying lines 214-216 as follows: “Wild-type and ddr- 2(qy64) animals were mounted and imaged at the L4 larval stage for utse-seam attachment defects, recovered, and tracked to the 72-hour adult stage, where they were examined for the Rup phenotype.”

      In seam cells you can see that DDR-2:mNG is present at membranes from early to mid L4, which makes sense. But I cannot see it on the membrane at any time point in the utse. Perhaps it is obscured by the yellow dotted line. Should it be visible on utse membranes before it is endocytosed?

      The reviewer raises an interesting question. We think it is likely that DDR-2 is initially on the membrane of the utse like it is on the seam cells. However, we have not observed this, possibly due to the complex shape and thin membrane extensions of the utse. We are unable even to detect clear membrane enrichment of membrane markers in the utse (for example, compare the utse and seam membrane markers in Figure 3B). Thus, we refrained from speculating on DDR-2 utse membrane localization in the manuscript, and instead focused on the pattern of vesicular DDR-2 peaking at the late L4 stage, which was clearly visible in both the utse and seam cells.

      Sup Fig 3A - please show quantification of seam cells not contacting utse at the same Y-axis scale as for regions that do contact utse.

      We have modified the Y-axis scale for the quantification of the seam region not contacting the utse.

      Figure 4A - I don't see a difference between WT and ok574 - what am I missing?

      In the representative ok574 animal shown, a portion of the utse arm on the top right is detached from the seam. To make this phenotype clearer, we have recropped the image panels, readjusted the brightness and contrast of the utse and the seam, and redrawn the outline of the detachment to make this clearer.

      Figure 4C+D, and lines 296-298: I'd bet that both are needed to recruit DDR-2 to membranes. But him-4 has a more severe phenotype because the RNAi knockdown is much more effective (perhaps b/c they are using the newer t444t vector).

      We agree with the reviewer that the him-4 knockdown phenotype is likely more severe than emb-9 knockdown. Type IV collagen at the utse-seam connection is very stable compared to hemicentin (Gianakas et al 2023, J Cell Biol PMID: 36282214, see Fig. 5C), which could explain the lower knockdown efficiency.

      We modified our interpretation of the data in the text as follows (lines 308-312): “In addition, we did not detect DDR-2 at the cell surface, suggesting that hemicentin has a role in recruiting DDR-2 to the site of utse-seam attachment. It is possible that collagen could also function in DDR-2 recruitment, but we could not assess this definitively due to the lower knockdown efficiency of emb-9 RNAi (Figure 4—figure supplement 1A).”

      Reviewer #2 (Recommendations For The Authors):

      Line 218 DDR-2 (typo)

      We have corrected this typo.

      Evidence (line 344-348) may not be strong enough to say whether or not DDR-2 promotes utse- seam linkage through regulation of MMPs.

      We agree with the reviewer and have softened our conclusions as follows (lines 356-363): “The C. elegans genome harbors six MMP genes, named zinc metalloproteinase 1-6 (zmp-1-6) (Altincicek, Fischer et al. 2010). We examined four available reporters of ZMP localization (ZMP-1::tiFP, ZMP-2::tiFP, ZMP-3::tiFP, and ZMP-4::tiFP) (Kelley, Chi et al. 201ti).Only ZMP-4 was detected at the utse-seam connection and its localization was not altered by knockdown of ddr-2 (Figure 5—figure supplement 1F). These observations suggest that DDR-2 does not promote utse-seam linkage through regulation of MMPs, although we cannot rule out roles for DDR-2 in promoting the expression or localization of ZMP-5 or ZMP-6.”

      The authors show the critical period is in late L4, however, is the signaling needed later too? For example, is the linkage strengthening moderated by DDR-2 important as more eggs accumulate?

      The reviewer raises an interesting question. We observed that the vesicular localization of DDR-2 sharply declined before the onset of egg-laying. By young adulthood, very few punctate structures of DDR-2 were observed in the seam cells, and none in the utse (Figure 3B). Furthermore, the frequency of utse- seam detachments in ddr-2 mutant animals peaked by the late L4 stage and did not increase after this time, suggesting DDR function is no longer required after the late L4 stage (Figure 2D). Thus, we believe that DDR-2 signaling strengthens tissue linkage only during the early formation of the utse-seam connection between the mid and late L4 stage.

      We incorporated these points in the discussion (lines 477-485): “Through analysis of genetic mutations in the C. elegans receptor tyrosine kinase (RTK) DDR-2, an ortholog to the two vertebrate DDR receptors (DDR1 & DDR2) (Unsoeld, Park et al. 2013), we discovered that loss of ddr-2 results in utse-seam detachment beginning at the mid L4 stage. The frequency of detachments in ddr-2 mutant animals peaked around the late L4 stage and did not increase after this time. This correlated with the levels of DDR-2::mNG at the utse-seam connection, which peaked at the late L4 stage and then sharply declined by adulthood. Together, these findings suggest that DDR-2 promotes utse-seam attachment in the early formation of the tissue connection between the mid and late L4 stage.”

      Fig. 3B is the fluorescence quantification normalized to the area?

      Yes, it is. We used mean fluorescence intensity for all fluorescence quantifications to normalize for the area where the signal was measured. We added a line in Methods to emphasize this (lines 73ti-740): “We measured mean fluorescence intensity for all quantifications in order to account for linescan area.”

      Fig. 4B a statistical assessment of the degree of co-localization of DDR-2::mNG and the endosomal markers might be a nice addition.

      We believe the reviewer is referring to Figure 3—figure supplement 1B. We have now added the statistical assessment of the degree of co-localization of DDR-2::mNG and the endosomal markers.

      We want to sincerely thank the two reviewers for their thoughtful comments and suggestions. The changes we have made in response to these comments have substantially improved the manuscript.

    1. Author Response

      Reviewer #1 (Public Review):

      This study by Park et al. describes an interesting approach to disentangle gene-environment pathways to cognitive development and psychotic-like experiences in children. They have used data from the ABCD study and have included PGS of EA and cognition, environmental exposure data, cognitive performance data and self-reported PLEs. Although the study has several strengths, including its large sample size, interesting approach and comprehensive statistical model, I have several concerns:

      • The authors have included follow-up data from the ABCD Study. However, it is not very clear from the beginning that longitudinal paths are being explored. It would be very helpful if the authors would make their (analysis) approach clearer from the introduction. Now, they describe many different things, which makes the paper more difficult to read. It would be of great help to see the proposed path model in a Figure and refer to that in the Method.

      We clarified the specific longitudinal paths explored in our study in the end of the Introduction section (line 149~160). We also added a figure of the proposed path model (Figure 1) and refer to it in the Method section (line 232~239).

      • There is quite a lot of causal language in the paper, particularly in the Discussion. My advice would be to tone this down.

      We corrected and tone-downed all causal languages used in our manuscript. Per your suggestion, we deleted statements like ‘unbiased estimates’ and used expressions such as ‘adjustment for observed/unobserved confounding’ instead.

      • I feel that the limitation section is a bit brief, and can be developed further.

      We specified additional potential constraints of our study, including limited representativeness, limited periods of follow-up data, possible sample selection bias, and the use of non-randomized, observational data. These corrections can be found in line 518~538.

      • I like that the assessment of CP and self-reports PEs is of good quality. However, I was wondering which 4 items from the parent-reported CBCL were used and how did they correlate with the child-reported PEs? And how was distress taken into account in the child self-reported PEs measurement? Which PEs measures were used?

      We believe that the Reviewer #1’s comment for the correlations between PLEs derived from PQ-BC (total score and distress score PLEs) and from CBCL (parent-rated PLEs) might have been due to the fact that she/he was referring to the prior version of our manuscript submitted to a different journal. We obtained Pearson’s correlation coefficients between the PLEs (baseline year: r = 0.095~0.0989, p<0.0001; 1-year follow-up: r = 0.1322~0.1327, p<0.0001; 2-year follow-up: r = 0.1569~0.1632, p<0.0001) and added this information in the Method section for PLEs (line 198~201).

      • What was the correlation between CP and EA PGSs?

      We also added the Pearson’s correlation between the two PGSs (r =0.4331, p<0.0001) in the Methods section for PGS (line 214~215).

      • Regarding the PGS: why focus on cognitive performance and EA? It should be made clearer from the introduction that EA is not only measuring cognitive ability, but is also a (genetic) marker of social factors/inequalities. I'm guessing this is one of the reasons why the EA PGS was so much more strongly correlated with PEs than the CP PGS. See the work bij Abdellaoui and the work by Nivard.

      We thank the reviewer for the feedback to clarify that educational attainment (EA) is not only a genetic marker of cognitive ability but also that of socioeconomic outcomes. Per your suggestion, we included the associations of EA PGS with multiple biological and socioeconomic outcomes found in prior studies (e.g., Abdellaoui et al., 2022) in the Introduction (line 131~142).

      Abdellaoui, A., Dolan, C. V., Verweij, K. J. H., & Nivard, M. G. (2022). Gene–environment correlations across geographic regions affect genome-wide association studies. Nature Genetics. doi:10.1038/s41588-022-01158-0

      • Considering previous work on this topic, including analyses in the ABCD Study, I'm not surprised that the correlation was not very high. Therefore, I don't think it makes a whole of sense to adjust for the schizophrenia PGS in the sensitivity analyses, in other words, it's not really 'a more direct genetic predictor of PLEs'.

      We conducted this adjustment considering that PLEs often precede the onset of schizophrenia. In addition, prior studies found that schizophrenia PGS is significantly associated with cognitive intelligence within psychosis patients (Shafee et al., 2018) and individuals at-risk of psychosis (He et al., 2021), and that significant distress psychotic-like experiences had greater positive correlation with schizophrenia PGS than PGS for psychotic-like experiences (Karcher et al., 2018).

      For these reasons, we thought that it is necessary to assess whether the effects of cognitive phenotypes PGS (i.e., CP PGS and EA PGS) in the linear mixed model are significant after adjusting for schizophrenia PGS. We believe our results from the mixed linear model showed the sensitivity and specificity of the association between cognitive phenotype PGS and PLEs.

      He, Q., Jantac Mam-Lam-Fook, C., Chaignaud, J., Danset-Alexandre, C., Iftimovici, A., Gradels Hauguel, J., . . . Chaumette, B. (2021). Influence of polygenic risk scores for schizophrenia and resilience on the cognition of individuals at-risk for psychosis. Translational Psychiatry, 11(1). doi:10.1038/s41398-021-01624-z

      Karcher, N. R., Paul, S. E., Johnson, E. C., Hatoum, A. S., Baranger, D. A. A., Agrawal, A., . . . Bogdan, R. (2021). Psychotic-like Experiences and Polygenic Liability in the Adolescent Brain Cognitive Development Study. Biological Psychiatry: Cognitive Neuroscience and Neuroimaging. doi:https://doi.org/10.1016/j.bpsc.2021.06.012

      Shafee, R., Nanda, P., Padmanabhan, J. L., Tandon, N., Alliey-Rodriguez, N., Kalapurakkel, S., . . . Robinson, E. B. (2018). Polygenic risk for schizophrenia and measured domains of cognition in individuals with psychosis and controls. Translational Psychiatry, 8(1). doi:10.1038/s41398-018-0124-8

      • How did the FDR correction for multiple testing affect the results?

      For all analysis results presented in our study, False Discovery Rate (FDR) correction for multiple testing compared p-values of nine key study variables: PGS (cognitive performance or educational attainment), family income, parental education, family’s financial adversity, Area Deprivation Index, years of residence, proportion of population below -125% of the poverty line, positive parenting behavior, and positive school environment. An exception was the sensitivity analysis that included schizophrenia PGS in the linear mixed model for adjustment: with another PGS variable added, FDR correction compared p-values of ten key variables. Overall, the effects of FDR correction on the results were limited; i.e., the majority of associations between the key variables and the outcomes, which were deemed highly significant, remained unchanged after the FDR correction.

      Overall, I feel that this paper has the potential to present some very interesting findings. However, at the moment the paper misses direction and a clear focus. It would be a great improvement if the readers would be guided through the steps and approach, as I think the authors have undertaken important work and conducted relevant analyses.

      We express our appreciation to the reviewer for the constructive feedback and guidance, which has significantly contributed to the improvement of our manuscript. As addressed in the preceding sections, we have implemented the necessary corrections and clarifications in response to the reviewer's suggestions. We remain open to making further amendments as needed, and thus invite any additional comments should any aspect of our revisions be deemed inadequate or inappropriate.

      Reviewer #2 (Public Review):

      This paper tried to assess the link between genetic and environmental factors on psychotic-like experiences, and the potential mediation through cognitive ability. This study was based on data from the ABCD cohort, including 6,602 children aged 9-10y. The authors report a mediating effect, suggesting that cognitive ability is a key mediating pathway in the link between several genetic and environmental (risk and protective) factors on psychotic-like experiences.

      While these findings could be potentially significant, a range of methodological unclarities and ambiguities make it difficult to assess the strength of evidence provided.

      Strengths of the methods:

      The authors use a wide range of validated (genetic, self- and parent-reported, as well as cognitive) measures in a large dataset with a 2-year follow-up period. The statistical methods have the potential to address key limitations of previous research.

      We sincerely thank the reviewer for recognizing these methodological strengths of our study. The reviewer’s positive comments are highly supportive and encouraging for us.

      Weaknesses of the methods:

      The rationale for the study is not completely clear. Cognitive ability is probably a more likely mediator of traits related to negative symptoms in schizophrenia, rather than positive symptoms (e.g., psychosis, psychotic-like symptom). The suggestion that cognitive ability might lead to psychotic-like symptoms in the general population needs further justification.

      We sincerely thank and highly appreciate the concerns that the reviewer has raised regarding our proposal that cognitive ability may serve as a mediator of psychotic-like experiences. To the best of our knowledge, it has been proposed that cognitive ability can be a mediator of positive symptoms in schizophrenia (including psychotic-like experiences), as well as negative symptoms. This mediating role of cognitive ability was proposed in several prior studies on cognitive model of schizophrenia/psychosis. Per your suggestion, we included further justification in the Introduction section of our study (line 104~107). Specifically, we highlighted that cognitive ability has been theoretically proposed as a potential mediator of genetic & environmental influence on positive symptoms of schizophrenia such as psychotic-like experiences. We refer to studies conducted by Howes & Murray (2014) and Garety et al. (2001).

      Howes, O. D., & Murray, R. M. (2014). Schizophrenia: an integrated sociodevelopmental-cognitive model. The Lancet, 383(9929), 1677-1687. doi:https://doi.org/10.1016/S0140-6736(13)62036-X

      Garety, P. A., Kuipers, E., Fowler, D., Freeman, D., & Bebbington, P. E. (2001). A cognitive model of the positive symptoms of psychosis. Psychological Medicine, 31(2), 189-195. doi:10.1017/S0033291701003312

      Terms are used inconsistently throughout (e.g., cognitive development, cognitive capacity, cognitive intelligence, intelligence, educational attainment...). It is overall not clear what construct exactly the authors investigated.

      Thank you for your comment. We corrected the term ‘cognitive capacity’ to ‘cognitive phenotypes’ throughout our manuscript. We also added in the Introduction (line 141~143) that we will collectively refer to these two PGSs of focus as ‘cognitive phenotypes PGSs’, which is similar to the terms used in prior research (Joo et al., 2022; Okbay et al., 2022; Selzam et al., 2019).

      Joo, Y. Y., Cha, J., Freese, J., & Hayes, M. G. (2022). Cognitive Capacity Genome-Wide Polygenic Scores Identify Individuals with Slower Cognitive Decline in Aging. Genes, 13(8), 1320. doi:10.3390/genes13081320

      Okbay, A., Wu, Y., Wang, N., Jayashankar, H., Bennett, M., Nehzati, S. M., . . . Young, A. I. (2022). Polygenic prediction of educational attainment within and between families from genome-wide association analyses in 3 million individuals. Nature Genetics, 54(4), 437-449. doi:10.1038/s41588-022-01016-z

      Selzam, S., Ritchie, S. J., Pingault, J.-B., Reynolds, C. A., O’Reilly, P. F., & Plomin, R. (2019). Comparing Within- and Between-Family Polygenic Score Prediction. The American Journal of Human Genetics, 105(2), 351-363. doi:https://doi.org/10.1016/j.ajhg.2019.06.006

      Not the largest or most recent GWASes were used to generate PGSes.

      Thank you for mentioning this point. The reason why we were not able to use the largest GWAS for cognitive intelligence, educational attainment and schizophrenia is because (unfortunately) our study started earlier than the point when the GWAS studies by Okbay et al. (2022) and Trubetskoy et al. (2022) were published. We corrected that our study used ‘a GWAS of European-descent individuals for educational attainment and cognitive performance’ instead of the largest GWAS (line 206~208).

      It is not fully clear how neighbourhood SES was coded (higher or lower values = risk?). The rationale, strengths, and assumptions of the applied methods are not fully clear. It is also not clear how/if variables were combined into latent factors or summed (weighted by what). It is not always clear when genetic and when self-reported ethnicity was used. Some statements might be overly optimistic (e.g., providing unbiased estimates, free even of unmeasured confounding; use of representative data).

      Consistent with the illustration of neighborhood SES in the Methods section, higher values of neighborhood SES indicate risk. In the original Figure 2, higher values of neighborhood SES links to lower intelligence (direct effects: β=-0.1121) and higher PLEs (indirect effects: β=-0.0126~ -0.0162). We think such confusion might have been caused by the difference between family SES (higher values = lower risk) neighborhood SES (higher values = higher risk). Thus, we changed the terms to ‘High Family SES’ and ‘Low Neighborhood SES’ in the corrected figure (Figure 3) for clarification.

      Considering that shorter duration of residence may be associated with instability of residency, it may indicate neighborhood adversity (i.e., higher risk). This definition of the ‘years of residence’ variable is in line with the previous study by Karcher et al. (2021).

      We represented PGSs, family SES, neighborhood SES, positive family and school environment, and PLEs as composite indicators (derived from a weighted sum of relevant observed variables). To the best of our knowledge, it has been suggested from prior studies that these variables are less likely to share a common factor and were assessed as a composite index during analyses. For instance, Judd et al. (2020) and Martin et al. (2015) analyze genetic influence of educational attainment and ADHD as composite indicators. Also, as mentioned in Judd et al. (2020), socioenvironmental influences are often analyzed as composite indicators. Studies on psychosis continuum (e.g., van Os et al., 2009) suggest that psychotic disorders are likely to have multiple background factors instead of having a common factor, and notes that numerous prior research uses composite indices to measure psychotic symptoms. These are the reasons why we used components for these constructs instead of generating latent factors (which is done in the standard SEM method). On the contrary, we represented general intelligence as a common factor that determines the underlying covariance pattern of fluid and crystallized intelligence, based on the classical g theory of intelligence. We added this explanation in line 269~285.

      Moreover, during estimation, the IGSCA determines weights of each observed variable in such a way as to maximize the variances of all endogenous indicators and components. We added this explanation in the description about the IGSCA method (line 266~268).

      We deleted overly optimistic statements like ‘unbiased estimates’ and used expressions such as ‘adjustment for observed/unobserved confounding’ instead, throughout our manuscript.

      Judd, N., Sauce, B., Wiedenhoeft, J., Tromp, J., Chaarani, B., Schliep, A., ... & Klingberg, T. (2020). Cognitive and brain development is independently influenced by socioeconomic status and polygenic scores for educational attainment. Proceedings of the National Academy of Sciences, 117(22), 12411-12418.

      Karcher, N. R., Schiffman, J., & Barch, D. M. (2021). Environmental Risk Factors and Psychotic-like Experiences in Children Aged 9–10. Journal of the American Academy of Child & Adolescent Psychiatry, 60(4), 490-500. doi:10.1016/j.jaac.2020.07.003

      Martin, J., Hamshere, M. L., Stergiakouli, E., O'Donovan, M. C., & Thapar, A. (2015). Neurocognitive abilities in the general population and composite genetic risk scores for attention‐deficit hyperactivity disorder. Journal of Child Psychology and Psychiatry, 56(6), 648-656.

      van Os, J., Linscott, R., Myin-Germeys, I., Delespaul, P., & Krabbendam, L. (2009). A systematic review and meta-analysis of the psychosis continuum: Evidence for a psychosis proneness–persistence–impairment model of psychotic disorder. Psychological Medicine, 39(2), 179-195. doi:10.1017/S0033291708003814

      It appears that citations and references are not always used correctly.

      We thoroughly checked all citations and specified the references for each statement. We deleted Plomin & von Stumm (2018) and Harden & Koellinger (2020) and cited relevant primary studies (e.g., Lee et al., 2018; Okbay et al., 2022; Abdellaoui et al., 2022) instead. We also specified the references supporting the statement that educational attainment PGS links to brain morphometry (Judd et al., 2020; Karcher et al., 2021). As Okbay et al. (2022) use PGS of cognitive intelligence (which mentions the analyses results in their supplementary materials) as well as educational attainment, we decided to continue citing this reference. These corrections can be found in line 131~141.

      Strengths of the results:

      The authors included a comprehensive array of analyses.

      We thank the reviewer for the positive comment.

      Weaknesses of the results:

      Many results, which are presented in the supplemental materials, are not referenced in the main text and are so comprehensive that it can be difficult to match tables to results. Some of the methodological questions make it challenging to assess the strength of the evidence provided in the results.

      As you rightly identified, we inadvertently failed to reference Table S2 in the main text. We have since corrected this omission in the Results section for the IGSCA (SEM) analysis (line 375). The remainder of the supplementary tables (Table S1, S3~S7) have been appropriately cited in the main manuscript. We recognize that the quantity of tables provided in the supplementary materials is substantial. However, given the comprehensiveness and complexity of our analyses, which encompass a wide array of study variables, these tables offer intricate results from each analysis. We deem these results, which include valuable findings from sensitivity analyses and confound testing, too significant to exclude from the supplementary materials. That said, we are open to, and would greatly welcome, any further suggestions on how to present our supplementary results in a more accessible and digestible format. We are ready and willing to implement any necessary modifications to ensure clarity and ease of comprehension. Your guidance in this matter is highly valued.

      Appraisal:

      The authors suggest that their findings provide evidence for policy reforms (e.g., targeting residential environment, family SES, parenting, and schooling). While this is probably correct, a range of methodological unclarities and ambiguities make it difficult to assess whether the current study provides evidence for that claim.

      Impact:

      The immediate impact is limited given the short follow-up period (2y), possibly concerns for selection bias and attrition in the data, and some methodological concerns.

      We added as study limitations (line 518~538) that the impact of our findings for understanding cognitive and psychiatric development during later childhood may be limited due to the relatively short follow-up period, the possibility of sample selection bias, and the problems of interpreting analyses results from an observational study as causality (despite the novel causal inference methods, designed for non-randomized, observational data, that we used).

      As responded above, we made necessary corrections and clarifications for the points suggested by the reviewer. As we are willing to make additional revisions, please feel free to give comments if you feel that our corrections are insufficient or inappropriate.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements

      We thank the reviewers for their positive statement and the significance of our work.

      2. Point-by-point description of the revisions


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This paper contains a set of highly valuable information on the physicochemical parameteters of betain lipids - which are synthesized in microalgae and some other lower eukaryotic organisms.

      The authors, using advanced biophysical techniques - neutron diffraction and small-angle scattering (SANS) as well as molecular dynamics (MD) simulations - established key physicochemical parameters of synthetic betaine lipid DP-DGTS, and compared it with those of the DPPC phospholipid. They "show that DP-DGTS bilayers are thicker, more rigid, and mutually more repulsive than DPPC bilayers". These are important findings.

      The authors also analyzed the phylogenetic tree of the appearance and disappearance of DGTS biosynthesis enzymes, which - together with the observed "different properties and hydration response of PC and DGTS" led them to explain "the diversity of betaine lipids observed in marine organisms and for their disappearance in seed plants". The authors tentatively suggest "A physicochemical cause of betaine lipid evolutionary loss in seed plants" (Title with "?")

      We put a question mark because our work suggests that the difference of sensitivity to hydration between DGTS and PC bilayers could be an explanation for the betaine lipid disappearance in seed plants due to the dry stage of the seed. In our hands, we never managed to obtain 35S-BTA1 overexpressing plant that produce seed. However, we do not have a formal evidence for this fact. We propose to change the title into: “The possible role of lipid bilayer properties in the evolutionary disappearance of betaine lipids in seed plants.

      May major concerns with this suggestion are:

      • In thylakoid membranes (TMs) the only phospholipid, PG, plays key roles in PSII and PSI functions (Wada and Murata 2007 Photosynth Res, Hagio et al. Plant Physiol 2000, Domonkos et al. 2004 Plant Physiol; it is difficult to explain how these roles would be overtaken by betaine lipids. In fact, data of Huang et al. (https://www.sciencedirect.com/science/article/pii/S2211926418309366) indicate betaine lipids constitute the major compounds of non-plastidial membranes" and compensation mechanism operate according to which "by the increase of PG in thylakoid membranes, suggesting a transfer of P from non-plastidial membranes to chloroplasts that would maintain a stable lipid composition of thylakoid membranes".
      • Although neutron diffraction and SANS data, as well as MD simulationa might indicate important differences, the behavior of membranes (e.g. stacking interactions, overall structure and structural dynamics of TMs, protein embedding conditions / membrane thickness etc), TMs are more dominantly determined by protein-protein interactions, mainly because these membranes, contain only small areas occupied by the bilayer phase. Similar arguments hold true for the inner mitochondrial membranes (IMMs). I suggest to take into account these severe limitations when extrapolating the data and trying to reach general conclusions. In general, I suggest a more cautious interpretation of data.

      We fully agree with the reviewer’s comments. We indeed wrote in the introduction: “In algae, under phosphate starvation, a situation commonly met in the environment, betaine lipids replace phospholipids in extraplastidic membranes. Because betaine lipids are localized in these membranes [11, 12] and share a common structural fragment with the main extraplastidic phospholipid phosphatidylcholine (PC) (Figure 1A and B), it can be speculated that these two lipid classes are interchangeable, but this was never demonstrated.”

      Plastidial membranes are mainly composed of the non-phosphorus glycerolipids MGDG, DGDG and SQDG. It is well known that in phosphate starvation, in plants and algae, the main phospholipid present in thylakoid membranes, PG, is replaced by SQDG because they are both anionic and bilayer forming lipids (Hölzl G, Dörmann P. Chloroplast Lipids and Their Biosynthesis. Annu Rev Plant Biol. 2019 Apr 29;70:51-81. doi: 10.1146/annurev-arplant-050718-100202; Endo K, Kobayashi K, Wada H. Sulfoquinovosyldiacylglycerol has an Essential Role in Thermosynechococcus elongatus BP-1 Under Phosphate-Deficient Conditions. Plant Cell Physiol. 2016 Dec;57(12):2461-2471; Van Mooy BA, Rocap G, Fredricks HF, Evans CT, Devol AH. Sulfolipids dramatically decrease phosphorus demand by picocyanobacteria in oligotrophic marine environments. Proc Natl Acad Sci U S A. 2006 Jun 6;103(23):8607-12.; Kobayashi K, Fujii S, Sato M, Toyooka K, Wada H. Specific role of phosphatidylglycerol and functional overlaps with other thylakoid lipids in Arabidopsis chloroplast biogenesis. Plant Cell Rep. 2015 Apr;34(4):631-42.). We recently showed by the same kind of neutron diffraction approaches that PG and SQDG share similar physicochemical properties that can explain their conserved replacement by each other in plastidial membranes (Bolik S, Albrieux C, Schneck E, Demé B, Jouhet J. Sulfoquinovosyldiacylglycerol and phosphatidylglycerol bilayers share biophysical properties and are good mutual substitutes in photosynthetic membranes. Biochim Biophys Acta Biomembr. 2022 Dec 1;1864(12):184037. ). However, nothing is known about mitochondrial membranes and DGTS localization. Because PC is a major lipid component of mitochondria in plants and fungi and PC is absent in Chlamydomonas reinhardtii, mitochondria membranes could contain DGTS at least in Chlamydomonas.

      To clarify this statement, we added in the introduction the sentences: “Betaine lipid synthesis is located in the ER [13,14] and betaine lipids are expected to be absent in photosynthetic membranes [12]. Therefore, this PC-betaine lipid replacement is not expected to occur in photosynthetic membranes. However, it might occur at the surface of the chloroplast envelope where PC might be present [15–17]. Nothing is known about the composition of mitochondrial membranes in algae but because PC is a major lipid component in plant and fungal mitochondria, this replacement might also occur in mitochondria.” In the discussion, we replaced “cellular membrane” with “extraplastidial membrane”.

      A minor point - just to avoid possible misunderstanding: betaine can be present in large quantities in many photosynthetic organisms. A short statement on betaine would help.

      To avoid any confusion with betaine as a soluble molecule and betaine lipid, we added this sentence in the introduction: “The presence of betaine lipids is not linked to the synthesis of betaine, a soluble compound present in almost every organism including most animals, plants, and microorganisms, acting as protectant against osmotic stress [22].”

      **Referee cross-commenting**

      I agree with the evaluation of Reviewer #2 - while keeping mine

      Reviewer #1 (Significance (Required)):

      The physico-chemical properties of betaine lipids have not been established. These lipids - under P starvation of microalgae - accummulate in large quentites. Thus, their detailed characterization and comparison to (otherwise similar) phospolipids are of high importance and advance our knowledge about the roles of these lipids and the organization and structural / functional plasticity of biological membranes.

      As outlined above, I suggest a more cautious interpretation of the data and conclusions regarding e.g. the energy-converting membranes.

      I think the audience is relatively broad: (i) basic research of lipid models and (ii) methodology as well as calling the attention of membrane biologists to the scarcely studied betaine lipids.

      My field is the biophysics photosynthesis - the stability and plasticity of the oxygenic photosynthetic machinery at different levels of complexity; the and closest to this topic is the polymorphic lipid phase behavior of plant TMs.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      This manuscript nicely presents the effect of phosphate depletion on how betaine lipids function as effective replacements in a water-rich environment. The mix of computational and wet lab experiments provides details on membrane structure and general effects when phospholipids are changed to betaine lipids. I found this manuscript easy to read and understand and is worthy of publishing. However, I do have a few minor comments below to improve the manuscript.

      Minor Comments:

      1. Phases in PC lipids with saturated tails: The authors present a gel to liquid crystalline phase change for DPPC at 40oC. However, this is at the ripple-liquid crystalline phase transition and the gel doesn't occur until about 34-35oC. This should be noted in the manuscript.

      We indeed completed the sentence in the first result section by : “The DSC data show a sharp phase transition at 40.2 ± 0.1°C for DPPC corresponding to the transition between the ripple phase and the fluid phase, which is consistent with earlier reports on DPPC large unilamellar vesicles [25].”

      Page 4: I am confused with the following phase: "indicating either weak cooperativity between lipid bilayers or that phase co-existence is not a thermodynamic disadvantage, while this phenomenon is not observed for DPPC bilayers." What is meant by phase co-existence is not a thermodynamic disadvantage? Could this also be due to some frustration in phase coexistence and the presence of a ripple phase that kinetically is inhibited and thus a sharp transition is not observed?

      We did not observe a ripple phase in DP-DGTS as it is defined in DPPC bilayer either by DSC, neutron diffraction or SANS experiments. We don’t know if it exists in DP-DGTS bilayers. What we observe in neutron diffraction is a coexistence of gel phase and fluid phase domains in oriented multilayer films of DP-DGTS over a wide range of humidity whereas for DPPC we observe only a gel phase or a fluid phase. Because the thicknesses of the DP-DGTS bilayers are not so different between the gel phase and the fluid phase, we suppose that the free energy difference between the two phases is very small over a wide osmotic pressure range and that could explain the broad phase transition.

      To further clarify our point, we have reworded the sentence in the following way: “As seen in Figure 2A , by increasing the humidity, DPPC molecules transit from the gel to the fluid phase via a ripple phase through a narrow window of osmotic pressures as previously reported [30,31]. In contrast, DP-DGTS bilayers show a phase coexistence that can be observed over a wide P-range and without the appearance of a third phase that could be attributed to a distinct ripple phase (Figure 2B) before forming a single fluid phase at high humidity (i.e., at low P). Based on DSC and neutron diffraction as two independent techniques, we can safely conclude that the phase transition for DP-DGTS is broad. This observation indicates that the free energy difference between the two phases is very small over a wide osmotic pressure range and may be connected to the shapes of the pressure-distance relations in the two phases, which are discussed further below.” We also added in the legend of figure 4 (SANS experiment): “No ripple phase Pb was detected for DP-DGTS bilayers.”

      DOI for computational methods: The DOI listed computational files (https://doi.org/10.18419/darus-2360) does not work.

      Unfortunately, we did not ask for publication of the URL upon submission of the manuscript and thank the reviewer for carefully checking this. Since DaRUS is a peer-reviewed repository ensuring high quality data sets according to the FAIR principle, peer review is still ongoing. The provided link will work definitely only when the manuscript will be published. In the meantime, we provide a temporary link for reviewing :

      https://darus.uni-stuttgart.de/privateurl.xhtml?token=cbfac341-0e4a-4403-8f73-87bce31ca805

      Reviewer #2 (Significance (Required)):

      This work has broad significance and would be of general interest to those in membrane biophysics to plant biology and evolution. The work nicely touches on all these topics, and I find this fills a gap in details of these betaine lipids structure and relation to evolution in terrestrial vs. marine plants.

    1. Because here’s something else that’s weird but true: in the day-to-day trenches of adult life, there is actually no such thing as atheism. There is no such thing as not worshipping. Everybody worships. The only choice we get is what to worship.

      I find this to be true because in reality, as much as some people may not believe in worshipping anything be it spiritual, supernatural, of anything of the sort, practically, people believe in things or worship things which keeps them going. For instance, one may find himself or herself in a critical situation with no certainty of how to get out of it, but he or she may wish to get out of that situation without thinking of anyone in mind but just believes and it their wish comes to pass, the act of wishing alone is a prayer made. Just like Wallace mention, atheism does not exist as people find themselves worshipping various things since in reality, whatever we humans dedicate our time to so much to the extent we believe we cannot do without (worthy) is actually a form of worship. This includes, money, spending much time with tv, social media, and the likes. Hence, we give reverence to these things which makes us prisoners in our own selves. Therefore, as humans, we should learn to be conscious about what is real and important, so we can control how we think and make choices.

    1. Author Response

      Reviewer #1 (Public Review):

      This manuscript reports new findings about the role of the glutamate transporter EAAC1 in controlling neural activity in the striatum. The significance is two-fold - it addresses gaps in knowledge about the functional significance of EAAC1, as well as provides a potential explanation for how EAAC1 mutations contribute to striatal hyperexcitability and OCD-associated behaviors. The manuscript is clearly presented, and the well-designed experiments are rigorously performed and analyzed. The main results showing that EAAC1 deletion increases the dendritic arbor of MSN D1 neurons and increases excitatory synaptic connectivity, as well as reduces D1-to-D1 mediated IPSCs are convincing. These results clearly demonstrate that EAAC1 deletion can alter excitatory and inhibitory synaptic function. Modelling the potential consequences for these changes on D1 MSN neural activity, and the behavior changes are interesting. Minor weaknesses include incomplete support for the conclusions about how EAAC1 regulates GABAergic transmission.

      We would like to take this opportunity to thank the reviewer. New sets of pharmacology experiments now address the minor concern about supporting the conclusions about the regulation of GABAergic transmission by EAAC1. The revised manuscript also includes new behavioral assays that allow us to examine in more depth the cell- and region-specificity of the effects of EAAC1.

      Reviewer #2 (Public Review):

      The manuscript by Petroccione et al., examines the modulatory role of the neuronal glutamate transporter EAAC1 on glutamatergic and GABAergic synaptic strength at D1- and D2-containing medium spiny neurons within the dorsolateral striatum. They find that pharmacological and genetic disruption of EAAC1 function increases glutamatergic synaptic strength specifically at D1-MSNs. They show that this is due to a structural change in release sites, not release probability. They also show that EAAC1 is critical in maintaining lateral inhibition specifically between D1-MSNs. Taken together, the authors conclude that EAAC1 functions to constrain D1-MSN excitation. Using a computational modeling technique, they posit that EAAC1's modulatory role at glutamatergic and GABAergic inputs onto D1-MSNs ultimately manifests as a reduction of gain of the input-output firing relationship and increases the offset. They go on to show that EAAC1 deletion leads to enhanced switching behavior in a probabilistic operant task. They speculate that this is due to a dysregulated E/I balance at D1-MSNs in the DLS. Overall, this is a very interesting study focused on an understudied glutamate transporter. Generally, the study is done in a very thorough and methodical manner and the manuscript is well written.

      We thank the reviewer for the thorough analysis and insightful comments on the manuscript. Our point-to-point responses to the concerns raised on the initial submission of this work are reported below:

      Major Comments/Concerns:

      Regional/Local manipulations in behavior study: The manuscript would be greatly improved if they provided data linking the ex vivo electrophysiological findings within the DLS with the behavior. Although they are using a DLS-dependent task, they are nonetheless, using a constitutive EAAC1 KO mouse. Thus, they cannot make a strong conclusion that the behavioral deficits are due to the EAAC1 dysfunction in the DLS (despite the strong expression levels in the DLS).

      Corrected - We concur with the reviewer. To address this concern, we performed new experiments to assess the cell- and regional-specificity of the effects of EAAC1 on task-switching behaviors.

      First, we repeated the behavioral assays described in Fig. 8 in two mouse lines (D1Cre/+:EAAC1f/f and A2ACre/+:EAAC1f/f) lacking EAAC1 expression in D1- or D2-MSNs, respectively (Supp. Fig. 8-1). As in the case of EAAC1+/+ and EAAC1-/- mice, when the switch time was short (<15 s), D1Cre/+:EAAC1f/f and A2ACre/+:EAAC1f/f mice collected a similar number of rewards (Supp. Fig. 8-1K, L) and performed a similar number of lever presses (Supp. Fig. 8-1M, N). As the switch time increased (30-75 s), D1Cre/+:EAAC1f/f mice collected more rewards than A2ACre/+:EAAC1f/f mice, at low and high reward probabilities (Supp. Fig. 8-1L, N). Overall, the task switching behavior of D1Cre/+:EAAC1f/f mice was similar to that of EAAC1-/- mice, whereas that of A2ACre/+:EAAC1f/f mice was similar to that of EAAC1+/+ mice (cf. Supp. Fig. 8 and Supp. Fig. 8-1). This suggests that loss of expression of EAAC1 from D1-MSNs is sufficient to reproduce the task switching behavior of EAAC1-/- mice. Because EAAC1 limits excitation onto D1-MSNs (Fig. 2, 3) and lateral inhibition between D1-MSNs (Fig. 4-6), these findings suggest that increased excitation onto D1-MSNs and reciprocal inhibition among D1-MSNs limit execution of reward-based behaviors with task-switching intervals >30s.

      Second, as noted by the reviewer, another potential limitation of the experiments performed on constitutive EAAC1-/- mice is that , on their own, they do not allow us to say whether they are due to changes in E/I onto D1MSNs within a specific domain of the striatum like the DLS. Although the DLS is recruited during task-switching, reward-based flexibility in executive control relies on neuronal activity in the VMS (Wallis 2007; Gu et al. 2008). Therefore, we asked whether limiting excitation in D1-MSNs and strengthening D1-D1 lateral inhibition via EAAC1 in the VMS could also alter reward-based task-switching behaviors. To address this question, we repeated the task switching test in EAAC1f/f mice that received stereotaxic injections of a Cre-dependent viral construct (AAV-D1Cre) that we used to remove EAAC1 expression from D1-MSNs in the DLS or VMS, respectively (Supp. Fig. 8-2). The results showed that the task switching behaviors of EAAC1f/f mice receiving AAV-D1Cre injections in the DLS or VMS were similar to each other and to those of EAAC1-/- mice, while being statistically different from those of EAAC1+/+ mice. This finding is important, as it suggests that: (i) the DLS and VMS are both recruited for the execution of task switching behaviors; (ii) the modulation of E/I onto D1-MSNs by EAAC1 may not be limited to the DLS but could extend to the VMS.

      Third, we performed further tests to examine the regional-specificity of the effects of EAAC1 in D1-MSNs. D1 receptor expressing cells are present not only throughout the striatum, but also in the substantia nigra (pars compacta and reticulata; SN) and ventral tegmental area (VTA) (Cadet et al. 2010; Savasta, Dubois, and Scatton 1986; Boyson, McGonigle, and Molinoff 1986; Wamsley et al. 1989). To determine whether lack of EAAC1 in D1expressing cells in the SN/VTA could also contribute to increased compulsivity, we repeated the task switching behavioral assays in EAAC1f/f mice that received injections of AAV-D1Cre in the SN/VTA (Supp Fig. 8-3). The task switching behavior of these mice was similar to that of EAAC1+/+ , not EAAC1-/- mice, suggesting that altering EAAC1 expression in D1-MSNS of the DLS/VMS, but not the SN/VTA, is implicated with the control of task switching of reward-based behaviors in mice.

      The results of these new sets of experiments are included in the revised version of the manuscript and their implications are reported in the Discussion section of the paper.

      Statistics used in the study: There are some missing details regarding the precise stats using for the different comparisons. I am particularly concerned that the electrophysiology studies that were a priori designed as a 2-factor analysis did not have 2-way ANOVAs performed, but rather a series of t-tests. For example, in Figure 3b, the two factors are 1) cell type and 2) genotype. Was a 2-way ANOVA performed? It is hard for me to tell from the text.

      Corrected - We apologize for any potential confusion. The statistical analysis for the experiments included in this work includes paired and unpaired t-tests, one-way ANOVA, two-way ANOVA, and ANOVA for repeated measures tests followed by post hoc t-test comparisons (reported in the text). To ensure both accuracy and readability of the manuscript, we report the results of the statistical comparisons in the main text of the manuscript, but also provide a fully detailed statistical analysis across all datasets performed in the data repository for this manuscript deposited on Open Science Framework. We revised the methods section to clarify the use of different statistical tests and values reported in the manuscript.

      Moderate Concerns:

      Control mice: I am moderately concerned that littermates were not used for controls for the EAAC1 KO, but rather C57Bl/6NJ presumably ordered from a vendor. It has been shown that issues like transit and rearing conditions can have long term effects on behavior. Were the control mice reared in house? How long was the acclimation time before use?

      Corrected - Sorry for the potential confusion. The EAAC1-/- mice are bred in house and have been backcrossed with C57BL/6J for more than 10 generations. We perform backcrossing regularly and routinely in our animal colony. The C57BL/6J are also bread in house. They are replaced every 10 generations to avoid genetic drift. Therefore, there is no concern about transit from vendors and rearing affecting the results of our experiments. This information has been added to the Methods section of the paper.

      OCD framework: I generally find the OCD framework unnecessary, particularly in the Introduction. Compulsive behaviors are not restricted to OCD. Indeed, the link between the behavioral observations and OCD phenotype seems a bit tenuous. In addition, studying the mechanisms of behavioral flexibility in and of itself is interesting. I do not think such a strong link needs to be made to OCD throughout the entirety of the paper. The authors should consider tempering this language or restricting it to the discussion and end of the abstract.

      Corrected - We concur with the reviewer and have revised the manuscript accordingly. At the end of the Abstract, we refer only to behavior flexibility. We have toned down our emphasis on OCD in the Introduction, broadening the genetic link between the gene encoding EAAC1 (SLC1A1) and neuropsychiatric diseases like OCD, ADHD and ASD. This is now limited to a single sentence. We also revised the Discussion section because we agree with the reviewer on the fact that compulsive behaviors are not limited to OCD.

    1. Author Response

      Reviewer #2 (Public Review):

      1) The authors in reality do not analyze oscillations themselves in this manuscript but only the power of signals filtered at determined frequency bands. This is particularly misleading when the authors talk about "spindles". Spindles are classically defined as a thalamico-cortical phenomenon, not recorded from hippocampus LFPs. Thus, the fact that you filter the signal in the same frequency range matching cortical spindles does not mean you are analyzing spindles. The terminology, therefore, is misleading. I would recommend the authors to change spindles to "beta", which at least has been reported in the hippocampus, although in very particular behavioral circumstances. However, one must note that the presence of power in such bands does not guarantee one is recording from these oscillations. For example, the "fast gamma" band might be related to what is defined as fast gamma nested in theta, but it might also be related to ripples in sleep recordings. The increase of "spindle" power in sleep here is probably related to 1/f components arising from the large irregular activity of slow wave sleep local field potentials. The authors should avoid these conceptual confusions in the manuscript, or show that these band power time courses are in fact matching the oscillations they refer to (for example, their spindle band is in fact reflecting increased spindle occurrence).

      We thank the reviewer for allowing us to clarify this subject. We completely agree with concerns raised in the comments. To avoid any confusion, we have replaced throughout the manuscript the word ‘spindle’ with ‘beta’.

      2) The shuffling procedure to control for the occupancy difference between awake and sleep does not seem to be sufficient. From what I understand, this shuffling is not controlling for the autocorrelation of each band which would be the main source of bias to be accounted for in this instance. Thus, time shifts for each band would be more appropriate. Further, the controls for trial durations should be created using consecutive windows. If you randomly sample sleep bins from distant time points you are not effectively controlling for the difference in duration between trial types. Finally, it is not clear from the text if the UMAP is recomputed for each duration-matched control. This would be a rigorous control as it would remove the potential bias arising from the unbalance between awake and sleep data points, which could bias the subspace to be more detailed for the LFP sleep features. It is very likely the results will hold after these controls, given it is not surprising that sleep is a more diverse state than awake, but it would be good practice to have more rigorous controls to formalize these conclusions.

      We are grateful to the reviewer for suggesting alternative analysis. We have used this direction, to create surrogate datasets obtained by time shifting each band and obtained their respective UMAP projections (see modified Figure 2D). Additionally, as suggested, for duration-matched controls, we have selected consecutive windows, rather than random points (Figure 2 – figure supplement 1C). UMAP projections were obtained for each duration-matched control and occupancy was computed. The text in the method section has been modified to indicate the analysis. As expected, the results were identical.

      3) Lots of the observations made from the state space approach presented in this manuscript lack any physiological interpretation. For example, Figure 4F suggests a shift in the state space from Sleep1 to Sleep2. The authors comment there is a change in density but they do not make an effort to explain what the change means in terms of brain dynamics. It seems that the spectral patterns are shifting away from the Delta X Spindle region (concluding this by looking at Fig4B) which could be potentially interesting if analyzed in depth. What is the state space revealing about the brain here? It would be important to interpret the changes revealed by this method otherwise what are we learning about the brain from these analyses? This is similar to the results presented in Figure 5, which are merely descriptions of what is seen in the correlation matrix space. It seems potentially interesting that non-REM seems to be split into two clusters in the UMAP space. What does it mean for REM that delta band power in pyramidal and lm layers is anti-correlated to the power within the mid to fast gamma range? What do the transition probabilities shown in Figures 6B and C suggest about hippocampal functioning? The authors just state there are "changes" but they don't characterize these systematically in terms of biology. Overall, the abstract multivariate representation of the neural data shown here could potentially reveal novel dynamics across the awake-sleep cycle, but in the current form of this manuscript, the observations never leave the abstract level.

      We thank the reviewer for allowing us to clarify this aspect of the manuscript. We have now edited the main text to include considerations on the biological relevance of the findings of Figure 4, 5 and 6.

      Additions to figure 4: In particular, non-REM states in sleep2 tended to concentrate in a region of increased power in the delta and beta bands, which could be the results of increased interactions with cortical activity modulated in the same range. It is also likely that such effect was induced by the exposure to relevant behavioral experience. In fact, changes in density of individual oscillations after learning have been reported using traditional analytical methods and are thought to support memory consolidation (Bakker et al., 2015; Eschenko et al., 2008, 2006). Nevertheless, while traditional methods provide information about individual components, the novel approach used here provides additional information about the combinatorial shift in the dynamics of network oscillations after learning or exploration. Thus, it provides the basis for identifying how coordinated activity among different oscillations supports memory consolidation processes, as those occurring during non-REM sleep after exploration, which cannot be elucidated using traditional analytical methods.

      Additions to figure 5: Gamma segregation and delta decoupling offer a picture of hippocampal REM sleep as being more akin to awake locomotion (with the major difference of a stronger medium gamma presence) while also suggesting a substantial independence from cortical slow oscillations. On the other hand, the across-scale coherence of non-REM sleep is consistent with this sleep stage being dominated by brain-wide collective fluctuations engaging oscillations at every range. Distinct cross frequency coupling among various individual pairs of oscillations such as theta-gamma, delta-gamma etc., have been already reported (Bandarabadi et al., 2019; Clemens et al., 2009; Hammer et al., 2021; Scheffzük et al., 2011). However, computing cross frequency coupling on the state space provides the additional information on how multiple oscillations, obtained from distinct CA1 hippocampal layers (stratum pyramidale, stratum radiatum and stratum lacunosum moleculare), are coupled with each other during distinct states of sleep and wakefulness. Furthermore, projecting the correlation matrices on 2D plane, provides a compact tool that allows to visualize the cross-frequency interactions among various hippocampal oscillations. Altogether, this approach reveals the complex nature of coupling dynamics occurring in hippocampus during distinct behavioral states

      Additions to Figure 6: We found that transitions occurring from REM-to-REM sleep and non-REM-to-non-REM sleep (intra-state transitions) are more vulnerable to plasticity after exploration as compared to inter-state transitions (such as non-REM to REM, REM-to-intermediate etc.) (Fig 6E, F). These changes in intra-state transitions were observed to be beyond randomness (Fig S9 E, F) indicating a specificity in plastic changes in state transitions after exploration. In particular, while the average REM period duration is unaltered after exploration (Fig 4G), REM temporal structure is reorganized. In fact, increased probability of REM to REM transitions indicates a significant prolongation of REM bout duration. Similarly, the increase in non-REM to non-REM transition probability reflects an increased duration of non-REM bouts. Therefore, environment exploration was accompanied by an increased separation between REM and non-REM periods, possibly as a response to increased computational demands. More in general, the network state space allows to characterize the state transitions in hippocampus and how they are affected by novel experience or learning. By observing the state transition patterns, this analytical framework allows to detect and identify state-specific changes in the hippocampal oscillatory dynamics, beyond the possibilities offered by more traditional univariate and bivariate methods. We next investigated how fast the network flows on the state space and assessed whether the speed is uniform, or it exhibits specific region-dependent characteristics.

      Reviewer #3 (Public Review):

      1) My primary concern is to provide clear evidence that this approach will provide key insights of high physiological significance, especially for readers who may think the traditional approaches are advantageous (for example due to their simplicity). I think the authors' findings of distinct sleep state signatures or altered organization of the NLG3-KO mouse could serve this purpose. However, right now the physiological significance of these results is unclear. For example, do these sleep state signatures predict later behavior performance, or is altered organization related to other functional impairments in the disease model? Do neurons with distinct sleep state signatures form distinct ensembles and code for related information?

      We are thankful to the reviewer for raising a very interesting line of questioning regarding sleep signatures and distinct ensemble. In this study, we show that sleep state signatures can predict how individual cells may participate in information processing during open field exploration. However, further analysis exploring the recruitment of neuronal ensembles are in preparation for another manuscript and is beyond the scope of this article.

      We have further modified the description of the results (as also suggested by other reviewers) to highlight the key advantages of this approach over traditional methods.

      Regarding functional impairment: as described in the manuscript, the altered organization in animal model of autism could possibly due to alterations in cellular and synaptic mechanisms as those described in previous reports (Modi et al 2019, Foldy et al 2013)

      2) For cells with different mean firing rates during exploration: is that because they are putative fast-spiking interneurons and pyramidal cells? From the reported mean firing rates, I think some of these cells are interneurons. Since mean firing rates are well known to vary with cell type, this should be addressed. For example, the sleep state signatures may be distinct for different putative pyramidal cells and interneurons. This would be somewhat expected considering prior work that has shown different cell types have different oscillatory coupling characteristics. I think it would be more interesting to determine if pyramidal cells had distinct sleep state signatures and, if so, whether pyramidal cells from the same sleep state signature have similar properties like they code for similar things or commonly fire together in an ensemble ms the number of cells in Fig. 8 may be limited for this analysis. The authors could use the hc-11 data in addition, which was also tested in this work.

      We thank the reviewer for suggesting this additional analysis to better describe the data. To this end, we have added an additional Figure in supplementary data (analysis of hc11 dataset: Figure Figure 8 – figure supplement 3), to demonstrate that interneurons and pyramidal cells have distinct sleep signatures. These findings are in agreement with dataset presented in Figure 8D, E.

      As shown in the manuscript, the spatial firing (sparsity) has large variability for cells having similar network signatures (Fig 8E). Thus, additional parameters beside oscillations may be involved in cells encoding. Different network state spaces are required to be explored in future studies to further understand this phenomenon in detail.

      We agree that investigating neuronal ensembles and state space are an interesting direction to follow. In another study (in preparation) which are investigating in detail the recruitment of neuronal ensemble by oscillatory state space. Thus, those findings are beyond the scope of this introductory article.

      3) Example traces are needed to show how LFPs change over the state-space. Example traces should be included for key parts of the state-space in Figures 2 and 3.

      We thank the reviewer for this key insight on data representation. Example traces of how LFP varies on the state space have been added (see Figure 4 – figure supplement 1).

      4) What is the primary rationale for 200ms time bins? Is this time scale sufficient to capture the slow dynamics of delta rhythm (1-5Hz) with a maximum of 1s duration?

      Time scale of binning depends on the scale of investigation. We also replicated the results with different time bins (such as 50 ms and 1 seconds) and the results are identical. For delta rhythms, with 200 ms time bins, the dynamics will be captured across multiple bins. Additionally, the binned power time series are also smoothed before obtaining projections.

      5) Since oscillatory frequency and power are highly associated with running speed, how does speed vary over the state space. Is the relationship between speed and state-space similar to the results of previous studies for theta (Slawinska and Kasicki, Brain Res 1998; Maurer et al, Hippocampus 2005) and gamma oscillations (Ahmed and Mehta J. Neurosci 2012; Kemere et al PLOS ONE 2013), or does it provide novel insights?

      We thank the reviewer for highlighting this crucial link between oscillation and locomotion. While various articles have focused on individual oscillations, the combinatorial effects of multiple oscillations from multiple brain areas in regulating the speed of the animal during exploration is definitely worth exploring with this novel approach. These set of results will be introduced in another study, currently in preparation.

      6) The separation of 9 states (Fig. 6ABC) seems arbitrary, where state 1 (bin 1) is never visited. I suggest plotting the density distribution of the data in Fig. 2A or Fig. 6A to better determine how many states are there within the state space. For example, five peaks in such a density plot might suggest five states. Alternately, clustering methods could be useful to determine how the number of states.

      We thank the reviewer for this this useful suggestion. We agree that additional clustering methods can be used to identify non-canonical sleep states. These are currently being explored in our lab and will be part of future studies. As for this dataset, the density plots are available in figure 4E, which determines how many states are in each part of the state space.

      7) The results in Fig. 4G are very interesting and suggest more variation of sub-states during non REM periods in sleep1 than in sleep2. What might explain this difference? Was it associated with more frequent ripple events occurring in sleep2?

      The reviewer is right in looking for the source of the decreased of state variability in sleep2. Considering the distribution of relative frequency power in the state space, the higher concentration in sleep 2 corresponds to higher content in the slower delta and spindle frequency bands, rather than the higher frequencies of SWRs. This result can be interpreted in the light of enhanced cortical activity (which is known to heavily recruit those bands) and possibly of enhanced cortical-hippocampal communication following relevant behavioral experience. In fact, it is also necessary to mention that with our recording setup we cannot rule out the effects of volume conductance completely, and thus we cannot exclude that the increase in the delta and spindle bands in the hippocampus were a spurious effect of purely cortical frequency modulations.

      8) The state transition results in Fig. 6 are confusing because they include two fundamentally different timescales: fast transitions between oscillatory states and slow dynamics of sleep states. I recommend clarifying the description in the results and the figure caption. Furthermore, how can an animal transition between the same sleep state (Fig. 6EF)? Would they both be in a single sleep state?

      The transitions capture the fast oscillatory scales (as they are investigated over a timeframe of 1 second). The sleep stages (REM, non-REM etc.) are used as labels from which the states originate on the state space. This allows us to characterize fast oscillatory dynamics in various sleep stages.

      Regarding same state transition: An increase in same state transition probability corresponds to increase in prolongation of that particular state, thereby altering the temporal structure of a given sleep state.

    1. When we don’t think certain messages meet our needs, stimuli that would normally get our attention may be completely lost. Imagine you are in the grocery store and you hear someone say your name. You turn around, only to hear that person say, “Finally! I said your name three times. I thought you forgot who I was!” A few seconds before, when you were focused on figuring out which kind of orange juice to get, you were attending to the various pulp options to the point that you tuned other stimuli out, even something as familiar as the sound of someone calling your name.

      This happens with my boyfriend and I all of the time. He will be playing a video game or on his phone, and when I try to get his attention, this happens. I also thought it was because he was tuning me out on purpose or something. I also heard that humans are not meant to focus their attention on multiple things at once, so this makes sense. I think that this concept is super interesting and now I know why people do this.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Evidence, reproducibility and clarity____:

      Summary: the paper suggested a new approach to study in vivo possible interaction between glioblastoma cells and glioblastoma associated macrophages. By using single cells transcriptome profiling and in vitro and in vivo functional experiments the authors also suggested LGALS1 as possible key factor in the suppression of the immune system and a new target for immune modulation in glioma patients. The experimental plan is well described, and the results are beautifully presented using images, clear drawings, and videos.

      Major comments: none

      Minor comments:

      • The number of zebrafish embryos analyzed after the xenograft is highly variable (e.g. 3-18; 4-22 in Figure 6). These numbers can be reported in the results section (not only in the legends) and the authors may comment on them in the discussion. The reproducibility of thexenotransplant experiments is always challenging as it is quite difficult to inject the same number of cells in every embryo and to have the same survival rate of injected cells and of transplanted embryos. For these reasons the volume of each xenograft can vary significantly in different embryos and in different experimental session. Accordingly, the number of macrophages associated to the tumor can vary and the statistical analysis can be deeply influenced by the number of replicates for each experimental group (a group with 3 embryos is very different in term of quality and quantity of information in respect to a group of 18 embryos). It could be useful for the reader, who has no experience in this technique, be aware of the advantages and disadvantages of the procedure including the possible influence of the temperature (34°C instead of 37°C) on the embryo survival and the replication rate of glioma cells or macrophages behavior. Comment on these aspects does not weaken the power and the relevance of the model but unveil the critical aspects that every scientist has to evaluate before planning these kinds of experiments.

      __Response: __We agree with Reviewer #1 that the zebrafish avatar model is challenging, and it is difficult to obtain reproducible tumor sizes and survival rates. To be even more transparent about this, we have added a few sentences about the variable n number in the Results section and a critical comment about it in the Discussion section.

      • An aspect that could be interesting to address, to further validate the avatar model, is to monitor the level of pro-inflammatory cytokines (Tumor Necrosis Factor and Interleukin 1, 6, and 8) that are expressed at basal level in the early developing zebrafish embryos. Do their expression level increase after the xenotransplantation? Can the zebrafish cytokines affect the behavior of glioma associated macrophages (i.e. macrophages polarization)?

      __Response: __This is an interesting point, indeed. We have injected murine melanoma (B16) cells into Tg(mpeg1:mCherry-F); Tg(TNFa:eGFP-F) embryos, a TNFa reporter line. Some (but not all) macrophages expressed TNFa and their expression decreased over time, which is consistent with previous reports (Póvoa et al, 2021). We further observed that TNFa-expressing macrophages mostly had a round, “tumor-attacking” phenotype. This is in line with our hypothesis that the tumor induces a phenotype switch in GAMs. Of note, we did not see TNFa expression in the rest of the brain tissue. We would be happy to add this data if deemed useful.

      We did not investigate other cytokines in the developing zebrafish, but we believe this is not essential for the following reasons: We are mainly interested in the differences between the patient-derived GBM stem cell cultures (GSSCs), and since they are all used in the same avatar model, we expect that if zebrafish cytokines would have an effect on GAMs and their polarization, this effect would be consistent in all avatars, and can thus be ignored when comparing different GSCCs. More importantly, our findings in the zebrafish avatar model were consistent with those in the in vitro model. We observed the same phenotype switch in the co-culture model, indicating that the key interaction is between tumor cells and macrophages.

      Significance____:

      Strengths and limitation. The manuscript is the result of a well-orchestrated effort to dissect a biological problem by complementary approaches and provide new data with high impact translational value. The image processing pipeline developed by the authors is a step forward in the in vivo analysis of cells interaction in living embryos. The identification of LGALS1 as a potential target for immune modulation can support the development of new therapeutical strategy implementing chemo- or immunotherapy protocols. The described zebrafish avatar can represent a new tool for personalized drug testing recapitulating in a in vivo model the heterogeneity of GBM found in patients.

      Audience: All the scientist interested in cell biology, cancer cell biology, imaging techniques, translational medicine, in vivo models for cancer research, precision medicine.

      Reviewer expertise: applied developmental biology

      Reviewer #2

      Evidence, reproducibility and clarity____:

      Finotto et al aim to address the polarisation of macrophages within GBM in their study. To do this, they have developed two different models. The first model is an in-vitro co-culture model of patient derived GSC lines and human monocyte derived macrophages. This model was used for single cell sequencing to understand the transcriptomic changes of macrophages upon contact to GBM cells. The second model is a zebrafish xenograft model. Here GFP labeled GBM cells were transplanted into the larval zebrafish ventricle. These experiments were done in the transgenic mpeg zebrafish which allowed to monitor responses of macrophages in vivo.

      In my opinion both models are not sophisticated enough to draw solid conclusions on macrophage polarisation in GBM. The in vitro model is highly artificial and is far from the complex situation in GBM. Within GBM the GAM population represents a heterogenous mix of resident microglia and infiltrating macrophages. These are influenced by the heterogeneous environment (which consists of tumour cells but also other host cells) and show diverse transcriptomic adaptations as shown in rodent models as well as sequencing studies of patient derived tumour samples. Studying monocyte derived macrophages in vitro does not provide any reliable insight.

      Response: We understand the reviewer’s concern about the complexity of our in vitro model. However, these simple models are needed to gain more insight into the complex in vivo situation. Others have demonstrated their usefulness in the past (C. Jayakrishnan et al, 2019; Zhou et al, 2022; Hubert et al, 2016; Chen et al, 2020; Coniglio et al, 2016; Li et al, 2022). Moreover, it may be advantageous to look at only two different cell types and unravel their reciprocal interaction, without the influence of other cell types, making it too complex to draw conclusions. We acknowledge that GAMs are a heterogeneous mix of both microglia and bone marrow-derived macrophages. Considering that bone marrow-derived macrophages have been shown to play an important role in tumor progression and are by far the most abundant immune cell population in GBM tumors (which even increases in recurrent GBM) (Pombo Antunes et al, 2021; Abdelfattah et al, 2022), we chose to focus initially on bone marrow-derived macrophages. Notably, it has already been reported that microglia were associated with significantly better survival, suggesting that they are anti-tumorigenic, whereas macrophages were associated with worse survival, suggesting that they are pro-tumorigenic (Pombo Antunes et al, 2021; Abdelfattah et al, 2022). This justifies our approach to focus on this cell type. Furthermore, although this model may be rather simplistic, it allowed us to screen different GSCCs side by side in a standardized way, through which we found an apparent phenotype switch within the macrophages, even without the complex interplay with other cell types. Because the results obtained using the in vitro model were also confirmed in GBM patient material and KO experiments in the zebrafish avatar model, our work shows that reliable and important insights can be derived. This, combined with its simplicity, makes our co-culture model an exceptionally relevant model that is scalable, screenable and allows us to study the effect of perturbations. Finally, the immunosuppressive role of the target we identified using this model, LGALS1, has been previously demonstrated by others (Verschuere et al, 2014; Van Woensel et al, 2017; Chen et al, 2019), which proves our approach is valid.

      Although the zebrafish can be a great model to understand the progression of tumours and the role of immune cells, I don't think that the model developed by the authors is suitable to address their questions. Transplantation of GBM cells into the the ventricle of larval zebrafish doesn't seem to be the right approach here. The poor survival of the transplanted cells is a clear indication of that. Many other groups have reported growth and proliferation of human cancer cells in the larval zebrafish. Direct transplantation into the brain parenchyma would be the better approach here. The brain parenchyma would provide the right environment for the GBM cells including a resident microglial population. This would also allow to study the complex mix of microglia and infiltrating macrophages in the context of GBM.

      Response: The reviewer does not specify which articles have reported growth and proliferation of human cancer cells in zebrafish larvae. Most research groups reporting this, did not follow tumor growth/proliferation over time or used immortalized cell lines (Vargas-Patron et al, 2019; Pan et al, 2020; Pudelko et al, 2018; Breznik et al, 2017; Vittori et al, 2017; Hamilton et al, 2016), which obviously have a much higher proliferation rate than the patient-derived cell lines used in this work. Second, although the number of patient-derived tumor cells decreases over time, we observed a clear invasive and migratory behavior, indicating that the human tumor cells reside well in the zebrafish microenvironment. Furthermore, it is important to note that the zebrafish avatars are grown at 34°C, a temperature that is suboptimal for tumor cell growth. The tumor cells still proliferate, albeit at a lower rate than at 37°C.

      To our knowledge, there is only one publication that reports the growth of patient-derived GBM tumors over time (Almstedt et al, 2022). However, here, zebrafish embryos were grown at 33°C. Also, prior to injection, patient-derived GBM cells were resuspended in medium containing polyvinylpyrrolidone, a polymer that enhances extracellular matrix deposition and cell proliferation. Furthermore, the authors observed substantial differences in proliferative capacity, ranging from growth to decline of signal, and represented only two patient-derived cell lines with growing tumors. Similar to our findings, another article has demonstrated that injected patient-derived GBM tumor cells progressively underwent mitotic arrest, while maintaining an invasive and aggressive growth pattern (Rampazzo et al, 2013).

      Although the tumor cells are injected into the hindbrain ventricle, they end up in the brain parenchyma, as evidenced by the presence of the typical brain vasculature of the zebrafish embryo. Notably, in Tg(mpeg1:mCherryF)ump2 zebrafish embryos, both macrophages and microglia are labeled with mCherry, meaning that we have studied both cell types in our zebrafish avatar model. Therefore, we consider the reviewer’s comment to be unfounded.

      Reviewer #3

      __ Evidence, reproducibility and clarity: __

      In this study, Finotto and colleagues developed patient-derived Glioblastoma (GBM) stem cell cultures from 7 patients. These GBM stem cell cultures were either co-cultured in vitro with human macrophages combined with single-cell RNA sequencing or injected into the orthotopic zebrafish xenograft to study live GBM-macrophage/microglia interactions. Authors aimed at studying tumor heterogeneity and GBM-associated macrophages (GAMs) which often exhibit immunosuppressive features that promote tumor progression. Their analyses revealed substantial heterogeneity across GBM patients in GBM-induced macrophages polarization and the ability to attract and activate GAMs - features that correlated with patient survival. Also authors show 3 distinct macrophage subclusters (MC1-3), highlighting that the simple M1/M2 polarization phenotypes is too reductive and there are no clear "markers". Authors associate these profiles with morphology and macrophage behaviour. Differential gene expression analysis, immunohistochemistry on original tumor samples, and knock-out experiments in zebrafish subsequently identified / confirmed that LGALS1 as a primary regulator of immunosuppression.

      Cheng et ( DOI: 10.1002/ijc.32102) had previously shown the immunosuppression effect of LGALS1 - but this work shows as a proof of concept that the authors approach is a valuable and interesting approach to find immune regulators.

      Response: We fully agree with Reviewer #3. In fact, the immunosuppressive role of LGALS1 has already been described by several research groups (Van Woensel et al, 2017; Verschuere et al, 2014), which indeed proves that our approach is valid. The reference cited by the reviewer was already included in the manuscript, along with other references.

      Major comments:

      In general claims are supported by date - very carefully presented and well characterized data with numbers, stats. It is an interesting descriptive study that illustrates the complexity and diversity of glioblastoma and the induced TME. I just have a few comments or clarifications that I would like to have elucidated:

      • I did not understand why not single cell sequence the original tumor - without in vitro passaging and have the original patient population of MACs/microglia and monocytes sequenced? In other words why sequence the in vitro system-with its inherent caveats of in vitro culturing and not the original tumor? Can you please clarify.

      Response: We agree with Reviewer #3 that our in vitro model does indeed have caveats inherent to patient-derived cell culture models. However, we chose this model to specifically focus on the reciprocal interaction between GBM tumor cells and macrophages in a way that also allows us to investigate how perturbations affect these interactions. This is not possible when using original tumors (e.g. we cannot make KO cells, as we did for LGALS1, and study the effects of genes of interest). (See also the response to the comment of Reviewer #2)

      We do have scRNAseq data from one original tumor sample (LBT123) that is currently being analyzed. Unfortunately, scRNAseq is not available for the other tumor samples. Also, for some of the patients, there is no original material left to use for sequencing. For LBT123, we will compare the scRNAseq data from the original tumor with the in vitro data from the co-culture model.

      • Mac signatures - out of curiosity- authors could not find TNFa and IFN signatures in any population?

      Response: Our analyses did not reveal TNF or IFN as cluster signature genes. However, we did find that TNF expression was slightly higher in MC2, the pro-inflammatory macrophages, although still at low levels. We did not find IFN expression in the macrophage subclusters, but we did find low expression of some IFN receptors. We found a gradient for IFNGR1 with the highest expression in MC3, followed by MC1 and the lowest expression in MC2. IFNGR2 was expressed at slightly higher levels in MC1 compared to the other subclusters. IFNAR1 and IFNAR2 were expressed at comparable low levels in all subclusters. Finally, IFNLR1 expression was higher in MC3 compared to the other two macrophage subclusters. Considering the overall low expression of IFN receptors, we believe that the differences in expression are rather negligible. Furthermore, it has been previously shown that IFN exerts its anti-tumor effect primarily through the responsiveness of endothelial cells and not of myeloid cells, such as macrophages (Kammertoens et al, 2017). Since vascular cells were not present in the co-culture model, low IFN receptor expression is not surprising. We are happy to investigate this in more detail and include it if deemed useful.

      • 8 please show controls side by side with the KO

      Response: We thank Reviewer #3 for this comment. We are not quite sure which panel the reviewer is referring to. If it is panel F, we agree with Reviewer #3 and have changed the order of the bars in the revised version. If it is panel E, the corresponding control images are shown in Figure 5I. Since we believe that these images should not be repeated, we have added a figure reference to Figure 5I in the figure legend of Figure 8, in addition to the figure reference already provided in the text. Furthermore, images of all embryos are presented side by side in Figure S8D-E.

      • Figure 5: if each pair of images are separated and have the legend on top would be easier to *read and follow. *

      Response: We appreciate the comment that the figure should be intuitively easy to read and follow. However, we have chosen a compromise between overview and visibility of details (e.g. morphological features of GAMs). Since this figure already has the maximum width, the images would become smaller if they needed to be separated. Reducing the size would compromise the visibility of important details.

      Significance:

      It is a very interesting study, carefully designed and performed that highlights the heterogeneity of glioblastoma and how GBM can modulate the macrophage population into 3 different subsets. This study constitutes a proof of concept of the combination of and in vitro approach and an in vivo approach to find new players and treatments in glioblastoma. I believe that it would be important and interesting to have a the original tumor sequenced to compare to the in vitro platform and understand how the in vitro selection impacts on the tumor biology and even if it changes the heterogeneity and differential composition of the tumor and macrophage profiles.

      References:

      Abdelfattah N, Kumar P, Wang C, Leu JS, Flynn WF, Gao R, Baskin DS, Pichumani K, Ijare OB, Wood SL, et al (2022) Single-cell analysis of human glioma and immune cells identifies S100A4 as an immunotherapy target. Nat Commun13

      Almstedt E, Rosen E, Gloger M, Stockgard R, Hekmati N, Koltowska K, Krona C & Nelander S (2022) Real-time evaluation of glioblastoma growth in patient-specific zebrafish xenografts. Neuro Oncol 24: 726–738

      Breznik B, Motaln H, Vittori M, Rotter A & Turnšek TL (2017) Mesenchymal stem cells differentially affect the invasion of distinct glioblastoma cell lines. Oncotarget 8: 25482–25499

      Jayakrishnan P, H. Venkat E, M. Ramachandran G, K. Kesavapisharady K, N. Nair S, Bharathan B, Radhakrishnan N & Gopala S (2019) In vitro neurosphere formation correlates with poor survival in glioma. IUBMB Life 71: 244–253

      Chen JWE, Lumibao J, Leary S, Sarkaria JN, Steelman AJ, Gaskins HR & Harley BAC (2020) Crosstalk between microglia and patient-derived glioblastoma cells inhibit invasion in a three-dimensional gelatin hydrogel model. J Neuroinflammation 17

      Chen Q, Han B, Meng X, Duan C, Yang C, Wu Z, Magafurov D, Zhao S, Safin S, Jiang C, et al (2019) Immunogenomic analysis reveals LGALS1 contributes to the immune heterogeneity and immunosuppression in glioma. Int J Cancer145: 517–530

      Coniglio S, Miller I, Symons M & Segall JE (2016) Coculture assays to study macrophage and microglia stimulation of glioblastoma invasion. Journal of Visualized Experiments 2016

      Hamilton L, Astell KR, Velikova G & Sieger D (2016) A zebrafish live imaging model reveals differential responses of microglia toward glioblastoma cells in vivo. Zebrafish 13: 523–534

      Hubert CG, Rivera M, Spangler LC, Wu Q, Mack SC, Prager BC, Couce M, McLendon RE, Sloan AE & Rich JN (2016) A three-dimensional organoid culture system derived from human glioblastomas recapitulates the hypoxic gradients and cancer stem cell heterogeneity of tumors found in vivo. Cancer Res 76: 2465–2477

      Kammertoens T, Friese C, Arina A, Idel C, Briesemeister D, Rothe M, Ivanov A, Szymborska A, Patone G, Kunz S, et al(2017) Tumour ischaemia by interferon-γ resembles physiological blood vessel regression. Nature 545: 98–102

      Li H, Yan X & Ou S (2022) Correlation of the prognostic value of FNDC4 in glioblastoma with macrophage polarization. Cancer Cell Int 22

      Pan H, Xue W, Zhao W & Schachner M (2020) Expression and function of chondroitin 4-sulfate and chondroitin 6-sulfate in human glioma. FASEB Journal 34: 2853–2868

      Pombo Antunes AR, Scheyltjens I, Lodi F, Messiaen J, Antoranz A, Duerinck J, Kancheva D, Martens L, De Vlaminck K, Van Hove H, et al (2021) Single-cell profiling of myeloid cells in glioblastoma across species and disease stage reveals macrophage competition and specialization. Nat Neurosci 24: 595–610

      Póvoa V, Rebelo de Almeida C, Maia-Gil M, Sobral D, Domingues M, Martinez-Lopez M, de Almeida Fuzeta M, Silva C, Grosso AR & Fior R (2021) Innate immune evasion revealed in a colorectal zebrafish xenograft model. Nat Commun12

      Pudelko L, Edwards S, Balan M, Nyqvist D, Al-Saadi J, Dittmer J, Almlöf I, Helleday T & Bräutigam L (2018) An orthotopic glioblastoma animal model suitable for high-throughput screenings. Neuro Oncol 127: 415

      Rampazzo E, Persano L, Pistollato F, Moro E, Frasson C, Porazzi P, Della Puppa A, Bresolin S, Battilana G, Indraccolo S, et al (2013) Wnt activation promotes neuronal differentiation of glioblastoma. Cell Death Dis 4

      Van Woensel M, Mathivet T, Wauthoz N, Rosière R, Garg AD, Agostinis P, Mathieu V, Kiss R, Lefranc F, Boon L, et al(2017) Sensitization of glioblastoma tumor micro-environment to chemo- and immunotherapy by Galectin-1 intranasal knock-down strategy. Sci Rep 7: 1–14

      Vargas-Patron LA, Agudelo-Dueñãs N, Madrid-Wolff J, Venegas JA, González JM, Forero-Shelton M & Akle V (2019) Xenotransplantation of human glioblastoma in zebrafish larvae: in vivo imaging and proliferation assessment. Biol Open 8

      Verschuere T, Toelen J, Maes W, Poirier F, Boon L, Tousseyn T, Mathivet T, Gerhardt H, Mathieu V, Kiss R, et al (2014) Glioma-derived galectin-1 regulates innate and adaptive antitumor immunity. Int J Cancer 134: 873–884

      Vittori M, Breznik B, Hrovat K, Kenig S & Lah TT (2017) RECQ1 helicase silencing decreases the tumour growth rate of U87 glioblastoma cell xenografts in zebrafish embryos. Genes (Basel) 8

      Zhou F, Shi Q, Fan X, Yu R, Wu Z, Wang B, Tian W, Yu T, Pan M, You Y, et al (2022) Diverse macrophages constituted the glioma microenvironment and influenced by PTEN status. Front Immunol 13

    1. Author Response

      Reviewer #1 (Public Review):

      The paper describes a robotic system that can be used for prolonged recording of forced activity in crawling Drosophila larvae. This is mostly intended to be a proof of principle description of a tool potentially useful for the community. The system - whose value lies completely in its reproducibility and adoption - is only superficially described in the paper, but a more detailed description is made available through Github, along with the software used for the collection and analysis of data.

      There is good, convincing evidence this can work as some sort of "larval conveyor belt", used to artificially prolong food crawling behaviour in the animals. More could be said about the ecological implications of the assay (for instance: how relevant is it to an animal's natural behaviour? Does the system introduce artifactual distortions in the analysis, driven by the fact that animals crawl greater distances than they would normally crawl in nature? Will this extensive activity affect their development to pupation or adulthood?).

      In addition all our code being available on GitHub, we have added substantially to Materials and Methods in the manuscript (1-1.5 pages) detailing the analysis pipeline more thoroughly.

      We agree that a more thorough comparison of ecological vs. laboratory conditions was warranted here, and have addressed this in new Discussion section material (6th paragraph especially). The developmental effect due to prolonged locomotion is a very good point – with only a single animal measured for more than 24 hours, we do not yet know whether instar molting or pupation is delayed, but this could certainly be a concern in longer experiments moving forward.

      Reviewer #3 (Public Review):

      "Continuous, long-term crawling behavior characterized by a robotic transport system" by Yu et al. presents their new robotic device to track, reposition, and feed Drosophila larvae as they crawl on an arena. By using a water droplet (or if necessary, suction) to transport larvae from the edge of the arena to the middle, long behavior trajectories can be recorded without losing larvae from the arena or camera field of view. The picker robot is also able to dispense small amounts of apple juice at precise locations to keep larvae alive for extended periods although the food was not sufficient to trigger molting and the development to the next instar stage.

      The approach is interesting, but the authors could provide more details on why the approach is necessary for non-expert readers. For example, what are the advantages of using the robot picker compared to simply confining larvae in a closed arena? It's not obvious (to me) that being picked back to the center of the arena is a smaller perturbation compared to running into a chamber wall and changing direction.

      Thank you for this suggestion, it’s a very good point. We have expanded our Introduction considerably, and directly address this issue (4th paragraph in particular). We do quantify the perturbation due to robot pick-ups and drop-offs (Fig. 3D), but that only addresses the short term. We prefer not to use a closed arena for three reasons: (1) in a gradient navigation experiment, reaching the edge would effectively end “navigation” and we would be unable to study that behavior over longer times, (2) larvae can crawl up the sides of walls and will be lost to the tracker (they do this all the time in the Petri dishes they are raised in), and (3) larvae often do not bounce off walls and resume crawling, they tend to dwell near edges they find. To this last point, we have added a new Supplemental figure (Figure 1 – supplement 1) illustrating this effect with a representative example.

      The first paragraph of the introduction emphasizes the multiple time scales that are relevant for behavior from rapid stimulus response up to developmental times. This is to set the context of the authors' contribution but I'm not sure it's a fair representation of the state of the art. For example, the authors state that high-bandwidth measurement over long times is prohibitive and cite three Drosophila papers, but there are home-cage monitoring systems that allow continuous recording of mouse behavior over long times with high resolution. At the other end of the spectrum, there have been some long-term behaviour experiments done on worm behaviour with reasonably high time resolution (e.g Stern et al. 10.1016/j.cell.2017.10.041).

      This is absolutely correct, the context needed to be much broader than our own prior larva results. We have overhauled that section and written a wider introduction that includes the C. elegans paper you mentioned, and also brings in other model systems like adult flies, mice, and rats. We frame our own work as (1) in a new animal, for long term measurements; (2) investigating non-confined free locomotion over a long time scale.

      The authors train a neural network to segment and track the larvae, however, little information is given on the training process and I don't think it would be possible to reproduce the model based on the description. More details of the network, hyperparameters, and training data would be required to evaluate it.

      Definitely! We have added a new section to Materials and Methods (1-1.5 pages in length), detailing our analysis pipeline, with sections for position tracking, postural analysis, and behavioral classification.

      The authors also state several times that larval identity is maintained throughout the recording, but this isn't quantified. It's not clear whether identity is maintained across collisions of two or more animals by the tracking algorithm or whether these collisions simply don't happen in their data because density is low.

      This has also been addressed and clarified in the same new part of the Materials and Methods section. We quantify collision rates and give the accuracy maintaining identity after collisions.

      The environment is nominally isotropic, but once larvae have been crawling on the surface for hours, including periodic feeding, there will likely be multiple gradients the larvae may sense. This may not be observable in the data, but should perhaps be mentioned in the text.

      This is certainly true. Other than the single animal 30-hour experiment described in the manuscript, there is no food introduced to the larvae during our 6-hour experiments. Looking ahead, the presence of food remnants in the arena could become a serious confounding factor in nominally isotropic experiments, as the reviewer points out. We have added substantially to the Discussion section to discuss various limitations of the design and experiments, and directly talk about the odor/taste stimuli being introduced by food (second to last paragraph in Discussion).

      The authors show that the picking action results in a small but detectable increase in speed. The degree of perturbation overall depends on the picking frequency so some quantification of the inter-pick time interval would help to interpret whether this perturbation is relevant for a particular experiment. Is there a difference in excitation when larvae are picked successfully on the first try compared to when multiple tries or suction are required?

      We have now quantified the amount of time between pickups and added that in the Materials and Methods section directly (it’s 0.87 pick-ups per hour per animal). We do not have a sufficient amount of data to determine whether there is a statistically significant difference in behavior for multiple pickup attempts – this can also be confounded because sometimes an unsuccessful pickup is one that does not touch the larva at all (so would presumably not introduce additional perturbations).

      From the reconstructed trajectory in Figure 4, this interval looks very long compared to speed increase after picking. When reconstructing the trajectory, how are the segments joined? Is it simply by resetting the xy position or also updating rotating to match the previous direction of travel? (I'm guessing the larva can rotate during transport?)

      We have updated the Figure 4 caption to make it clear that the segments are only joined translationally, by resetting the xy position.

      The authors present a simple model in Figure 6 to illustrate the differences between individuals that can be hidden when looking at population distributions. However, the differences they show in the simulation don't seem relevant to the differences they observe in the experiments. Specifically, Fig. 6A and B show a contrast between individuals with similar mean speeds compared to individuals with different (but still unimodal) mean speeds. In contrast, the experimental data in Fig. D shows individual distributions that are quite similar but that are bimodal. So, there is indeed a difference between the individual distributions that is obscured in the population distribution, but is there evidence of larval personality types (line 444)? Similarly, the sentence beginning line 381 doesn't seem right either.

      We are really glad this was brought up so that we could clarify better in the text, as it’s an important point. We have edited the text in the Results subsection related to Figure 6 and the Figure 6 caption to clear things up. The individual distributions in 6D are not bimodal, there are 38 traces shown that are all essentially unimodal. In addition to stating this directly in the text, we have quantified this by adding the average BC for individuals in both isotropic and thermal gradient contexts (they are essentially the same, i.e. equally unimodal in both cases).

    1. Author Response

      Reviewer #1 Public Review:

      1) “…The authors make reasonable assertions, but all of these need to be validated by electrophysiological studies before they can be treated as fact. Instead, they should be treated as predictions. For example, in the conclusions from the model section, that endbulb size does not strictly predict synaptic efficacy should be modified from an assertion to a prediction.”

      The reviewer makes an important point. We realize that, despite describing the data as the output of a model, we needed to be clearer that the model output is in fact a set of predictions to be tested experimentally. In the reorganization of the results, we collect the model output explicitly in a section named “Model Predictions”, and list five classes of predictions that describe explorations of bushy cells. The fifth set of predictions was previously a separate section but should now be better appreciated as conveying hypotheses since it is incorporated into this newly named section. Please note that the hypotheses are constrained to varying extents by the high-resolution structural data we present, such as the estimation of synaptic weights from the counts of synapses. The compartmental models for each bushy cell also are constrained by the structural data and published biophysical and electrophysiological properties of the cells. The pipeline to create the models is described in its own section now using that terminology: “A pipeline for translating high-resolution neuron segmentation into compartmental models consistent with in vitro and in vivo data.”, which we hope conveys the notion that the modeling framework is indeed a template that can be applied to future experimental data. We explicitly make this latter point in the new Discussion section “Toward a complete computational model for globular bushy cells: strengths and limitations”.

      Reviewer #2 Public Review:

      1) …” While this is technically impressive (in regards to both the structure and modelling) there are significant weaknesses because this integration makes massive assumptions and lacks a means of validation; for example, by checking that the results of the structural modelling recapitulate the single-cell physiology of the neuron(s) under study. This would require the integration of in vivo recorded data, which would not be possible (unless combined with a third high throughput method such as calcium imaging) and is well beyond the present study.

      We appreciate the support for our approach, and we now make explicit in the manuscript that the output of the models should be interpreted as predictions for eventual experimental testing. We also consider in the Discussion some experimental procedures that might be used to test the predictions. Ca2+ imaging is currently too slow a reporter for the rapid synaptic events and integration time constant for bushy cells, as the reviewer knows, and we think (and present in the Discussion, section 2) that focal optical stimulation simultaneous with recording from fast voltage sensors are potential avenues to achieve this goal.

      2) The authors need to be more open about the limitations of their observations and their interpretations and focus on the key conclusions that they can glean from this impressive data set.

      As indicated in response to a similar comment from Reviewer 1, we have collected and discuss the primary limitations in a new section within the Discussion, entitled “Toward a complete computational model for globular bushy cells: strengths and limitations”.

      3) The manuscript would be considerably improved by re-writing to focus the science on the most important results and provide clear declarations of limitations in interpretation.

      We have extensively re-organized and re-written the text to highlight the key structural observations (Figures 1-3, 7-8), the pipeline from structure to model (Figure 4) and interleave structural observations with the outputs of the model (Figures 5-6, 8). The latter are explicitly detailed in a new section called “Model Predictions”. These predictions are organized into five classes. We think that this new organization will improve communication of the key results, and further highlights the key discoveries from structural analysis and predicted functional mechanisms as explored in the compartmental models.

      Reviewer #3 Public Review:

      1) The authors extract here from the longer introductory commentary a one-sentence summary of the strengths of the manuscript, and thereafter focus on the weaknesses, since this document emphasizes our response to those critiques. To quote reviewer #3: “The strengths of this paper are that the authors obtained unprecedented high-resolution 3-D images of the AN-bushy cell circuit, and they implemented a biophysical model to simulate the neural processing of AN inputs based on these structural data. … The biophysical modeling, although lacking comparison with in vivo physiological data due to the chosen species (mice), is also solid and well documented.”

      We appreciate that the reviewer acknowledges the attention to detail that entered into the nanoscale imaging, cell reconstructions, building the modeling pipeline and constructing the compartmental models.

      2) Despite the high quality of the data, the paper is marred by the species they chose: there are very few published in vivo single-unit results from mouse bushy cells, so it is hard to evaluate how well the model predictions fit the real-world data, and how the structural findings address the “fundamental questions” in physiology. … No rationale (e.g. use of molecular tools or in vitro physiology) is given why the authors focus on the mouse. It seems that the analyses provided here could as well have done on a species with good low-frequency hearing, which may have provided a much more interesting case for understanding the spectacular temporal transformation performed by bushy cells.

      We now report our reasons, in the first paragraph of the Results, for selecting the mouse. One reason for choosing mouse was that biophysical properties of bushy cells, which were important parameters to constrain the compartmental models, were collected from mice. These data are collected from dissociated cells and from brain slices, and these experiments continue to be more tractable in mice. The second reason is that mice are used in nanoscale and light microscopy connectomic studies because their neurons, cell groups and entire brain are smaller, so that a given volume of imaged brain will contain more cellular elements. These other connectomic studies provide a template for eventual comparisons among brain regions. Our overall goal is to image the entire cochlear nucleus, and the size of the mouse brain makes this goal tractable given current technology. Indeed, we are currently analyzing an image volume of the more rostral ventral cochlear nucleus that is about 5x larger than this image volume and collected with a much better signal to noise ratio. The third reason for choosing mouse was so that the current project could be augmented by genetic tools to further classify cochlear nucleus (CN) neurons and their extrinsic inputs, and potentially manipulate neural circuits in future studies. For example, the atoh7 (math5) and hhip gene products are markers for subsets of bushy cells, suggesting the presence of molecular subtypes of this cell class (Jing et al. 2023).

      3) If we look at data from other animals such as cats and gerbils, it is true that high-frequency (globular) bushy cells show envelope phase locking, but compared to ANs they are at best only moderately enhanced (gerbils: Frisina et al. 1990: Fig 7 and 10; cats: Joris and Yin 1998 Fig 4); the most prominent enhancement is actually to the temporal fine structures of low-frequency bushy cells (cells tuned to < 1 kHz), which mice lack. Furthermore, the temporal modulation transfer function (tMTF, i.e. the vector strengths vs modulation frequency plots in Fig 7O of the paper) of (globular) bushy cells are mostly low-pass filtered, with a cutoff frequency close to 1 kHz, and the highest vector strength rarely surpasses 0.9 (cats: Rhode 1994 Fig 9, 16, Rhode 2008 Fig 8G, Joris and Yin 1998 Fig 7; and there's one report from mice: Kopp-Scheinpflug et al 2003 Fig 8). Thus, the band-pass tMTFs tuned to 100-200 Hz with vector strengths > 0.9 or 0.95 in this paper (Fig 7O, Fig 8M) do not really match known physiology (in non-mouse species). Again, we know very little about in vivo physiology of mouse (globular) bushy cells and there is of course a possibility that responses in mice may be closer to the predictions of this paper.

      We agree that there are (unfortunately) few studies in mouse that can be compared with our simulations. With regard to the tMTFs, we can make a couple of points. First, we note that the stimulus used for all the panels except P2 in Figure 6 (previous Figure 7) were at 15 dB SPL, which is the level where maximal envelope phase-locking occurs in the low-threshold ANF inputs. This choice was based on previous experimental work that examined the intensity dependence for SAM stimuli in the auditory nerve (Smith and Brachman, 1980; Joris and Yin, 1992; Cooper et al, 1993; Dreyer and Delgutte, 2006, Figure 2B, Figure 3). Second, Figure 6, Supplemental Figure 1 confirms the behavior of the auditory nerve model used for input to the bushy cells (Rudnicki and Hemmert (2017) implementation), replicating Zilany et al., 2009, Figure 13D. These results show that phase-locking decreases at higher intensities as expected from the experimental work. Relevant to this topic, the lone report of responses to SAM stimuli in mice (Kopp-Scheinpflug et al. 2003) used 100% SAM at CF at 80 dB SPL. At this high intensity, it is expected that the envelope phase locking at CF will be less than at lower intensities because of rate saturation in the high and medium spontaneous rate ANFs (Carney, JARO 2019; Joris and Yin, 1998). In guinea pig, envelope phase locking is greater in low-SR fibers at 80 dB SPL than in medium and high SR fibers, but it is still lower than at its peak at about 50 dB SPL (Cooper et al., 1993). All of these experimental observations therefore lead to the prediction that the SAM envelope locking in Kopp-Scheinpflug et al. (2003) should be lower than in our simulations.

      In addition, Kopp-Scheinpflug et al. (2003) did not report which VCN cell populations cells were recorded. If the recorded cells were a heterogenous mixture of bushy and multipolar cells, then their data are not directly comparable to our model predictions. The stimulus intensity also needs to be considered for comparison with the work of Rhode (1994), whose lowest stimulus level is 30 dB SPL (Figure 9), and who also used a different stimulus, 200% SAM, and with the work of Frisina et al. (1990), who used 50 dB SPL. Interestingly, Figure 14D in Rhode (1994) shows a synchrony coefficient ranging from 0.5 to 0.9 at 30 dB SPL at 300 Hz modulation, which is similar to what we predict in Figure 6P2. We also remind the reviewer that our simulations did not include the effects of feed-back inhibition at CF (Caspary and Palombi, 1994; Campagnola and Manis, 2014; Xie and Manis, 2014, Keine et al. eLife 2016), which may affect phase synchrony in complex ways (Gai and Carney, 2008). One important feedback pathways arises from the tuberculoventral cells of the DCN (Wickesberg and Oertel, 1991; Campagnola and Manis, 2014), but the envelope synchrony behavior of those cells is not known.

      Thus, we now emphasize in the revised manuscript (in the Discussion) considerations of stimulus intensity used across published studies, citing the works above, the relatively high vector strengths at low modulation frequency, and that these simulation results are currently predictive. The simulations are also limited in that we used only one configuration of ANF inputs (low-threshold, high SR). This ANF SR category was selected to be consistent with the suggestion by Liberman (1991) that the globular BCs receive input principally from the low-threshold high-SR fibers. Mixtures of input SR classes would be expected to change the envelope representation at higher intensities. Finally, the parameter space is quite large (intensity x frequency x [ANF distributions], x inhibition) and is better explored in a separate study once we are able to provide better or additional constraints to the modeling framework. Also, to put the selection of SAM stimuli in context, we indicate that mice can encode temporal fine structure although only as low at 1 kHz, but at similar VS to larger rodents such as guinea pig (Taberner and Liberman 2005; Palmer and Russell 1986).

      Reviewer 4: Public comments

      1) The authors have collected an impressive array of physiological data and provided some beautiful 3D images of SBCs with dendrites. These are clearly strengths. The computational models for mechanisms of SBC responses, however, are made to fit what may be inadequate anatomical data. Instead of conclusions, perhaps they need to reword their discussions to refer to the anatomy as hypothetical substrates.

      It is true that the SBEM image volumes have strengths and limitations. We now collect these considerations in the second section of the Discussion, “Toward a complete computational model for globular bushy cells: strengths and limitations”. One limitation of this volume is that we do not have sufficient resolution to categorize synaptic vesicles by shape and must infer their excitatory or inhibitory nature. Note that tracing inputs to a source neuron, such as tracing the endbulbs to parent auditory nerve fibers, solves this problem, but the smaller terminals remain problematic in this regard. The goal is to not only assign excitatory or inhibitory phenotype, but also a cell type of origin, so that actual spike patterns, evoked by sound, can be provided as inputs to the model. The compartmental model is detailed, and amenable to mapping this information from other experiments as it becomes available. Nanoscale imaging does provide detailed structural information in terms of surface areas, volumes and process diameters that is important in constraining the compartmental models, and that is not attainable by standard light microscopy approaches. These points are now made in the Results and in the Discussion, as mentioned earlier in this paragraph. And, as indicated in the responses to other reviewers, we highlight the model outputs as predictions to be tested experimentally.

    1. Author Response

      Reviewer #1 (Public Review):

      This work reports an important demonstration of how to predict the mutational pathways to antimicrobial resistance (AMR) emergence, particularly in the enzyme DHFR (dihydrofolate reductase). Epistasis, or non-additive effects of mutations due to their background dependence, is a major confounding factor in the predictability of protein evolution, including proteins that confer antimicrobial resistance. In the first approach, they used the Rosetta to predict the mutant DHFRdrug binding affinity and the resulting selection coefficient, which then became inputs to a population genetics model. In the second approach, they use the observed clinical/environmental frequency of the variants to estimate the selection coefficient. Overall, this work is a compelling demonstration that a mechanistic model of the fitness landscape could recapitulate AMR evolution; however, considering that the number of mutations and pathways is small, a more compelling description of the robustness of the results and/or limitations of the model is needed.

      Major strengths:

      1) This is a compelling multi-disciplinary work that combines a mechanistic fitness landscape of DHFR (previously articulated in literature and cited by the authors), Rosetta to determine the biophysical effects of mutations, and a population genetics model.

      2) The study takes advantage of extensive data on the clinical/environmental prevalence of DHFR mutations.

      3) Provides a careful review of the surrounding literature.

      Major weakness:

      1) Considering that the number of mutations and pathways being recapitulated is rather small, I would suggest a more detailed description of the robustness of the results. For example:

      a) Please report the P-value for the correlation of the predicted DDG_{binding, theory} and DDG_{binding, experimental}.

      We thank the reviewer for the suggestion. We agree the available experimental data is small, limiting the statistical power of the Pearsons correlation test to determine how well Flex ddG predicts binding free energy change. However, as highlighted in the manuscript, two earlier studies by Aldeghi et al. 2018 & 2019 considered much larger datasets and found a correlation in a similar range to the one we found here. Furthermore, as suggested by the Reviewer, we carried out a onesided T-test with alternative hypothesis that the correlation is greater than 0 and found a p-value of 0.040, suggesting the correlation we observed is significant. We have included this test and p-value to the Results section.

      If interested in showing the correct assignment of mutational effects, perhaps use a contingency matrix to derive a P-value.

      As suggested by the Reviewer, we used a contingency matrix known as a confusion matrix to determine how accurate Flex ddG is at classifying mutations as stabilising or destabilising. This gave an accuracy of 0.89, sensitivity of 0.83 and a specificity of 1. The p-value associated with this continency table was 0.14, despite the high accuracy, sensitivity and specificity. This is likely due to the small sample size making it difficult to determine significance. This analysis has been included in the Results section.

      b) Although the DDG_binding calculation in Rosetta seems to converge (Appendix figures 3 and 4), I do not think the DDG values before equilibration should be included in the final DDG estimate. In practice, there is a "burn in" number of runs where the force field optimizes the calculation to account for potential clashes in the structure, etc. This is particularly important since the starting structures are modeled from homology. Consequently, the distributions of DDG that include the equilibration runs are multimodal (Appendix figure 2), which means that calculating an average may be inappropriate.

      Each Flex ddG prediction is independent (see Figure 1 of Barlow et al. 2018 for a summary of the Flex ddG method), i.e. the distribution of values does not represent a MCMC process in which there is a burn-in in order to equilibrate. The structures of both the wild-type and mutant are equilibrated in each run using the backrub algorithm. The reason so many runs are required is because each prediction is from a distribution of possible ddG values associated with that specific mutation and the authors of Flex ddG suggest running 35 runs or more and taking the average of the distribution. Therefore, in order to get an accurate prediction, enough simulations must be run per mutation to adequately characterise the distribution so that the average converges to a constant value.

      2) The geographical areas over which the mutational pathways are independently estimated are not isolated, allowing for the potential that an AMR variant in one region arose due to "migration" from another area. For example, the S58R-S117N is the most frequent double mutant of PvDHFR in geographically proximate Southern/Southeastern Asia (Fig. 4). To a certain extent, similar mutational patterns occur for PfDHFR in Southern/Southeastern Asia (Fig. 3). Although accounting for mutant migration in the model may be beyond the scope of the study, a clear argument for the validity of the "isolated island" assumption is needed.

      The Reviewer is correct that some variants in one region may have arisen due to “migration” from another area. This would impact the method for inferring mutational pathways from regional isolate frequency data but not when considering the worldwide population. If this occurred, we would expect to see a multiple mutant appearing in a region without the precursor (single, double etc) mutations, even in the case of large sample size. However, this does not seem to have been an issue for the pathways we have been predicting here. If it were the case that a variant migrated, and the precursor mutations could not be found in that region, we could look to mutations from neighbouring regions to infer the pathway, under the assumption of migration.

      We have added some discussion on this between lines 517-523:

      “When inferring pathways at a regional level, it is possible we may encounter instances where genotypes with multiple mutations are observed in a specific region, but the precursor mutations in the pathway are absent. This could happen either due to insufficient sampling of the region or due to "migration" of the variant from a neighbouring region. To infer pathways in the former case more samples would be required, whereas in the latter case we can look to the data from neighbouring regions where the variant is present and use the frequency data of the precursor mutations.”

    1. Author Response

      Reviewer #2 (Public Review):

      1) Analytical approaches are in the current form preliminary and not enough to draw firm biological conclusions. While the datasets are large (which is highly appreciated), they represent a relatively early stage of ENS development and possible differences between vagal and sacral-derived populations could partially be attributed to difference in maturity. Maturity will surely not explain the whole difference observed but needs to be factored into the interpretation. As scRNA-seq datasets from the mature chicken ENS are lacking (as well as detailed IHC-based neural classification system) the inference made in the paper between molecular classes and functional types are premature.

      We appreciate this comment and think it is an excellent suggestion that we definitely plan to do. This made us realize that we failed to clarify in the text why we chose this particular time point for our study, which is two-fold.

      First, we are particularly interested in how neural crest cells choose their prospective fates. E10 is a time when the post-umbilical gut has been completely populated by both vagal and sacral neural crest cells for 2 days so cells are in the process of differentiation but there still exists a large precursor pool. For this reason, we can capture both precursors and some differentiated neuronal subtypes. We have clarified this point in the revised manuscript and now focus much more on the precursor population to identify both genes that are common to vagal and sacral neural crest cells as well as those that are distinct. This enables us to formulate testable hypotheses for the role of potential role of particular transcription factors is allocation of cell fate. Of particular interest, we find that at E10, the sacral neuronal precursor pool is largely depleted whereas the vagal crest has a substantial neuronal precursor pool. Thus, we believe this is the perfect time point for initial analysis.

      Second and perhaps even more important, in the US, chick embryos are not considered vertebrates until after E10. Thus, E10 represents the last timepoint we can raise embryos without animal approvals which are not currently in hand. We completely agree that performing experiments at later timepoints will be incredibly valuable and therefore are now applying for approvals. But realistically, these take several months and thus would delay publication of our datasets (already delayed due to Covid restrictions) for at least another year. Therefore, we propose to publish the mature dataset as a Research Advance that would focus on differences between mature neuronal subtypes between preumbilical vagal, post-umbilical vagal and sacral datasets that would nicely complement the current work. Instead, we have refocused this paper on the precursor to differentiated neuron transition.

      I should mention that this refocusing seems particularly important given that our original aim was to explore differences between vagal and sacral neural crest contributions to the gut. However, the single cell data reveals strong overlap between sacral and vagal neural crest contributions to the postumbilical gut, suggesting a strong environmental influence on cell fate decisions.

      Specific concerns:

      1) Analysis of scRNA-sequenced sacral- versus vagal-derived ENS reveals clusters consistent with a non-ENS identity (endothelial, muscle, vascular and more). Previous studies in mouse using the neural crest tracing line Wnt1-Cre has not demonstrated such diverse progenies of neural crest from any region. An exception being a small population of mesenchymal-like cells (Ling and Sauka-Spengler, Nat Cell Biol. 2019; Zeisel et al., Cell 2018; Morarach et al., 2021; Soldatov et al., Science 2019). Therefore, the claimed broad potential of 6 of 13 neural crest giving rise to diverse gut cell populations warrants more validating experiments.

      We thank the reviewer for this comment. We clarify that hematopoetic clusters have dropped out upon reanalysis. The other clusters we believe are real based on gene markers used in previous studies to identify cell types such as neural crest-derived melanocytes like Mlana, Dct, and Mitf.

      2) Several earlier studies have revealed that parts of the ENS is derived from neural crest that attach to nerve bundles, obtain a schwann cell precursor-like identity and thereafter migrate into the gut (Uesaka et al. J Neurosci 2015 and Espinosa-Medina et al, PNAS 2017). The current work in chicken needs to be interpretated in the light of these findings and the publications should be discussed in relevant sections of the introduction and discussion.

      Thank you for this suggestion. We agree and indeed our data cannot differentiate between SCPs, which are neural crest-derived, versus early migrating neural crest cells. We have added this point to the discussion and also discuss these papers in more detail.

      3) The analysis indicates the presence of melanocytes. It is not clear why they are part of the GI-tract preparations. Could they correspond to another cell type, with partially overlapping gene expression profile as melanocytes?

      We have assigned these as melanocytes based on expression of Mlana, Mitf, and Dct as highly upregulated genes. These have been used in previous studies to identify neural crest derived melanocytes in the heart (Chen et al., 2021)

      4) As evident, the sacral- and vagal-derived ENS are not clonally related. To decipher differentiation paths and relations between clusters, individual analysis of the different datasets are needed. With only one UMAP representing the merged datasets combined with little information on markers, it is hard to evaluate the soundness of the conclusions regarding cell-identities of clusters and lineage differentiation.

      This is an excellent suggestion and we apologize for not including this previously. We have now added individual pre-umbilical vagal, post-umbilical vagal and sacral neural crest datasets as well as trajectory analysis for each.

      5) E10 is a relatively early stage in chicken ENS development. Around E7, the intestines do not contain differentiated neurons even. The relative high expression of Hes5 (marking mature enteric glia in the mouse; Morarach et al., 2021) in the vagal neural crest population might be explained by the more mature state of vagal versus sacral ENS. As also outlined below, Th/Dbh are known to be transiently expressed in the developing ENS why they could indicate the relative immaturity of sacral neural crest rather than differential neural identities. These issues need to be taken into account when interpreting biology from scRNA-seq data.

      We completely agree. We now clarify that we are particularly interested in how neural crest cells choose their prospective fates. We chose the E10 time point because this reflects a time point when the post-umbilical gut has been completely populated by both vagal and sacral neural crest cells for 2 days so cells are in the process of differentiation but there still exists a large precursor pool. For this reason, we can capture both precursors and some differentiated neuronal subtypes. Notably, the sacral derived precursors seem to be glial in flavor whereas neuronal precursors appear to be absent. We have clarified this point in the revised manuscript.

      6) Unlike the guineapig, and to some extent pig and murine ENS, the physiology of chicken enteric neurons has not been well characterized yet. Therefore, it is highly advisable to refrain from a nomenclature of clusters designating functions. Several key molecular markers are known to differ between murine, guineapig, rat and human systems. IPANs are a good example where differential expression is seen (SST in human but not mice; CGRP labels some IPANS in mouse, but not in guineapig, where Tac1 instead is expressed). IPANs are not defined in the chicken very well, and molecular markers found in other species may not be valid. Adrenergic and noradrenergic neurons have not been validated in the ENS (although, TH and Dbh have been observed in the especially in the submucosal ENS). Cholinergic neurons are also mentioned in the text, but do not appear in the figures as a defined group.

      Another reason to refrain from functional nomenclature is that a rather early stage is analysed in the present study, without possibilities to compare with scRNA-seq data from the mature chicken ENS (which was performed in Morarach et al, 2021 for the mouse). Recent data suggest that considerable differentiation may occur even in postmitotic neurons, and several markers are known to display a transient expression pattern (TH, DBH and NOS1; Baetge and Gershon 1990; Bergner et al., 2014; Morarach et al., 2021) why caution should be taken to infer neuronal identities to clusters.

      This is an excellent point and we thank the reviewer for this valuable input. Accordingly, we have now renamed the clusters based on prominent gene expression rather than neuronal or precursor subtype. Indeed we struggled with finding appropriate names making this comment all the more useful.

      7) The immunohistochemical analysis (Figure 5,6) is an essential complementary addition and validation of scRNA-seq. However, it is very difficult to discern staining when magenda and red are combined to display coexpression.

      Good point. This has been changed to be more readily discernible and higher magnification views have been added.

      8) To give more information to the field and body of evidence for claims made, quantifications relating to the analysis in Figures 5 and 6 are warranted as well as an expanded set of marker genes that align with the scRNA-seq results.

      Good point. We have added additional markers as suggested. In terms of quantitation, we can include numbers of labeled cells in a particular region but this may give a false impression of degree of contribution since we are using different viruses for vagal vs sacral that may have different titers making it a bit like comparing apples and oranges. We now emphasize that our labeling approach does not mark the entire population and that the degree of labeling can be variable.

      9) Correlations between genes and functions/neuron class are in many cases wrong (including Grm3, Gad1, Nts, Gfra3, Myo9d, Cck and more).

      Good point. We have toned this down.

      10) Attempts to subcluster neuronal populations are needed (Figure 7). However, to understand the biology, it is important to address which cells are sacral versus vagal-derived. Additionally, related to previous comment, as the vagal and sacral neurons are not clonally related, it would be important to make separate analysis of neurons relating to each region.

      Good point. We have added additional analysis to address this important point in what is now Fig 6 and in particular validated sacral contributions to glial cells (new Fig 8).

    1. Author Response

      Reviewer #1 (Public Review):

      In the current work, the authors aimed to investigate the genetic and non-genetic factors that impact structural asymmetry.

      A major strength is the number of data samples included in the study to assess brain structural asymmetry. A consequence of the inclusion of many samples is then also the sample size.

      We thank the reviewer for their supportive and insightful comments that have helped improve our paper.

      Comment #1: Given that the authors also work with longitudinal data, it would be nice to be able to appreciate the individual effects across time points, this is now a little unclear.

      Our lifespan analysis incorporated both single and repeat measures over time in the trajectory estimation, and hence these will be an intermediate estimate of cross-sectional and longitudinal trajectories. We have clarified this in the Methods (see 1). A comprehensive analysis of the individual-specific asymmetry change effects in the current paper is thus hindered by many properties of the data, including that many participants contribute a single measure, that participants vary in their number of repeat-measures (1-6 timepoints), that the number of repeat-measures is dependent on age, and that the degree of asymmetry change differs between cortical metrics, clusters, and along the age variable. Most importantly, the average degree of asymmetry change is small; Fig. 3 indicates thickness asymmetry typically corresponds to a ~0.1 - 0.2mm difference, such that changes therein will be smaller and thus likely unclear at the individual level. Nevertheless, we have modified the average plots in Figures 2 and 3 to allow better visualization of the individual hemispheric measures across timepoints, as well as an appreciation of the density of our longitudinal data.

      1 – (line 646) “GAMMs incorporate both single and repeat measures over time to capture nonlinearity of the mean level trajectories across persons, resulting in population estimates that are intermediate between cross-sectional and longitudinal trajectories”

      Comment #2: A possible less well-developed approach is the genetic basis, as this was stated as the main question, here the investigations are not that deep and may only touch upon the question.

      We agree the previous formulation of our Abstract did convey this impression, and have thus made the following important amendment:

      (Abstract) “Cortical asymmetry is a ubiquitous feature of brain organization that is subtly altered in some neurodevelopmental disorders, yet we lack knowledge of how its development proceeds across life in health. Achieving consensus on the precise cortical asymmetries in humans is necessary to uncover the developmental timing of asymmetry and extent to which it arises through genetic or later influences in childhood.”

      Our paper aims to serve as a critical reference for the normative childhood development and lifespan change of cortical asymmetry. We performed heritability analyses as they are informative regarding development and shed light on the timing of influences shaping cortical asymmetry (also possibly prior to age ~4 at which our sample starts). Similarly, genetic correlation analysis sheds light on whether the replicable interregional correlations are underpinned by genetic differences, indicative of coordinated genetic development of asymmetries. We apologize the rationale behind these analyses was not well-specified, and have clarified this (see response #4). Thus, we respectfully disagree the genetic aspect represented the main research question, but rather lends support to our developmental perspective.

      Given the density of analyses already included and that these are well-specified within the context of our overarching question, we do not see how adding more genetic analyses will be beneficial for our paper. However, we agree with the Reviewer’s subsequent comment (#8) that the genetic correlations in HCP data should also have been reported, and now incorporate these (see response #8).

      Comment #3: Moreover, the association with cognition, handedness, sex, and ICV is somewhat interesting yet seems also a bit minimal to fully grasp its implications.

      In the asymmetry field it has been commonplace to assume these factors are strongly related to asymmetry, particularly sex. Here, despite optimizing the delineation of asymmetries, associations with factors purportedly related to it were all very small. We believe this is an important message that may help reorient the field away from entrenched views; unless we show it is not the case, researchers may think the effects of these factors are larger than they are. Further, because questions pertaining to sex and handedness differences will certainly arise for many, we chose to address them by quantifying the average effects in big data, because our lifespan trajectory analysis was not well-suited to assessing e.g. sex differences in asymmetry trajectories (i.e. 3-way non-linear interactions; sexagehemisphere). We have strengthened the reasoning for this analysis in the Introduction (see 1):

      1 – (line 118) “Therefore, as a final step, we reasoned that combining an optimal delineation of population-level cortical asymmetries with big data would optimize detection and quantification of the effects of factors commonly assumed important for asymmetry, namely general cognitive ability, handedness and sex.”

      Contrary to approaches that often place emphasis on p-values (e.g. pheWAS), our targeted approach using variables long considered important for asymmetry enabled transparent reporting of the effect sizes and directions. We hope the Reviewer agrees we have taken care in this regard, and are careful to communicate the found effects are small. The small effects seem typical of structural brain associations in big data, as may be expected when relating complex phenotypes to any single structural measure. For these reasons, we opt not to extend the analysis beyond our initial targeted approach, arguing instead that the size of the effects is reason enough to report them.

      Despite being small, however, we argue they are not negligible (see 2-4). Of note, though it may appear so in Fig. 7, the p-value for the cognitive association was far from just surviving Bonferroni correction (it would survive >13,000 comparisons at our alpha level [⍺=.01], whereas we corrected for our 136). Note we did not accept a 5% false positive rate. We have clarified this in the Results (see 5):

      2 – (line 485) “Other factors commonly espoused to be important for asymmetry were associated with only small average effects in adults. For example, we found one region – SMG/perisylvian – wherein higher leftward areal asymmetry related to subtly higher cognitive ability. Since interhemispheric anatomy here is likely related to brain torque 2,3, this may agree with work suggesting torque relates to cognitive outcomes 4,5. Interestingly, that ~94% of humans exhibit leftward asymmetry in this region (Figure 1G) suggests tightly regulated genetic-developmental programs control its lateralized direction in humans (see Figure 6). This result may therefore suggest disruptions in areal lateralization early in life are associated with cognitive deficits detectable in later life as small effects in big data 6. While speculative, this may also agree with evidence that differences in general cognitive ability that show high lifespan stability 6 relate primarily to areal phenotypes formed early in life 7–9.”

      3 – (line 461) “We also found areal asymmetry in anterior insula is, to our knowledge, the most heritable asymmetry yet reported with genomic methods 10–14, with common SNPs explaining ~19% variance. This is notably higher than in our recent report (< 5%) 14, illustrating a benefit of our approach. As we reported recently 14, we confirm asymmetry here associates with handedness.”

      4 - (line 495) “Consistent with our recent analysis in UKB 14, we confirmed leftward areal asymmetry of anterior insula, and leftward somatosensory thickness asymmetry is subtly reduced in left-handers. Sha et al. 14 reported shared genetic influences upon handedness and asymmetry in anterior insula and other more focal regions. Anterior insula lies within a left-lateralized functional language network 15, and its structural asymmetry may relate to language lateralization 16–18 in which left-handers show increased atypicality 19–21. Since asymmetry here emerges early in utero 22 and is by far the most heritable (Figure 6), we agree with others 16 that this ontogenetically foundational region of cortex may be fruitful for understanding genetic-developmental mechanisms influencing laterality 23,24. Less leftward somatosensory thickness asymmetry in left-handers also echoes our recent report 14 and fits a scenario whereby thickness asymmetries may be partly shaped through use-dependent plasticity and detectable through group-level hemispheric specializations of function. Still, the small effects show cortical asymmetry cannot predict individual handedness. Associations with other factors typically assumed important were similarly small, and mostly compatible with the ENIGMA report 25 and elsewhere 26,27. 5 - (line 3221) ”Although small, we note this association was far from only just surviving correction at our predefined alpha level (⍺ = .01; corrected for 136 tests; Methods).”

      6 - (line 348) “we … uncover novel and confirm previously-reported associations with factors purportedly related to asymmetry – all with small effects”

      Thus, in quantifying effects we could not include in our lifespan analysis we preempt the questions likely to arise for many researchers, provide a sobering account of the effect sizes of factors typically assumed important for asymmetry, and find results that fit the developmental framework we lay out in the paper. We therefore opt to keep these together with the lifespan and heritability results in the current paper.

      Comment #4: To some extent, the aim of the study could still be written with more clarity. However, the authors have in part achieved their aims - assuming it is found a consensus on the brain asymmetry patterns in humans as is stated in the abstract.

      Alongside the amendment to the Abstract that better clarifies our aims (response #2), we have restated the aims in the Introduction:

      1 - (line 121) Here, we first aimed to delineate population-level cortical areal and thickness asymmetries using vertex-wise analyses and their overlap in 7 international datasets. With a view to gaining insight into cortical asymmetry development, we then aimed to trace a series of lifespan and genetic analyses. Specifically, we chart the developmental and lifespan trajectories of cortical asymmetry for the first time longitudinally across the lifespan. Next, we examine phenotypic interregional asymmetry correlations, under the assumption correlations indicate coordinated development of left-right asymmetries through genes or lifespan influences. To shed light on the extent to which differences in asymmetry are genetic, we test heritability of asymmetry using genome-wide single nucleotide polymorphism (SNP) and extended twin data, and examine whether or not phenotypic associations are underpinned by genetic correlations suggestive of coordinated development through genes. Finally, we screen our set of robust, population-level asymmetries for association with general cognitive ability and factors purportedly related to asymmetry in UK Biobank (UKB). 28

      Comment #5: Overall the results support the conclusions, yet the strong interpretation of early life factors in particular is not empirically investigated as far as I gather.

      The reviewer is correct that we do not have data on neonates to directly support interpretations of prenatal factors. We have therefore tempered strong interpretations pertaining to prenatal accounts accordingly, have added text at the start of the Discussion to address this (see 1), and qualified all discussion of prenatal factors:

      1 – (line 366) “Tracing their lifespan development, we show the trajectories of areal asymmetry primarily suggest this form of asymmetry is developmentally stable at least from age ~4, maintained throughout life, and formed early on – possibly in utero 13,29,30 (while we cannot extrapolate to ages before our sample begins, we note this agrees with findings in neonates 29,30). One interpretation of lifespan stability combined with low heritability may be stochastic early-life developmental influences determine individual differences in areal asymmetry more than later developmental change, but work linking prenatal and childhood trajectories is needed to affirm this”

      2 – (Abstract) “Results suggest areal asymmetry is developmentally stable and arises early in life through genetic but mainly subject-specific stochastic effects”

      We have also added argumentation regarding a just-published study suggesting the average pattern of neonatal areal asymmetry is largely similar to adults 1. In addition, we reiterate what our data can and cannot say about the developmental timing of asymmetry in several places in the Discussion (see 3 & 5). In other places, we have removed reference to prenatal factors (see 4). Still, while we agree we previously used the terms “prenatal” and “early life factors” interchangeably, we note the latter often encompasses periods of early childhood covered here and is not necessarily restricted to factors present at birth 2,3. Thus, we have amended the Discussion to qualify the age-range the interpretation pertains to (see 5), and then retain the conclusion as follows (see 6).

      3 - (line 383) “For areal asymmetry, adult-like patterns of lateralization were strongly established before age ~4, indicating areal asymmetry traces back further and does not primarily emerge through later cortical expansion 33. Rather, the lifespan trajectories predominantly show stability from childhood to old age, as asymmetry was maintained through periods of developmental expansion and aging-related change that were region-specific and bilateral. This may align with evidence indicating areal asymmetry may be primarily determined in utero 29,30, including evidence suggesting little change in areal asymmetry from birth to 2 years 29,33,34, and little difference between maps derived from neonates and adults 29,30. It may also fit with the principle that the primary microstructural basis of cortical area 8 – the number of and spacing between cortical minicolumns – is determined in prenatal life 8,9, and agree with work suggesting asymmetry at this microstructural level may underly hemispheric differences in surface area 35. The developmental trajectories agree with studies indicating areal asymmetry is established and strongly directional early in life 29,36. That change in surface area later in development follows embryonic gene expression gradients may also agree with a prenatal account for areal asymmetry 9”

      4 - (line 439) “The strongest relationships all pertained to asymmetries that were proximal in cortex but opposite in direction. Several of these were underpinned by high asymmetry-asymmetry SNP-based genetic correlations, illustrating some lateralizations in surface area exhibit coordinated genetic development.”

      5 - (line 481) “Regardless, these results support a differentiation between early-life (i.e. before age ~4) and later developmental factors in shaping areal and thickness asymmetry, respectively.”

      6 - (Conclusion) “Developmental and lifespan trajectories, interregional correlations and heritability analyses converge upon a differentiation between early-life and later-developmental factors underlying the formation of areal and thickness asymmetries, respectively. By revealing hitherto unknown principles of developmental stability and change underlying diverse aspects of cortical asymmetry, we here advance knowledge of normal human brain development.”

      Overall this is a nice and thorough work on asymmetry that may inform further work on brain asymmetry, its genetic basis, development, environmentally induced change, and link to behavioural variation.

    1. Author Response

      Reviewer #1 (Public Review):

      Bacterial carboxysomes are compartments that enable the efficient fixation of carbon dioxide in certain types of bacteria. A focus of the current work is on two protein components that provide spatial regulation over carboxysomes. The McdA system is an ATPase that drives the positioning of carboxysomes. The McdB system is essential for maintaining carboxysome homeostasis, although how this role is achieved is unclear. Previous studies, by the lead author's lab, showed that the McdB system is a driver of phase separation in vitro and in cells. They proposed a putative connection between McdB phase separation and carboxysome homeostasis. The central premise of the current work is as follows: In order to understand if and how phase separation of McdB impacts carboxysome homeostasis, it is important to know how the driving forces for phase separation are encoded in the sequence and architecture of McdB. This is the central focus of the current work. The picture that emerges is of a protein that forms hexamers, which appears to be a trimer of dimers. The domains that drive that the dimerziation and trimerization appear to be essential for driving phase separation under the conditions interrogated by the authors. The N-terminal disordered region regulates the driving forces for phase separation - referred to as the solubility of McdB by the authors. To converge upon the molecular dissections, the authors use a combination of computational and biophysical methods. The work highlights the connection between oligomerization via specific interactions and emergent phase behavior that presumably derives from the concentration (and solution condition) dependent networking transitions of oligomerized McdB molecules.

      Having failed to obtain specific structural resolution for the full-length McdB as a monomer or oligomer, the authors leverage a combination of computational tools, the primary one being iTASSER. This, in conjunction with disorder predictors, is used to identify / predict the domain structure of McdB. The domain structure predictions are tested using a limited proteolysis approach and, for the most part, the predictions stand up to scrutiny affirming the PONDR predictions. SEC-MALS data are used to pin down the oligomerization states of McdB and the consensus that emerges, through the investigations that are targeted toward a series of deletion constructs, is the picture summarized above.

      Is the characterization of the oligomerization landscape complete and likely perfect? Quite possibly, the answer is no. Deletion constructs pose numerous challenges because they delete interactions and inevitably impose a modularity to the interpretation of the totality of the data.

      This is a good point and always a possibility with truncations – the protein McdB may not be as modular in nature as it seems in our tripartite model. But the deletion constructs were more so intended to be tools for identifying key regions of oligomerization and condensate formation as others have done, and for this, they were indeed useful. Additionally, we were able to strategically aim our substitution mutations based on data from the deletion constructs. These substitutions provided data consistent with the deletions, but in the context of the full-length protein (see Fig. 5 vs. Figs. 2, 4). However, we ultimately agree with the reviewer that this is always a possibility with truncations, and we have therefore mentioned this caveat in the discussion.

      Line 415 “Truncated proteins have been useful in the study of biomolecular condensates. But it is important to note that using truncation data alone to dissect modes of condensate formation can lead to erroneous models since entire regions of the protein are missing. However, data from our truncation and substitution mutants were entirely congruent. For example, deletion of the CTD or substitutions to this region caused destabilization of the hexamer to a dimer, and deletion of the IDR or substitutions to this region caused solubilization of condensates without affecting hexamer formation.”

      Accordingly, we are led to believe that the N-terminal IDR plays no role whatsoever in the oligomerization.

      Our updated data still strongly supports this interpretation. Both truncation of the IDR (Fig. 2) and the six-Q-substitution mutant in the IDR (Fig. 5) form a monodispersed hexamer in solution via SEC-MALS, as does wild-type McdB.

      Close scrutiny, driven by the puzzling choice of nomenclature and the Lys to Gln titrations in the N-terminal IDR raise certain unresolved issues. First, the central dimerization domain is referred to as being Q-rich. This does not square with the compositional biases of this region. If anything is Q/L or just L-rich. This in fact makes more sense because the region does have the architecture of canonical Leu-zippers, which do often feature Gln residues. However, there is nothing about the sequence features that mandates the designation of being Q-rich nor are there any meaningful connections to proteins with Q-rich or polyQ tracts. This aspect of the analysis and discussion is a serious and erroneous distraction.

      We changed the language here, and no longer refer to the central region as “Q-rich”. However, we would like to note that the second half of the McdB central domain is indeed enriched in glutamines (14/53 = 26.4%) to a comparable extent as the region of FUS, which has been shown to help drive condensate formation via glutamine H-bonding (14/44 = 31.8%; Murthy et al 2019). We were simply proposing that, at a molecular level, there was some insight to be gained from this comparison. We agree, however, that there is no functionally meaningful comparison between McdB and polyQ-tract proteins, as we may have previously alluded to in our discussion, and that text has been removed.

      Back to the middle region that drives dimerization, the missing piece of the puzzle is the orientation of the dimers. One presumes these are canonical, antiparallel dimers. However, this issue is not addressed even though it is directly relevant to the topic of how the trimer of dimers is assembled.

      Indeed, we were unable to resolve the orientation issue, despite much effort. The story we present is not a complete and final model of McdB structure, nor its molecular modes of oligomerization or condensate formation. However we now provide a discussion section “McdB homologs have polyampholytic properties between their N- and C-termini” that highlights this issue. We also mention the remaining dimer orientation issue at the end of the results section “Se7942 McdB forms a trimer-of-dimers hexamer”. However, we believe the data presented still provides useful initial models, which for example, allowed us to create a series of substitutions that tune McdB condensate solubility and verify that they do not affect oligomerization. We would like to further add that for other condensate forming proteins in bacteria, like the PopZ protein we mention in the text, there remains no detailed structural model beyond the resolution we provide here for McdB; despite PopZ being first identified in 2008. Over 40 publications on PopZ have progressively provided useful and more detailed models that are only now being used to develop PopZ as a tool for condensate technologies that are furthering our understanding of the biological implications of condensate formation across all cell types. The intention with our current report is therefore not to generate a finalized molecular model of this entirely unstudied class of McdB proteins. But instead, to generate useful insight into McdB biochemistry that can advance our understanding of this class of protein’s function in vivo. To this end, we now add in vivo data based on these initial models where we specifically link cellular phenotypes to McdB condensate solubility (Fig. 8). Of course, there are several follow-up studies that come from the current report, but we believe that speaks to the value of the presented research in advancing this field.

      If the trimer is such that all binding sites are fully satisfied (with the binding sites presumably being on the C-terminal pseudo-IDR), then the hexamer should be a network terminating structure, which it does not seem to be based on the data. Instead, we find that only the full-length protein can undergo phase separation (albeit at rather high concentrations) in the absence of crowder. We also find that the driving forces for phase separation are pH dependent, with pH values above 8.5 being sufficient to dissolve condensates. Substitution of Lys to Gln in the N-terminal IDR leads to a graded weakening of the driving forces for phase separation. The totality of these data suggest a more complex interplay of the regions than is being advocated by the authors.

      Thank you and we agree. As we discuss above in response #4 and below in response #7, we have changed the focus and tone of our report to say that, while the models we have generated are useful, we are aware they are incomplete at a molecular level. Furthermore, as we describe in response #6, we have added several new McdB mutants to investigate more deeply the role of the CTD, but this region was not amenable to mutagenesis as these mutants affected McdB oligomerization. Lastly, while network forming interactions are certainly important for condensate formation as the reviewer describes, so are solvent interactions. We have added new text and data related to Figs. 3, 4 that address these issues.

      Almost certainly, there are complementary electrostatic interactions among the N-terminal IDR and C-terminal pseudo IDR that are important and responsible for the networking transition that drives phase separation, even if these interactions do not contribute to hexamer formation. The net charge per residue of the 18-residue N-terminal IDR is +0.22 and the NCPR of the remainder is ≈ -0.1. To understand how the N-terminal IDR is essential, in the context of the full-length protein, to enable phase separation (in the absence of crowder), it is imperative that a model be constructed for the topology of the hexamer. It is also likely that the oligomer does not have a fixed stoichiometry.

      We agree and thank the reviewer for these comments. We have added several new substitution mutants aimed at addressing this (Figs. 5, S6). However, the C-terminus was not amenable to substitutions as the trimer-of-dimers was significantly destabilized in these mutants (Figs. 5, S7). Therefore, in this report we were unable to determine specifically how the basic residues in the IDR contribute to condensate formation. However, with the addition of new data in Fig. 8, we think we adequately show that the IDR mutants can be used to investigate McdB condensate formation in vivo, and that follow-up studies will be aimed at investigating these details. We have also added an new discussion section “McdB homologs have polyampholytic properties between their N- and C-termini” that highlight this very likely possibility suggested by the reviewer.

      Therefore, the central weakness of the current work is that it is too preliminary. A set of interesting findings are emerging but by fixating on Lys to Gln titrations within the N-terminal IDR and referring to these titrations as impacting solubility, a premature modular and confused picture emerges from the narrative that leaves too many questions unanswered.

      The work itself is very important given the growing interest in bacterial condensates. However, given that the focus is on understanding the molecular interactions that govern McdB phase behavior - a necessary pre-requisite in the authors minds for understanding if and how phase separation impacts carboxysome homeostasis - it becomes imperative that the model that emerges be reasonably robust and complete. At this juncture, the model raises far too many questions.

      We agree that our previous report was focused mainly on the molecular basis of McdB condensate biochemistry, and in that report we left the model short. In this revised version, we have added several pieces of new data that strengthen the model (Figs. 3-5), although it is still incomplete. However, in this revised version, we have also shifted the focus from a complete biochemical understanding of McdB condensates to a study that links McdB condensate formation in vitro to phenotypes in vivo. In this regard, we have added the in vivo data in Fig. 8 and somewhat changed the focus in the text.

      The MoRF analysis is distraction away from the central focus.

      The MoRF analysis has been removed.

      The problem, as I see it, is that the authors have gone down the wrong road in terms of how they have interpreted the preliminary set of results. Further, the methods used do not have the resolution to answer all the questions that need to be answered. Another issue is that a lot of standard tropes are erected and they become a distraction. For example, it is simply not true that in a protein featuring folded domains and IDRs it almost always is the case that the IDR is the driver of phase transitions. This depends on the context, the sequence details of the IDRs, and whether the interactions that contribute to the driving forces for phase separation are localized within the IDR or distributed throughout the sequence. In McdB it appears to be the latter, and much of the nuance is lost through the use of specific types of deletion constructs.

      Thank you. We have removed much of this and changed the diction on how our current model of McdB condensate formation fits into the literature in the discussion.

      Overall, the work represents a good beginning but the data do not permit a clear denouement that allows one to connect the molecular and mesoscales to fully describe McdB phase behavior. Significantly more work needs to be done for such a picture to emerge.

      Reviewer #2 (Public Review):

      In this work, Basalla et al. study the biochemical properties of the carboxysome positioning protein, McdB. Using in vitro experiments, the authors characterize McdB oligomeric states and the domains driving and modulating its phase separation. Based on bioinformatics analysis, the authors identify a putative binding recognition motif between McdB and its two-component system counterpart McdA. As McdAB-like systems emerge as spatial regulators of bacterial compartments, the data presented here may be of general interest. The study is well executed and provides exciting hypotheses to be tested in vivo.

      The authors found that McdB from S. elongatus PCC 7942 consists of three domains: an N-terminal 18 aa disordered region, a Q-rich helical domain, and a helical C-terminal domain (CTD). Analyzing these domains, the authors present three key results: (i) The Q-rich domains form dimers, and the CTD drives the formation of trimers of dimers (ii) Phase separation is pH sensitive, driven by the Q-rich domain, and modulated by basic residues in the IDR, (iii) The IDR contains a putative recognition motif that binds McdA. While these three sets of results are rich in data, they are disjointed. Relating the three datasets (oligomeric states of the protein, its phase separation behavior, and its ability to bind McdA) is required to provide a complete picture of the molecular mechanism driving McdB condensation.

      Specific comments:

      1) The main limitation of this manuscript is the lack of integration between the three areas of results. In particular: how do the IDR basic residues disrupt phase separation? Is that through interference with either the dimer or timer interface? Does the McdB IDR regulate phase separation behavior when bound to McdA? Or, in other words, is the MoRF acting both as a binding interface and as a solubility regulator, and if so, can both functions be achieved simultaneously? It seems like the MoRF includes at least three basic residues.

      Indeed, we were unable to fully resolve the specific molecular interactions that give rise to condensates versus those that give rise to oligomers, and how these two modes of self-association contribute to one another. One limitation was that, as shown in our new data, the CTD was not amenable to mutagenesis, as it caused destabilization of the trimer-of-dimers (Fig. 5, Fig. S7). Therefore, we could not dissect how the CTD contributes to oligomerization versus driving condensates. However, we did include in vivo data showing how the IDR mutations allowed us to specifically link phenotypes to McdB condensate solubility (Fig. 8). As we discuss above in responses #4, #6, and #7, we changed the focus of the revised manuscript from the molecular basis of McdB condensate formation to linking McdB condensate formation in vitro and its functionality in vivo. To this end, we think the IDR mutation set has been useful, and follow-up studies will be done to further the molecular model of McdB condensate formation. Reviewers 1 and 3 deemed the MoRF section a distraction. Therefore, MoRF analysis and discussions of McdA interactions with this potential MoRF have been removed.

      Finally, what is the effective concentration of McdB in cells, and how does that translate to the in vitro studies?

      In our previous version, we used McdB concentrations between 50-100 µM. We do not know the in vivo concentration of McdB. We have tried several antibodies against McdB, and a few were good enough to detect the presence of McdB, but not quantifiably. We therefore believe in vivo McdB levels are low (sub-micromolar), and definitely lower than the range we previously used in our in vitro studies. In our revised manuscript, we include a titration of McdB at lower concentrations, and see condensates at McdB concentrations lower than 2 µM.

      2) How general are the conclusions made here to other McdBs? The authors have published nice work surveying the commonalities and differences between homologous McdB proteins. Can you comment on the applicability of your findings to other McdB proteins?

      This is a great point, which we have added to a new discussion section titled “McdB homologs have polyampholytic properties between their N- and C-termini”.

      Additional issues:

      3) Using SEC and SEC-MALS, the authors demonstrated that the Q-rich domain forms a stable dimer and that the full-length protein forms hexamers, suggesting trimers of dimers assembly. The authors also suggest that the CTD is responsible for forming those trimers of dimers based on SEC-MALS measurements. However, Figure 2D shows that while the full length runs at 6.6x the monomer, the Q-rich+CTD runs at 5.4x the monomer. First, I could not find SEC-MALS of the full-length protein, and it is not clear whether SEC-MALS was used for all or a fraction of the constructs discussed in Figure 2D. Second, could it be that the Q-rich domain+CTD is an ensemble of hexamers and dimers? Perhaps the IDR is playing a secondary role in stabilizing the hexamer?

      We have repeated the SEC-MALS experiments and included the full-length protein (Fig. 2). Furthermore, we have included SEC-MALS for some of the key substitution mutants (Figs. 5, S7). With the additional findings, our conclusions remain the same as in our previous version of the manuscript.

      4) The analysis of the phase separation results needs to have some extra quantification. The authors show that at 100 uM protein with 10% PEG the full-length phase separates as well as IDR+Q-rich. Lines 176-178: "The CTD, on the other hand, has no effect on the Q-rich domain condensates; Q-rich+CTD condensates formed at the same protein concentration and with identical droplet morphologies at the Q-rich domain alone." It is hard to draw this conclusion solely based on the data presented in Figure 3. An alternative interpretation might be that Q-rich+CTD reduces csat. I suggest the authors include turbidity assays (as shown for pH effect) to quantitively determine csat for these different constructs and perhaps perform FRAP to determine the mobility of these different constructs. In addition, how long after the addition of PEG were these droplets imaged?

      We now include an additional figure where we characterize condensates for full-length McdB (Fig. 3), including FRAP as suggested by the reviewer. We also include additional experiments for the truncations as requested (Fig. 4), and relate the truncation data to the model we propose for the full-length protein. All condensate samples were incubated for 30 mins prior to imaging unless otherwise stated, which we have added to the methods section “Microscopy of protein condensates”.

      5) Solubility assays shown in Figures 4A, B, D, and 5C are missing error bars. Without replicates, it is difficult to assess, for example, the effect of KCl.

      We have included replicates and error bars. Apologies for the omission.

      Also, please indicate the physiological ranges of KCl and pH in Figure 6. The phase separation sensitivity to pH is intriguing. By changing basic residues to glutamines, the authors conclude that the positive charge of the IDR modulates solubility. The Q-rich domain, however, is negatively charged. Can the authors comment on the role of acidic residues in the Q-rich domain? Are they required for phase separation? Also - based on your previous bioinformatics analysis, are the charges of the IDR and the Q-rich domains conserved across McdB homologs?

      Data from this report, and as described by reviewer #1, suggest that charge in the CTD, and not the central region, may be important. Our previous report (MacCready et al., Mol Biol Evol. 2020) touches on the conservation of charge in the NTD and CTD, which we have now added to the discussion section titled ““McdB homologs have polyampholytic properties between their N- and C-termini””. However, we were unable to experimentally verify electrostatic associations between the NTD and CTD because the CTD was not amenable to mutagenesis, as shown in our new data added to the manuscript (Figs. 5, S7).

      6) In previous work, the authors showed a conserved RKR segment in the IDR is highly conserved and missing in S. elongatus PCC 7942 (MacCready et al., Mol Biol Evol. 2020). Given the current finding, it would be important to understand whether the RKR deletion carries functional implications for phase separation behavior.

      The RKR segment is not missing, but likely relates to the KKR residues from S. elongatus PCC 7942. We describe this in more detail elsewhere (MacCready et al., Mol Biol Evol. 2020). However, as we show here, these specific residue locations do not seem to be especially important for condensate formation, but instead the overall net charge of the IDR mediates condensate solubility regardless of the specific residues mutated (Fig. 6).

      7) McdB proteins with 2Q left mutated vs. 2Q middle and 2Q right seem to result in condensates with different material properties (e.g., DIC pictures show different droplet morphologies for the different constructs). Is that the case? And if so, can you comment on that?

      We have included a brief mention of this in the text. However, the overall interpretation of these results remains that regardless of the residues mutated, there is a comparable degree of condensate solubilization for constructs with the same IDR net charge (Fig. 6).

      Reviewer #3 (Public Review):

      Through a series of rigorous in vitro studies, the authors determined McdB's domain architecture, its oligomerization domains, the regions required for phase separation, and how to fine-tune its phase separation activity. The SEC-MALS study provides clear evidence that the α-helical domains of McdB form a trimer-of-dimers hexamer. Through analysis of a small library of domain deletions by microscopy and SDS-PAGE gels of soluble and pellet fractions, the authors conclude that the Q-rich domain of McdB drives phase separation while the N-terminal IDR modulates solubility. A nicely executed study in Figure 4 demonstrated that McdB phase separation is highly sensitive to pH and is influenced by basic residues in the N terminal IDR. The study demonstrates that net charge, as opposed to specific residues, is critical for phase separation at 100 micromolar. In addition, the experimental design included analysis of McdB constructs that lack fluorescent proteins or organic dyes that may influence phase separation. Therefore, the observed material properties have full dependence on the McdB sequence.

      Thank you for the kind words and this perspective. We have added a brief mention to it in the discussion section titled “McdB condensate formation follows a nuanced, multi-domain mechanism”: “Furthermore, it should be noted that the McdB constructs used in our in vitro assays were free from fluorescent proteins, organic dyes, or other modification that may influence phase separation. Therefore, the observed material properties of these condensates have full dependence on the McdB sequence.”

      Studies of proteins often neglect short, disordered segments at the N- or C- terminus due to unclear models for their potential role. This study was interesting because it revealed a short IDR as a critical regulator of phase separation. This includes experiments that remove the IDR (Fig 2 & 3) and mutate the basic residues to show their importance towards McdB phase separation. In a nice set of SDS-PAGE experiments, the authors showed that as the net charge of the IDR decreased the construct became more soluble.

      One challenge is in the experimental design when mutating residues is to assess their impact on phase separation. The author's avoided substitutions to alanine, as alanine substitutions have synthetically stimulated phase separation in other systems. The authors, therefore, have a good rationale for selecting potentially milder mutations of lysine/arginine to glutamine. A potential caveat of mutation to glutamine is that stretches of glutamines have been associated with amyloid/prion formation. So, the introductions of glutamines into the IDR may also have unexpected effects on material properties. Despite these caveats, the authors show mutation of six basic residues in the short IDR abolished phase separation at 100 mM.

      Thank you for the thoughtful consideration, and appreciation of our work! Reviewer 1 had reservations for the Gln substitutions as well. We also used Alanine in new data added to the manuscript. But as the reviewer notes, the alanine mutations artificially drove further phase separation activity, and even aggregation. We show that mutants with the introduction of glutamines, however, remain soluble in vitro and in E. coli even at very high concentrations. Furthermore, we now include SEC-MALS of the McdB variant with 6 glutamines introduced in the IDR and show that there is no impact on oligomeric state. Together the data show no amylogenic properties of these glutamine enriched mutants.

      We have added a note to this potential caveat in the discussion section “McdB condensate formation follows a nuanced, multi-domain mechanism”: “Glutamine-rich regions are known to be involved in stable protein-protein interactions such as in coiled-coils and amyloids (52, 53), and expansion of glutamine-rich regions in some proteins lead to amylogenesis and disease (54, 55). However, when we introduced glutamines into the IDR of McdB solubility was increased both in vitro and in vivo, and without any impact on hexamerization. Together, the data show that increasing the glutamine content in the IDR of McdB did not lead to amylogenesis, but rather increased solubility. Our findings therefore underpin the importance of positive charge in the IDR specifically for stabilizing McdB condensates.”

      Computational studies (Fig 7) also suggest that this short N-IDR region may play a role as a MORF upon potential binding to a second protein McdA. The formulation of this hypothesis is strengthened by the fact that for other ParA/MinD-family ATPases, the associated partner proteins have also been shown to interact with their cognate ATPase via positively charged and disordered N-termini. This aspect of understanding McdB's N-IDR as a MORF is at a very early stage. This study lacks experimental evidence for an N-IDR: McdA interaction and experimental data showing conformational change upon McdA binding. However, the computation study sets up the future to consider whether and how the phase separation activity of McdB is related to its structural dynamics and interactions with McdA.

      Based off of these comments and from Reviewer 1 comments, we have removed the MoRF analyses entirely. The MoRF analysis will be coupled to another study in the lab focused on McdB interactions with McdA.

      In summary, this study provides a strong foundation for the contribution of domains to McdB's in vitro phase separation. This knowledge will inform and impact future studies on McdB regulating carboxysomes and how the related family of ParA/MinD-family ATPases and their cognate regulatory proteins. For example, it is unknown if and how McdB's phase separation is utilized in vivo for carboxysome regulation. However, the revealed roles of the Q-rich domain and N-IDR will provide valuable knowledge in developing future research. In addition, the systematic domain analysis of McdB can be combined with a similar analysis of a broad range of other biomolecular condensates in bacteria and eukaryotes to understand the design principles of phase separating proteins.

    1. Author Response

      Reviewer #1 (Public Review):

      When we tilt our heads, we do not perceive objects to be tilted or rotated. In this study, the authors investigate the underlying neural underpinnings by characterizing how neurons in monkey IT respond to objects when the entire body is tilted. They performed two experiments. In the first experiment, the authors record single neuron responses to objects rotating in the image plane, under two conditions - when the animals were tilted +20{degree sign} or -20{degree sign} relative to the gravitational vertical. Their main finding is that neural tuning curves for object orientation were highly correlated under these conditions. This high correlation is interpreted by the authors as indicative of encoding of object orientations relative to an absolute gravitational reference frame. To control for the possibility that the whole-body tilt could have induced compensatory torsional rotations of the eyes, the authors estimated the eye torsional rotation between the {plus minus}20{degree sign} whole-body tilt to be only {plus minus}6{degree sign}. In the second experiment, the authors recorded neural responses to objects rotated in the image plane with no whole-body tilt but with a visual horizon that could be tilted by the same {plus minus}20{degree sign} relative to the gravitational vertical. Here too they find many neurons whose tuning curves were correlated between the two horizon tilt conditions. Based on these results, the authors argue that IT neurons represent objects relative to the gravitational or absolute vertical.

      The question of whether the visual system encodes objects relative to the gravitational vertical is an interesting and basic one, and I commend the authors for attempting this question through systematic testing of object selectivity under conditions of whole-body tilt. However, I found this manuscript extremely difficult to read, with important analyses and controls described in a very cursory fashion. I also have several major concerns about these results.

      First, the high tuning correlation in the {plus minus}20{degree sign} whole-body tilt conditions could also occur if IT neurons encoded object orientation relative to other fixed contextual cues in the surrounding, such as the frame of the computer monitor. The authors ideally should have some experiment or analysis to address this potential confound, or else acknowledge that their findings can also be interpreted as the encoding of object orientation relative to contextual cues, which would dilute their overall conclusions.

      We think there are three possible interpretations of this comment. First, that visible edges, including the horizon and ground plane (in the scene stimuli), and the screen edges and other gravitationally aligned edges in the room, could serve as visual cues for the orientation of gravity. We agree with this wholeheartedly, and in fact showed a strong degree of gravitational alignment based purely on visual scene cues in Figures 3 and 4. This is consistent with our previous results suggest computation of gravity’s direction in the middle channel of IT (Vaziri et al., Neuron 2014; Vaziri and Connor, Current Biology 2016). Our findings would not be diluted by the fact that multiple cues, not just vestibular/somatosensory but also visual, could help in computing the direction of gravity.

      Second, that overlap between objects and horizon could produce a shape-configuration interaction that changes with object orientation and produces a tuning effect that remains consistent across monkey tilts. We agree this was a possibility, and that is why we tested neurons in the isolated object condition. We have added text to better explain this concern and the control importance of the isolated object condition in the discussion of Fig. 1: “The Fig. 1 example neuron was tested with both full scene stimuli (Fig. 1a), which included a textured ground surface and horizon, providing visual cues for the orientation of gravity, and isolated objects (Fig. 1b), presented on a gray background, so that primarily vestibular and somatosensory cues indicated the orientation of gravity. The contrast between the two conditions helps to elucidate the additional effects of visual cues on top of vestibular/somatosensory cues. In addition, the isolated object condition controls for the possibility that tuning is affected by a shape-configuration (i.e. overlapping orientation) interaction between the object and the horizon or by differential occlusion of the object fragment buried in the ground (which was done to make the scene condition physically realistic for the wide variety of object orientations that would otherwise appear improbably balanced on a hard ground surface).”

      The comparable results in the isolated object condition address the reasonable concern about the horizon/object shape configuration interaction.: “Similar results were obtained for a partially overlapping sample of 99 IT neurons tested with isolated object stimuli with no background (i.e. no horizon or ground plane) (Fig. 2b). In this case, 60% of neurons (32/53) showed significant correlation in the gravitational reference frame, 26% (14/53) significant correlation in the retinal reference frame, and within these groups 13% (7/53) were significant in both reference frames. The population tendency toward positive correlation was again significant in this experiment along both gravitational (p = 3.63 X 10–22) and retinal axes (p = 1.63 X 10–7). This suggests that gravitational tuning can depend primarily on vestibular/somatosensory cues for self-orientation.”

      Third, that the object and screen edges in the isolated object condition have an orientation interaction that influences tuning in a way that remains consistent across monkey tilt. If this was intended, we do not think this is a reasonable concern that needs mentioning in the paper itself. The closest screen edges on our large display were 28 in the periphery, and there is no reason to suspect that IT encodes orientation relationships between distant, disconnected visual elements. Screen edges have been present in all or most studies of IT, and no such interactions have been reported. We will discuss this point in online responses.

      Second, I do not fully understand torsional eye movements myself, but it is not clear to me whether this is a fixed or dynamic compensation. For instance, have the authors measured torsional eye rotations on every trial? Is it fixed always at {plus minus}6{degree sign} or does it change from trial to trial? If it changes, then could the high tuning correlation between the whole-body rotations be simply driven by trials in which the eyes compensated more? The authors must provide more data or analyses to address this important control.

      We now clarify that we could only measure ocular rotation outside the experiment with high-resolution closeup color photography, not possible on individual trials. The extensive literature on ocular counter-rotation has no indication that the degree of rotation is changed by any conditions other than tilt. Our measurements were consistent with previous reports showing that counterroll is limited to 20% of tilt. Moreover, they are consistent with our analyses showing that maximum correlation with retinal coordinates is obtained with a 6 correction for counterroll, indicating equivalent counterroll during experiments. Our analytical compensation for counterroll was based on this value, which optimized results in the retinal reference frame, so our measurements of counter-roll are used only to confirm this value. Ocular rotation would need to be five times greater than any previous observations to completely compensate for tilt and mimic the gravitational tuning we observed. For these reasons, counterroll is not a reasonable explanation for our results:

      “Compensatory ocular counter-rolling was measured to be 6 based on iris landmarks visible in high-resolution photographs, consistent with previous measurements in humans6,7, and larger than previous measurements in monkeys41, making it unlikely that we failed to adequately account for the effects of counterroll. Eye rotation would need to be five times greater than previously observed to mimic gravitational tuning. Our rotation measurements required detailed color photographs that could only be obtained with full lighting and closeup photography. This was not possible within the experiments themselves, where only low-resolution monochromatic infrared images were available. Importantly, our analytical compensation for counter-rotation did not depend on our measurement of ocular rotation. Instead, we tested our data for correlation in retinal coordinates across a wide range of rotational compensation values. The fact that maximum correspondence was observed at a compensation value of 6 (Figure 1–figure supplement 1) indicates that counterrotation during the experiments was consistent with our measurements outside the experiments.”

      Third, I find that when the objects were presented against a visual horizon, different object features are occluded at each orientation. This could reduce the correlation between the neural response in the retinal reference frame, thereby biasing all results away from purely retinal encoding. The authors should address this either through additional analyses or acknowledge this issue appropriately throughout.

      This idea of a shape interaction between object and horizon/ground is essentially the same concern discussed as the second interpretation of the first point, above. As outlined there, we addressed this concern in the best way possible, by removing the horizon/background (in the isolated object condition) and showing that the same results obtained. This comment raises the related point (also cured by the isolated object condition) of differential partial occlusion at the bottom of the object, 15% (by virtual mass) of which was buried below ground to provide a realistic physical interpretation for unbalanced orientations.

      We make both concerns explicit in the revised manuscript: “The Fig. 1 example neuron was tested with both full scene stimuli (Fig. 1a), which included a textured ground surface and horizon, providing visual cues for the orientation of gravity, and isolated objects (Fig. 1b), presented on a gray background, so that primarily vestibular and somatosensory cues indicated the orientation of gravity. The contrast between the two conditions helps to elucidate the additional effects of visual cues on top of vestibular/somatosensory cues. In addition, the isolated object condition controls for the possibility that tuning is affected by a shape-configuration (i.e. overlapping orientation) interaction between the object and the horizon or by differential occlusion of the object fragment buried in the ground (which was done to make the scene condition physically realistic for the wide variety of object orientations that would otherwise appear improbably balanced on a hard ground surface).”

      And we report that the control produces similar results in the absence of horizon/background: “Similar results were obtained for a partially overlapping sample of 99 IT neurons tested with isolated object stimuli with no background (i.e. no horizon or ground plane) (Fig. 2b). In this case, 60% of neurons (32/53) showed significant correlation in the gravitational reference frame, 26% (14/53) significant correlation in the retinal reference frame, and within these groups 13% (7/53) were significant in both reference frames. The population tendency toward positive correlation was again significant in this experiment along both gravitational (p = 3.63 X 10–22) and retinal axes (p = 1.63 X 10–7). This suggests that gravitational tuning can depend primarily on vestibular/somatosensory cues for self-orientation.”

      Reviewer #3 (Public Review):

      This is a very interesting study examining for the first time the influence of lateral tilt of the whole body on orientation tuning in macaque IT. They employed two types of displays: one in which the object was embedded in a scene that had a horizon and textured ground surface, and a second one with only the object. For the first type, they examined the orientation tuning with and without tilting the subject. However, the effect of tilt for the scene stimuli is difficult to interpret in terms of gravitational reference frame since varying the orientation of the object relative to the horizon leads to changes in visual features between the horizon and object. If neurons show tolerance for the global orientation of the scene (within the 50{degree sign} manipulation range) then the consistent orientation tuning across tilts may just reflect tuning for the object-horizon features (like the angle between the object and the horizon line/surface) that is tolerant for the orientation of the whole scene. Thus, the effects of tilt can be purely visually-driven in this case and may reflect feature selectivity unrelated to gravitation. The difference between retinal and gravitational effects can just reflect neurons that do not care about the scene/horizon background but only about the object and neurons that respond to the features of the object relative to the background. Thus, I feel that the data using scenes cannot be used unambiguously as evidence for a gravitational reference frame. The authors also tested neurons with an object without a scene, and these data provide evidence for a gravitational reference frame. The authors should concentrate on these data and downplay the difficult-to-interpret results using scenes.

      We still believe it is important to present these two experimental conditions in parallel, because we believe that visual driving of gravitational tuning by environmental cues is important in real life, and this is substantiated by the effects of visual cues alone. But, we have tried in this revision, in response to these comments and to comments from other reviewers, to clarify the potential concerns about visual effects in the full scene experiment, the importance and meaning of the isolated object condition as a control for concerns about other kinds of tuning, and the relationships between the two experimental conditions:

      Concerns about full scene experiment and the control importance of the isolated object condition: “The Fig. 1 example neuron was tested with both full scene stimuli (Fig. 1a), which included a textured ground surface and horizon, providing visual cues for the orientation of gravity, and isolated objects (Fig. 1b), presented on a gray background, so that primarily vestibular and somatosensory cues indicated the orientation of gravity. The contrast between the two conditions helps to elucidate the additional effects of visual cues on top of vestibular/somatosensory cues. In addition, the isolated object condition controls for the possibility that tuning is affected by a shape-configuration (i.e. overlapping orientation) interaction between the object and the horizon or by differential occlusion of the object fragment buried in the ground (which was done to make the scene condition physically realistic for the wide variety of object orientations that would otherwise appear improbably balanced on a hard ground surface) …

      Similar results were obtained for a partially overlapping sample of 99 IT neurons tested with isolated object stimuli with no background (i.e. no horizon or ground plane) (Fig. 2b). In this case, 60% of neurons (32/53) showed significant correlation in the gravitational reference frame, 26% (14/53) significant correlation in the retinal reference frame, and within these groups 13% (7/53) were significant in both reference frames. The population tendency toward positive correlation was again significant in this experiment along both gravitational (p = 3.63 X 10–22) and retinal axes (p = 1.63 X 10–7). This suggests that gravitational tuning can depend primarily on vestibular/somatosensory cues for self-orientation. However, we cannot rule out a contribution of visual cues for gravity in the visual periphery, including screen edges and other horizontal and vertical edges and planes, which in the real world are almost uniformly aligned with gravity and thus strong cues for its orientation (but see Figure 2–figure supplement 1). Nonetheless, the Fig. 2b result confirms that gravitational tuning did not depend on the horizon or ground surface in the background condition.”

      Cell-by-cell comparisons of scene and isolated stimuli, for those cells tested with both, in Figure 2–figure supplement 6. This figure shows 8 neurons with significant gravitational tuning only in the floating object condition, 11 neurons with tuning only in the gravitational condition, and 23 neurons with significant tuning in both. Thus, a majority of significantly tuned neurons were tuned in both conditions. A two-tailed paired t-test across all 79 neurons tested in this way showed that there was no significant tendency toward stronger tuning in the scene condition. The 11 neurons with tuning only in the gravitational condition by themselves might suggest a critical role for visual cues in some neurons. However, the converse result for 8 cells, with tuning only in the floating condition, suggests a more complex dependence on cues or a conflicting effect of interaction with the background scene for a minority of cells.

      Main text: “This is further confirmed through cell-by-bell comparison between scene and isolated for those cells tested with both (Figure 2–figure supplement 6).”

      Furthermore, the analysis of the single object data should be improved and clarified.

      We have added Figure 1–figure supplement 3–10 that expand the analysis of example cells and additional cells to include all stimuli shown and smoothed tuning curves for individual repetitions of the orientation range.

      We also now present results for individual monkeys in Figure 2–supplements 2,3, and the anatomical locations of individual neurons in Figure 2–supplements 4,5.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements

      __Response: __Thank you to all the reviewers for their helpful efforts on behalf of our manuscript. At current, we have addressed most of the reviewers’ major comments, including providing additional replicates for many experiments and clarifying ambiguous points in the text. Related data, figures and text have been adjusted accordingly. We believe that these changes have improved our manuscript, both strengthening our main conclusions and clarifying ambiguous text.

      Several still-ongoing experiments are elaborated below. These experiments are well within the abilities of our lab and can be completed in short order.

      Specific responses to the individual concerns addressed by the reviewers are outlined below.

      Please feel free to contact me if I can be of any help in the decision process.

      2. Description of the planned revisions

      Insert here a point-by-point reply that explains what revisions, additional experimentations and analyses are planned to address the points raised by the referees.

      • *

      [Reviewer 1]

      Comment: Across the manuscript, NIX levels appear to be unresponsive to most treatments in the MDA-MB-231 line, including hypoxia treatment. This is an unusual result and raises questions about the role of NIX in MDA-MB-231 line, mainly that BNIP3 is the primary driver of mitophagy in this system. Indeed, Figure 7D indicates that there is very little mitophagy contribution by NIX since knockout of BNIP3 is sufficient to abolish mitophagy almost completely. Therefore, the effects seen on mitophagy following EMC3 knockout in Figure 7 might be smaller in a line that is responsive to NIX mitophagy. It would be beneficial to analyse basal mitophagy flux in an additional cell line, for example U2OS (FigS1E) in which NIX is responsive to hypoxia.

      Response: Thank you for bringing this intriguing insight to our attention. We have seen that EMC3 knockout prevents lysosomal delivery of BNIP3 in U2OS cells (Fig S2D). However, we don’t know what the effects on mitophagy are in U2OS, or the extent to which mitophagy is dependent on BNIP3 and/or NIX. To test this, we will perform the suggested experiment, taking mt-Keima expressing U2OS cells testing the role of NIX and/or BNIP3 in mitophagy.

      Comment: Following on from comment 1 above, Figure 7 would benefit with an analysis of hypoxia (or DFP, or cobalt chloride) stimulation of mitophagy to assess whether mitophagy levels are higher in EMC3 KOs. The authors argue that BNIP3 is trafficked to the ER during mitophagy and is not turned over by mitophagy itself, it would therefore be interesting to test if BNIP3 is prevented from being removed from mitochondria whether this would affect the rate or levels of mitophagy under stimulating conditions.

      • *

      __Response: __To address this question, we will perform mitoflux analysis on EMC3 KO cells +/- hypoxia.

      Comment: Figure 4B: The localisation of tf-BNIP3 is reminiscent of ER in BTZ treated samples. How much of the protein is on mitochondria in the presence of BTZ? Does MLN4924 cause a similar issue?

      __Response: __To address this question, we will perform fluorescence microscopy of tf-BNIP3 cells co-expressing mito-BFP under these treatments and utilize our Coloc2 plugin pipeline to monitor correlation.

      • *

      Comment: Can the authors assess whether BNIP3 that is on mitochondria is transferred to the ER (perhaps through photoswitchable GFP-BNIP, activated on mitos and then observe its transfer to ER)? This seems important in order to address the possibility that BNIP3 that is being turned over by the endolysosome is being delivered directly to the ER.

      • *

      __Response: __This is an interesting question and a curiosity also shared by Reviewer #2. To test this hypothesis, we will utilize a photo-switchable Dendra2 fluorophore to track BNIP3 in the cell via microscopy.

      • *

      [Reviewer #2]

      Comment: How is BNIP3 inserted into the outer membrane? A previous study from the Weissman lab proposed that MTCH2 serves as insertase. The authors did not mention MTCH1 and MTCH2 in context of Fig. 2B. Were these proteins not found? Did the authors test the relevance of MTCH2 in their assay? This aspect should be addressed and mentioned.

      __Response: __Thank you for the insight and suggestion. We were intrigued when the Weissman/Voorhees paper characterizing MTCH1/2 was published. Consistent with their findings, MTCH2 was found in the “suppressor” population of our tf-BNIP3 CRISPR screen, but given our 0.5-fold change threshold, the gene was not validated (fold change value = 0.46, Table S1). We suspect the lack of significance stems from the redundancy with MTCH1. Consequently, we would hypothesize that MTCH1/2 are the responsible insertases. To formally address this suggestion, we plan to genetically perturb MTCH1/2 and look at BNIP3 localization and mitophagy.

      • *

      Comment: The authors generated an interesting BNIP3 mutant with a C-terminal Fis1 anchor. This variant is constantly located in the outer membrane (which is shown here). The physiological consequence of the constitutive distribution on mitochondria is however only superficially studied. The authors should characterize this interesting mutant in some more depth.

      • *

      __Response: __In the original manuscript, we characterized BNIP3(Fis1TMD) for lysosomal delivery and mitophagy. Going forward, we will perform Seahorse oxygen consumption experiments and mitochondrial network analysis to view the physiological consequences of constitutive expression of BNIP3(Fis1TMD) on the outer membrane.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. If no revisions have been carried out yet, please leave this section empty.

      • *

      [Reviewer #1]

      Comment: Continuing from comment 2, given that the authors conclude that BNIP3 is not turned over by mitophagy, can they examine whether BNIP3 is excluded from sealed mitophagosomes?

      __Response: __We have softened the wording of our conclusions to reflect that the vast majority of BNIP3 lysosomal degradation is by this alternative pathway and not mitophagy. However, we do not wish to completely dismiss that BNIP3 is present on mitophagosomes. Rather, if mitophagosomes contain BNIP3, they seemingly account for only a very small portion of BNIP3 degradation in the cell, to the extent that it is not easily detectable by our assays (Lines 414-419). Definitively identifying whether BNIP3 is in sealed mitophagosomes will be part of future studies using CLEM or FIB-SEM techniques.

      Comment: Is the BNIP3(FisTMD) expressed to equivalent levels to WT BFP-BNIP3? Given that theFis1 form of BNIP3 cannot traffic to endolysosomes, its levels might be higher. In addition, overexpression of the BNIP3-Fis construct was used to make the argument that dimerization is not important for mitophagy. But the authors should also take into account the possibility that with overexpression, the potential efficiency afforded to mitophagy via dimerization of endogenous proteins may be negated, and therefore hidden. Given this, I don’t think that the authors can confidently conclude that dimerization does not contribute to mitophagy, and that instead its main role is ER-endolysosomal turnover of BNIP3.

      __Response: __We thank the reviewer for pointing out the possible over-interpretation of our data. Overexpression is an important caveat to consider. We would expect the Fis1 form of BNIP3 to be higher in protein levels given its deficiency in endolysosomal trafficking. Still, as the reviewer points out, over-expression could be mitigating the effect of our dimerization mutants. This caveat is now discussed in the manuscript and our interpretations regarding this fact have been greatly softened (Lines 373-376, Lines 449-462).

      • *

      Comment: Please include molecular weight markers for all western blots.

      • *

      __Response: __All western blots have now been labeled with molecular weight markers.

      Comment: Figure 5A-G: These data do not make a convincing case for the role of dimerization and are very difficult to follow. Only the mislocalized S172A mutant was responsive to Baf treatment, while the LG swap mutant which is mitochondrial and cannot dimerize is unaffected by Baf treatment. Figure 5H-I utilize a construct of BNIP3 that is missing most of the protein and which has very low turnover (Figure 5B). Unfortunately these results don’t make a highly convincing case about the biology of native, full length, mitochondrial BNIP3. The authors are advised to either strengthen the dimerization argument, or perhaps lighten the language around the main conclusions from these data.

      Response: __Thank you for bringing the lack of clarity to our attention. Both dimer mutants of BNIP3 (S172A and LG swap) are insensitive to Baf-A1 treatment. These results hold for full-length BNIP3 using either the tf (__Fig 5D) or IRES (Fig 5I) reporter. To demonstrate that defects in lysosomal transport were due to dimerization defects (and not other, unanticipated effects of the mutations), we looked at whether chemically induced dimerization could reverse the trafficking defects. Indeed, forced dimerization of the ER-restricted variant rescued ER-to-lysosome trafficking. From this, we conclude that that dimerization is a critical facet of BNIP3 trafficking to the lysosome.

      We have re-worked the relevant text (both in results and discussion) to clarify major points and lighten the language around the conclusions from these data (described below).

      First, as mentioned above, we have added a significant discussion about the limitations of our assay and of possible interpretations. (Lines 300-303, Lines 323-326, Lines 483-489).

      Second, with regards to the specific construct used in this experiment, we have expanded the results section to better describe our rationale and approach (Lines 304-308). In short, because dimerization of native BNIP3 occurs within the membrane, we aimed to place the DmrB domain as close to the TM segment as possible. Due to the topology of TA proteins, a C-terminal tag isn’t possible. Therefore, we used the shortest truncation version of BNIP3 (117-end) that undergoes measurable lysosomal delivery. This was an important experimental consideration, and one we did not sufficiently rationalize in the original manuscript. We now include this point in the text.

      • *

      [Reviewer #2]

      Comment: The authors show that BNIP3 on the ER is not stable but degraded by the proteasome. Does this require ERAD factors? Is the mitochondrial BNIP3 protein likewise degraded by proteasomal degradation? It is not clear whether both BNIP3 pools are constantly turned over or whether degradation exclusively/predominantly occurs on the ER surface.

      Response: __These are fascinating mechanistic questions. We hope to thoroughly address these questions in a subsequent study. However, as a teaser, we have included the basic answer to these questions in __Fig 5I.

      To preliminarily characterize the proteasomal degradation of ER- and mitochondrial-BNIP3, we utilized our IRES reporter system - adapted from Steve Elledge’s system for degron monitoring (Fig 5I). Strikingly, our ER-restricted BNIP3 mutation (S172A) is sensitive to inhibition of both the proteasome and the AAA-ATPase p97/VCP, a key extractase for ERAD substrates. These data tentatively suggest an ERAD-dependent degradation mechanism (although many follow-up studies will be needed to confirm the mechanistic details). In sharp contrast, our mitochondrial-restricted mutant (LG Swap) is sensitive to proteasome inhibition by Bortezomib, but it is insensitive to VCP inhibition. The differential requirement for VCP suggests that proteasomal degradation occurs on both cellular pools of BNIP3 albeit through different mechanisms.

      Comment: The results of the screen shown in Fig. 2B are particularly interesting for readers. The glutathione peroxidase GPX4 was found as a top hit among the EMC components. GPX4 protects membranes (including those of mitochondria) against oxidative damage, is a major component of ferroptosis and linked to mitochondrial dysfunction and mitophagy. The authors should mention this interesting hit in the context of their discussion of the lipid-sensing properties of the dimerizing TM domains of BNIP3.

      __Response: __Thank you to Reviewer #2 for bringing this to our attention. The relationship between GPX4 and BNIP3 flux is very interesting. We have incorporated GPX4 into the discussion section (Lines 457-459).

      • *

      [Reviewer #3]

      Comment: For all of the tf-BNIP3 FACS data (all violin plots), it is unclear how many biological replicates were performed. The author only stated that at least 10,000 cells were analyzed per sample, but I believe this is for each biological replicate. To better demonstrate the biological replicates, the authors should consider using bar graphs of the medians(triplicates) with error bars.

      Response: We have included biological replicates of FACS data in all primary figures (except for Fig.1C). Biological replicates, represented as medians (in triplicate), are indicated in figure legends.

      Comment: In Fig 3D, it is unclear as to why there is no basal state accumulation of BNIP3 protein levels compared to Baf1A treated condition especially with USO1 and SAR1A KO samples. Is this because BNIP3 are targeted for proteasomal degradation? I think Fig 3D should include a BTZ treatment next to Baf1A to account for the lack of basal state accumulation of BNIP3.

      Response: We apologize for the lack of clarity on this point. Yes, the reviewer’s interpretation of the data is correct. This point is more clearly elaborated in the text of our revised manuscript (Lines 219-223). Our results indicate that when lysosomal degradation is diminished, the expected increase in total BNIP3 protein levels is attenuated by proteasomal degradation (as evidenced by the hyperstability of BNIP3 upon Bortezomib treatment in mutant backgrounds). As requested, we have included the same knockout panel, now treated with BTZ (Fig S2E). These genetic data are further supported by Fig 3E, where a small molecule inhibitor of vesicle trafficking, Brefeldin-A, ameliorates the effect of lysosomal inhibition (BafA1) but exacerbates the effect of proteasome inhibition.

      Comment: Truncation of proteins could affect their protein stability even during their synthesis. For Fig 5B and 6B, the authors should show the blots for the expression of the different truncated mutants to prove that the change in BNIP3 stability and their effect of mitoflux (or lack thereof), is not due to poor expression of these mutants.

      Response: These were important potential caveats to document, and we thank the reviewer for their comment.

      We note that, due to differences in transduction efficiency, western blot data is an incomplete measure for relative expression levels – it cannot distinguish between fraction of cells transduced and expression level per cell. However, RFP fluorescence (Fig 5B) and BFP fluorescence (Fig 6B) are fluorescent internal controls allowing us to assess expression levels with single cell resolution. We have provided histograms of RFP and/or BFP intensity (new Fig S4A, Fig S5B), which provides support that overall expression levels of these constructs are similar. Critically, any variation we observe does not correlate with any of the effects we report.

      In addition, we have clarified the figure axis in Fig 5B to indicate that the value we are reporting is the “fold-stabilization upon BafA1 treatment”. The original figure legend wasn’t clear. Our metric (fold-stabilization) is internally normalized to compensate for differences in expression level. This is an important clarification.

      Comment: For the data in Fig 7, the authors demonstrated that treating cells with proteasomal inhibitor increases mitoflux. Since the proteasome targets monomeric BNIP3 for degradation, the logical assumption is that BTZ drives dimerization of BNIP3. Can the authors demonstrate this in an approach similar to Fig 5C? This simple experiment will add significant insight into the study.

      Response: __Thank you for the suggestion. As Fig 5C relied on BNIP3 over-expression, we thought it even more informative to assess the effects of BTZ on dimerization of endogenous BNIP3. Indeed, we see accumulation of an SDS-resistant BNIP3 dimer in cells treated with BTZ (__new Fig S2E, line 221). We hypothesize that BTZ indirectly drives dimerization of BNIP3 by accumulating the total levels of the protein, potentiating monomers to form additional stable dimers.

      Comment: In line 168-169, "In addition, multiple suppressor genes identified from our screen had previously been reported including TMEM11..." -- Unclear what biology they are reported to be involved in

      __Response: __We have clarified this line to read: "In addition, we recovered multiple known suppressors of BNIP3 flux, including outer membrane protein spatial restrictor TMEM11, mitochondrial protein import factors DNAJA3 and DNAJA11, and mitochondrial chaperone HSPA9"

      Comment: Along the line with Major comment 2, the explanation for Fig 3D needs to be better elaborated, perhaps to include the role of proteasome already at this point (if the authors think this is the reason why basal BNIP3 levels remains low with USO1 and SAR1A KO).

      __Response: __We have included a discussion about compensation by the proteasome in these genetic backgrounds (lines 219-226) and have referred to the newly incorporated western blot (new Fig S2E).

      Comment: Line 302-304, I believe that statement only refers to Fig S4C and the statement for Fig5G is in the next sentence. Please remove Fig5G from line 304. It was confusing to read.

      Response: __The reference of __Fig 5G has been removed.

      Comment: Line 367, there is a reference for Fig S5C but that figure is missing.

      __Response: __The spurious reference has been removed.

      Comment: Line 410-411, are there any reported clinical cases of EMC mutations with phenotypes that could be explained by elevated mitophagy?

      __Response: __Thank you for the suggestion. There are clinical presentations of EMC mutations and splice variants in diseases and conditions related to the central nervous system (PMID: 23105016, PMID: 26942288, PMID: 29271071). However, all characterization has been done in the clinical setting looking at clinical presentations/symptoms and not molecular or cellular characterization. We have added a line to the discussion about this speculative correlation between EMC deficiency and mitophagy (lines 516-519).

      4. Description of analyses that authors prefer not to carry out

      Please include a point-by-point response explaining why some of the requested data or additional analyses might not be necessary or cannot be provided within the scope of a revision. This can be due to time or resource limitations or in case of disagreement about the necessity of such additional data given the scope of the study. Please leave empty if not applicable.

      • *

      [Reviewer #1]

      Comment: Figure 3B: Are the red puncta observed in USO1 and SAR1A cells a product of higher levels of ER-phagy owing to BNIP3's high presence on the ER membrane?

      __Response: __This is an intriguing hypothesis. We will test whether this is true using a USO1/ATG9A dual KO. However, we don’t think this result is critical to the overall arc of the manuscript and we will not include these data if they indicate otherwise.

    1. Author Response

      Reviewer #1 (Public Review):

      [...] This study brings a lot of new information on the regulation of flagellar genes, from the identification of novel sigma 28-dependent sRNAs to their effects on flagella production and motility. It represents a considerable amount of work; the experimental data are clear and solid and support the conclusions of the paper. Even though mechanistic details underlying the observed regulations by MotR or FliX sRNAs are lacking, the effect of these sRNAs on fliC, several rps/rpl genes, and flagellar genes and motility is convincing.

      The connection between r-protein genes regulation and flagellar operons is exciting and raises a few questions. First, from the RILseq data, chimeric reads with mRNA for r-proteins (including rpsJ) are not restricted to the sigma 28-dependent sRNAs (e.g. rpsJ-sucD3'UTR, rpsF-DicF, rplN-DicF, rplK-ChiX, rplU-CyaR, rpsT-CyaR, rpsK-CyaR, rpsF-MicA...), suggesting that regulation of r-protein synthesis by sRNAs is not necessarily related to flagella/motility. Second, it would be interesting to know if the flagellar operons are more sensitive than other long operons to antitermination following MotR overexpression? In other words, does pMotR similarly affect antitermination in rrn or other long operons?

      The general effect of pMotR or pFliX on the expression of multiple middle and late flagellar genes is also interesting even though the mechanism is not clear. While it may be difficult to fully address it, testing whether some of these regulatory events depend on the control of fliC and/or the S10 operon could be relevant (by analyzing the effects in strains deleted for fliC or nusB for instance).

      We also think the connection between r-protein genes regulation and flagellar operons is exciting and raises some intriguing questions. While there are other RIL-seq chimeras for r-protein genes, the highest numbers are found for MotR and FliX. Nevertheless, understanding the impact of these other sRNAs on the r-protein operons and elucidating which long operons are most sensitive to antitermination following MotR overexpression are important directions for further studies.

      Reviewer #2 (Public Review):

      [...] This is a very interesting study that shows how sRNA-mediated regulation can create a complex network regulating flagella synthesis. The information is new and gives a fresh outlook at cellular mechanisms of flagellar synthesis. The presented work could benefit from additional experiments to confirm the effect of endogenous sRNAs expressed at natural level.

      We agree that experiments regarding the endogenous effects of endogenous sRNAs are important. We provide such data in Figures 8 and S14 for MotR and FliX in a variety of assays: flagella numbers by electron microscopy, motility and competition assays, expression of flagellar genes by RT-qPCR and western analysis. We went to the trouble of constructing strains carrying point mutations in the chromosomal copies of these genes rather than deletions to avoid interfering with expression of motA and fliC given that MotR and FliX encompass the 5’ and 3’ UTRs respectively.

      Reviewer #3 (Public Review):

      [...] Overall, this comprehensive study expands the repertoire of characterized UTR derived sRNAs and integrate new layers of post-transcriptional regulation into the highly complex flagellar regulatory cascade. Moreover, these new flagella regulators (MotR, FliX) act non-canonically, and impact protein expression of their target genes by base-pairing with the CDS of the transcripts. Their findings directly connect flagella biosynthesis and motility, highly energy consuming processes, to ribosome production (MotR and FliX) and possibly to carbon metabolism (UhpU).

      Specific points to be considered:

      • The authors use a crl- hyper-motile strain as WT strain for the study and sometimes also a crl+ strain is used. Can the authors comment on potential reasons why some phenotypes (e.g., UhpU and MotR effects on motility) are only detectable in the crl+ strain or vice versa? Is σS regulation important for the function of these sRNAs?

      • In several experiments, a variant of MotR sRNA, MotR that harbors a 3 nt mutation upstream of the seed sequence is used and seems to mediate stronger phenotypes (impact on flagellar number) upon overexpression compared to WT or phenotypes not retrieved for WT MotR (increased flagellin expression). It would be helpful to have some more clarification throughout the text, why this variant was used, even when OE of WT MotR already has impact on the target and how these three mutated nucleotides impact target regulation. For example, does MotR show increased RNA stability or Hfq binding compared to MotR? Does the mutation in MotR* impact MotR structure (e.g., based on secondary structure predictions) or increase the complementarity with selected targets at potential secondary binding sites (e.g., based on target predictions)? For example, Fig. S7 shows additional regions of interaction between MotR and fliC mRNA beside the seed sequence. It is also suggested that MotR might have multiple interaction sites on rpsJ mRNA. Additional structure probing or biocomputational predictions could clarify these points.

      • It is suggested that UphU impacts on motility via regulation of LrhA, which represses transcription of flhDC, and therefore the flagellar cascade. While LhrA-mediated regulation by UphU is validated based on reporter genes, the effect of UhpU OE on FlhDC levels is not directly examined (Fig. 3). Furthermore, as deletion of LrhA de-represses the flagellar cascade and UhpU was also shown to increase motility, the conclusions could be further strengthened by examining flhDC levels and/or the effect of ∆UhpU (if the sRNA part can be deleted) on motility (reduction) due to relieved down-regulation of LrhA.

      • This study provides many opportunities for future follow-work. Now that the four sRNAs and some of their targets and opposing effects on flagella biogenesis have been identified, it will be interesting to see how the sRNAs themselves are temporally regulated throughout the flagella biogenesis cascade and which other targets are regulated by them. Future studies could also provide insights into the mechanism and function of FlgO sRNA, which seems to act via a different mechanism than base-pairing to target RNAs, as well as the global effects of regulation of ribosomal genes via FliX and MotR.

      We thank the reviewer for the constructive comments about the variation between the crl- and crl+ strains, and about the use of MotR versus MotR*, and will address these points in a revised version of the manuscript. Regarding the UhpU-mediated regulation, we agree that assays of flhDC expression will strengthen our conclusions. We share the reviewer opinion regarding many opportunities for future follow-up work.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reply to the reviewers

      Dear Editor and reviewers,

      We would like to thank the three reviewers for their thorough review of our manuscript and their detailed comments and very helpful suggestions to improve the manuscript. Overall, we thought the reviews were very positive with the reviewers commenting that our discovery of a novel genetic code variant is a “cause for celebration” and that our study is “technically solid” and “rigorous”. All three reviewers agree that our manuscript would “stimulate new discussions in the field of genetic code evolution” and also be of broad interest to evolutionary cell biologists, protistologists and the translation/protein synthesis community at large. The reviewers highlight the particular novelty of the genetic code variant described here due to it being an exception to the wobble hypothesis which adds a new level of complexity to stop-codon reassignment. The reviewers share our frustration about the lack of proteomics data due to being unable to establish a stable culture but acknowledge that we address this limitation frankly in our discussion and agree that it is “frustrating but it's not a limitation”.

      We present an updated and improved version of the manuscript after taking on board the reviewers’ suggestions. Our point-by-point responses to their comments and our modifications are detailed below in bold.

      Point-by-point description of the revisions

      __Reviewer #1 (Evidence, reproducibility and clarity (Required)): __

      Summary

      This study by J. McGowan and colleagues reports the discovery of a ciliate species that uses a variant genetic code where the codons UAA and UAG, which are stop codons in the canonical code, instead code for lysine and glutamate respectively. The primary data are genomic and transcriptomic sequence libraries from single cells. The genetic code was predicted by aligning coding sequences to references from other species and examining the most frequent amino acids in positions homologous to putative coding-UAA/UAGs. They also identified suppressor tRNAs for UAA and UAG, and tandem in-frame stop UGAs (but not UAA/UAG) in the 3'-UTR, which further support the recoding of UAA and UAG.

      A limitation of this study (and several other recent studies on variant genetic codes) is that the predictions are based on nucleic acid sequencing, without confirmation from proteomics. The authors acknowledge and briefly but frankly discuss the limitations in their manuscript (lines 258-261).

      Major comments

      Controls against contamination and sequence chimeras

      The ciliate species studied here was an environmental isolate, and sequence libraries were prepared by amplification from small pools of cells sorted by FACS. The genome assembly was produced by co-assembly of multiple amplified libraries. Given the potential for contamination and amplification artefacts (such as sequence chimeras) associated with these methods, I think it is important to demonstrate that the data truly originate from one species, so as to rule out the possibility that the co-assembly may be chimeric, i.e. representing two or more organisms with different genetic codes (one with UAA recoded and the other with UAG recoded, for instance). Even if the cell sorting was accurate, contamination could still enter down the line during library preparation so it would be important to show internal evidence from the sequence data too.

      We understand the reviewer's concerns about the possibility of contamination as it can be a major issue in environmental single cell sequencing experiments. We have addressed the individual points below in detail to demonstrate that we have generated a clean genome assembly of a single ciliate species but also summarise here:

      • The cells we sequenced originated from the same clonally isolated cell propagated in culture
      • We have manually curated the assembly
      • The assembly has a unimodal GC content peak with a low BUSCO duplication score
      • Most genes (95.9 %) contain both in-frame UAA and UAG codons
      • We recovered a single identical ciliate 18S rRNA gene across all 10 samples
      • De novo assemblies of the 10 individual gDNA libraries are virtually identical in terms of average nucleotide identity
      • We also predicted the genetic code for each of the genome and transcriptome samples individually
      • 85% of the final assembly is taxonomically classified as Ciliophora. The remainder is either unclassified (i.e. no hits) or has spurious/inconsistent hits

        Specifically:

      (a) From the description in Methods under "Sampling, Ciliate isolation, culturing, and cell-sorting", it is not clear whether all the cells that were ultimately sequenced originated from the same clone (i.e. the same well in the 96-well plate described in line 389). Could the authors confirm whether this was the case?

      Yes. All the sorted cells originated from the same ciliate clone. A single-cell was isolated and cleaned (without removing all the environmental bacteria). The ciliate single-cell divided and we established a mono-clonal ciliate culture that we used for the cell sorting and sequencing. This culture grew but only for a relatively short period. We could not establish a long term culture.

      (b) What % of genes have in-frame coding UAA, UAG, or both? How per gene on average? Counts are given for the conserved genes/domains identified by PhyloFisher or Codetta (lines 192-207), and overall frequencies per codon are addressed later in lines 263 onward, but how often do they occur together in the same genes?

      My reasoning behind this is that if genes with both in-frame coding UAA and UAGs are common then it is very unlikely to be the result of chimeric sequence artefacts from whole-genome amplification.

      We have updated the text to include this information. From the PhyloFisher analysis, we had reported that 58 genes contained in-frame UAA codons and 46 genes contained in-frame UAG codons. We have now added the text “Amongst the genes identified by PhyloFisher, 27 contained both an in-frame UAA codon and an in-frame UAG codon.”

      Additionally, from our annotated gene set, we had reported that 98.6% of genes contain at least one UAA codon and 96.4% of genes contain at least one UAG codon. We have now added text to report how many genes contain both codons “The reassigned codons are widely used across genes with 95.9% of genes containing both a UAA codon and a UAG codon”.

      The example gene (tubulin gamma chain protein) shown in Figure 1 contains both in-frame UAA codons and in-frame UAG codons, with the UAA codons aligning to lysine and the UAG codons to glutamic acid.

      (c) What is the sequence identity of conserved marker sequences between the individual amplified replicate libraries?

      I would naively expect that individual replicates may not have the full set of markers because of uneven amplification, but if the sequences originate from the same clone they should have overlapping coverage of the conserved markers, and these should be +/- identical between replicates (save for allele variants). If so this would support the claim that contaminant sequences were mostly removed during sequence QC and that the cells were clonal.

      We generated an individual assembly for each of the 10 gDNA libraries and calculated average nucleotide identity at the whole assembly level. On average, the 10 assemblies are 99.43% identical to each other, with the least similar pair being 99.37% identical to each other. This level of variation includes not only allelic variants but also sequencing/assembly errors as the individual libraries are relatively low coverage. In terms of assembly alignment coverage (i.e. the fraction of each assembly that is aligned to another assembly), the average value is 76.5% and the value for the lowest pair is 59.1%. We have now also made the individual 10 assemblies available in the Zenodo repository (10.5281/zenodo.7944379) and updated the methods section.

      Furthermore, as an additional quality control step, we predicted the genetic code for each of the 10 individual genome assemblies and obtained the same predictions that UAA encodes lysine and UAG encodes glutamic acid for all 10 individual assemblies. We also predicted the genetic code for each individual RNA-Seq sample based on individual transcriptome assemblies which yielded consistent predictions.

      (d) Line 392: "Non-axenic" presumably refers to environmental prokaryotes. This also appears to contradict the statement that the cells were "free of any other contaminant" (line 387). Could authors confirm whether they mean "non-axenic but monoeukaryotic"?

      In line 387, when we say "free of any other contaminant” we mean that we isolated a ciliate single-cell from the environmental sample, and the picked ciliate cell was washed 3 times until it was free of any other eukaryotes, but still containing environmental bacteria. In line 392, when we say non-axenic, we mean that the mono-clonal ciliate culture contained environmental bacteria and was monoeukaryotic.

      We have modified the text in the methods section to say “free from any other eukaryote” and “non-axenic but monoeukaryotic”.

      (e) Lines 448-451: More details should be given on the criteria used to identify and bin out contaminants. MetaBAT typically bins prokaryotic genomes quite well, but not eukaryotic ones. What did the bins look like and how were the eukaryotic ones chosen?

      We routinely use MetaBAT2 to assist with separating bacterial contigs from protist genomes. From our experience we find that it generally performs well but requires careful manual curation. We only use tetranucleotide frequencies when binning single-cell assemblies and not coverage variance as this is heavily skewed due to amplification bias from single-cell amplification. We integrated the binning results from MetaBAT2 with taxonomic classification from tools such as CAT, Blobtools and Tiara, and manually curated the assembly.

      We have modified both the results and methods section to clarify that the assembly was manually curated to remove contaminant contigs.

      For example, using CAT, which taxonomically classifies contigs based on blast/diamond hits to open reading frames:

      The final curated assembly is 69.7 Mb in length.

      59.5 Mb (85.4%) is classified as Ciliophora.

      9.7 Mb (13.9%) is unclassified.

      The remaining 0.5 Mb (0.7%) have inconsistent, low-identity hits to 22 different Eukaryotic and Bacterial phyla (due to lack of closely related species in public databases).

      Furthermore, we recovered only a single ciliate 18S rRNA gene and the final curated assembly has a unimodal GC content peak with a low BUSCO duplication score and high cDNA mapping rate.

      __Minor comments __

      Line 52: Not strictly true, some germline-limited segments contain mobile elements with coding sequences, e.g. TBE elements in Oxytricha (doi:10.1371/journal.pgen.1003659)

      Thank you for pointing this out. We have rephrased “excision of non-coding sequences” to “excision of micronucleus-limited sequences” to describe the process of macronuclear development more generally.

      Lines 229-231, Supplementary Table 1: Presenting the identity matrix as a distance tree may make it easier to see the pattern of similarity between the tRNAs

      We have added a phylogenetic network of tRNA genes as a supplementary figure to better visualise the relationships between tRNA genes.

      Lines 274-275: Suggest stating the criterion for classifying genes as "highly expressed" on the first mention of this in the Results, although it's explained later on in the Methods.

      We have clarified this in the results section by adding the text: ‘We defined a subset of genes as “highly expressed” based on the 10% of genes with the highest transcripts per million (TPM) values for comparison below.’

      Lines 298-299: What is the frequency of tandem UGA stops in the 3'-UTR in genes with coding-UAA/UAG vs. genes without, and is there a significant difference? The argument in this paragraph is that UAA+UAG reassignment increases selective pressure to minimize translational readthrough. Therefore I think that it would make sense to compare the frequency in genes with and without these codons.

      Following the reviewer’s suggestion, we have looked at tandem UGA stop codons in the 3’-UTR of genes that don’t use UAA and genes that don’t use UAG. We found similar enrichment for in-frame UGA codons at the beginning of the 3’-UTR in these small subsets of genes.

      To clarify, the hypothesis from the literature is that there may be stronger selective pressure to maintain tandem stop codons in ciliates with reassigned genetic codes, particularly those that use only UGA as a stop codon. Within a genome, we wouldn’t expect a difference if a gene contains UAA/UAG codons.

      Lines 353-354, Figure 5: Suggest marking the internal nodes where genetic code changes likely occurred. At the moment only the leaves of the tree are annotated with the genetic codes of the respective species. This would make it clearer how one counts the numbers of independent origins as reported in the text (e.g. "... a fourth independent origin of UGA being translated as tryptophan").

      We have decided not to label the internal nodes on the phylogeny. We think that deeper sampling will reveal that some of these genetic code changes occurred independently, so we don’t want the figure to be misleading. Also, for the species with the genetic code UAA=Q, UAG=Q and UGA=W, we can’t determine the order of events.

      Lines 371-372: Question out of curiosity (not necessary to address for the manuscript at hand): Do the authors think the recoding of UAA and UAG happened simultaneously in both codons or stepwise, or is there insufficient information to speculate?

      An initial guess would be that it happened as a stepwise process but without deeper sampling of this lineage it is not possible to determine the order of events.

      This highlights the need for deeper sampling and sequencing across undersampled lineages of ciliates and demonstrates the utility of single-cell OMICs approaches for species that are not yet amenable to culturing.

      Line 395: "10uL" should use the actual symbol for "micro" prefix. Also, the choice of spacing or no spacing between numerical figure and units should be made consistent in manuscript.

      Fixed

      Line 403: "Biotynilated" should be "Biotinylated"

      Fixed

      Line 414 and elsewhere: "2" in MgCl2 should be subscripted

      Fixed

      Lines 419-420: Clarify whether the "r" and "+" symbols are to be read as prefixes or suffixes, i.e. is the modified base the preceding or succeeding one.

      We have clarified in the text that these symbols are to be read as prefixes.

      Table 1: What is the difference between the two sets of BUSCO completeness scores reported? One is given under "Genome assembly" and the other under "Genome annotation", but the annotation is based on the same assembly, right? I'm assuming this has to do with different modes in which BUSCO can be run, but this should be explained in the Methods (lines 452-453, 496-497) and briefly explained in the Table caption.

      Yes this is because we ran BUSCO in two different modes. BUSCO is run in genome mode on the genome assembly and in protein mode on the genome annotation. In genome mode gene prediction is performed by Augustus guided by amino acid BUSCO group block-profiles while in protein mode the gene set described in our methods is the input to BUSCO classification. The superior BUSCO results for the protein mode reflect the superiority of our final annotation over that generated by BUSCO Augustus. We have added text to the methods section and to the table caption to clarify which mode was used.

      **Referee Cross-commenting** I generally agree with the other reviewers' comments. Specifically I like reviewer #3's suggestion #3 to have a more detailed summary of the codon frequencies, perhaps as a graphic, and to compare the tandem stop frequencies with other ciliate species, especially those with all three canonical stops.

      Reviewer #1 (Significance (Required)):

      Any new genetic code variant discovered is a cause for celebration! This is a basic biological fact with inherent significance and should be generally interesting to biologists because the rarity of variant codes stands in contrast to the diversity of most biological systems.

      This variant code would also stimulate new discussions in the field of genetic code evolution specifically because, as the authors point out, when both UAA and UAG are recoded they both usually encode same amino acid, but here they are recoded to different ones. This is an apparent exception to the "wobble" hypothesis for why these codons often evolve in concert, which was well explained with relevant citations in the Introduction.

      For context: My expertise is in genomics and environmental microbiology.

      END reviewer 1

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      This study reports the reassignment of the UAA and UAG stop codons to lysine and glutamic acid, respectively, in the ciliate Oligohymenophorea sp PL0344. The paper is nicely written, easy to read and the experimental approach, ideas and questions are easy to follow. The work is technically solid both at the NGS - in house library preparation, sequencing and data interpretation - as well as phylogeny levels. The conclusions are consistent with the comparative genomic and transcriptomic data obtained by the study.

      __Reviewer #2 (Significance (Required)): __

      The work extends current knowledge on codon reassignment in ciliates, confirming previous discoveries of existence of very high stop codon assignment flexibility in these organisms. The assignment of UAA and UAG to two different amino acids by two different tRNAs is very interesting and reinforces the idea that stop codon reassignment in ciliates is rather common. It also raises important questions about the parallel evolution of the release factor-1 (eRF1), Lysine and Glutamine tRNAs, as the reassignment requires loss of recognition of both UAA and UAG by eRF1 with parallel appearance of the new Lysine and Glutamic Acid suppressor tRNAs.

      The main issue of this work is the inability to cultivate the ciliate Oligohymenophorea sp PL0344 in the laboratory to prepare protein extracts for direct analysis of the amino acids inserted at UAA and UAG sites by Mass Spectrometry. The comparative genomic and transcriptomic data, as well as the identification of cognate tRNA anticodons for UAA and UAG, are likely correct, but provide indirect evidence for the assignment of UAA to Lysine and UAG to Glutamic Acid. This issue is relevant because one cannot exclude the possibility of insertion of other amino acids at UAA and UAG sites beyond Lysine and Glutamic acid, respectively; nor can one exclude the possibility that such amino acids are inserted at high level. The authors do acknowledge the limitations of the unavailability of protein extracts for direct MS analysis of the reassignment, but should consider, in particular in the discussion, the possibility of multiple amino acid insertions in a context where Lysine and Glutamine Acid are the major but not the only amino acid species being inserted at those sites.

      Based on my expertise of studying codon reassignments in fungi of the CTG clade, I believe this work is very interesting and appealing to the genetic code community, and is of relevance to the evolution and protein synthesis research communities at large.

      We thank the reviewer for their positive review. They raise an important point about the possibility of amino acids other than lysine and glutamic acid being inserted for UAA/UAG codons which we hadn’t considered. We have added text and relevant references to our discussion to highlight this possibility:

      “Additionally, while the genomic and transcriptomic data provide strong evidence that lysine and glutamic acid are the major translation products of UAA and UAG codons, respectively, we cannot rule out the possibility that other amino acids are (mis)incorporated at these sites which could be detected using mass-spectrometry [38, 39].”

      Krassowski T, Coughlan AY, Shen X-X, Zhou X, Kominek J, Opulente DA, et al. Evolutionary instability of CUG-Leu in the genetic code of budding yeasts. Nat Commun. 2018;9:1887. Mordret E, Dahan O, Asraf O, Rak R, Yehonadav A, Barnabas GD, et al. Systematic Detection of Amino Acid Substitutions in Proteomes Reveals Mechanistic Basis of Ribosome Errors and Selection for Translation Fidelity. Molecular Cell. 2019;75:427-441.e5.

      END reviewer 2

      __Reviewer #3 (Evidence, reproducibility and clarity (Required)): __

      Summary: from genome and transcriptome sequencing of what appears to be a novel ciliate from the class Oligohymenophorea, McGowan et al provide convincing evidence of a protist in which the stop codons UAA and UAG have almost certainly been recoded to specify incorporation of different amino acids (UAA = K; UAG = E) during translation. Several ciliates from different classes use a non-standard genetic code (as do a narrow variety of other protists), but this is an unusual observation in that stop codons which differ only in the wobble position code for different amino acids in the ciliate identified here.

      I say 'almost certainly' the stop codons have been recoded in Oligohymenophorea sp. PL0344 because in the absence of being able to retain the ciliate in culture the authors have not been able to complete the proteomics which would unequivocally (a) show stop codons now code for amino acids and (b) confirm the identity of the amino acids now encoded (the authors discuss this issue on p12).

      Comments: overall this manuscript is straightforward to read and the analyses realistically taken as far as is realistic in the absence of a continuous culture method. My suggested revisions should be straightforward for the authors to address.

      1) The manuscript appears to report the identification and genome/transcriptome sequencing of a novel ciliate species - clarity should be provided by the authors. However, it disappointed me that this manuscript was crafted entirely from nucleotide sequencing. I would have welcomed seeing the morphology of the ciliate identified here and would have anticipated that there was sufficient material to perform microscopy at the light level (for DIC images) and by scanning or transmission electron microscopy.

      Yes, based on the 18S rRNA sequence and phylogenies of protein-coding genes, this is a novel species that hasn’t been described before. The most similar hits to the 18S rRNA gene are to other unnamed/environmental sequences. We haven’t attempted to name or describe this species as we weren’t able to establish a culture, so have referred to it as Oligohymenophorea sp. PL0344. We have clarified in the text that this is a novel, unnamed ciliate species.

      The genomic and transcriptomic data was generated from a single cell isolate propagated into micro-cultures of 10’s of cells. These were done in the strictest conditions in an attempt to minimise contamination. Consistent with this approach it was not possible to obtain useful SEM/TEM as it would be very hard to recover EM imaging from 10’s of cells (a process that would have drastically reduced our ability to do replete genome sampling). Similarly, our approach to culturing limited our ability to acquire useful DIC images. After discovering that this ciliate uses a novel genetic code, we attempted on a number of occasions to re-isolate the same species from the same and surrounding water bodies but failed.

      2) It is unfortunate that the ciliate could not be maintained in culture (or cryopreserved). Coordinates for the University Parks pond are provided, but I got the impression that this ciliate could be repeatedly isolated. Thus, in the absence of culture methods could the authors indicate the points in the year when the ciliate could be isolated (i.e. is there a season element to when PL0344 could be isolated) and how frequently when sampling was performed could PL0344 be seen? From the environmental sequence data that is publicly available is there any evidence for the presence of PL0344 anywhere else in the world? I'd be surprised if this was a UK-specific ciliate.

      The water sample from which this ciliate was isolated was collected in April 2021. After having sequenced its genome and identifying the genetic code change, we made several attempts to reisolate it from the same pond but were unsuccessful. Regarding the geographic distribution of this ciliate, in the text we mention that the most similar 18S rRNA sequence in GenBank is to an unnamed species recovered in a metabarcoding study in France with 99.81% identity. We assume that this is the same species. We also examined other publicly available environmental datasets such as the PR2/metaPR2 database. The most similar match in the metaPR2 database was to a sequence “OLIGO4_XX_sp”. In the metaPR2 database this sequence is unique to Lake Garda in Italy (sample name: “Lake_Garda-LTER-euphotic-water”). However, this hit was only 98% identical with a partial alignment so we did not discuss it in the text. We agree that it is very unlikely that this is a UK-specific ciliate but cannot determine its geographic range based on the publicly available environmental sequence data, other than the single hit to a sequence from France. We think it is important to stress that it was not the aim of our paper to describe the taxonomy and biogeographical range of this ciliate but rather to report the exciting shift in codon usage.

      3) I felt the statistics presented on pages 13-14 (lines 277-301) for codon usage were a little superficial. It would be helpful to see how frequently other E and K codons are used in PL0344 and ideally to see how similar codon usage differs in the more model ciliates Paramecium, Tetrahymena or Stentor. To complete an analysis and justify/confirm conclusions drawn, I would also like to see how frequently in-frame, downstream stop codons are seen in ciliates where stop codons have NOT been reassigned - although the data in Fig 5 indicates genome/transcriptome sequences are not necessarily complete for many ciliate species (where stop codons are not reassigned), there is certainly more varied data to look at than when Fleming and Cavalcanti published their PLoS One work (which is cited in the manuscript).

      We have shortened this section about UAA and UAG usage, with supplementary table 3 showing usage of all codons in all genes compared to our subset of highly expressed genes.

      We have also added a sentence stating how many genes contain both in-frame UAA and UAG codons based on the point from Reviewer 1: “The reassigned codons are widely used across genes with 95.9% of genes containing both a UAA codon and a UAG codon.“

      According to our knowledge, there are no new genome assemblies available for ciliates that use the canonical genetic code since the Fleming and Cavalcanti publication from 2019, certainly not any with annotated gene sets available for comparison. The species in Fig 5 which use the canonical genetic code are all from transcriptome data (other than Stentor) that have generally low completeness. We do not think comparison with low-quality transcriptome assemblies would make a fair comparison as they would be biased towards transcripts with higher expression. Furthermore, they likely include many fragmented transcripts which are not suitable for detailed comparisons of the stop codon/3-UTR region.

      4) Given the presence of just one stop codon in PL0344 have the authors looked genome-wide at nucleotide composition 5' and 3' to UGA. The nucleotide sequences 5' and 3' to a stop can influence whether read through is and thus potentially limits the frequency of or tendency for unwanted readthrough?

      We thank the reviewer for this suggestion which is something we did not investigate initially but have now added a short section in the manuscript to address. Many studies in model organisms have demonstrated that UGA is the least robust stop codon and the most prone to read through. As the reviewer alludes to, this is particularly interesting for ciliates with reassigned genetic codes that use only UGA as a stop codon. Experimental data from model organisms have shown that the sequence composition surrounding a stop codon can influence the frequency of read through, with the nucleotide immediately downstream of the stop codon (“+4 position”) being particularly important.

      We have now looked at the sequence composition around stop codons for Oligohymenophorea sp. PL0344 and our results show that cytosine tends to be avoided following the UGA stop codon. From the literature, presence of a cytosine following UGA (i.e., UGAC) leads to a substantial increase in translational read through. Furthermore, when examining the subset of highly expressed genes, there are significantly fewer cases of UGAC when compared to all genes. This trend has previously been reported in Paramecium and Tetrahymena based on EST data (Salim, Ring and Cavalcanti; 2008).

      We have added a short section to the text reporting this and a supplementary figure showing a sequence frequency logo around the stop codon for all genes and for the subset of highly expressed genes. We are very cautious, however, that there is a paucity of experimental studies investigating stop codon robustness in ciliates. While several publications hypothesise that read through may happen at higher rates in ciliates due to a combination of factors (e.g., ERF-1 mutations, presence of tandem stop codons, competition from suppressor/near-cognate tRNA genes, etc..) we are careful not to speculate without experimental evidence.

      __Reviewer #3 (Significance (Required)): __

      Strengths - I found this a straightforward manuscript to read - aside from the interesting and unexpected observation about genetic code use in PL0344, Fig 5 draws together a lot of earlier published information into an easily accessible form - I felt this a particularly useful part of the manuscript.

      I don't feel the absence of proteomics to back up the genome/transcriptome analysis is a notable limitation - it's perhaps frustrating but it's not a limitation. However, the work does perhaps inevitably feel a little bit observational - there's not really a lot of insight or new insight into why the genetic code can be revised in some microbial eukaryotes - in contrast, for instance, to a recently published study of the aptly named Blastocrithidia nonstop. McGowan et al's manuscript, however, will be of interest and should be formally published.

      Descriptions of organisms that have tweaked the standard genetic code are not new; coupled to the limited insight into why the genetic code can be rewritten so readily in ciliates, this limits the general appeal of the work. However, the study executed is rigorous and it should be of interest to a wide variety of protistologists, evolutionary cell biologists, and researchers in the translation field.

      END reviewer 3

    1. Author Response:

      The following is the authors' response to the original reviews.

      eLife assessment

      This study presents important findings regarding the quantification of dynamics in fish communities in changing ecosystems by combining a large-scale environmental DNA metabarcoding time series with novel statistical approaches. The methods are convincing, with controlled experiments, thorough statistical analyses, and a substantial dataset covering two years of detailed observation, which can provide sufficient power to detect fine-scale ecological interactions. This work is relevant for informing future research on assessing community stability under climate change.

      Thank you so much for your careful evaluation of our manuscript. We are very pleased to hear that you found our study important. We have revised our manuscript according to the helpful comments to further improve our manuscript.

      Reviewer #1 (Public Review):

      […] Their work provides a highly relevant approach to perform species-interaction strength analysis based on eDNA biodiversity assessments, and as such provides a research framework to study marine community dynamics by eDNA, which is highly relevant in the study of ecosystem dynamics. The models and analytical methods used are clearly described and made available, enabling application of these methods by anyone interested in applying it to their own site and species group of interest.

      Thank you so much for your time and effort to evaluate our manuscript. We are very pleased to hear that you found our study interesting. We have further revised the manuscript according to your comments and hope that the revised manuscript is now better than the original one.

      Strengths: The authors have a study setup that is suitable to measure the effects of temperature of the eDNA diversity, and have taken a large number of samples and all appropriate controls to be able to accurately measure and describe these dynamics. The applied internal spike in to enable relative eDNA copy number quantification is convincing.

      We are happy to hear that you found the study design and the method to estimate eDNA copy number are suitable and convincing.

      Weaknesses: The authors aim to study the relationship between species interaction strength and ecosystem complexity, and how temperature will influence this. However, there is only limited ecological context discussed explaining their results, and a link with climate change scenario's is also limited. A further discussion of this would have strengthened the manuscript.

      Thank you so much for the comment. We have added discussion about how our study contributes to understanding fish community assembly process and predicting the community-level response under ongoing climate change. We have added one subsection, "Implications for fish community assembly and the effect of global climate change ", at L679. As for the ecological discussion for each specific fish-fish interaction, we provided this in Supplementary file 1c.

      The authors were able to find a correlation between water temperature and interaction strengths observed. However, since water temperature is dependent on many environmental variables that are either directly or indirectly influencing ecosystem dynamics, it is hard to prove a direct correlation between the observed changes in community dynamics and the temperature alone.

      Thank you for pointing this. We have discussed the possibility of the effects of other environmental variables (e.g., oxygen) and how we could overcome this issue at L661. Some of the sentences were originally in the subsection " Interaction strengths and environmental variables ", but were moved to the subsection " Potential limitations of the present study and future perspectives".

      Reviewer #2 (Public Review):

      In this work Ushio et al. combine environmental DNA metabarcoding with novel statistical approaches to demonstrate how fish communities respond to changing sea temperatures over a seasonal cycle. These findings are important due to the need for new techniques that can better measure community stability under climate change. The eDNA metabarcoding dataset of 550 water samples over two years is, I feel, of sufficient scale to provide power to detect fine-scale ecological interactions, the experiments are well controlled, and the statistical analysis is thorough.

      Thank you so much for your time and effort to evaluate our manuscript. We are happy to hear that you found our study technically sound and important. We have revised the manuscript according to your comments to improve our manuscript further.

      The major strengths of the manuscript are: (1) the magnitude of the dataset, which provides densely replicated sampling that can overcome some of the noise associated with eDNA metabarcoding data and scale up the number of data points to make unique inferences; (2) the novel method of transforming the metabarcode reads using endogenous qPCR "spike-in" data from a common reference species to obtain estimates of DNA concentration across other species; and (3) the statistical analysis of time-series and network data and translating it into interaction strengths between species provides a cross-disciplinary dimension to the work.

      Thank you for your positive comments. Regarding (1), we are very pleased to hear that (1) our intensive and extensive water sampling, (2) our method for using the common fish species eDNA as "spike-in," and (3) our nonlinear time series analysis were positively evaluated.

      I feel like this kind of study showcases the power of eDNA metabarcoding to answer some really interesting questions that were previously unobtainable due to the complexities and cost of such an exercise. Notwithstanding the problems associated with PCR primer bias and PCR stochasticity, the qPCR "spike-in" method is easy to implement and will likely become a standardised technique in the field. Further studies will examine and improve on it.

      We must admit that our endogeneous "spike-in" method does not overcome all problems associated with PCR. However, we agree with you and believe that we are heading in a correct direction. The method

      does not require the addition of external internal standard DNAs and enables post-hoc evaluation of eDNA absolute concentrations. Although this approach requires an additional experiment (qPCR), the method may be an alternative for quantifying eDNA concentrations.

      Overall I found the manuscript to be clear and easy to follow for the most part. I did not identify any serious weaknesses or concerns with the study, although I am not able to comment on the more complex statistical procedures such as the "unified information-theoretic causality" method devised by the authors. The section on limitations of the study is important and acknowledges some issues with interpretation that need to be explained. The methods, while brief in parts, are clear. The code used to generate the results has been made available via a GitHub repository. The figures are clear and attractive.

      We are very happy to hear that you found our manuscript clear and not containing any serious weakness.

      Reviewer #1 (Recommendations For The Authors):

      This is a very nice manuscript discussing highly relevant methods to use eDNA analysis to study interactions in marine ecosystems. There are some minor concerns that we will address below:

      - As already mentioned above, based on the statements in the introduction we expected a very elaborate discussion section concerning the ecological interaction observed between species. This is however missing, and a more extensive general discussion of the biological interactions would be appreciated, either based on existing literature, or by suggesting further experiments. Alternatively, the claims made in e.g. line 124-128 (Overcoming these difficulties....) could be amended so this expectation is not raised.

      Thank you so much for the comment. As answered in the response above, we have added discussion about how our study contributes to the fish community assembly process and predicting the community-level response under ongoing climate change at L679.

      Specifically, we argued that our study provides a piece of evidence that temperature exerts influences on fish-fish interactions under field conditions at a relatively short time scale (weeks to months). We suggested that temperature effects on fish community assembly involve effects at different time scales, and thus, integrating results from different temporal (and spatial) scales are necessary to understand the fish community assembly process in nature. As stated above, we provided the detailed ecological discussion for each specific fish-fish interaction in the Supporting Information.

      - A lot of negative controls were taken and described in the material & methods. However, there is no clear mention of what was done with the outcome of these negative controls. How did the results of the negative controls influence your analysis? Or were they all completely negative?

      Thank you for pointing this out. The negative controls produced negligible reads (177 ± 665 reads [mean ± S.D.]), which accounted for ca. 0.1% of the positive sample reads. Moreover, all the reads were assigned to non-target taxa, such as fish species that had never been observed in the study region and freshwater fish species. Therefore, we conclude that any contaminations in our experiments were negligible, and we discarded the sequence reads from the negative control samples. We have explained this in L533–L539 in the main text.

      - Line 423 states: "..suggesting that weak interactions are key to the maintenance of species-rich communities." We are wondering if this can be stated like this, as it seems the other way around would also be true, since in a species rich community it can be expected that most interactions are weak?

      Thank you for pointing this. out We agree that there is a possibility that the high species diversity could be a cause of weak intearctions. To clarify this, we have revised the sentence as follows in L568: " ...suggesting that understanding the causes and effects of weak interactions is key to understanding the maintenance of species-rich communities. "

      - There is a correlation between DNA concentration and temperature (e.g. shown in fig. S2b). We wondering what could be an argument to not correct for this temperature effect on eDNA concentrations (as now described) or if it would be better to apply a correction factor for this, as it is also shown that there is a correlation between DNA concentration and interaction strengths.

      In the unified information theoretic (UIC) analysis, we took the effect of temperature into account if temperature had statistically clear influence on eDNA dynamics of a particular fish species (L439). This means that temperature was included as a conditional variable in the calculation of TE (i.e., Zt in Eqn. [1]). Other environmental variables were also included if they had statistically clear influence. Similarly, in the MDR S-map, we included temperature or other environmental variables as conditional variables if they had statistically clear influence on eDNA dynamics of a particular fish species. We explained this in L479.

      - The models used for the interaction dynamics calculations are extensively discussed in this manuscript, although these details are also present in the original papers describing these models, and therefore the manuscript could be shortened by removing some of this explanation.

      Thank you for your suggestion. As you understood, the details of the method (S-map and MDR S-map) are available in Sugihara (1994), Chang et al. (2021), and elsewhere. However, we have kept the explanation so that readers who are not familiar with the methods can briefly understand the methods without the needs to read the detail of the previoius studies.

      Reviewer #2 (Recommendations For The Authors):

      L50-L72: I feel like the abstract could be snappier, i.e. quicker to read with less detail. Consider reducing it a little.

      Thank you for your suggestion. We have deleted some redundant phrases and shortened the abstract a little.

      L173-L176: I don't understand exactly what is suggested here. Perhaps rephrase?

      We have revised the sentence as follows (L165): " As our eDNA time series was taken twice a month, the interactions detected should also have the same time scale (e.g., the interactions detected may cause changes in the population size at the same time scale), which means that we tend to focus on behavior-level interactions (e.g., schooling) rather than birth-death process in the present study (except for predation)."

      L228: How many PCR replicate reactions were undertaken per sample?

      We performed eight technical replicates for the same eDNA template. This information is described in the third paragraph of the section "Paired-end library preparation and MiSeq sequencing." This section has been moved from the previous supplementary methods to the main text in the revision.

      L236: There is no mention later of how these blanks are used to clean up or filter the dataset from the effects of contamination. Consider adding this information.

      Thank you for pointing this. As in the responses above, we have described the negative controls in L533–L539 in the main text. The negative controls generated negligible reads, so we simply discarded the sequence reads.

      L252-L253: "Primer sequences were removed from merged reads and reads without the primer sequences underwent quality filtering"? Wouldn't all of the reads not have primers after the primers were trimmed off? Or is something else intended here?

      All primer sequences were removed after merging the paired- end reads (see "Sequence analysis"). There is no specific reason for this process, and we think that the primer removal before merging the paired- end reads will generate the same results.

      L264-L265: "To refine the above taxon assignments". I assume because there were lots of assignments to species that were not known from the study area? Explain why this was done.

      At present, the reference sequences are available for about 70% of 4,500 fish species in Japan. However, due to the unknown degree of intraspecific variation, using a uniform threshold of 98.5% to delineate species can result in over-splitting or over-clustering MOTUs. To solve this issue, the manual refinement of the taxon assignments was performed based on the phylogenetic tree. This has been explained in L335.

      L274: More details of the qPCR assay are required, or a citation of previous study or supporting information.

      The details of the qPCR assay are provided in the secion "Quantitative PCR and estimation of DNA copy numbers." This section has been moved from the previous supplementary methods to the main text in the revision.

      L327: Explain further how seasonality was treated here? This is an important part of the study, so deserves further attention.

      We included water temperature (if it had statistically clear influence on fish eDNA dynamics) as a conditional variable z(t) in the calculation of TE, and this took the effect of the seasonality in detecting causation into account. We have described this in L436–444.

      L407: Consider giving the code repository a DOI to cite.

      We have archived the analysis codes at Zenodo and provided the DOI in L39 and L521.

      L411: How many MiSeq runs exactly?

      We performed 21 MiSeq runs (often with other eDNA samples). We have described this in the main text (L299).

      L411: What proportion of your total sequencing data were assigned to fishes? This is a useful statistic to compare methods between studies.

      About 98% of the total sequence reads was assigned to fish. We have described this in the main text (L528).

      Figure 2: There does not appear to be a key to the color-coded species ecologies.

      We have added a legend for the fish ecology in Figure 2.

    1. Author Response:

      The following is the authors' response to the original reviews.

      Consolidated response to public comments:

      We are grateful to the reviewers for their careful examination of our manuscript and for their insights for improving our work. We appreciate that they recognize the potential of the TARDIS approach for diverse transgenesis applications.

      We address two primary concerns that the reviewers raise. First is a concern that this approach is not as innovative as stated. We acknowledge that our work builds upon previous studies in the field, such as those by Nonet, Mouridi et al., with Malaiwong coming after our initial preprint. However, we believe that our approach offers a unique contribution, in that prior work does not provide a protocol or process to provide large-scale multiplexed transgenesis. Specifically, our introduction of large sequence library arrays (TARDIS Library Arrays or TLAs). While high throughput multiplexed transgenesis is discussed in Nonet & Mouridi manuscripts, it is never demonstrated. It is the combination of library construction, heritable transmission of the library itself, and then induced transgenesis of library components at a defined location within single individuals that makes this approach particularly useful.

      Second, there were concerns that we have not demonstrated that this approach will work beyond C. elegans. We agree that our discussion of the potential application of TARDIS beyond C. elegans is speculative at this point. Our intention was to highlight the potential for future development and application in other systems. In some cases, large integrations into the genome are possible, such as in the case of H11 locus in mice, which could provide a means to inherit a sequence library. We are hopeful that our success in C. elegans will inspire work in other systems. The motivation for this will naturally depend on the usefulness of actual TARDIS implementations, which will be forthcoming in due course.

      Reviewer #1 (Recommendations For The Authors):

      1. Section titled "Integration from TARDIS array to F1" beginning on line 161 has some missing details that make it difficult to follow. Many of those details are present in the following section titled "Generation and Integration of TARDIS promoter library", but should have been present sooner.<br /> a. How many barcodes were in the array in line PX786?<br /> b. Clarify the use of G-418, heat shock, hygromycin, etc. in this paragraph.<br /> c. Please clarify that the L1 death is due to selection with G-418 - "We found that a portion of the initially plated worms die, likely due to lack of array inheritance." is confusing unless you add that they are selected in this step.<br /> d. "These results suggest that approx. 100-200 worms need to be heat shocked to obtain an integrated line" - the math actually looks like 200-300, and this would be to get a single integrant.<br /> 2. In general, the barcoding study and results reported here read like a teaser/proof-of-concept but do not really robustly demonstrate the application of the method for barcoding and tracing individual lineages in a population of C. elegans. How many barcodes were in the array, and how many ended up in F1s? Would one need to screen for duplicate barcodes after integration?<br /> 3. The promoter library study is impressive but again, rather limited.<br /> 4. The Discussion section about extending this technology to other systems is fairly balanced, acknowledging the limitations that would need to be overcome. The language in the abstract and introduction is less balanced and oversells the current translation of this approach to systems outside C. elegans.

      Reviewer #2 (Recommendations For The Authors):

      As I mentioned in the Public Review, I appreciate the design of the selection markers for integration. However, I do not see a major advance in the field. The use of barcoding of individuals to address a biological question would change that impression.

      Regarding the integration of promoters, I think this is something that anyone could address in diverse forms using existing knowledge.

      Suggestions:<br /> - Use one or two more landing pads for barcoding of animals and check numbers, efficacy, enrichments..etc. About 500 sequences overrepresented may be too much for future applications;<br /> - Increase the number of landing pads for inserting promoters. Genomics context matters and this could help to have a better summary of the real expression patterns driven by the promoter of interest;<br /> - Other references about landing pads would be Vicencio et al, Genetics 2019, and Nonet microPublication Biology 2021.

      In addition to the general comments, the reviewers provided useful suggestions to the text that we have used to clarify the manuscript.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements [optional]

      Thank you for your letter dated on May 5, 2023 concerning our manuscript (MS# RC-2023-01906) entitled “Activation of Nedd4L Ubiquitin Ligase by FCHO2-generated Membrane Curvature.”

      We thank the reviewers for their constructive comments and suggestions. We have considered all reviewers’ comments and plan to revise our manuscript accordingly.

      We believe that our revision plan will greatly improve the quality of our manuscript.

      1. Description of the planned revisions

      __Reviewer #1 __

      I enjoyed reading the paper by Sakamoto and colleagues, where they show that Nedd4L ubiquitin ligase activity is stimulated by membranes and in particular positive membrane curvature. This paper is a conceptual advance that hopefully will be extended by many other groups where membranes topology participates in the activation of associated enzymes, giving rise to added complexity but also specificity and further compartmentalization. It is an important paper for all cell biologists to understand.

      1. My comments are all relatively minor and I hope can improve the readability of the paper, but will not alter the overall conclusion as this is well backed up. In general I would like to see more/better statistics/quantitation and better figure legends. I found that often one had to read the paper to understand a figure where reading the figure legend should suffice.

      __Reply: __According to the reviewer’s comment, we will quantify the experiments (Fig. 1C, Fig. 2, Fig. 9B, and Fig. 10B) and add descriptions of statistics (Fig. 5, Fig. 6, B and D, and Fig. 7C). We will also write better figure legends to enable the readers to easily understand experiments.

      1. This paper reminds me of a paper from Gilbert Di Paolo's lab on the activation of synaptojanin PIP2 hydrolysis by high membrane curvature. One would expect that there may be many such proteins whose activities will be dependent on their membrane environment. I find it conceptually rather likely that a protein which interacts with membranes via a C2 domain (which has membrane insertions and will thus likely be curvature sensitive) will likely show some positive curvature sensitivity. Can I suggest this paper is referenced and discussed in the light of the discussion statement "Thus, our findings provide a new concept of signal transduction in which a specific degree of membrane curvature serves as a signal for activation of an enzyme that regulates a number of substrates."

      Reply: __According to the reviewer’s comment, we will cite the paper entitled “synaptojanin-1-mediated PI(4,5)P2 hydrolysis is modulated by membrane curvature and facilitates membrane fission” by Chang-Ileto et al. (Dev. Cell __20, 206–18 , 2011). We will also discuss this paper in the light of the discussion statement.

      1. Where the paper could be improved (or I have not understood fully). In figure 1 there is a robust endocytosis of ENaC that is FCHo2 and Nedd4L sensitive. There is a rescue for FCHo2 in a fluorescence image (unquantified), so it would be good to have the more quantitative approach of rescue with both FCHo2 and Nedd4L in the biochemical assay.

      __Reply: __Although the reviewer suggests a rescue experiment in the biochemical assay, the experiment is difficult because the transfection efficiency is low (about 50%). On the other hand, we agree with the reviewer that a quantitative approach is required in the rescue experiment (Fig. 1C). Therefore, we plan to quantify the rescue experiment for FCHO2 in the immunofluorescence assay. The reviewer also suggests a rescue experiment for Nedd4L as well as FCHO2. However, since the involvement of Nedd4L in ENaC endocytosis is well established, we do not think that the rescue experiment for Nedd4L is further required.

      1. In figure 2 there is nice co-localisation between clathrin/FCHo2 and ENaC but not with Nedd4L. It would be good to have some quantitation of the co-localisation. But also one should use a Nedd4L mutant or a mutant of ENaC and so be able to visualise co-localisation between receptor and ub-ligase. I find it strange that there is no (or much less) Nedd4L-GFP visible in the cells overexpressing ENaC... Is there an explanation? Does overexpression of ENaC lead to more auto-ubiquitination of Nedd4L. Also the Nedd4L-GFP signal in other cells is punctate, while in the next figure Myc-Nedd4L is not.

      __Reply: __According to the reviewer’s comment, we will perform quantitative colocalization analysis in Fig. 2.

      We have found that a catalytically inactive Nedd4L mutant, C922A, co-localizes with cell-surface αENaC and FCHO2 in αβγENaC-HeLa cells. According to the reviewer’s comment, these data will be added in the revised manuscript.

      In Fig. 2C, Nedd4L was transiently transfected in cells stably expressing ENaC. In Nedd4L-transfected cells, overexpression of Nedd4L stimulated ENaC internalization, resulting in the disappearance of ENaC at the cell surface. On the other hand, in non-transfected cells, cell-surface ENaC was detected. Thus, Nedd4L-negative cells are non-transfected cells (cell-surface ENaC positive cells). This explanation will be added in the revised manuscript.

      The staining pattern of Nedd4L depends on what section of the cell a confocal microscope was focused on. Nedd4L-GFP signals were punctate at the bottom section of the cell in Fig. 2, whereas Myc-Nedd4L was diffusely distributed at the upper section (cytoplasm) of the cell (Fig. 3). Thus, Nedd4L shows distribution throughout the cytoplasm and punctate staining at the bottom (cell surface). The staining pattern of Nedd4L is also affected by the expression amount of Nedd4L in cells. When Nedd4L was highly expressed in COS7 and HEK293 cells in Fig. 3, the punctate staining was hardly detected. This localization pattern of Nedd4L will be clearly described in the revised manuscript.

      1. In figure 3 it appears to me that there is co-localization between ENaC and amphiphysin. Is this not a positive piece of information? I am not sure that FBP17 is a good F-BAR domain to use given its oligomerization may well prevent membrane association of Nedd4L. Minor comment: I don't see tubules for amphiphysin in panel B.

      __Reply: __The reviewer states that there is co-localization between Nedd4L and amphiphysin1 (Fig. 3A). However, Nedd4L was not recruited to membrane tubules generated by amphiphysin1. We will clearly show that there is no colocalization between Nedd4L and amphiphysin1.

      The reviewer states that FBP17 may not be a good F-BAR domain to use because its oligomerization may well prevent membrane association of Nedd4L. However, we have shown that FCHO2 as well as FBP17 forms oligomer (Uezu et al. Genes Cells, 16, 868-878, 2011). Furthermore, we have found that FCHO2 inhibits the membrane binding and catalytic activity of Nedd4L when the PS percentage in liposomes is elevated (unpublished data and Fig. 9C). Thus, since FBP17 and FCHO2 probably have similar properties, we presume that FBP17 is a good F-BAR domain to use.

      As the reviewer pointed out, membrane tubules generated by amphiphysin1 were hardly detected in HEK293 cells (Fig. 3B). It showed punctate staining, but did not co-localized with Nedd4L. This description will be added in the revised manuscript.

      1. Figure 5: The affinity of Nedd4 C2 domain for calcium is quite high given we normally assume a cytosolic concentration of 100nM (approximate). The authors have rightly buffered the calcium with EGTA. Normally we would check that the buffering is sufficient by varying the protein concentration and making sure the affinity is still the same, so can I suggest the authors use 3 or 4 times the amount of C2 domain and make sure the curve does not change (provided liposomes are not limiting). Minor comment: How many experiments and what are error bars (SD?).

      __Reply: __According to the reviewer’s comment, we will check that the buffering is sufficient by varying the protein concentration (Fig. 5). We will also add a description of statistics to the legend to Fig. 5.

      1. Figure 6: Controls have been performed to ensure that liposomes are pelleted, according to methods. In Figure 6B can the authors show that there is the same amount of liposomes in each sample by showing more of the coomassie gel so that the reader can see the Neutravidin band is the same in each sample. Also I believe a student t-test should not be used in this experiment (but perhaps an Anova test), and in panel D there does not appear to be a description of statistics.

      __Reply: __To ensure that the same amounts of liposomes were pelleted, the reviewer suggests that we show more of the Coomassie gel to present the neutravidin bands in Fig. 6B. However, as the molecular weight of neutravidin is about 15 kDa, neutravidin run out of the gel (7% SDS-PAGE gel) where Nedd4L (As the reviewer pointed out, we will use an Anova test in Fig. 6B. We will also add a description of statistics in Fig. 6D.

      1. Figure 11: In panel B I note that the FCHo2 BAR domain on small liposomes appears to inhibit Ubiquitination. Is this consistent with the BAR domain not preventing Nedd4L binding?

      __Reply: __The FCHO2 BAR domain enhances the liposome binding and catalytic activity of Nedd4L when the strength of interaction of Nedd4L with liposomes (20% PS) is weak. In contrast, we have also found that the FCHO2 BAR domain inhibits the membrane binding and catalytic activity of Nedd4L when the interaction of Nedd4L with liposomes is increased by elevating the PS percentage in liposomes (unpublished data and Fig. 9C). The reason for the different effects of FCHO2 on Nedd4L is considered as follows: When liposomes (20% PS) are used (the interaction of Nedd4L with PS in liposomes is weak), Nedd4L binds to liposomes mainly through ENaC (Fig. 8F). The liposome binding is hardly mediated by PS. Addition of the FCHO2 BAR domain increases the strength of interaction Nedd4L with PS by generating membrane curvature. Consequently, the FCHO2 BAR domain newly induces the PS-mediated liposome binding of Nedd4L, resulting in the enhancement of liposome binding and catalytic activity of Nedd4L. On the other hand, when the interaction of Nedd4L with PS in liposomes is increased by elevating the PS percentage in liposomes (50% PS), the liposome binding of Nedd4L is mainly mediated by PS. Addition of the FCHO2 BAR domain inhibits the PS-mediated liposome binding of Nedd4L. Since both FCHO2 and Nedd4L are PS-binding proteins, they compete with each other to bind to PS in liposomes. Therefore, the results in Fig. 11B are consistent, because the interaction of Nedd4L with PS is increased by 0.05 µm pore-size liposomes. This explanation will be added in the revised manuscript.

      __Reviewer #2 __

      The authors have reported the involvement of the BAR domain-containing protein FCHO2 in the Nedd4L-mediated endocytosis of ENaC. They propose a model in which the membrane curvature induced by the BAR domain-FCHO2 relieves the auto-inhibition of E3 ligase causing its activation and recruitment. The paper describes a series of in vitro reconstituted experiments that are interesting but not fully connected with the mechanism of ENaC endocytosis. Additional experiments are needed to fully support the authors' conclusions.

      Major comments:

      1. Although the data reported by the authors regarding FCHO2 and Nedd4L involvement in ENaC endocytosis are convincing, it is suggested that the authors perform the same ENaC endocytosis assay presented in Fig.1B under conditions of FBP17 and amphiphysin1 siRNA to formally prove the selective involvement of FCHO2 in the process among other BAR-containing proteins.

      __Reply: __The reviewer suggests the same ENaC endocytosis assay presented in Fig. 1B under conditions of FBP17 and amphiphysin1 siRNA to prove the selective involvement of FCHO2 in ENaC endocytosis. There seems to be a misunderstanding. Similar to FCHO2, FBP17 and amphiphysin are well known to be involved in clathrin-mediated endocytosis. As ENaC is internalized through clathrin-mediated endocytosis, FBP17 and amphiphysin siRNA presumably inhibit ENaC endocytosis. We cannot understand the significance of FBP17 and amphiphysin1 siRNA in the ENaC endocytosis assay.

      1. According to the previous point, it will be interesting to see not only a snapshot image of the internalisation assay performed by immunofluorescence (Fig.1C) but a more quantitative analysis of the different time points (as in Fig.1B) in condition of FCHO2 siRNA and eventually FBP17 and amphiphysin1 siRNA.

      __Reply: __According to the reviewer’s comment, we will perform a quantitative analysis in Fig. 1C. The reviewer also suggests the immunofluorescence assay at the different time point in Fig. 1C. However, we show the time course of ENaC internalization in Fig. 1B. We do not think that the time course in the immunofluorescence assay is further required. As for FBP17 and amphiphysin siRNA, our response is the same as that to the comment 1 of this reviewer.

      1. In Fig.2B, overexpression of the catalytically inactive version of Nedd4L (Nedd4L C922A) would help to see Nedd4L-ENaC co-localization.

      __Reply: __This comment is the same as the comment 4 of the reviewer#1.

      1. In Fig.4D, the authors need to analyse ENaC ubiquitination in the same experimental setting as Fig. 4A instead of transfecting cells with increasing amounts of Nedd4L in the presence or absence of FCHO2 BAR. It is also recommended to include Nedd4L C922A as an additional control.

      __Reply: __The reviewer requests us to analyse ENaC ubiquitination in the same setting as Fig. 4A. However, an in vivo autoubiquitination assay is widely used to determine the catalytic activity of E3 Ub ligase, because the E3 activity is typically reflected in their autoubiquitination. Therefore, the autoubiquitination assay is sufficient to show that Nedd4L is specifically activated by membrane tubules generated by FCHO2 in cells. Furthermore, we have found it very difficult to compare ENaC ubiquitination among many GFP-BAR proteins (GFP alone, GFP-FCHO2, GFP-FBP17, amphiphysin1-GFP, GFP-FCHO2 mutant) in the same experimental setting as Fig. 4A. In Fig. 4A, three types of cDNAs (HA-Ub, Myc-Nedd4L, and GFP-BAR protein) were transfected in cells. The expression amounts of Myc-Nedd4L were similar among the GFP-BAR proteins. On the other hand, in Fig. 4D, four types of cDNA (HA-Ub, Myc-Nedd4L, GFP-BAR protein, and FLAG-αENaC) were transfected in cells. Under these conditions, it is very difficult to adjust the expression amounts of Nedd4L and αENaC among many GFP-BAR proteins. Even when comparing two GFP-BAR proteins (GFP alone and GFP-FCHO2), it was necessary to assess the expression amounts of Nedd4L by transfection with various cDNA amounts of Nedd4L (Fig. 4D). Moreover, as shown in Fig. 4D, enhancement of ENaC ubiquitination by FCHO2 is decreased at higher expression of Nedd4L (1.0 and 1.5 μg DNA), although the reason is unknown. Therefore, we are not sure that we will able to accurately analyse ENaC ubiquitination in the same setting as Fig. 4A instead of transfecting cells with increasing amounts of Nedd4L.

      According to the reviewer’s comment, we will examine the effect of Nedd4L C922A on ENaC ubiquitination.

      1. While discussing the role of hydrophobic residues in Nedd4L C2 domain,the authors never mentioned the publication by Escobedo et al., Structure 2014 (DOI:10.1016/j.str.2014.08.016), which highlighted how I37 and L38 are directly involved in Ca2+ binding. This aspect should be discussed since the authors show the importance of Ca2+ for PS binding in the sedimentation assay.

      __Reply: __According to the reviewer’s comment, we will cite the reference (Escobedo et al.) and discuss the aspect (I37 and L38 are directly involved in Ca2+ binding).

      1. As stated by the authors those two residues I37 and L38 are also involved in E3 enzyme activation by relieving C2-HECT interaction. It is important to further demonstrate the effect of these mutations on ENaC substrate.

      __Reply: __To prove that the I37 and F38 residues are involved in E3 enzyme activation by relieving C2-HECT interaction, the reviewer requests us to further demonstrate the effect of Nedd4L I37A+F38A on ENaC ubiquitination. However, these two residues are critical noy only for Nedd4L activation but also for membrane binding and curvature sensing of Nedd4L. We also show that membrane binding of Nedd4L is critical for ENaC ubiquitination. Actually, we have found that Nedd4L I37A+F38A mutant, which loses membrane binding, shows little ENaC ubiquitination (unpublished data), whereas it enhances autoubiquitination (Fig. 4C). Thus, the effect of the I37A+F38A mutant on ENaC ubiquitination is not appropriate to prove that the two residues are involved in E3 enzyme activation.

      1. There are some concerns regarding the in vitro ubiquitination assay performed in Fig.8 and following figures. The Nedd4L proteins used during the assay has been produced as His tagged at the C-terminus, it was reported (Maspero et al, Nat Struct Mol Biol 2013 DOI: 10.1038/nsmb.2566), at least for the isolated HECT domain, that modification of the C-terminal residue of the protein affects its activity. It would be important to judge the activity of the purified proteins used in the assay. Moreover, as additional control it is suggested the introduction of a mSA-ENaC PY mutant protein. The authors claimed the importance of membrane localized PY motif for recruitment and activation of Nedd4L, it would be informative to perform the experiment in presence of PY mutated ENaC.

      __Reply: __The reviewer states that there are some concerns regarding His-tagged Nedd4L proteins. We have prepared Nedd4L that has no tag at its N- or C-terminus. N-terminal GST-tagged, C-terminal untagged Nedd4L was expressed in E. coli and purified by Glutathione-Sepharose column chromatography. The GST tag was cleaved off and Nedd4L was further purified by Mono Q anion-exchange column chromatography. Using this purified sample, we have examined the catalytic activity of untagged Nedd4L. We have found that concerning Ca2+-dependency, PS-dependency, and curvature-sensing, the properties of untagged Nedd4L are similar to those of C-terminal His-tagged Nedd4L (unpublished data).

      According to the reviewer’s comment, we will perform the experiment in the presence of PY-mutated ENaC.

      1. It is not clear why increasing the concentration of PS (from 20% to 50%) the presence of BAR domain doesn't allow ENaC ubiquitination (Fig.9C), is Nedd4L not recruited to the pellet? It would be interesting to see the sedimentation experiment of Fig.9A done in presence of 50% PS.

      __Reply: __This comment is essentially the same as the comment 8 of the reviewer#1. We have found that FCHO2 BAR domain inhibits the membrane binding of Nedd4L when the PS percentage in liposomes is elevated (~50%) (unpublished data). According to the reviewer’s comment, these data will be added in the revised manuscript.

      1. This reviewer is not an expert of lipids biology, thus the explanations related to the effect of FCHO2 BAR in presence of PI(4,5)P2 (Fig. 10) or 0.05 pore-size liposomes (Fig.11) were not clear. Does FCHO2 BAR have a different effect in inducing membrane tubulation in these two conditions? Is this parameter measurable by tubulation assay?

      __Reply: __According to the reviewer’s comment, we will write more clearly the explanation related to the effect of FCHO2 BAR domain in the presence of PI(4,5)P2 or 0.05 μm pore-size liposomes.

      Minor Comments

      1. It would be appreciated if a nuclei staining panel is included in all immunofluorescence images, as it would help to identify the number of cells in the field of view (e.g., Fig. 1C, Fig. 2B).

      __Reply: __According to the reviewer’s comment, we will show immunofluorescence images to identify the number of cells in Fig. 1C and Fig. 2B.

      1. It would be recommended to include colocalization analysis, such as Pearson's correlation coefficient or Manders coefficient in immunofluorescence images.

      __Reply: __According to the reviewer comment, we plan to perform quantitative colocalization analysis in Fig. 2.

      1. It is not clear how the quantitation of mSA-ENaC ubiquitination in Fig.8D, 8C, and 9B was performed. Did the authors normalise the detected Ub signal over the amount of unmodified mSA-ENaC?

      __Reply: __We did not normalize the detected Ub signals over the amount of unmodified mSA-ENaC, because the same amount of mSA-ENaC was added in each assay. The chemiluminescence intensity of Ub signals was quantified by scanning using ImageJ. According to the reviewer’ comment, we will clearly describe how the quantification of mSA-ENaC ubiquitination was performed.

      __Reviewer #3 __

      --- Summary ---

      The manuscript by Sakamoto et al. describes how the ubiquitin ligase Nedd4L is activated by membrane curvature generated by the endocytic protein FCHO2. For their experiments, the authors use the epithelial sodium channel (ENaC) as a model Nedd4L target and CME cargo. The authors start their manuscript by showing in cells the importance of FCHo2 and Nedd4L in ENaC internalization. Using a combination of experiments in cells and biochemistry, the authors show that Nedd4L binds preferentially to membranes with the same curvature generated by FCHO2. Next, the authors show that a combination of membrane composition (PS), calcium concentration, PY domain presence and membrane curvature all act in concert to recruit Nedd4L to membranes and fully release its ubiquitination activity. Crucially, the authors show that role of FCHO2 in Nedd4L recruitment is not direct, with FCHO2 simply generating an optimal membrane curvature for Nedd4L binding. Taken together, the authors suggest a mechanism by which the curvature of early clathrin coated pits, generated by FCHO1/2 define an optimal environment for the recruitment and activation of the ubiquitin ligase Nedd4L.

      The manuscript convincingly shows the membrane curvature-dependent mechanism of Nedd4L activation. The biochemistry experiments in the manuscript are well designed and the results are of clear. The quality of these experiments is very high. The experiments in cells are, however, not of the same level of quality.

      --- Major comments ---

      1) The results do not show convincingly that Nedd4L is recruited to CCPs. There is plenty of indirect evidence, but to support the model shown in the last figure, authors need to show more than the staining in figure 2C. Live-cell imaging showing the post-FCHo2 recruitment of Nedd4L would be required. I understand that the recruitment would possibly occur in a fraction of events and may be difficult to catch. The cmeAnalysis script from the danuser lab(https://doi.org/10.1016/j.devcel.2013.06.019 can facilitate the identification of these events.

      __Reply: __According to the reviewer comment, we plan to examine by live-cell TIRF microscopy that Nedd4L is recruited to CCPs.

      2) What happens to ENaC in Nedd4L and FCHO2 knockdown cells? One would expect accumulation of the receptor on the surface.

      __Reply: __We have found that upon Nedd4L or FCHO2 knockdown, αENaC accumulates at the cell surface in αβγENaC-HeLa cells. According to the reviewer’s comment, we will show these data in the revised manuscript.

      *3) In the experiments in figure 1, it would be important to use a standard CME cargo as an internal control (transferrin). This will serve as a functional confirmation of FCHO2 knockdown and help the reader to put the Need4L knockdown experiments into the context of CME. *

      __Reply: __According to the reviewer’s comment, we will use a standard CME cargo as an internal control (transferrin).

      *4) Quantification for the rescue experiment is required (figure 1C). if not possible, at least a picture where the reader can see transfected and non-transfected cells side-by-side is necessary. *

      Reply: This comment is the same as those of the reviewer#1 (comment 3) and reviewer#2 (comment 2). According to the reviewer’s comment, we plan to quantify the rescue experiment (Fig. 1C).

      *--- Minor comments --- *

      *1) The experiments in figure 3 must be presented in order as they are in the text. For example, figure 3E is cited in the text into the context of figure 7. It is very confusing. *

      __Reply: __According to the reviewer’ s comment, we will present the experiments in Fig. 3 in order they are in the text.

      *2) A better explanation of the assay in 1C would facilitate its understanding for the non-specialist reader. The reader needs to read the methods section to understand how it was done. *

      __Reply: __According to the reviewer’ comment, we will write a better explanation of the assay in the Fig. 1C legend to enable the readers to understand how it was done.

    1. non lasciarmi pensare alle mie montagne

      Very often, when we think about ‘Il canto di Ulisse’, we tend to recall only the most famous pages in which Levi tries to remember Dante’s canto. The depth and sense of urgency of the Ulyssean passages are so overwhelming and passionate that they may distract us from other elements in the chapter. However, if we go back to the text and read it closely, we cannot avoid noticing that, after a brief opening in which Levi introduces Pikolo and narrates how he came to be Pikolo’s ‘fortunate’ chaperone to collect the soup for the day, ‘Il canto di Ulisse’ also dwells quite significantly on a moment of domestic memories. While going to the kitchens, Levi writes: ‘Si vedevano i Carpazi coperti di neve. Respirai l’aria fresca, mi sentivo insolitamente leggero’. This is the first moment in the chapter in which Levi refers to the mountains as something that revitalises him and makes him feel fresh and light, both physically and mentally.

      This moment foreshadows another, also in this chapter, when Levi goes back to his mountains, those close to Turin, and compares them to the mountain that the protagonist of Dante’s canto, Ulysses, encounters just before his shipwreck with his companions:

      ... Quando mi apparve una montagna, bruna

      Per la distanza, e parvemi alta tanto

      Che mai veduta non ne avevo alcuna.

      Sì, sì, ‘alta tanto’, non ‘molto alta’, proposizione consecutiva. E le montagne, quando si vedono di lontano... le montagne... oh Pikolo, Pikolo, di’ qualcosa, parla, non lasciarmi pensare alle mie montagne, che comparivano nel bruno della sera quando tornavo in treno da Milano a Torino! Basta, bisogna proseguire, queste sono cose che si pensano ma non si dicono. Pikolo attende e mi guarda. Darei la zuppa di oggi per saper saldare ‘non ne avevo alcuna’ col finale.

      The significance of the mountains in Levi’s narration is confirmed in this passage. For him, the mountains represent his experience of belonging, his youthful years, and his work as a chemist – the job he was doing when he commuted by train from Turin to Milan. At the same time, Levi’s own memories of the mountains intertwine and overlap with another mountain, Dante’s Mount Purgatory. Here, a deep and perhaps not fully conscious intertextual game starts to emerge and to characterise Levi’s writing. The lines that Levi does not remember are these:

      Noi ci allegrammo, e tosto tornò in pianto,

      ché de la nova terra un turbo nacque,

      e percosse del legno il primo canto.

      For Dante’s Ulysses, Mount Purgatory signifies the final moment of his adventure and his desire for knowledge. The marvel and enthusiasm that Ulysses and his company feel when they see the mountain is suddenly transformed into its contrary. From the mountain, a storm originates that will destroy the ship and swallow its crew: ‘Tre volte il fe’ girar con tutte l’acque, | Alla quarta levar la poppa in suso | E la prora ire in giù, come altrui piacque’. Dante’s Mount Purgatory, so majestic and spectacular, represents the end of any desire for knowledge that aims to find new answers to and interpretations of human existence in the world without God’s word.

      Going back to Levi’s text, we find that, instead, in a kind of reverse overlapping between his image and that of Ulysses, the image of the mountain of Purgatory suggests to Levi a very different set of thoughts that, although seemingly and similarly overwhelming, opens up new interpretations: ‘altro ancora, qualcosa di gigantesco che io stesso ho visto ora soltanto, nell’intuizione di un attimo, forse il perché del nostro destino, del nostro essere oggi qui’. For a moment, it is almost as if Levi, a new Dantean Ulysses in a new Inferno, stands in front of Mount Purgatory and forgets the terzine and the shipwreck. Maybe Levi cannot or does not want to remember those terzine because the mountain in Purgatory represents something very different for him than for Dante’s Ulysses. Levi’s view of the mountain does not lead to a moment of recognition of sin, as it does in Dante’s Ulysses. For him, the mountain, like his mountain range, is the gateway to knowledge, enrichment, and illumination and to a world that lies beyond the imposed limits of traditional, constricting, and distorted views and that awaits discovery (‘qualcosa di gigantesco che io stesso ho visto ora soltanto’). Something about and beyond the Lager.

      To better understand how the mountains are central in ‘Il canto di Ulisse’, we have to remember that Levi’s view of the mountains strongly depends on his anti-Fascism, which he expressed particularly vigorously in two moments of his life: during his months in the Resistance, just before he was captured and sent to Fossoli, and, even more intensely, during the adventures of his youth, when he was a free young man who enjoyed climbing the mountains surrounding Turin. As Alberto Papuzzi has suggested, ‘le radici del suo rapporto con la montagna sono ben piantate in quella stagione più lontana: radici intellettuali di cittadino che cercava sulla montagna, nella montagna, suggestioni e risposte che non trovava nella vita, o meglio nell’atmosfera ispessita di quella vita torinese, senza passato e senza futuro’ (OC III, 426-27). Indeed, reports Papuzzi, Levi confirms that:

      Avevo anche provato a quel tempo a scrivere un racconto di montagna […]. C’era tutta l’epica della montagna, e la metafisica dell’alpinismo. La montagna come chiave di tutto. Volevo rappresentare la sensazione che si prova quando si sale avendo di fronte la linea della montagna che chiude l’orizzonte: tu sali, non vedi che questa linea, non vedi altro, poi improvvisamente la valichi, ti trovi dall’altra parte, e in pochi secondi vedi un mondo nuovo, sei in un mondo nuovo. Ecco, avevo cercato di esprimere questo: il valico.

      The heart of that epic story made its way into the chapter ‘Ferro’ in Il sistema periodico. The discovery of this (brave) new world, ‘mondo nuovo’, is an integral part and a direct achievement of Levi’s experience in the mountains. The mountains open a new understanding and a new perspective on the world.

      Something that escapes common understanding is revealed through the experience of the mountains, both in Levi’s memories of his youth and in his literary recounting of Auschwitz. Reciting Dante in ‘Il canto di Ulisse’ is therefore not only an intertextual exercise for Levi. Only by inserting Levi’s literary references in the complexity of his own experience – before, during, and after Auschwitz – can we fully capture the depth of his reflections. Levi mentally and metaphorically brought to Auschwitz not only Dante but also his ‘metafisica dell’alpinismo’. Together, they contributed to his attempt to come to terms with that reality.

      VG

    2. Considerate

      My reflections here build on Lino Pertile’s 2010 essay, ‘L’inferno, il lager, la poesia’. Pertile notes the profound correspondence between the opening poem of the book (OC I, 139) and this chapter. He points out how the main theme of Levi’s book, the dehumanising experience in the Lager, based on the annihilation of people’s identity, is expressed in the poem and resurfaces explicitly again in the chapter dedicated to Dante’s Ulysses. The key term revealing the correspondence of themes and intentions is ‘Considerate [consider]’, used twice in Levi’s poem (‘Consider if this is a man | … | Consider if this is a woman’) and rooted in the memory of Dante’s famous tercet where Ulysses addresses his crew as they sail towards the horizon of their last journey beyond the pillars of Hercules: ‘Considerate la vostra semenza: | fatti non foste a viver come bruti, | ma per seguir virtute e canoscenza’ (Inf. 26, 118-20 and OC I, 228).

      There are many other correspondences between the chapter of Ulysses and the opening poem, besides the ‘Considerate’, and that they are profound and filtered through the theme of memory, an eminently Dantean theme: the urgency to fix in the memory itself what is or will be necessary to tell, or the urgency to express and recount what is deposited in memory. Indeed, for Levi, the memory of each individual person contains that person’s humanity.

      Memory is immediately activated as Primo and Jean exit the underground gas tank (‘He [Jean] climbed out and I followed him, blinking in the brightness of the day. It was warm [tiepido] outside; the sun drew a faint smell of paint and tar from the greasy earth that made me think of [mi ricordava] a summer beach of my childhood). Temporarily escaping hell by means of a ladder (a sort of Dantesque ‘natural burella’), it is the tiepido sun and a characteristic smell that evoke the childhood memory and that at the same time the reader cannot avoid connecting to the tiepide case of the initial poem (‘You who live safe | in your heated houses [tiepide case]’ [my emphasis]). It is then around the memory ‘of our homes, of Strasbourg and Turin, of the books we had read, of what we had studied, of our mothers’ that another theme in the chapter coalesces, the theme of friendship (‘He and I had been friends for a week’), a theme that had already emerged in a more general connotation in the opening poem (‘visi amici’). Warmth, friendship (visi amici…Jean), the kitchens as destination for Primo and Jean’s walk (the walk from the tank with the empty pot is ‘the ever welcomed opportunity of getting near the kitchens’, not for that hot food [cibo caldo] evoked in the poem, but for the soup of the camp, an alienating incarnation of Dantesque ‘pane altrui’ whose various names are dissonant). During the respite of the one hour walk from the tank to the kitchens, the intermittent memory of Dante’s canto emerges as if from an underground consciousness, the memory of Inferno as a partial and imperfect mirror of the human condition in the Lager, Ulysses as poetic memory, a sudden epiphany of a semenza, a seed, of humanity that the Lager is made to suppress, and Primo’s wondering in the face of this sudden internal revelation of still possessing an intact humanity. Primo’s memory of his home resurfaces as if springing from the memory of Dante’s text: the ‘montagna bruna’ of Purgatory is reflected in the memory of ‘my mountains, which would appear in the evening dusk [nel bruno della sera] when I returned from Milan to Turin!' But the real, familiar landscape is too heartbreaking a memory of ‘sweet things cruelly distant’, one of those hurtful thoughts, ‘things one thinks but does not say’. There is an epiphanic memory then, the poetic memory that surfaces during the walk and that reveals to Primo that he still is a man, a memory to which he clings despite the sense of his own audacity (‘us two, who dare to talk about these things with the soup poles on our shoulders’); there is also a more intimate memory, equally pulsating with life and humanity - but dangerous, because it makes Primo vulnerable to despair, threatening his own survival in the camp.

      The urgent need to remember Dante’s verses in this chapter develops the theme of memory, which has been central from the opening poem. In Levi’s poem, though, memory is perceived from a different angle: the readers (who live safe…) must honour that memory and transmit it as an imperative testimony of what happened in the concentration camp from generation to generation, testifying to the suffering of the man and the woman ‘considered’ in the poem. This is a memory to be carved in one’s heart, which must accompany those who receive it in every action and in every moment of each day like a prayer. Not coincidentally the poem follows the text of the most fundamental prayer of Judaism, the Shemà Israel, which is read twice a day, a memory to be passed on to one’s own children, a responsibility which is a sign of one’s humanity. The commandment to remember of the opening poem (‘I consign these words to you. | Carve them into your hearts') issues a potential curse to the reader, threatening the destruction of what most fundamentally characterises their humanity - home, health, children: ‘Or may your house fall down, | May illness make you helpless, | And your children turn their eyes from you’. Finally, Primo’s act of remembering during the walk to the kitchens is submerged by the Babelic soup (‘Kraut und Rüben…cavoli e rape…Choux et navets…Kàposzta és répak…Until the sea again closed – over us’) and yet the memory of it becomes part of his testimony in such a central chapter of the book written after surviving the Shoah. If the memory of Dante’s verses contributed to Primo’s faith in his own humanity and his psychological and physical survival in the camp, he then accomplishes the commandment of memory and his responsibility as a man through his own writing.

      CS

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1:

      Major comments:

        • The relevance of these findings to human biology remains unclear. In Figures 1-4, the authors present data showing that AATBC is enriched in thermogenic fat, and they argue that it regulates thermogenesis and mitochondrial biology. However, in Figures 6-7, where the authors look at AATBC in different human cohorts, they actually find that it is enriched in visceral fat, which is thought of as being the least thermogenic fat depot. The authors do not explain this seeming paradox, and thus, the role of AATBC in fat remains uncertain. *

      RESPONSE: We thank the reviewer for this comment and have clarified the discussion to address this point. It has been recently shown (PMID: 28529941) that the pattern of browning genes in human white adipose tissue depots is actually inverted to mice, making visceral adipose tissue in humans actually more thermogenic than subcutaneous. This aligns well with our findings of AATBC is predominantly expressed in thermogenic adipose tissue.

      • In many of the experiments, insufficient controls are provided, or the data are not at all convincing. For example:*

      (a) The first four figures rely on in vitro adipocyte models, but the authors do not present data to show these cells differentiate properly and equally. This is especially relevant for the gain and loss of function studies.

      RESPONSE: We agree with the reviewer that equal differentiation is necessary for in vitro adipocyte models. Therefore, we added Oil-red-O stainings and the corresponding quantifications to Supp. Fig. 4 (see below) for the differentiation of hMADS in the absence of AATBC. We also want to emphasize, that the expression levels of PLIN1, a surrogate marker for differentiation was unchanged in our experiments, as already shown in the initial draft of the manuscript. On top of that, in all experiments presented in the original draft of the manuscript, AATBC gene expression was only altered in mature adipocytes.

      (b) Some of the experiments in Figure 1 (K-L) seem to only show an N of 1.

      RESPONSE: Figure 1 highlights a screening process to find new lncRNA regulated during thermogenesis. The forskolin sample was included to achieve an additional dimension in the filtering process. The displayed values in K&L demonstrate the validity of the sample. The validation of AATBC as a target was performed with statistical power in the work displayed in the following figures.

      (c) The RNAscope data in Figure 2 is not at all convincing for nuclear localization

      RESPONSE: We respectfully disagree. In our opinion, the RNAScope is convincing for nuclear localization of the lncRNA. However, we have repeated the experiments with different probes that strengthen our data (see figure for the reviewer)

      (d) The ASO mediated knockdown of AATBC in Figure 3 only reduced expression slightly. A more complete knockdown or deletion may elicit a stronger phenotype.

      RESPONSE: We thank the reviewer for the feedback. We have repeated the knockdown experiments but were not able to reduce the expression further, even after designing additional ASOs. However, already with current approach, the reduction in AATBC expression elicited a phenotype, highlighting the importance of AATBC in a dose-dependent manner.

      (e) In Figure 4, OPA1 is shown as a single band in panel E and a doublet in panel N. Based on this, are the authors certain they are detecting OPA,1 or could this be a nonspecific band?

      RESPONSE: We thank the reviewer for this comment. Protein extraction has been performed at different research institutes with slightly different buffers. Multiple bands (cleaved/uncleaved) have been described for OPA1 in the past, therefore we are certain that the correct protein has been detected.

      *(f) The correlations in Figure 6 I-L and Figure 7 do not include any statistical analysis. *

      REPONSE: For better readability, the statistical analysis is being mentioned in the figure legend. The reviewer might have overlooked this information.

      • The gain of function studies in mice are problematic. The authors have performed a large amount of invasive studies in a short period of time. The animals will undoubtedly lose weight after each study and with insufficient time to recover, this could influence the subsequent studies.*

      RESPONSE: These general concerns are valid, but all controls are in place and the animals gained weight during the experiments, as one would have been expected with animals of that age (see below).

      *In addition, since the authors present data in Figures 1-4 arguing that AATBC overexpression is associated with increased thermogenesis, it is surprising that the authors never looked at this in Figure 5 (aside from measuring Ucp1 mRNA). It would be interesting to measure energy expenditure by indirect calorimetry and cold tolerance. *

      RESPONSE: We agree with the reviewer on this point but are due to animal protocol limitations in conjunction with the viral approach are unable to perform these experiments.

      • The authors do not provide any mechanistic insights into how AATBC may be acting.*

      RESPONSE: Certainly, more mechanistic insight into the direct mode of action of AATBC would be interesting. To address this point, over the past year we performed multiple attempts to perform pulldown of AATBC using the ChIRP technology. However, we were unable to achieve a sufficient enrichment, which would have allowed us to give further information about direct interaction partners of AATBC. However, we believe that our data regarding mitochondrial dynamics, which we now also have confirmed in in vivo experiments, explain the connection of AATBC and thermogenicity. In future, we aim to work on this point further but for multiple reasons have decided to close this chapter here.

      Minor comments:

      • The introduction is rather long and would benefit from being condensed.*

      RESPONSE: We have edited the text for better readability.

      Reviewer #2:

      Major Comments:

        • The key conclusion that AATBC is a novel obesity-linked regulator of adipocyte plasticity is made relatively clear with the comparison between various stages of adipocytes and the loss and gain of function with AATBC. - Figure 1 H and J do not seem to be consistent with the data in Figure 1F in LINC00473 level-There is no difference in Control vs NE in the heatmap but in Figure1J, the difference seems to be quite obvious; Figure 1K does not seem to be consistent with AATBC level-The measurement in Control VS Fsk group showed no difference in AATBC in heatmap, but in Figure K, there seem to be a dramatic increase. Therefore, the claims that there is a difference in these two lncRNA expression in these cell groups needs further clarification. *

      RESPONSE: To combine the different approaches to identify novel lncRNA into one heatmap the data need to be normalized over experiments. As the fold change of the expression of AATBC in BAT compared to WAT (on average ~100x) is higher than with forskolin (~4x), this will stand out in the heatmap and will to some extent overshadow the smaller fold changes. The same holds true for LINC00473, which is drastically induced with forskolin, which to some extent masks the higher expression in the other approaches. Therefore, we decided to show both the heatmap to represent the general approach and the “zoomed in” versions to show the consistent increases. We are confident this clarifies the issue.

      • Figure 4H and I, the difference in the representative immunoblot seem to be minimal and inconsistent with the decrease shown in the bar graph. *

      RESPONSE: We agree with the reviewer and have removed the claim from our manuscript.

      • In Figure 5, after overexpressing human AATBC in murine adipose tissue , is it possible to look at the mitochondria changes that were seen before in cell lines? If there are similar changes in murine adipose tissue, then it would prove the changes in vitro hold up with the in vivo model. But if the mitochondria changes were not seen, then it would indicate the changes in leptin, triglyceride levels may due to other mechanisms. The length of the suggested experiment to look into the mitochondrial differences in mice may vary depending on whether there are preserved samples from previous experiments. If there are, then the time period would be couple of weeks for immunblot and analysis. If there are no samples preserved, then the estimated period for the suggested experiments may be around 1.5 to 2 months at least .*

      RESPONSE: We thank the reviewer for the suggestion. We performed Western Blot analysis on the tissues from the in vivo study and have included them in Fig. 5, further strengthening the link between AATBC and mitochondrial dynamics (please see figure on the right).

      • The data are convincing overall in that the replicates are clearly marked with dots in many figures. Some immune blot and expression level are inconsistent with other data showing the same results however. *

      RESPONSE: We thank the reviewer and have removed the necessary quantifications.

      • Figure 6 and 7 are provocative and significant, reporting strong associations of AATBC with well-known markers of metabolism in adipocytes. The sex difference for adiponectin and AATBC expression is particularly intriguing. Further discussion of this point would be interesting. However, there is no information provided about the medication status of the obese subjects that were consented for samples used in the analysis. Specifically, many of the obese subjects (mean BMI 45 or more with a range going up to 97.3) would be expected also to have metabolic diagnoses and to be treated with numerous medications, including Metformin, GLP1 agonists, Orlistat, Liraglutide, Bupropion/Naltrexone and combinations. It is unreasonable to ignore possible effects of major medications on AATBC expression. Please comment on the strengths and weaknesses of the analysis that ignores medications, or if some annotations of clinical data are available, perhaps to explain outliers in the plots, please discuss. *

      RESPONSE: We thank the reviewer for this suggestion. Unfortunately, we are unable to exclude additional diagnoses and medication of our patients due to the points the reviewer stated. However, given the large size of the cohorts we are confident that such effects are being compensated for. We have added a part on weaknesses of the study in the discussion.

      Minor Comments:

      • The labeling of figure 2 A-K is not clear because the use of the same color of bars is easily misunderstood as the same source of cells, but it is in fact not. For example, the grey color that appeared in 2B and 2C are not the same source but can be misunderstood. *

      RESPONSE: The coloring of Fig.2A&G has been changed.

      • Figure 3 ASO-AATBC has two repeats #1 and #2, and over-expression of AATBC has one, even though there are enough repeats. It would be less confusing to present all of the repeats in ASO_AATBC together in one bar.*

      RESPONSE: The two different ASO target different areas of AATBC. In line with general guidelines for ASO use, those are not pooled but used separately, which is why the results are also split up. As the overexpression is additional genomic information of AATBC, it is impossible to use different variants in this case, therefore only one bar for overexpression is shown.

      • The experimental outline can be a bit more detailed and explain some of the words like Thermo versus Browning.*

      RESPONSE: The manuscript has been revised regarding this point.

      • Some of the panels in Figure 7 could be put into supplementary if space is at a premium, and present the representative graph would be enough*

      RESPONSE: We think that all our data of Fig. 7 warrants enough attention to be considered in a main figure, but if space is sparse, we are very happy to oblige. We would kindly ask the editors for input on this matter.

      Reviewer #3:

        • Throughout the study, the data provided are mainly correlative and in some cases not robust. In Fig. 2, AATBC expression is described to be elevated in the so-called "thermogenic condition", which contained prolonged PPARg agonist treatment (rosiglitazone) known to promote adipogenesis. Consistent with this notion, adipogenic markers, such as PLIN1 and FABP4, are higher in "thermogenic adipocytes" (Suppl Fig. 2). As such, the result may only suggest that AATBC has higher expression in mature adipocytes vs pre-adipocytes. *

      RESPONSE: We thank the reviewer for the suggestion. We have added Oil-Red-O-Stainings to Suppl. Fig. 2 to show unchanged lipid content upon modulation of AATBC gene expression, which can be seen as a surrogate for differentiation. Concerning the use of rosiglitazone as a browning agent, we want to emphasize that rosiglitazone was used during the entirety of differentiation until day 9, where it was removed in the “non-thermogenic” group. At this point we already observe fully differentiated adipocytes. This is an established protocol. Furthermore, the data is in line with using norepinephrine or forskolin as a short-term inducer of browning, making it very likely that the effect seen is due to the “more thermogenic” character of the adipocytes.

      • Along the same vein, whether and how AATBC affects adipogenesis is unclear. Suppl Fig. 3H and 3L (misplaced as Suppl Fig. 4) show the adipocyte differentiation marker FABP4 is down-regulated by both ASO- and AV-AATBC. Since mitochondrial respiration (and other parameters including UCP1 expression) is tightly linked to adipogenic efficiency, the authors need to address whether these manipulations affect adipocyte differentiation. *

      RESPONSE: We agree with the reviewer that differences in differentiation capacity would falsify our data on mitochondrial dynamics. We have added Oil-Red-O-Staining to Suppl. Fig. 2 to show that no significant difference in lipid content exists during modulation of AATBC gene expression, which can be seen as a surrogate for differentiation. Furthermore, in all experiments presented in the manuscript, the modulation of AATBC occurs in already fully differentiated adipocytes. Accordingly, we are confident that AATBC does not influence differentiation but mainly acts through the modulation of mitochondrial dynamics.

      • The data in Fig. 4 supporting a role for AATBC in regulating mitochondrial dynamics are superficial and not robust. Fig. 4A/4J do not have high enough resolution to provide accurate assessment of the mitochondrial network.*

      RESPONSE: We respectfully disagree with the reviewer on this point. State of the art methods and algorithms were used to image and analyze the mitochondrial network. Furthermore, we have used multiple established markers of mitochondrial dynamics in western blot analysis to further strengthen our assessments of the immunofluorescence. In summary, we feel like have given enough evidence for an accurate assessment of the mitochondrial network.

      • The level of loading control TUBB is clearly lower in siAATBC in Fig. 4H. In addition, OPA1 should have multiple isoforms and Fig. 4E/4N show inconsistent patterns. As such, mitochondrial dynamics is not likely an underlying mechanism. *

      RESPONSE: We agree with the reviewer on the assessment of the expression of complex 5 and have removed this claim from the manuscript. Regarding the expression of OPA1, protein extraction has been performed at different research institutes with slightly different buffers. Multiple bands (cleaved/uncleaved) have been described for OPA1 in the past, therefore we are certain that the correct protein has been detected.

      • Notably, RNAseq data in Suppl Fig. 4 (misplaced as Suppl Fig. 3) seem to indicate that AATBC over-expression promotes TG synthesis, while AATBC knockdown modulates cell death. The authors should consider exploring the leads from RNAseq analysis?*

      RESPONSE: We thank the reviewer for the feedback. The small number of altered genes in the RNASeq make us believe in a rather post-transcriptional role of AATBC. We investigated cell death and oxidative stress response as GO terms were highlighted in the analysis, but we were unable to detect any differences in the absence of AATBC, pointing to a minimal effect on transcriptional level (See figure below for the reviewers).

      • In Fig. 5, the AV-AATBC transduction in WAT/BAT is localized, transient and not homogeneous. Not surprisingly, this manipulation does not produce any robust effects. The difference in circulating leptin/leptin expression appears to be driven by 4-5 mice in the control group (Fig. 5H/5N). The correlation data in Fig. 6 and Fig. 7, although relevant, do not provide additional mechanistic insights. Unfortunately, the efforts in Fig. 5-7 fail to lead to information related to the biological function of adipose AATBC.*

      RESPONSE: We agree with the reviewer on the limitations of the AV model, but we have performed these experiments with the highest technical standard. As the reviewer states, the overexpression, especially in WAT, has different magnitudes depending on the individual mouse, but the overexpression is present and consistently high in every animal. We would expect even bigger alterations in a genetic model, which, however, is beyond the scope of this first manuscript on AATBC in adipocytes. We are disappointed that the reviewer does not value the human data presented, as it very strongly hints to a relevant function of our human lncRNA in vivo by robust correlations with established biomarkers mirroring the effects seen in vitro and in the mouse model. A limitation of human studies is in virtually every case that it is based on correlations, as manipulation of gene expression, which would be necessary to delineate a biological process as requested by the reviewer, is not possible in humans. We do not concur on dismissing our human data on that behalf.

    1. Actually, as Davidson argues, multitasking helps us see more and do more, and experience texts and tasks in different ways. There’s no evidence that anyone ever was deeply reading for hours on end with no interruptions. All we have are claims from Plato saying that writing is going to kill our ability to memorize. Our minds have always been wandering; we’ve always been distractible. We’ve always been doodling on the sides of pages, or thinking about our lunch, or stopping to converse with someone. Now we just have distraction that’s more readily available and purposefully attuned to distracting us — like popup ads, notifications; things that quite literally fly across your screen to distract you. But the fact that we have students who have grown up with those and have trained themselves to deal with those in such interesting ways is something that I think we should bring into the classroom and be talking about and critically thinking about

      1) the point that multitasking can offer different experiences with texts and tasks is interesting to me. initially, the comparison between multitasking and single-tasking seems like a clear distinction between what is beneficial (focus) and what is detrimental (distraction)

      2) taking a bold stance, i would venture to say that there exists a significant number of individuals who engage in deep work, which is perhaps one of the most profound pursuits throughout human history. after all, most of us have experienced a state of flow at least once, to some extent, and our brains subconsciously crave this state of heightened focus and productivity

      3) this observation all the more underscores the rarity of deep work in a world that is perpetually plagued by distractions

      here is one of my notes from deep work by cal newport:

      the connection between depth and meaning in human experience is undeniable. whether approached from the perspectives of neuroscience, psychology, or philosophy, there appears to be a profound correlation between engaging in deep, meaningful activities and a sense of fulfillment. this suggests that our species may have evolved to thrive in the realm of deep work and purposeful engagement

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      *The current manuscript by Shiryaev et al describes their observation of the new function of zika NS2B-NS3 proteases. They have shown that NS2B-NS3 protease lacking the helicase domain binds to RNA and the interaction can be affected by protease inhibitors. Main two new findings are presented in the manuscript: super open conformation of the protease; RNA binding activity of the protease region. Although the manuscript is interesting, the design of the experiments is not convincing. *

      Major issues:

        • the claim of a super open confirmation is problematic. Using an artificial construct lacking the C-terminal portion of NS2B will of course generate the open conformation. This is a wrong definition unless you observe such a conformation in living cells.*
      1. We understand the skepticism towards a less known super-open confutation of flavivirus NS2B-NS3pro complex. In addition to our own structure of ZIKV NS2B-NS3pro (PDB ID 7M1V), the crystal structure of another orthologous flavivirus Japanese encephalitis virus (JEV) NS2B-NS3pro (PDB ID 4R8T) was discovered in 2015 1. However, no functional analysis was provided for this crystal structure resulting in the lack of attention paid by the research community. We computed the overlay of the ZIKV NS2B-NS3 protease structures in the super-open conformation (PDB ID 7M1V, deposited by us in 2021) with the crystal structure of JEV protease (PDB ID 7M1V ) (Rebuttal Figure 1). We observed an almost identical organization of the critical NS3pro C-terminal loop between these two structures (RMSD 0.6A). Polypeptides with over 35% identity are very likely to have a similar fold2. Given over 50% identity(!) between flaviviral proteases across the family3,4, we posit that the super-open conformation demonstrated for JEV and ZIKV NS2B-NS3pro is a common feature of the Flaviviridae family. Further, NS2B peptide is always tightly associated with NS3pro via a three-strand beta-barrel (aa 49-58 of NS2B), which remains intact in all NS3Pro conformations. The C-terminal portion of NS2B progressively loses association with NS3pro, being mostly associated in the closed conformation, less so in the open, and even less in the super-open conformation. The G4SG4 linker between NS2B and NS3pro remains unstructured in all conformations. The native C-terminal portion of NS2B (TGKR) is equally unstructured when competed out of the protease active site by another substrate. It is unclear to us why “lacking the C-terminal portion of NS2B will of course generate the open conformation”.

      2. It is odd that authors made homology model to generate open conformation structures. the authors did not cite the two papers of eZiPro (Phoo et al 2016 NC) and bZiPro (Zhang et al 2016, Science). these two structures show the closed conformation of protease in the absence and presence of a natural substrate.*

      3. We agree with the reviewer that in both constructs eZiPro5 and bZiPro6 of ZIKV NS2B-NS3pro are likely to exist in the closed conformation as documented by the crystal structures. However, in both cases, the active center of ZIKV NS2B-NS3pro is occupied with a short peptide fragment, which is sufficient to induce the closed conformation of NS2B-NS3 protease. We superimposed eZiPro (PDB ID 5GJ4) with bZiPro (PDB ID 5GPI) to better demonstrate that the active center in both structures is occupied either by tetrapeptide TGKR (T127-G128-K129-R130 ) originating from the NS2B C-terminus (eZiPro) or by a tetrapeptide KKGE (K14-K15-G16-E17) originating from a neighboring NS3 molecule (bZiPro) (Rebuttal Figure 2). Indeed, Zheng et al., 2016 6 stated that: “the structure (bZiPro) does capture the protease in complex with a reverse peptide. The tetrapeptide K14K15G16E17 folds into a small hairpin loop to occupy the active site.” Further, Phoo et al., 2016 5 stated that “binding of the ‘TGKR’ peptide to the catalytic site stabilizes the protease (eZiPro)”. To the best of our knowledge, so far there are no crystal structures of flaviviral NS2B-NS3 proteases in the closed conformation without peptide/inhibitor in the active center. We take it as a hint that the closed conformation is always induced by a substrate present in the active center.

      Finally, we would like to draw the attention of this reviewer to the fact that the 15N R2 NMR signal from NS2B residues 65-85 is missing in bZiPro alone but re-appears when AcKR is added. This is consistent with the idea that without AcKR, bZiPro exists in the open conformation where much of the C-terminal part of NS2B is dissociated from NS3Pro and remains unstructured, thus resulting in the lack of NMR signal.

      • RNA binding is novel, but is it observed in cells? only one method was used for testing the interactions, not other biophysical methods are used.*

      • Given a complex network of protein-RNA interactions and the fact that NS3pro and NS3hel are connected by a single polypeptide, separating dynamically bound 11kB RNA to NS3pro from that to NS3hel in a native cell is a major technical challenge beyond the scope of this work. We employed a fluorescent polarization assay to demonstrate ssDNA and ssDNA binding to ZIKV NS2B-NS3pro. Subsequently, we employed a proteolytic activity assay with labeled peptide mimicking natural substrate for protease to demonstrate that the presence of ssRNA and ssDNA can efficiently inhibit proteolytic activity. To the best of our knowledge, this is the first indication that ssRNA or ssDNA could block proteolytic activity for any serine proteases, let alone a viral protease. Therefore, we consider the proteolytic activity assay used in the current work an orthogonal biochemical method supporting ssRNA binding to ZIKV NS2B-NS3pro.

      • binding studies with RNA used artificial construct, how about the one with KTGR present like eZiPro. Keep in mind that the P1-P4 residues are present under native conditions.*

      __- __As mentioned by the reviewer, TGKR peptide was found in the active center in the eZiPro crystal. Indeed, the junction region between NS2B and NS3 protease contains native cleavage sites for the NS2B-NS3Pro and is naturally cleaved by protease during the viral polyprotein processing. However, the TGKR peptide representing P1-P4 positions will have to leave the active center after the cleavage to ensure enzyme processivity/cleaving additional targets (otherwise, the protease would get stacked after the first cleavage). Proteolytic activity assay utilizes the fluorogenic peptide labeled with FAM (such as TGKR-FAM; where FAM is a group representing P1’ position in this case). TGKR-FAM peptide will compete and easily replace cleaved TGKR peptide from the active center in proteolytic activity assay. In sum, the C-terminal end of NS2B will be competed out of the protease active center by the next substrate, and there is no evidence that it will be naturally placed back in the active center after each round of protease proteolytic activity. Indeed, several crystal structures of flaviviral NS2B-NS3Pro in open conformation lack the C-terminal part of NS2B in the active center. Our unpublished NMR studies demonstrated that the C-terminal part of NS2B is unstructured in solution if the substrate peptide or small molecule inhibitor are not present in the active center of the protease.

      • authors built up nice models, it is great to consider the full length NS2B, but authors haven't taken into account the effect of NS2B on the open or closed conformation of the protease. *

      - __ All crystal structures of flavivirus NS2B-NS3pro in the closed, open, or super-open conformations have NS2B associated withNS3pro via a beta-barrel (__Rebuttal Figure 3), which is located at the opposite side from the RNA binding site. The transition from the closed to the open and to the super-open conformation is associated with the progressive dissociation of NS2B from NS3pro. Therefore, the effect of NS2B on NS3Pro is progressively diminished. In the closed conformation of NS3Pro, the negatively charged C-terminal part of NS2B is associated with the same positively charged grove as the RNA in the open conformation of NS3Pro. The C-terminal part of NS2B is dissociated from NS3Pro in the open conformation.

      Minor issues:

      *This manuscript shows the novel function of zika protease and conclude that protease binds to RNA. This is a novel finding, but the conclusion needs to be further confirmed, to avoid misinterpretations by future readers *

      • closed, and super open conformations. But the definition was not carefully compared with current literatures. I am surprised that the two important papers are not cited. It is well known the G4SG4 linker affect the conformation of the protease.*

      • The crystal structures and the proteolytic activities of gZiPro, eZiPro, and bZiPro are rather similar. In fact, Km (μM) are 2.86 ± 0.90 for gZiPro, 6.332 ± 2.41 for bZiPro, and the IC 50s of BPTI inhibition for gZiPro, eZiPro and bZiPro are 350, 76 and 12 nM respectively. NS2B and NS3pro have a large binding area in the closed conformation. Upon changing the conformation to the open conformation (and even more so to the super-open conformation), the C-terminal part of NS2B is progressively dissociated from NS3Pro. Therefore, possible minor effects introduced by the G4SG4 linker is unlikely to affect any of the conclusions in our work.

      • Authors need to show super open conformation is present in nature e.g. the model in which full length NS2B and NS3pro.*

      • A full-length NS2B has 2 transmembrane domains, which tether the NS2B-NS3pro complex to the cell membrane (we have modeled the presence of such transmembrane domains to account for the orientation of NS2B-NS3pro with respect to the cell membrane). The full-length complex has never been crystallized or tested in any assay due to the major technical challenges associated with the modeling of complex transmembrane proteins.

      • RNA is a charged molecule under some conditions, NS3 also have charged residues, it is important to show whether the binding between RNA-protease is relevant to the function{Luo, 2010 #9270;Chernov, 2008 #9275;Xu, 2019 #10006}, or is this due to the application of the artificial constructs used in this study. Why so many mutants are used? *

      • The requirement of NS3pro for the helicase function was shown by several investigators 7–9. Given the structural independence of NS3pro and NS3hel, which mostly rules out the allosteric effect, RNA binding by NS3pro is a newly proposed function of NS3pro for the helicase activity. We demonstrated biochemically that RNA-bound to NS3pro inhibits its protease function. A variety of mutants were used to constrain the conformations of NS2B-NS3pro (e.g. enforce the super-open confirmation) for crystallization studies.

      • Using a construct close to the native protease, at least the P1-P4 residues should be present. Using a peptide in the assay is also useful.*

      • We were unable to interpret this critique.

      • Test binding of RNA with protease using another method such as biophysical methods, or even gel shift assay*

      • We thank the reviewer for this suggestion. Although the gel-shift assay seems to be a reasonable method to test the binding, given the ease of spontaneous conformational change (i.e. into the super-open conformation), this assay could result in a progressive loss of bound RNA during migration in the gel.

      • I don't know the correlation between Figure 7 and Figure 6. The authors describe ploy A binding to protease, while Figure 7 is talking about Helicase binds to dsRNAs. *

      • There is no correlation. Figure 6 describes the models for NS2B-NS3pro binding to ssRNA. Figure 7 describes a separate point, the direction of dsRNA processing by NS3hel.

      • I am glad to see the consideration of full length NS2B, NS3 in the models Figure 8, 9 and 11, but there is no data to support any of the model proposed. *

      • There is no experimental data. We have modeled the N-terminal and C-terminal parts of full NS2B, which are predicted to be inserted into the cell membrane due to their characteristic amphipathic helical structure.

      • Is the linker a ploy G not G4SG4? *

      The linker is GGGGSGGGG (G4SG4) as stated in Materials and Methods of the manuscript.

      • Do the mutant sustain their protease activity? *

      • All mutants with intact catalytic centers have protease activity, except the mutants with a disulfide bridge that fixes the polypeptides in the super-open conformation.


      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      *The manuscript by Shiryaev et al., submitted to BioRXiv is an exploration of the ability of NS2B-NS3protease to bind RNA and its subsequent role in NS3 helicase processivity. The authors first utilize fluorescence polarization assays to demonstrate that NS2B-NS3protease can bind ssRNA with a strong affinity (and also ssDNA with lower affinity). They subsequently utilize mutational and small molecule inhibitor strategies in these assays to force the NS2B-NS3protease into different conformations, with the associated results inferring that the "open" conformation is responsible for ssRNA binding affinity. Furthermore, they demonstrate that ssRNA binding impairs protease activity, suggesting these roles may be exclusive in the viral life cycle. They also identified a number of small molecule ligands that target the putative ssRNA binding channel, and demonstrate that these ligands inhibit ssRNA binding by NS2B-NS3protease, providing potential inhibitor candidates for ZIKV. Finally, the authors utilized their crystal structures and others for the various conformations of NS2B-NS3protease to model ssRNA binding by the domain and the full NS3 protein, and used these models to propose a reverse inchworm model for NS3 travelling along ssRNA as it unwinds the dsRNA duplex. Overall, the authors utilize a comprehensive approach to demonstrate a number of novel findings (ssRNA binding by NS2B-NS3protease, small molecule ligands that inhibit this interaction) that would be of interest to both virologists and structural biologists. However, there are some important experimental design limitations and viral life cycle considerations that the authors should address before acceptance of the manuscript. Major and minor comments intended to improve the manuscript are outlined in more detail below. *

      Major Comments:

        • While the quantity of indirect data (ruled out closed and super-open, inhibitors of ssRNA binding pocket) suggest that the open conformation of NS2B-NS3protease is associated with ssRNA binding, the argument would be greatly strengthened by direct experimental data. Is there a mutational or small molecule approach to locking the NS2B-NS3 protease in the open conformation? If so, the authors should perform such experiments to strengthen the foundation of their argument.*
      1. Unfortunately, despite significant efforts, mutations or small molecules locking the NS2B-NS3 protease in the open conformation have not been identified for the ZIKV protease. However, several structures for NS2B-NS3 proteases have been documented in other flaviviruses (i.e., DENV PDB IDs 2FOM and 5T1V; WNV PDB ID 2GGV). Polypeptides with over 35% identity are very likely to have a similar fold2. Given over 50% identity(!) between flaviviral proteases across the family3,4, there is little doubt that ZIKV NS2-NS3 protease adopts an open conformation similar to all flaviviral proteases. Our modeling demonstrated that there are no sterically/structural problems in folding NS2B-NS3 protease into the open conformation.

      2. A negative control should be used in Figure 4A to strengthen the claim that ssRNA binding in the open conformation impairs protease activity (ie. include a curve for dsRNA). Such an experiment would lend support to ssRNA inhibition being due to specific binding instead of some other non-specific effect of increasing local nucleic acid concentration.*

      3. To address this critique, we have conducted the modeling of dsRNA binding to the open conformation of NS2B-NS3Pro. The model revealed that dsRNA could not be accommodated by the open conformation of the NS2B-NS3Pro complex (Rebuttal Figure 4). Indeed, dsRNA has a very different rigid structure compared to the extended form of the ssRNA chain. The dsRNA is unable to provide continuous interactions between negatively RNA backbone and positively charged side chain amino acids in NS3pro. The continuous interface on NS2B-NS3 protease interacting with ssRNA is an extension of the exit groove for one of the ssRNA strands exiting the NS3 Helicase after unwinding. Therefore, the ssRNA, but not dsRNA is naturally always present in close proximity of the NS2B-NS3Pro complex.

      4. *

      5. Due to the highly coupled roles of NS5 and NS3 in replication, the authors should include some more consideration of the role of NS5 in their complex. They very briefly address this interplay in the fifth paragraph of the discussion, but then neglect to discuss the implications any further. In particular (perhaps in a brief comparison to an NS3/NS5 modeling approach such as Brands et al., 2017; WIRES), the authors should consider some of the following questions: could the channel on protease domain lead to ssRNA entry site on RdRp?*

      6. Indeed, our model suggests that the negative strand (-)ssRNA exits from NS2B-NS3protease facing the ER membrane in the area where the protease is anchored to the ER membrane via the NS2B transmembrane domains. It is possible that NS3pro interacts with NS5 polymerase and “handles” (-)ssRNA to the NS5 polymerase. This scenario would modify Brands et al., 2017 model to add NS2B-NS3Pro complex between NS3Hel and NS5. However, at present, the NS3-NS5 (or NS2B-NS3-NS5) complex together has not been crystallized. It would be logical for NS5 polymerase to access the (-)ssRNA strand after it is released from NS2B-NS3Pro since the (-)ssRNA strands are used as a template for the (+)ssRNA which is used for polyprotein synthesis and packaging into viral particles.

      7. would NS5 interaction constrain or augment inchworm model of NS2B/NS3 translocation? *

      8. Yes, integrating NS5 interaction with the NS2B-NS3pro handling (-)ssRNA will augment the utility of the suggested reverse inchworm model.

      9. how does increased activity of NS3 when complexed with NS5 (**Xu et al. 2019) align with proposed inchworm model? *

      10. We appreciate the reviewer's question. We think that NS2, NS3, NS4, and NS5 work in concert as one coordinated complex where various subunits of NS2 and NS4 may provide anchoring of the entire complex to the ER membrane. Indeed, such a complex has recently been proposed6. Also, see our response to the previous reviewer’s point (#4). We have incorporated this discussion into the revised manuscript.

      Minor Comments: 1. Introduction, 4th paragraph, NS3-NS4 should read NS3-NS4A.

      • We corrected this sentence.

      * ** Throughout the manuscript, the authors should denote some key amino acid residues in each figure to help orient the reader better to the observed structural changes and rotations. Inclusion, at least in the supplement, of the crystal structures of mutants solved herein should **also be included. *

      • We annotated the key residues in all figures (e.g. catalytic residues, loop interacting with the membrane, position of NS2B, and other elements) and kept the same orientation of complexes in all figures.

      • Section: RNA binding inhibits the proteolytic activity of ZIKV NS2B-NS3pro, last sentence, NS2N-NS3pro should be NS2B-NS3pro*.

      • We corrected this sentence.

      • Section: Allosteric inhibitors of NS2B-NS3 protease interfere with RNA binding- first sentence: "The open conformation of NS2B-NS3pro is achieved by the rearrangement of NS2B cofactor (its dissociation from the C-terminal half of NS3pro) leading to a loss of proteolytic activity [32]. - the reference is not correct. I could not find the reference the authors refer to here and had not heard before that NS2B cofactor was able to disassociate from the C-terminal half of NS3pro; hence, this really needs to be appropriately referenced. *

      • We have revised this sentence and added additional references. “The open conformation of NS2B-NS3pro is achieved by the rearrangement of NS2B cofactor (partial dissociation from NS3pro), leading to a loss of proteolytic activity4,11.”

      • Section: Modeling RNA binding to ZIKV NS2B-NS3, first sentence - unwinds should be unwind*.

      • We corrected this sentence.


      • With respect to the results of Figure 3A, the authors should address that adding the linker alone to the NS3 protease may not be an accurate examination of its role/importance. The linker in this scenario is only constrained at its N-terminus, while it is always constrained at both termini during infection (and even more so by the interactions of those two linked domains [protease and helicase] with each other). As such, the authors statement that "observations suggests that the 12-aa linker region modulates RNA binding to NS2B-NS3pro" should be more strongly qualified to this effect. In addition, it would be interesting to see the effects of the linker mutations on ssRNA binding in the context of the full NS3 protein, albeit admittedly more complex due to the confounding ssRNA binding by the helicase domain.*

      • We agree with this reviewer that the protease-helicase linker is also restrained at both termini. We have rephrased the statement in the revised manuscript. The goal of the experiment shown in Figure 3A was to examine whether a negatively charged linker is able to compete with ssRNA binding as we expected from the structural model. The mutational analysis of the protease helicase linker is, indeed, a very interesting subject that is, however, beyond the scope of this work.

      7. The NS#hel should be changed to NS3hel in part (C) of figure legend for Figure 11. - We corrected this mishap.

      • The authors data in Figure 4A (and even more so the nature of the viral life cycle where 1000s of viral polyproteins are created from the first genome during infection) disputes the depiction in the inchworm model of how NS3 protease cleaves the polyprotein while the helicase binds ssRNA. At minimum, the authors need to discuss this discrepancy, and it is recommended that they modify the cartoon in their model to not include the ssRNA binding on the protease side of the equation (or show as alternative on that side to the existing cartoon).*
      • Indeed, as proposed by our reverse inchworm model, ssRNA is not bound to NS3Pro in the closed conformation, while NS2B-NS3pro has a protein substrate in the active center (Figure 11A). We agree that NS2B-NS3Pro in the closed conformation cannot bind ssRNA as we demonstrated in competitive cleavage assay. Only large amounts of ssRNA can shift the balance towards the open conformation which binds ssRNA. We think that most of the time NS2B-NS3Pro cycles between the open and the super conformations handling ssRNA (Figure 11(B-C_D), but as soon as protein substrate becomes available (typically a loop from a transmembrane viral polypeptide), NS2B-NS3Pro quickly switches to the closed proteolytically active conformation to act as protease.

      • In the third paragraph of the discussion, the authors state "An alternative model of coupled transcription and translation where viral RNA is associated with ribosomes right after the release from NS2B-NS3 is also possible". Considering there is abundant evidence that translation and replication are exclusive and that translation does not take place in ROs, it would be prudent to remove such statements from the discussion. Without any supporting evidence, these statements will be misleading to readers by providing a false equivalency. The preceding discussion of RFs would be sufficient to contextualize your inchworm model in the broader viral life cycle (which was done quite well). *

      • We have adjusted the discussion in the revised manuscript to avoid a false equivalency.

      10. There were a number of aspects I appreciated about the manuscript and will briefly list a few here: ** i) the focus on how different non-structural proteins effect the structure and function of ** each other during the viral life cycle, which forms a more comprehensive and informative model ** ii) the use of structural and functional assays as complementary approaches to studying the intra- and inter-protein relationships of NS3 ** iii) the depiction of the forks in Figure 10, which effectively demonstrated the channels and oriented the reader to the conservation data ** *iv) the use of small molecule inhibitors to modify structure and function of NS3, which greatly deepened the richness of the story from both a basic and applied science view point *

      • We are very grateful to the Reviewer for these kind remarks.

      Reviewer #2 (Significance (Required)): ** Strengths and limitations: ** - provides some experimental and modeling data to provide a new model for RNA interactions with the NS3pro-hel; may help inform models for enzyme function, mostly consistent with previous literature ** - leaves out the NS5 RdRp, known to contribute to NS3 activity. ** - some suggestions are made which might strengthen the conclusions and inclusions of additional controls would improve the data. ** Advance ** - conceptual, perhaps may provide some insight into mechanism; although limited by the lack of NS5 RdRp which is crucial to helicase activity. It is unclear if the ssRNA would be oriented this way given interactions with NS5 RdRp and MT domains (is the ssRNA routed to NS5 or along NS3, or potentially are both happening?) ** Audience: ** - quite specialist, but may include structural biologists and virologist alike. ** Expertise of the reviewer(s): ** *- molecular virologists, RNA viruses - including flaviviruses; replication complex biogenesis, protein-RNA and RNA-RNA interactions. While comfortable with the concepts regarding complex formation, the appropriateness of computational modeling and RNA docking tools as well as structural biology is out of our area of expertise. *






      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      *This paper investigates the nucleic acid binding properties of zika virus protease. In particular the data suggest that single stranded RNAs and DNAs are capable of binding to and inhibiting ZIKV protease at micromolar concentrations. With the use of active site inhibitors and mutants that lock the protease in closed and super-open conformation, the authors concluded that RNA binds to the open conformation. Through extensive modeling of the protease and helicase domains, this manuscript provides a model of how ssRNAs can bind to all conformations of the proteas, but the open conformation provides two positively charged forks that should be available to bind RNA. *

      * SECTION A - Evidence, reproducibility, and clarity ** Major comments: **

      *·The main conclusions of this paper rely on the existence of the super-open conformation, however this conformation has not been reported in the scientific literature previously. Structures deposited in the pdb are referenced in this manuscript, however no citation for an accompanying publication is provided. This calls into question the biological relevance of this super open conformation. This is of particular concern because in other highly-homologous flaviviral proteases, structures that have been observed crystallographically (e.g. the open conformation of dengue virus protease) appear to be only very sparsely populated in solution. What is the evidence that the super-open conformation exists in solution.

      • Please, see our reply to question #1 from Reviewer 1.

      • The activity of each of the constructs used was not reported making it impossible to directly compare the impact of these changes on intrinsic activity. In particular, the NS2B-NS3 long construct is predicted to exist in the super-open conformation. If this is correct, it should show no activity against a peptide substrate. *
      • We appreciate these concerns. The NS2B-NS3pro-long construct is proteolytically active (only NS2B-NS3pro-short construct is proteolytically inactive because its NS3pro C-terminal part is too short to fold into the closed conformation). It is unconstrained and likely capable of adopting all possible conformations (closed, open, super open). As we suspected, the negatively charged linker interferes with RNA binding, potentially via direct competition. Investigating the role of the protease-helicase linker is an exciting subject of a separate manuscript in preparation.

      • This paper reports that the IC50 is much weaker than the Kd for binding of ssRNA to ZIKV NS2B-NS3pro. Are orthogonal assays, such as thermal shift assay, available which could distinguish between the reported IC50 and the Kd. *
      • Binding of ssRNA occurs in an area distinct from the protease active center. We think that there is a constant competition between C-terminal NS2B binding/release versus ssRNA binding/release from NS3pro. We think that ssRNA “catches” the moment when protease has the open conformation and freezes that conformation by blocking the C-terminal of NS2B from binding to NS3Pro. In terms of thermal shift assay, the structure of NS3Pro is changed, only the C-terminal of NS2B is affected. Note that the 15N R2 NMR signal from NS2B residues 65-85 is missing in bZiPro alone but re-appears when AcKR is added6. This is consistent with the idea that without AcKR, bZiPro exists in the open conformation where much of the C-terminal part of NS2B is dissociated from NS3Pro and remains unstructured, thus resulting in the lack of NMR signal. Taken together, these observations suggest that thermal shift assay is unlikely to be of much help.

      • *This paper suggests that ssRNA binds to the open conformation of ZIKV NS2B-NS3pro, however no experimental evidence, only modeling has been used to suggest binding to the open conformation. In Dengue virus protease, the M84P variant has been reported to lock the protease into the open conformation. How does the F84P variant of ZIKV NS2B-NS3pro impact ssRNA binding? *

      • We appreciate this question. Indeed, M84P mutation shifts Dengue NS3Pro to the open conformation, which is proteolytically inactive12, consistent with our reverse inchworm model. We have not investigated the effect of this mutation on ZIKV NS3pro. We expect this mutation has a similar effect in ZIKV NS3pro in Dengue NS3Pro.

      • The relevance of the discussion on the co-crystallization of NSC86314 with the Mut7was not clear. What point was being made?

      • We provide a proof-of-principle for a novel class of allosteric inhibitors that specifically target newly identified druggable pockets present in the open and super-open conformations of ZIKV NS2B-NS3pro. Our results suggest that such allosteric inhibitors can interfere with the RNA-binding activities of NS2B-NS3pro in addition to blocking the protease activity. The co-crystallization of NSC86314 with the Mut7 confirms a novel pocked bound by NSC86314.

      *- These data show that both active site and allosteric inhibitors block binding of ssRNA to the protease. The paper also suggests that ssRNA only binds to the open conformation. What is the evidence that the allosteric inhibitors do not enable or promote formation of the open conformation? *

      • We thank this reviewer for an interesting question. Indeed, we have no evidence of whether allosteric inhibitors enable or promote the formation of the open conformation. This is formally possible and will need to be investigated.

      • This paper makes two claims about the function of the protease. The title should specify what those dual functions are (proteolytic activity and ssRNA-recruitment).*
      • We appreciate this reviewer's suggestions for the title.

      • The discussion of Figures 6 and 9 are highly similar. The main takeaway points for both figures seem to be nearly identical: the presence of two positively charged pitchfork on the open conformation. The distinction between these two figures should be more significantly and explicitly stated. *
      • Figure 6 presents several models that provide evidence for the open conformation of ZIKV NS2B-NS3pro being uniquely suitable to bind RNA. Figure 9 presents several models of the entire RNA-NS2B-NS3pro-NS3hel complex anchored into the ER membrane. Figure 9 illustrates that the open conformation of NS2B-NS3pro provides two positively charged/polar forks, contiguous with the positively charged groove on NS3hel. Figure 6 does not illustrate that point.

      *- Mention explicitly in the materials and methods if the 12-amino acid linker is present in all the mutants used. *

      • This is mentioned explicitly and shown in Supplementary Figure 2A.

      Minor comments: ** · Figure 1. The rotation that promotes the transitions from orientation in panel A to that in panel B should be drawn. ** · FAM should be defined in the legend of Figure 2. ** · The term Cold should be changed to unlabeled. ** · Please check labels for the supplementary Figure 2. For example one label states 1-1 but it ** should be 1-170. ** · Figure 1C does not exist and it is referenced in the results section under "NS2B-NS3pro substrate-mimicking inhibitors compete with RNA binding." ** · As discussed above, if the super open conformation is going to be addressed in this paper, then either a reference for the manuscript describing those structures should be included, or this manuscript should include in the materials and methods the procedure on crystallization, data collection, structure determination, refinement, and analysis as well as a table for crystallographic data and refinement statistics. ** · Adjust figure arrangement (ABCED to ABCDE) in Figure 11.

      • We thank this reviewer for all minor comments. We corrected the above-mentioned errors in the manuscript.

      Reviewer #3 (Significance (Required)): ** It is well established that the flaviviral proteases exist in different conformations but most of the structures published are concentrated on the closed conformation which is the one required for effective substrate processing. The open conformation has recently been the subject of increased interest, especially with the discovery of allosteric inhibitors for which modeling suggests that these compounds result in the dissociation of the C-terminal region of NS2B from the NS3. This paper adds important insights into the function of the open conformation and in general implicitly shows the importance of the dynamic nature of ZIKV NS2B-NS3pro. In addition to these insights, this paper aptly demonstrates that ssRNA can bind and inhibit these proteases as has not been shown previously. ** I am a senior graduate student working on characterizing and understanding the mechanism of action of allosteric compounds against viral proteases, specifically proteases from Zika and dengue viruses.

      References.

      1. Weinert T, Olieric V, Waltersperger S, Panepucci E, Chen L, Zhang H, Zhou D, Rose J, Ebihara A, Kuramitsu S, Li D, Howe N, Schnapp G, Pautsch A, Bargsten K, Prota AE, Surana P, Kottur J, Nair DT, Basilico F, Cecatiello V, Pasqualato S, Boland A, Weichenrieder O, Wang BC, Steinmetz MO, Caffrey M, Wang M. Fast native-SAD phasing for routine macromolecular structure determination. Nat Methods. nature.com; 2015 Feb;12(2):131–133. PMID: 25506719
      2. Solis AD, Rackovsky SR. Fold homology detection using sequence fragment composition profiles of proteins. Proteins. 2010 Oct;78(13):2745–2756. PMCID: PMC2933786
      3. Brinkworth RI, Fairlie DP, Leung D, Young PR. Homology model of the dengue 2 virus NS3 protease: putative interactions with both substrate and NS2B cofactor. J Gen Virol. 1999 May;80 ( Pt 5):1167–1177. PMID: 10355763
      4. Aleshin AE, Shiryaev SA, Strongin AY, Liddington RC. Structural evidence for regulation and specificity of flaviviral proteases and evolution of the Flaviviridae fold. Protein Sci. 2007 May;16(5):795–806. PMCID: PMC2206648
      5. Phoo WW, Li Y, Zhang Z, Lee MY, Loh YR, Tan YB, Ng EY, Lescar J, Kang C, Luo D. Structure of the NS2B-NS3 protease from Zika virus after self-cleavage. Nat Commun. 2016 Nov 15;7:13410. PMCID: PMC5116066
      6. Zhang Z, Li Y, Loh YR, Phoo WW, Hung AW, Kang C, Luo D. Crystal structure of unlinked NS2B-NS3 protease from Zika virus. Science. science.org; 2016 Dec 23;354(6319):1597–1600. PMID: 27940580
      7. Luo D, Wei N, Doan DN, Paradkar PN, Chong Y, Davidson AD, Kotaka M, Lescar J, Vasudevan SG. Flexibility between the protease and helicase domains of the dengue virus NS3 protein conferred by the linker region and its functional implications. J Biol Chem. 2010 Jun 11;285(24):18817–18827. PMCID: PMC2881804
      8. Chernov AV, Shiryaev SA, Aleshin AE, Ratnikov BI, Smith JW, Liddington RC, Strongin AY. The two-component NS2B-NS3 proteinase represses DNA unwinding activity of the West Nile virus NS3 helicase. J Biol Chem. 2008 Jun 20;283(25):17270–17278. PMCID: PMC2427327
      9. Xu S, Ci Y, Wang L, Yang Y, Zhang L, Xu C, Qin C, Shi L. Zika virus NS3 is a canonical RNA helicase stimulated by NS5 RNA polymerase. Nucleic Acids Res. 2019 Sep 19;47(16):8693–8707. PMCID: PMC6895266
      10. Klema VJ, Padmanabhan R, Choi KH. Flaviviral Replication Complex: Coordination between RNA Synthesis and 5’-RNA Capping. Viruses. 2015 Aug 13;7(8):4640–4656. PMCID: PMC4576198
      11. Shiryaev SA, Aleshin AE, Muranaka N, Kukreja M, Routenberg DA, Remacle AG, Liddington RC, Cieplak P, Kozlov IA, Strongin AY. Structural and functional diversity of metalloproteinases encoded by the Bacteroides fragilis pathogenicity island. FEBS J. 2014 Jun;281(11):2487–2502. PMCID: PMC4047133
      12. Lee WHK, Liu W, Fan JS, Yang D. Dengue virus protease activity modulated by dynamics of protease cofactor. Biophys J. 2021 Jun 15;120(12):2444–2453. PMCID: PMC8390872
    1. My children live with an unconscious fear that they may not live out their natural lives. I am not saying that fear is good. I am trying to find a way to deal with that anxiety. An architecture that puts its head in the sand and goes back to neoclassicism, and Schinkel, Lutyens, and Ledoux, does not seem to be a way of dealing with the present anxiety. Most of what my colleagues are doing today does not seem to be the way to go. Equally, I do not believe that the way to go, as you suggest, is to put up structures to make people feel comfortable, to preclude that anxiety. What is a person to do if he cannot react against anxiety or see it pictured in his life? After all, that is what all those evil Struwwel Peter characters are for in German fairy tales. CA: Don't you think there is enough anxiety at present? Do you really think we need to manufacture more anxiety in the form of buildings?

      to manufacture more anxiety in the form of buildings

    1. Author Response:

      The following is the authors' response to the current reviews.

      Reviewer #1 (Public Review):

      This revised manuscript by Walker et. al. addresses some of the editorial points and conceptual discussion, but in general, most of my suggestions (as the previous reviewer #1) for additional experimentation or addition were not addressed as discussed below. Therefore, my overall review has not changed.

      In our previous response, we included i) extra experimental data illustrating the reproducibility of our results and ii) added transcription start site data at the request of this reviewer. We included the information because we agreed with the reviewer that these were important points to address. For the points raised again below, we explained why the additional analysis was unlikely to add much in terms of insight or rigour. We have elaborated further below.   

      1) For example, in point 1, the suggested analysis was not performed because it is not trivial. My reason for making this suggestion is that the original manuscript was limited to Vibrio cholerae, and the impact of the manuscript would increase if the findings here were demonstrated to be more broadly applicable. I expect papers published in eLife to have such broad applicability. But no changes were made to the manuscript in this regard. The revised version is still limited to only Vibrio cholerae.

      Our paper is focused on the unexpected co-operative interactions between HapR and CRP. Such co-binding of two transcription factors to the same DNA site is unexpected. Consequently, it is this mode of DNA binding that is likely to be of broad interest. With this in mind, we did provide experimental, and bioinformatic, analyses for other regulatory regions and other vibrio species (Figures S3 and S6). This, in our view, is where the “broad applicability” for papers published in eLife comes from.

      The analysis the reviewer suggests is not related to the main message of our paper. Instead, the reviewer is asking how many HapR binding sites seen here by ChIP-seq are also seen in other vibrio species by ChIP-seq. This is only likely to be of interest to readers with an extremely specific interest in both vibrio species and HapR. The reviewer states above that we did not make the change “because it is not trivial”. This is an oversimplification of the rationale we presented in our response. The analysis is indeed not straightforward. However, much more importantly, the outcome is unlikely to be of interest to many readers, and has no bearing on the rigour of work. With this in mind, we do not think our position is unreasonable. We also stress that, should a reader with this very specific interest want to explore further, all of our data are freely available for them to do so.

      2) For point 2, the activity of FLAG-tag luxO could have been simply validated in a complementation assay. Yes, they demonstrated DNA binding, but that is not the only activity of LuxO.

      DNA binding by LuxO is the only activity of the protein with which we are concerned in our paper. Furthermore, LuxO is very much a side issue; we found binding to only the known targets and potentially, at very low levels, one additional target. No further LuxO experiments were done for this reason. Indeed, even if these data were removed completely, our conclusions would not change or be supported any less vigorously. We are happy to remove the LuxO data if the reviewer would prefer but this would seem like overkill.

      3) For point 7, the transcriptional fusions were not explored at different times or different media, which is also something that was hinted at by other reviewers. In regard to exploring expression at different time points, this seems particularly relevant for QS regulated genes.

      In their previous review, the reviewer did not request that such experiments were done. Similarly, no other reviewer requested these experiments. Instead, this reviewer i) commented that lacZ fusions were not as sensitive as luciferase fusions ii) asked if we had done any time point experiments. We agreed with the first point, whilst also noting that lacZ is not unusual to use as a reporter. For the second point, we responded that we had not done such experiments (which by the reviewer’s own logic would have been complicated using lacZ as a reporter). This seems like a perfectly reasonable way to respond.   

      We should stress that these comments all refer to Figure 2a, which was our initial screening of 23 promoter::lacZ fusions, supported by separate in vitro transcription assays. Only one of these fusions was followed up as the main story in the paper. Given that the other 22 fusions were not investigated further, and do not form part of the main story, there would seem little value in now going back to assay them at different time points.

      4) For point 13, the authors express that doing an additional CHIP-Seq is outside of the scope of this manuscript. Perhaps that is the case, but the point of the comment is to validate the in vitro binding results with an in vivo binding assay. A targeted CHIP-Seq approach specifically analyzing the promoters where cooperative binding was observed in vitro could have addressed this point.

      We did appreciate the original comment, and responded as such, but we do think additional ChIP-seq assays are outside the scope of this paper.

      Reviewer #2 (Public Review):

      This manuscript by Walker et al describes an elegant study that synergizes our knowledge of virulence gene regulation of Vibrio cholerae. The work brings a new element of regulation for CRP, notably that CRP and the high density regulator HapR co-occupy the same site on the DNA but modeling predicts they occupy different faces of the DNA. The DNA binding and structural modeling work is nicely conducted and data of co-occupation are convincing. The work seeks to integrate the findings into our current state of knowledge of HapR and CRP regulated genes at the transition from the environment and infection. The strength of the paper is the nice ChIP-seq analysis and the structural modeling and the integration of their work with other studies.

      We thank the reviewer for the positive comments.

      The weakness is that it is not clear how representative these data are of multiple hapR/CRP binding sites

      This comment does not consider all data in our paper. We did test our model experimentally at multiple HapR and CRP binding sites. These data are shown in Figure S6 and confirm the co-operative interaction between HapR and CRP at 4 of a further 5 shared binding sites tested. We also used bioinformatics to show the same juxtaposition of CRP and HapR sites in other vibrio species (Figure S3). Hence, the model seems representative of most sites shared by HapR and CRP.

      or how the work integrates as a whole with the entire transcriptome that would include genes discovered by others.

      At the request of the reviewers, our revision integrated our ChIP-seq data with dRNA-seq data. No other suggestions to ingrate transcriptome data were made by the reviewers. 

      Overall this is a solid work that provides an understanding of integrated gene regulation in response to multiple environmental cues.

      We thank the reviewer for the positive comment.

      —————

      The following is the authors' response to the original reviews.

      Reviewer #1 (Public Review):

      This manuscript by Walker et. al. explores the interplay between the global regulators HapR (the QS master high cell density (HDC) regulator) and CRP. Using ChIP-Seq, the authors find that at several sites, the HapR and CRP binding sites overlap. A detailed exploration of the murPQ promoter finds that CRP binding promotes HapR binding, which leads to repression of murPQ. The authors have a comprehensive set of experiments that paints a nice story providing a mechanistic explanation for converging global regulation.

      We thank the reviewer for their positive evaluation.

      I did feel there are some weak points though, in particular the lack of integration of previously identified transcription start sites

      For completeness, we have now added the position and orientation or the nearest TSSs to each HapR or LuxO binding peak in Table 1 (based on Papenfort et al.).

      the lack of replication (at least replication presented in the manuscript) for many figures,

      We assume that the reviewer is referring to gel images rather than any other type of assay output (were error bars, derived from replicates, are shown). As is standard, we show representative gel images. All associated DNA binding and in vitro transcription experiments have been done multiple times. Indeed, comparison between figures reveals several instances of such replication (e.g. Figures 4b & 5d, Figures 4d & 5e). We have added details of repeats done to the methods section.

      some oddities in the growth curve

      We do not know why cells lacking hapR have a growth curve that appears biphasic. We can only assume that this is due to some regulatory effect of HapR, distinct from the murQP locus. Despite the unusual shape of the growth curve, the data are consistent with our conclusions.

      and not reexamining their HapR/CRP cooperative binding model in vivo using ChIP-Seq.

      We agree that these would be interesting experiments and, in the future, we may well do such work. Even without these data, our current model is well supported by the data presented (and the reviewer seems to agree with this above).

      Reviewer #2 (Public Review):

      This manuscript by Walker et al describes an elegant study that synergizes our knowledge of virulence gene regulation of Vibrio cholerae. The work brings a new element of regulation for CRP, notably that CRP and the high density regulator HapR co-occupy the same site on the DNA but modeling predicts they occupy different faces of the DNA. The DNA binding and structural modeling work is nicely conducted and data of co-occupation are convincing. The work could benefit from doing a better job in the manuscript preparation to integrate the findings into our current state of knowledge of HapR and CRP regulated genes and to elevate the impact of the work to address how bacteria are responding to the nutritional environment. Importantly, the focus of the work is heavily based on the impact of use of GlcNAc as a carbon source when bacteria bind to chitin in the environment, but absent the impact during infection when CRP and HapR have known roles. Further, the impact on biological events controlled by HapR integration with the utilization of carbon sources (including biofilm formation) is not explored.

      We thank the reviewer for their overall positive evaluation.

      The rigor and reproducibility of the work needs to be better conveyed.

      Reviewer 1 made a similar comment (see above) and we have modified the manuscript accordingly.

      Specific comments to address:

      1)  Abstract. A comment on the impact of this work should be included in the last sentence. Specifically, how the integration of CRP with QS for gene expression under specific environments impacts the lifestyle of Vc is needed. The discussion includes comments regarding the impact of CRP regulation as a sensor of carbon source and nutrition and these could be quickly summarized as part of the abstract.

      We have added an extra sentence. However, we have used cautious language as we do not show impacts on lifestyle (beyond MurNAc utilisation) directly. These can only be inferred.

      2)  Line 74. This paper examines the overlap of HapR with CRP, but ignores entirely AphA. HapR is repressed by Qrrs (downstream of LuxO-P) while AphA is activated by Qrrs. With LuxO activating AphA, it has a significant sized "regulon" of genes turned on at low density. It seems reasonable that there is a possibility of overlap also between CRP and AphA. While doing an AphA CHIP-seq is likely outside the scope of this work, some bioinformatic or simply a visual analysis of the promoters known AphA regulated genes would be interest to comment on with speculation in the discussion and/or supplement.

      In short, everything that the reviewer suggests here has already been done and was covered in our original submission (see text towards the end of the Discussion). Also, we would like to point the referee to our earlier publication (Haycocks et al. 2019. The quorum sensing transcription factor AphA directly regulates natural competence in Vibrio cholerae. PLoS Genet. 15:e1008362).

      3)  Line 100. Accordingly with the above statement, the focus here on HapR indicates that the focus is on gene expression via LuxO and HapR, at high density. Thus the sentence should read "we sought to map the binding of LuxO and HapR of V. cholerae genome at high density".

      Note that expression of LuxO and HapR is ectopic in these experiments (i.e. uncoupled from culture density).

      4)  Line 109. The identification of minor LuxO binding site in the intergenic region between VC1142 and VC1143 raises whether there may be a previously unrecognized sRNA here. As another panel in figure S1, can you provide a map of the intergenic region showing the start codons and putative -10 to -35 sites. Is there room here for an sRNA? Is there one known from the many sRNA predictions / identifications previously done? Some additional analysis would be helpful.

      We have added an extra panel to Figure S1 showing the position of TSSs relative to the location of LuxO binding. We have altered the main text to accommodate this addition..

      5)  Line 117. This sentence states that the CHIP seq analysis in this study includes previously identified HapR regulated genes, but does not reveal that many known HapR regulated genes are absent from Table 1 and thus were missed in this study. Of 24 HapR regulated investigated by Tsou et al, only 1 is found in Table 1 of this study. A few are commented in the discussion and Figure S7. It might be useful to add a Venn Diagram to Figure 1 (and list table in supplement) for results of Tsou et al, Waters et al, Lin et al, and Nielson et al and any others). A major question is whether the trend found here for genes identified by CHIP-seq in this study hold up across the entire HapR regulon. There should also be comments in the discussion on perhaps how different methods (including growth state and carbon sources of media) may have impacted the complexity of the regulon identified by the different authors and different methods.

      We have added a list of known sites to the supplementary material (new Table S1). We were unsure what was meant by the comment “A major question is whether the trend found here for genes identified by CHIP-seq in this study hold up across the entire HapR regulon”. We have added the extra comment to the discussion re growth conditions, also noting that most previous studies relied on in vitro, rather than in vivo, DNA binding assays.

      6)  The transcription data are generally well performed. In all figures, add comments to the figure legends that the experiments are representative gels from n=# (the number of replicate experiments for the gel based assays). Statements to the rigor of the work are currently missing.

      See responses above. We have added a comment on numbers of repeats to the methods section.

      7)  Line 357-360. The demonstration of lack of growth on MurNAc is a nice for the impact of the work. However, more detailed comments are needed for M9 plus glucose for the uninformed reader to be reminded that growth in glucose is also impaired due to lack of cAMP in glucose replete conditions and thus minimal CRP is active. But why is this now dependent of hapR? A reminder also that in LB oligopeptides from tryptone are the main carbon source and thus CRP would be active.

      We find this point a little confusing and, maybe, two issues (murQP regulation, and growth in general) are being conflated. In particular, we do not understand the comment “growth in glucose is also impaired due to lack of cAMP in glucose replete conditions and thus minimal CRP is active”.

      Growth in glucose should indeed result in lower cAMP levels*, and hence less active CRP, but this does not impair growth. This is simply the cell’s strategy for using its preferred carbon source. If the reviewer were instead referring to some aspect of P_murQP_ regulation then yes, we would expect promoter activity to be lower because less active CRP would be available in the presence of glucose. The reviewer also comments “why is this now dependent of hapR?”. We assume that they are referring to some aspect of growth in minimal media with glucose. If so, the only hapR effect is the change in growth rate as cells enter mid-late log-phase (i.e. the growth curve looks somewhat biphasic). A similar effect is seen in all conditions. We do not know why this happens and can only conclude this is due to some unknown regulatory activity of HapR. Overall, the key point from these experiments is that loss if luxO, which results in constitutive hapR expression, lengthens lag phase only for growth with MurNAc as the sole carbon source.

      *Although in V. fischeri (PMID: 26062003) cAMP levels increase in the presence of glucose.

      8)  A great final experiment to demonstrate the model would have been to show co-localization of the promoter by CRP and HapR from bacteria grown in LB media but not in LB+glucose or in M9+glycerol and M9+MurNAc but not M9+glucose. This would enhance the model by linking more directly to the carbon sources (currently only indirect via growth curves)

      This is unlikely to be as straightforward as suggested. The sensitivity of CRP binding to growth conditions is not uniform across different binding sites. For instance, the CRP dependence of the E. coli melAB promoter is only evident in minimal media (PMID: 11742992) whilst the role of CRP at the acs promoter is evident in tryptone broth (PMID: 14651625). Similarly, as noted above, in Vibrio fischeri glucose causes and increase in cAMP levels. (PMID: 26062003).

      9) Discussion. Comments and model focus heavily on GlcNAc-6P but HapR has a regulator role also during late infection (high density). How does CRP co-operativity impact during the in vivo conditions?

      We really can’t answer this question with any certainty; we have not done any infection experiments in this work.

      Does the Biphasic role of CRP play a role here (PMID: 20862321)?

      Again, we cannot answer this question with any confidence as experimentation would be required. However, the suggestion is certainly plausible.

      Reviewer #3 (Public Review):

      Bacteria sense and respond to multiple signals and cues to regulate gene expression. To define the complex network of signaling that ultimately controls transcription of many genes in cells requires an understanding of how multiple signaling systems can converge to effect gene expression and ensuing bacterial behaviors. The global transcription factor CRP has been studied for decades as a regulator of genes in response to glucose availability. It's direct and indirect effects on gene expression have been documented in E. coli and other bacteria including pathogens including Vibrio cholerae. Likewise, the master regulator of quorum sensing (QS), HapR), is a well-studied transcription factor that directly controls many genes in Vibrio cholerae and other Vibrios in response to autoinducer molecules that accumulate at high cell density. By contrast, low cell density gene expression is governed by another regulator AphA. It has not yet been described how HapR and CRP may together work to directly control transcription and what genes are under such direct dual control.

      We thank the reviewer for their assessment of our work.

      Using both in vivo methods with gene fusions to lacZ and in vitro transcription assays, the authors proceed to identify the smaller subset of genes whose transcription is directly repressed (7) and activated (2) by HapR. Prior work from this group identified the direct CRP binding sites in the V. cholerae genome as well as promoters with overlapping binding sites for AphA and CRP, thus it appears a logical extension of these prior studies is to explore here promoters for potential integration of HapR and CRP. Inclusion of this rationale was not included in the introduction of CRP protein to the in vitro experiments.

      We understand the reviewer’s comment. However, the rationale for adding CRP was not that we had previously seen interplay between AphA and CRP (although this is a relevant discussion point, which we did make). Rather, we had noticed that there was an almost perfect CRP site perfectly positioned to activate PmurQP. Hence, CRP was added.

      Seven genes are found to be repressed by HapR in vivo, the promoter regions of only six are repressed in vitro with purified HapR protein alone. The authors propose and then present evidence that the seventh promoter, which controls murPQ, requires CRP to be repressed by HapR both using in vivo and vitro methods. This is a critical insight that drives the rest of the manuscripts focus. The DNase protection assay conducted supports the emerging model that both CRP and HapR bind at the same region of the murPQ promoter, but interpret is difficult due to the poor quality of the blot.

      There are areas of apparent protection at positions +1 to +15 that are not discussed, and the areas highlighted are difficult to observe with the blot provided.

      We disagree on this point. The region between +1 and +15 is inherently resistant to attack by DNAseI and there are only ever very weak bands in this region (lane 1). Other than seeing small fluctuations in overall lane intensity (e.g. lanes 7-12 have a slightly lower signal throughout) the +1 to +15 banding pattern does not change. Conversely, there are dramatic changes in the banding pattern between around -30 and -60 (again, compare lane 1 to all other lanes). That CRP and HapR bind the same region is extremely clear. Also note that this is backed up by mutagenesis of the shared binding site (Figure 4c).

      The model proposed at the end of the manuscript proposes physiological changes in cells that occur at transitions from the low to high cell density. Experiments in the paper that could strengthen this argument are incomplete. For example, in Fig. 4e it is unclear at what cell density the experiment is conducted.

      Such details have been added to the figure legends and methods section.

      The results with the wild type strain are intermediate relative to the other strains tested.

      This is correct, and exactly what we would expect to see based on our model.

      Cell density should affect the result here since HapR is produced at high density but not low density. This experiment would provide important additional insights supporting their model, by measuring activity at both cell densities and also in a luxO mutant locked at the high cell density. Conducting this experiment in conditions lacking and containing glucose would also reveal whether high glucose conditions mimicking the crp results.

      We agree with this idea in principle but note that the output from our reporter gene, β- galactosidase, is stable within cells and tends to accumulate. This is likely to obscure the reduction in expression as cells transition from low to high cell density. Since we have demonstrated the regulatory effects of HapR and CRP both in vivo using gene knockouts, and in vitro with purified proteins, we think that our overall model is very well supported. Further experimental additions may provide an incremental advance but will not alter our overall story. Also note the unexpected increase in intracellular cAMP due to addition of glucose, in Vibrio fischeri (PMID: 26062003).

      Throughout the paper it was challenging to account for the number of genes selected, the rationale for their selection, and how they were prioritized. For example, the authors acknowledged toward the end of the Results section that in their prior work, CRP and HapR binding sites were identified (line 321-22).

      This is not quite what we say, and maybe the reviewer misunderstood, which is our fault. The prior work identified CRP sites whilst the current work identified HapR sites. We have made a slight alteration to the text to avoid confusion.

      It is unclear whether the loci indicated in Table 1 all from this prior study. It would be useful to denote in this table the seven genes characterized in Figure 2 and to provide the locus tag for murPQ.

      Again, we are unsure if we have confused the reviewer. The results in Table 1 are all HapR sites from the current work, not a prior study. However, some of these also correspond to CRP binding regions found in prior work.

      The reviewer mentions “the seven genes characterised in Figure 2” but 23 targets were characterised in Figure 2a and 9 in Figure 2b. The “VC” numbers used in Figure 2 are the same as used in Table 1 so it is easy to cross reference between the two. We have added a footnote to Table 1, also referred to in the Figure 2 legend, to allow cross referencing between gene names and locus tags (including for murQP and hapR).

      Of the 32 loci shown in Table 1, five were selected for further study using EMSA (line 322), but no rationale is given for studying these five and not others in the table.

      This is not quite correct, we did not select 5 from the 32 targets listed in Table 1. We selected 5 targets from Table 1 that were also targets for CRP in our prior paper. This was the rationale.

      Since prior work identified a consensus CRP binding motif, the authors identify the DNA sequence to which HapR binds overlaps with a sequence also predicted to bind CRP. Genome analysis identified a total of seven sites where the CRP and HapR binding sites were offset by one nucleotide as see with murPQ. Lines 327-8 describe EMSA results with several of these DNA sequences but provides no data to support this statement. Are these loci in Table 1?

      This comment is a little difficult to follow, and we may have misunderstood, but we think that the reviewer is asking where the EMSA data referred to on lines 327-328 resides. We can see that the text could be confusing in this regard. We had referred to the relevant figure (Figure S6) on line 324 but did not again include this information further down in the description of the result. We have changed the text accordingly.

      Using structural models, the authors predict that HapR repression requires protein-protein interactions with CRP. Electromobility shift assays (EMSA) with purified promoter DNA, CRP and HapR (Fig 5d) and in vitro transcription using purified RNAP with these factors (Figure 5e) support this hypothesis. However, the model proports that HapR "bound tightly" and that it also had a "lower affinity" when CRP protein was used that had mutations in a putative interaction interface. These claims can be bolstered if the authors calculate the dissociation constant (Kd) value of each protein to the DNA. This provides a quantitative assessment of the binding properties of the proteins.

      The reviewer is correct that we do not explicitly provide a Kd. However, in both Figures 5d and 5e, we do provide very similar quantification. In 5d, our quantification is the % of the CRP-DNA complex bound by HapR (using either wild type or E55A CRP). Since the % of DNA bound is shown, and the protein concentrations are provided in the figure legend, information regarding Kd is essentially already present. In 5e, we show the % of maximal promoter activity. This is a reasonable way of quantifying the result. Furthermore, Kd is not a metric we can measure directly in this experiment that is not a DNA binding assay.

      The concentrations of each protein are not indicated in panels of the in vitro analysis, but only the geometric shapes denoting increasing protein levels.

      The protein concentrations are all provided in the figure legend. It is usual to indicate relative concentrations in the body of the figure using geometric shapes.

      Panel 5e appears to indicate that an intermediate level of CRP was used in the presence of HapR, which presumably coincides with levels used in lane 4, but rationale is not provided.

      There was no particular rationale for this, it was simply a reasonable way to do the experiment.

      How well the levels of protein used in vitro compare to levels observed in vivo is not mentioned.

      The protein concentrations that we use (in the nM to low μM range) are very typical for this type of work and consistent with hundreds of prior studies of protein-DNA interactions. The general rule of thumb is that 1000 molecules of a protein per bacterial cell equates to a concentration of around 1 μM. However, molecular crowding is likely to increase the effective concentration. Additionally, in vitro, where the DNA concentration is higher.

      The authors are commended for seeking to connect the in vitro and vivo results obtained under lab conditions with conditions experienced by V. cholerae in niches it may occupy, such as aquatic systems. The authors briefly review the role of MurPQ in recycling of the cell wall of V. cholerae by degrading MurNAc into GlcNAc, although no references are provided (lines 146-50). Based on this physiology and results reported, the authors propose that murPQ gene expression by these two signal transduction pathways has relevance in the environment, where Vibrios, including V. cholerae, forms biofilms on exoskeleton composed of GlcNAc.

      We have added a citation to the section mentioned.

      The conclusions of that work are supported by the Results presented but additional details in the text regarding the characteristics of the proteins used (Kd, concentrations) would strengthen the conclusions drawn. This work provides a roadmap for the methods and analysis required to develop the regulatory networks that converge to control gene expression in microbes. The study has the potential to inform beyond the sub-filed of Vibrios, QS and CRP regulation.

      As noted above, quantification essentially equivalent to Kd is already shown (% of bound substrate is indicated in figures and all protein concentrations are given in the figure legends).

      Reviewer #1 (Recommendations For The Authors):

      1.  As similar experiments have been performed in other Vibrios, it would be interesting to do a more detailed analysis of the similarities and differences between the species. Perhaps a Venn diagram showing how many targets were found in all studies versus how many are species specific.

      We appreciate this suggestion but would prefer not to make this change. A cross-species analysis would be very time consuming and is not trivial. The presence and absence of each target gene, for all combinations of organisms, would first need to be determined. Then, the presence and absence of binding signals for HapR, or its equivalent, would need to be determined taking this into account. For most readers, we feel that this analysis is unlikely to add much to the overall story. Given the amount of effort involved, this seems a “non-essential” change to make.

      2.  Line 101-Are the FLAG tagged versions of LuxO and HapR completely functional? Can they complement a luxO or hapR deletion mutant?

      The activity of FLAG tagged HapR (LuxR in other Vibrio species) has been shown previously (e.g. PMIDs 33693882 and 23839217). Similarly, N-terminal HapR tags are routinely used for affinity purification of the protein without ill effect. We have not tested LuxO-3xFLAG for “full” activity, though this fusion is clearly active for DNA binding, the only activity that we have measured here, since all know targets are pulled down.

      3.  Line 106-As the authors state later that there are additional smaller peaks for HapR that could be other direct targets, I think a brief mention of the methodology used to determine the cutoff for the 5 and 32 peaks for LuxO and HapR, respectively, would be informative here.

      We have added a little more text to the methods section. The added text states “Note that our cut- off was selected to identify only completely unambiguous binding peaks. Hence, weak or less reproducible binding signals, even if representing known targets, were excluded (see Discussion for further details)”.

      4.  Line 118-Need a reference here to the prior HapR binding site.

      This has been added.

      5.  Figs. 1e-What do the numbers on the x-axis refer to? Why not just present these data as bases? The authors also refer to distance to the nearest start codon, but this is irrelevant for 4/5 of the luxO targets as they are sRNAs. They should really refer to the distance to the transcription start site. Likewise, for HapR, distance to the nearest start codon is not as informative as distance to the nearest transcription start site. A recent paper used transcriptomics to map all the transcription start sites of V. cholerae, and these results should be integrated into the author's study rather than just using the nearest start codon (PMID: 25646441).

      The numbers are kilo base pairs, this has been added to the axis label. We have also changed “start codon” to “gene start” (since “gene start” is also suitable for genes that encode untranslated RNAs).

      Re comparing binding peak positions to transcription start sites (TSSs) rather than gene starts, this analysis would be useful if TSSs could be detected for all genes. However, some genes are not expressed under the conditions tested by PMID: 25646441, so no TSS is found. Consequently, for HapR or LuxO bound at such locations, we would not be able to calculate a meaningful position relative to the TSS. We stress that the point of the analysis is to determine how peaks are positioned with respect to genes (i.e. that sites cluster near gene 5’ ends). Also note that nearest TSSs are now shown in the revised Table 1. In some cases, these are unlikely to be the TSS actually subject to regulation (e.g. because the regulated gene is switched off).

      6.  Fig. 1e-Is there directionality to the site? In other words, if a HapR binding site is located between two genes that are transcribed in opposite directions, is there a way to predict which gene is regulated? It looks like this might be the case with the list presented in Table 1, but how such determination is made and what the various symbol in Table 1 mean are not clear to me. This also has ramifications for Fig. 2a as the direction to construct the fusion is critical for the experiment.

      The site is a palindrome so lacks directionality. The best prediction re regulation is likely to be positioning with respect to the nearest TSS (which is now included in Table 1). However, this would remain just a prediction and, where TSSs are in odd locations with respect to binding sites (taking into account the caveats above) predictions would be unreliable.

      We are unsure which symbol the reviewer refers to in Table 1, a full explanation of any symbols used is provided in the table footnotes.

      With respect to Figure 2a, if sites were between divergent genes, and met our other criteria, we tested for regulation in both directions. For example, see the divergent genes VCA0662 (classified inactive) and VCA0663 (classified repressed).

      7.  Fig. 2a-It is a little disappointing that the authors use LacZ fusions to measure transcription as this is not the most sensitive reporter gene. Luciferase gene fusions would have been much more sensitive. Also, did the authors examine multiple time points. The methods only describe "mid-log phase" but some of the inactive promoters could be expressed at other time points. Also, it would be simple to repeat this experiment in different media, such as minimal with glucose or another non- CRP carbon source, to expand which promoters are expressed.

      The reviewer is correct regarding the sensitivity of β-galactosidase, which is very stable and so accumulates as cells grow. Even so, this reporter has been used very successfully, across thousands of studies, for decades. We did not examine multiple timepoints. We appreciate that the 23 promoter::lacZ fusions could be re-examined using varying growth conditions but this is unlikely to impact the overall conclusions, though it could generate some new leads for future work.

      8.  Fig. 2a legend-typos

      This has been corrected.

      9.  Line 138-I think you mean Fig. 2a here.

      This has been corrected.

      10.  Fig. 2b and many additional figures quantify band intensity but do not show any replication or error. Therefore, it is impossible to gauge reproducibility of these experiments.

      We have added a reproducibility statement (all experiments were done multiple times with similar results) as is standard throughout the literature. Also note that there is a lot of internal replication between figures. Figure 4d and Figure 5e lanes 1-9 show essentially the same experiment (albeit with slightly different protein concentrations) and very similar results. To the same effect, Figure 5e lanes 10-18 and lanes 19-27 show the same experiment for two different mutations of the same CRP residue. Again, the results are very similar. Also see the response to your comment 15 below.

      11.  Fig. 4a-lanes 2-4-the footprint does not change with additional CRP. In other words, it looks the same at the lowest concentration of CRP versus the highest concentration of CRP. The footprints for HapR look similar. This is somewhat troubling as in these types of experiments one would like to observe a dose dependent change in the footprint correlating with more DNA occupancy.

      For CRP we agree but are not concerned at all by this. The site is simply full occupied at the lowest protein concentration tested. Given that the footprint exactly coincides with a near consensus CRP site (which, when mutated, abolishes CRP binding in EMSAs, and regulation by CRP in vivo) all our results are perfectly consistent. Note that i) our only aim in this experiment was to determine the positions of CRP and HapR binding ii) our conclusions are independently backed up using gel shifts and by making promoter mutations. With respect to HapR, there are changes at the periphery of the main footprint.

      12.  Fig. 4e-Why does the transcriptional activation of murQP decrease with increasing concentrations of CRP? This is also seen in Fig. 5e.

      In our experience, this often does happen when doing in vitro transcription assays (with CRP and many other activators). The anecdotal explanation is that, at higher concentrations, the regulator can start to bind the DNA non-specifically and so interfere with transcription.

      13. The authors demonstrate in vitro that HapR requires binding of CRP to bind the murQP promoter. It would strengthen their model if they demonstrated this in vivo. To do this, the authors only need to repeat their ChIP-Seq experiment in a delta CRP mutant and the HapR signal at murQP would be lost. In fact, such an experiment would experimentally confirm which of the in vivo HapR binding sites are CRP dependent.

      We agree, appreciate the comment, and do plan to do such experiments in the future as a wider assessment of interactions between transcription factors. However, doing this does have substantial time and resource implications that we cannot devote to the project at present. We feel that our overall conclusions, regarding co-operative interactions between HapR and CRP at PmurQP, are well supported by the data already provided. This also seems the overall opinion of the reviewers.

      14.  Fig. 5b-I am confused by the Venn diagram. The text states that "In all cases, the CRP and HapR targets were offset by 1 bp", but the diagram only shows 7 overlapping sites. The authors need to better describe these data.

      We mean that, in all cases where sites overlap, sites are offset by 1 bp (i.e. we didn’t find any sites

      overlapping but offset by 2, 3 4 bp etc).

      15. Line 287-288 and Fig. 5d-The authors state that HapR binds with less affinity to the CRP E55A mutant protein bound to DNA. There does seem to be a difference in the amount of shifted bands at the equivalent concentrations of HapR, but the difference is subtle. In order to make such a conclusion, the authors should show replication of the data and calculate the variability in the results. The authors should also use these data to determine the actual binding affinities of HapR to WT and the E55A mutant CRP, along with error or confidence intervals.

      All of these experiments have been run multiple times and we are absolutely confident of the result. With respect to Figure 5d, this was done many times. We note that not all experiments were exact repeats. E.g. some of the first attempts had fewer HapR concentrations. Even so, the defect in HapR binding to the CRP E55A complex was always evident. The two gels to the left show the final two iterations of this experiment (these are exact repeats). The top image is that shown in Figure 5d. The lower image is an equivalent experiment run a day or so previously. Both clearly show a defect in HapR binding to the CRP E55A complex. We appreciate that our conclusion re these experiments is somewhat qualitative (i.e. that HapR binds the CRP E55A complex less readily) but this is not out of kilter with the vast majority of similar literature and our results are clearly reproducible.

      16.  Fig. 6a-It is odd that the locked low cell density mutants have such a growth defect in MurNAc, minimal glucose, and LB. To my knowledge, such a growth defect is not common with these strains. Perhaps this has to do with the specific growth conditions used here, but I can't find that information in the manuscript (it should be there). Furthermore, the growth rate of the luxO and hapR mutants appears to be similar up to the branch point (i.e. slope of the curve), but the lag phage of the luxO mutant is much longer. The authors need to address these issues in relationship to previous published literature and specify their growth conditions because the results are not consistent with their simple model described in Fig 6b.

      This comment is a little difficult to pick apart as it covers several different issues. We’ll try and

      answer these individually.

      a)     The unusual “biphasic growth curve with hapR and hapRluxO cells: We do not know why cells lacking hapR have a growth curve that appears biphasic. We can only assume that this is due to some regulatory effect of HapR, distinct from the murQP locus. Despite the unusual shape of the growth curve, the data are consistent with our conclusions.

      b)     The extended lag phase of the luxO mutant in minimal media + MurNAc: We appreciate this comment and had considered possible explanations prior to submission. In the end, we left out this speculation but are happy to include it as part of our response. The extended lag phase might be expected if CRP/HapR regulation is largely critical for controlling the basal transcription of murQP. The locus is likely also regulated by the upstream repressor MurR (VC0204) as in E. coli. So, if deprepression of MurR overwhelms the effect of HapR on murQP, we think you would expect that once the cells start growing on MurNAc, the growth rates are unchanged. But the extended lag is due to the fact that it took longer for those cells to achieve the critical threshold of intracellular MurNAc-6-P necessary to drive murR derepression. Obviously, we can not provide a definitive answer.

      c)     We have added further details regarding growth conditions to the methods section and the Figure 6a legend.

      17.  Fig. S6-The data to this point with murPQ suggested a model in which CRP binding then enabled HapR binding. But these EMSA suggest that both situations occur as in some cases, such as VCA0691, HapR binding promotes CRP binding. How does such a result fit with the structural model presented in Fig. 5?

      This is to be expected and is fully consistent with the model. Cooperativity is a two-way street, and each protein will stabilise binding of the other. Clearly, it will not always be the case that the shared DNA site will have a higher affinity for CRP than HapR (as at PmurQP). Depending on the shared site sequence, expected that sometimes HapR will bind “first” and then stabilise binding of CRP.

      18. Line 354-356-The HCD state of V. cholerae occurs in mid-exponential phase and several cell divisions occur before stationary phase and the cessation of growth, at least in normal laboratory conditions. Therefore, there is not support for the argument that QS is a mechanism to redirect cell wall components at HCD because cell wall synthesis is no longer needed.

      We did not intent to suggest cell wall synthesis is not needed at all, rather that there is a reduced need. We made a slight change to the discussion to reflect this.

      19. Line 357-360-Again, as stated in point 16, the statement that cells locked in the HCD are "defective for growth" is an oversimplification. The luxO mutants have a longer lag phage, but they actually outgrow the hapR mutants at higher cell densities and reach the maximum yield much faster.

      In fairness, we do go on to specify that the defect is an extended lag phase. Also see our response above.

      Reviewer #2 (Recommendations For The Authors):

      Comments to improve the text

      1)  Line 103-106, line 130, line 136, etc. Details of the methods and the text directing to presentations of figures should be in the methods and/or figure legends with (Figure x) in citation after the statement. The sentences in lines indicated can be deleted from the results. Although several lines are noted specifically here, this comment should be applied throughout the entire results section.

      We appreciate this comment but would prefer not to make this change (it seems mainly an issue of personal stylistic choice). It is sometimes helpful for the reader to include such information as it avoids them having to cross reference between different parts of the manuscript.

      2)  Line 115. Recommend a paragraph between content on LuxO and HapR (before "Of the 32 peaks for HapR binding")

      We agree and have made this change.

      3)  Line 138 and Figure 1a. I am not convinced this gel shows that VC1375 is activated by HapR. Is the arrow pointing to the wrong band? There does seem to be an induced band lower down.

      We understand this comment as it is a little difficult to see the induced band. This is because this is a compressed area of the gel and the transcript is near to an additional band.

      4)  Line 147. Add the VC0206-VC0207 next to murQP (and the gene name murQP into Table 1).

      We have added the gene name to the figure foot note. The text has been changed as requested.

      5) Methods. It is essential for this paper to have detailed methods on the bacterial growth conditions. Referring to prior paper, bacteria were grown in LB (add composition...is this high salt LB often used for vibrios or low salt LB often used for E. coli). Growth is to "mid log". Please provide the OD at collection. Is mid log really considered "high density". Provide a reference regarding HapR activity at mid log to support the method. Could the earlier collection of bacteria account for missing known HapR regulated genes? In preparing the requested ç, include growth conditions for other experiments in the legends.

      Note that we have included a new supplementary table, rather than a Venn diagram. We have also added further details of growth conditions as mentioned above. Also not that, for the ChIP-seq, HapR and LuxO were expressed ectopically and so uncoupled from the switch between low and high cell density.

      6)  Content of Table 1, HapR Chip-seq peaks, needs to be closely double checked to the collected data as there seems to be some errors. Specifically, VC0880 and VC0882 listed under Chromosome I are most likely VCA0880 (MakD) and VCA0882 (MakB), both known HapR induced genes on Chromosome II with VCA0880 previously validated by EMSA. This notable error suggests the table may have other errors and thus requires a very detailed check to assure its accuracy.

      We appreciate the attention to detail! We have double checked, thankfully this is not an error, the table is correct (even so, we have also checked all other entries in the table). As an aside, VCA0880 is one of the locations for which we see a weak HapR binding signal below our cut-off (included in the new Table S1). In cross checking between Table 1 and all other data in the paper we noticed that we had erroneously included assay data for VC0620 in Figure 2A. This was not one of our ChIP-seq targets but had been assayed at the same time several years ago. This datapoint, which wasn’t related to any other part of the manuscript, has been removed.

      If VCA0880 and VCA0882 are correctly placed on Chr. I, then add comment to text that the Mak toxin genomic island found on Chromosome II in N16961 is on Chr. I in E7946. (See recent references PMID: 30271941, 35435721, 36194176, 34799450).

      See above, this is not an error.

      7)  Alternatively for both comments 8 & 9, are these problems of present/missing genes or misannotations the result of the annotation of E7946 gene names not aligning with gene names of N16961? (if so, in Table 1, please give the gene name as in E7946 but include a separate column with the N16961 name for cross study comparison)

      See above and below, this is not an issue.

      8)  Line 126-127. Also regarding Table 1, please add a column with function gene annotation. For example, VC0916 needs to be identified as vpsU. If function is unknown, type unknown in the column. This will help validate the approach of selecting "HapR target promoters where adjacent coding sequence could be used to predict protein function."

      We added an extra column to Table 1 in response to a separate reviewer request (TSS locations). This leaves no space for any additional columns. Instead, to accommodate the reviewer’s request, we have added alternative gene names to the footnote.

      Not following up on VCA0880 (promoter for the mak operon) is a sad missed opportunity here as it is one of the most strongly upregulated genes by HapR (PMC2677876)

      As noted above, this was not an error and VCA0880 was not one of our 32 HapR targets. As such, we would not have followed this up.

      9)  Figure Legends. Add a unit to the bar graphs in Figure 1e (should be kb??) This has been corrected.

      10) The yellow color text labels in figures 3c, 4a, 4c are difficult to read. Can you use an alternative darker color for CRP.

      We have made this slightly darker (although to our eye it is easily reliable). We haven’t changed the colour too much, for consistency with colour coding elsewhere.

      11) Figure S3. Binding is misspelled. Add units to the x-axis

      This has been corrected.

      12) Figure S7. The text in this figure is too small to read. Figure could be enlarged to full page or text enlarged. Are these 4 the only other known regulated promoters? Could all the known alternative promoters linked to HapR be similarly probed?

      We have increased the font size and included a new Table S1 for all previously proposed HapR sites.

      13) Figure S8. Original images..are any of these the replicate gels (see public comment 6)

      We have added a statement regarding reproducibility, and also note the internal reproducibility between different figures in our reviewer response. The gels in Figure S8 are full uncropped versions of those shown in the main figures.

      Reviewer #3 (Recommendations For The Authors):

      None

      Whilst there were no specific recommendations from this reviewer, we have still responded to the public review and made changes if required.

  3. May 2023
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Dear Editor and reviewers,

      Thank you very much for the thorough assessment of our manuscript. We have carefully considered the comments and reflected most of them in the new version. We recognized the need to shorten and clarify the manuscript. Therefore, we have omitted particularly the less important passages concerning metabolism and the loss of genes encoding mitochondrial proteins, which cut the text by six pages in the current layout. We have also removed the text relating this model to eukaryogenesis. Finally, we have slightly changed the structure and linked the different sections to improve the flow of the story and to emphasize the key messages, which are the absence of mitochondria in a large proportion of oxymonads and the impact of this loss, loss of Golgi stacking and transformation to endobiotic lifestyle on selected gene inventories. We hope the manuscript is now clear and more concise and will be of interest to a broad readership interested in the evolution of eukaryotes, mitochondria and protists.

      1. Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity):

      This is a very interesting paper that investigates through detailed comparative genomics the tempo and mode of the evolution of microbial eukaryotes/protists members of the Metamonada with a focus on Preaxostyla, currently the only known lineage among eukaryotes to have species that have lost, by all accounts, the mitochondria organelle all together. Notably, it includes a free-living representative of the lineage allowing potential interesting comparison between lifestyles among the Preaxostyla. This is a generally nicely crafted manuscript that presents well supported conclusions based on good quality genome sequence assemblies and careful annotations. The manuscript presents in particular (i) additional evidence for the common role of LGT from various bacterial sources into eukaryotic lineages and (ii) more details on the transition from a free-living lifestyle to an endobiotic one and (iii) the related evolution of MROs and associated metabolism.

      Thank you very much for the positive assessment.

      I have some comments to improve a few details:

      In the introduction, lines 42-43, the last sentence should be more conservative by replacing "whole Oxymonadida" with "...all known/investigated Oxymonadida".

      The sentence has been changed to: "Our results provide insights into the metabolic and endomembrane evolution, but most strikingly the data confirm the complete loss of mitochondria and every protein that has ever participated in the mitochondrion function for all three oxymonad species (M. exilis, B. nauphoetae, and Streblomastix strix) extending the amitochondriate status to all investigated Oxymonadida."

      Similarly on line 62, the sentence could state "... contain 140 described...".

      The sentence has been changed to: "Oxymonadida contain approximately 140 described species of morphologically divergent and diverse flagellates exclusively inhabiting digestive tracts of metazoans, of which none has been shown to possess a mitochondrion by cytological investigations (Hampl 2017)."

      When discussing the estimated completeness of the genome are discussed (lines 117-120) and contrasted with the values for Trypanosoma brucei and other genomes, the author should explicitly state that these genomes are considered complete, which seems is what they imply, is that the case? If so, please provide more details to support this idea.

      We have elaborated on this part also in reaction to comments of other reviewers. The text now reads: "It should be noted that, despite their wide usage, BUSCO values are not expected to reach 100% in lineages distant from model eukaryotes simply due to the true absence (or high sequence divergence) of some of the assessed marker genes. For example, various Euglenozoa representatives with highly complete genome sequences, including Trypanosoma brucei, have BUSCO completeness estimates in the range of 71-88% (Butenko et al. 2020), and representatives of Metamonada fall within the range of 60-91% (Salas-Leiva et al. 2021). Specifically in the case of oxymonad M. exilis, the improvement of the genome assembly using long-read resequencing from 2092 scaffolds to 101 contigs led to only a marginal increase of BUSCO value from 75.3 to 77.5 (Treitli et al. 2021). "

      Also please see the detailed table prepared in response to reviewers 2 and 3 summarizing the presence/absence of genes from BUSCO set in the selected representatives of Metamonada and Trypanosoma brucei. The table is commented in the answer to Reviewer 3 comment (page 18)

      The supplementary file named "132671_0_supp_2540708_rmsn23" is listed as a Table SX? (note: I found it rather difficult to establish exactly what file corresponds to what document referred in the main text)

      We apologize for this mistake. We have checked and corrected references to tables, figures and supplementary material throughout the manuscript and hope it now does not contain any errors.

      Lines 243-245, where 46 LGTs are discussed, it is relevant that the authors investigate their functional annotations. Indeed, it is suggested that these could have adaptive values, hence investigating their functional annotation will allow the authors to comment on this possibility in more details and precision. When discussing LGTs it would also be very useful to cite relevant reviews on the topic - covering their origins, functional relevance when known, distribution among eukaryotes. This is done when discussing the evolution and characteristics of MROs but not when discussing LGTs, with several reviews cited and integrated in the discussion of the data and their interpretation.

      Available annotations of all putative LGT genes are provided in Supplementary_file_3 and also in the Supplementary_file_6 if the gene belongs to a manually annotated cellular system. Although we agree with the reviewer that the discussion of 46 species-specific LGTs might be interesting, for the sake of conciseness and brevity of the manuscript, we have decided not to expand the discussion further. However, note that we discuss selected cases of P. pyriformis-specific LGTs in the part “P. pyriformis possesses unexpected metabolic capacities” which follows right after the lines reviewer is referring to.

      The sentence, lines 263-265, where the distribution of some LGTs are discussed, needs to be made more precise. When using the work "close" the authors presumably refer to shared/similar habitat,s or else? Entamoeba is not a close relative to the other listed taxa.

      The “close relatives” mentioned in the text were meant as close relatives of all p-cresol-synthesizing taxa discussed in the paragraph, including Mastigamoeba, i.e. a specific relative of Entamoeba. We have modified the text such as to make the intended meaning easier to follow.

      Lines 346-348, that sentence needs to end with a citation (e.g. Carlton et al. 2007).

      The citation proposed by the reviewer has been added. The sentence was changed to: " The most gene-rich group of membrane transporters identified in Preaxostyla is the ATP-binding cassette (ABC) superfamily represented by MRP and pATPase families, just like in T. vaginalis (Carlton et al. 2007). "

      In the paragraph (line 580-585) discussing ATP transporters, note that Major et al. (2017) did not describes NTTs but distantly related members of MSF transporter, shared across a broader range of organisms then the NTTs. Did the authors check if the genome of interest encoded homologues of these transporters too?

      The citation has been removed; we admit that it was not the most appropriate one in the given

      context. Concerning the NTT-like transporters, encouraged by the reviewer we searched for them in the Preaxostyla genome and transcriptome assemblies and found no candidates. This is not explicitly stated in the revised manuscript. The paragraph now reads: “MROs export or import ATP and other metabolites typically using transporters from the mitochondrial carrier family (MCF) or sporadically by the bacterial-type (NTT-like) nucleotide transporters (Tsaousis et al. 2008). We did not identify any homolog of genes encoding proteins from these two families in any of the three oxymonads investigated. In contrast, MCF carriers, but not NTT-like nucleotide transporters, were recovered in the number of four for each P. pyriformis and T. marina (Supplementary file 6).

      Line 920-921, I don't understand how the number 30 relates to "guarantee" inferring the directionality of LGTs events. This will be very much dataset dependent, 100 sequences might still not allow to infer directionality of LGT events. The authors probably meant to "increase the possibility to infer directionality".

      We agree the original wording has not been particularly fortunate, so the sentence has changed to: "Files with 30 sequences or fewer were discarded, as the chance directionality of the transfer can be determined with any confidence is low when the gene family is represented by a small number of representatives."

      Reviewer #2 (Evidence, reproducibility and clarity):

      Using draft genome sequencing of the free-living Paratrimastix pyriformis and the sister lineage oxymonad Blattamonas nauphoetae, Novack et al. infer the metabolic potential of the two protists using comparative genomics. The authors conclude that the common oxymonad ancestor lost the mitochondrion/mitosome and discuss general strategies for adapting to commensal/symbiotic life-style employed by this taxon. Some elaborations on pathways go on for several paragraphs and feel unnecessarily stretched, which made those sections of the paper rather difficult to digest.

      Having seen reflections on the manuscript by three reviewers we carefully reconsidered its content and attempted to make it shorter and more compact by removing some of the less substantial material. Namely, we have dispensed completely with the original last section of Results and Discussion (“No evidence for subcellular retargeting of ancestral mitochondrial proteins in oxymonads”) and made various cuts throughout other sections. We hope that the revised version makes a substantially better job of delivering the key messages of our study to the readers compared to the original submission.

      This might be also be because the work, and all conclusions drawn, depend entirely on incomplete (ca. 70-80%) genome data and simple similarity searches, and e.g. no kind of biochemistry or imaging is presented to underpin the manuscripts discussion.

      This is a very crude and superficial assessment of our data. We have actually good reasons to believe that the genome assemblies are close to complete. Please see the discussion on this topic below and an answer to a particular comment from reviewer 3 (page 18).

      This is noteworthy in light of other protist genome reports published in the last few years that differ in this respect, including previous work by this group. And for sequencing-only data, this paper - https://doi.org/10.1016/j.dib.2023.108990 - might offer an example of where we are at in 2023.

      Frankly, we do not think it is fair or relevant to compare our study to the paper pointed to by the reviewer, as that paper reports on a metagenomic study that delivers a set of metagenomically assembled genomes (MAGs) of varying quality retrieved from environmental DNA samples without providing any in-depth analysis of the gene content. Our study is very different in its scope and aims, and we are not certain what lesson we should take from this reviewer’s point. We have good reasons to believe that the datasets are close to complete. Please see the discussion on this topic below and answer to comment of reviewer 3 (page 18).

      With respect to previous work of the group (Karnkowska et al. 2016 and 2019), this submission is very similar (analysis pattern, even some figures and more or less the conclusion), i.e. to say, the overall progress for the broader audience is rather incremental. Then there are also some incidents, where the data presented conflicts with the author‘s own interpretation.

      It was our intention to use the previous analytical experiences and approaches, which at the same time makes the new results comparable with those published before. Although the format is intentionally similar, this work is a substantial step forward because only with our present study the amitochondrial status of the large part of Oxymonadida group can be considered solidly established. This in turn allows us to estimate the timing of the loss of mitochondrion (more than 100 MYA) demonstrating that the absence of mitochondrion in this group is not an episodic transient state but a long-established status. We do not understand what exactly the reviewer had in mind when pointing to “incidents, where the data presented conflicts with the author‘s own interpretation” – we are not aware of such cases.

      The text (including spelling and grammar) needs some attention and the choice of words is sometimes awkward. The overuse of quotation marks ("classical", "simple", "fused", "hits", "candidate") is confusing (e.g. was the BLAST result a hit or a "hit").

      The whole text has been carefully checked and the language corrected whenever necessary by a one of the co-authors, who is a native English speaker. The use of quotation marks has been restricted as per the reviewer’s recommendation.

      In its current formn the manuscript is, unfortunately, very difficult to review. This reviewer had to make considerable efforts to go through this very large manuscript, mainly because of issues affecting to the presentation and the lack of clarity and conciseness of the text. It would be greatly appreciated if the authors would make more efforts upfront, before submission, to make their work more easily accessible both to readers and facilitate the task of the reviewers.

      We admit that the story we are trying to tell is a complex one, consisting of multiple pieces whose integration into a coherent whole is a challenging task. As stated above, the reports provided by the reviewers provided us with an important stimulus, leading us to substantially modify the manuscript to make it more concise, less ambiguous when it comes to particular claims, and easier to read. We hope this intention has been fulfilled to a larger degree.

      About a fifth of the two genome is missing according the authors prediction (table 1). Early on they explain the (estimated) incompleteness of the genomes to be a result from core genes being highly divergent. In light of this already suspected high divergence, using (the simplest NCBI) sequence similarity approach to call out the absence of proteins (for any given lineage) may need lineage-specific optimization. The use of more structural motif-guided approaches such as hidden Markov models could help, but it is not clear whether it was used throughout or only for the search for (missing) mitochondrial import and maturation machinery. The authors state that the low completeness numbers are common among protists, which, if true, raises several questions: how useful are then such tools/estimates to begin with and does this then not render some core conclusions problematic? The reader is just left with this speculation in the absence of any plausible explanation except for some references on other species for which, again, no context is provided. Do they have similar issues such as GC-content, same core genes missing, phylogenetic relevance?, etc.. No info is provided, the reader is expected to simply accept this as a fact and then also accept the fact that despite this flaw, all conclusions of the paper that rests on the presence/absence of genes are fine. This is all odd and further skews the interpretations and the comparative nature of the paper.

      The question of the completeness of the data sets was raised also by reviewer 3 and we would like to provide an explanation at this point. First, it should be stated that there is no ideal and objective way how to measure the completeness of the eukaryotic genomic assembly. In the manuscript, we have used the best established method, adopted by the community at large, which is based on the search for a set of „core eukaryotic genes“ using a standardized pipeline BUSCO or previously popular CEGMA. The pipeline uses its own tools to identify the homologues of genes/proteins which ensures standardization of the procedure. This answers the question of reviewer 2, why we have not used more sensitive tools for these searches. We did not use them, because we followed the procedure that is the gold standard for such assessments, for comparability with other genomes and to make this as clear to the reader as possible. Although the result of the pipeline is usually interpreted as the completeness of the assembly, this is a simplification. Strictly speaking, the result is a percentage of the genes from the set of 303 core eukaryotic genes (in our case) which were detected in the assembly by the pipeline. Even in complete assemblies, the value is usually below 100% because some of the genes are not present in the organism and some diverged beyond recognition. We do not see any other way how to deal with this drawback than to compare with related complete genome assemblies acting as standards. This we have done in Supplementary file 11, where we list the presence/absence of each gene for Preaxostyla species and three highly complete assemblies of Trypanosoma brucei, Giardia intestinalis and Trichomonas vaginalis. T. brucei and G. intestinalis are assembled into chromosomes. As you can see, in these three „standards“ 63, 148 and 77 genes from the core were not detected resulting in BUSCO completeness values of 79%, 51% and 75%, respectively. 18 of the non-detected genes function in mitochondria (shown in red), which are highly reduced in some of these species, so the absence of the respective genes is therefore expected. Simply not considering these genes would increase the “completeness measure” for oxymonads by 6%. The values for our standards are not higher than the values for Preaxostyla (69-82%). In summary, the BUSCO incompleteness measure is far from ideal, particularly in these obscure groups of eukaryotes. The values received for Preaxostyla give no reason for concern about their incompleteness. See also our answer to reviewer 3 (page 18).

      At the same time, we admit that the BUSCO values do not confirm the high completeness of our assemblies. So, why do we think they are highly complete? One reason is that we do not see suspicious gaps in any of the many pathways which we annotated but the main reason is the high contiguity of the assemblies. Thanks to Nanopore long read sequencing, the assembly of P. pyriformis and B. nauphoetae compose of 633 and 879 scaffolds, suggesting that there are “only” hundreds of gaps. Although this may still sound too much, it is a relatively good achievement for genomes of this size and the experience shows that a further decrease in the number of scaffolds would allow the detection of additional genes but not in huge numbers. As we have shown for M. exilis (Treitli et al. 2021, doi:10.1099/mgen.0.000745) the decrease from 2 092 scaffolds to 101 contigs, i.e., filling almost 2 000 gaps, allowed the prediction of additional 1 829 complete gene models, of which 1 714 were already present in the previous assembly but only partially and just 115 were completely new. None of these newly predicted genes was functionally related to the mitochondrion. Thus, we infer the chance that all mitochondrion-related genes are hidden in the gaps of assemblies is very low.

      We have provided these arguments in a condensed form in the text following the description of genome assemblies: “It should be noted that, despite their wide usage, BUSCO values are not expected to reach 100% in lineages distant from model eukaryotes simply due to the true absence (or high sequence divergence) of some of the assessed marker genes. For example, various Euglenozoa representatives with highly complete genome sequences, including Trypanosoma brucei, have BUSCO completeness estimates in the range of 71-88% (Butenko et al. 2020), and representatives of Metamonada fall within the range of 60-91% (Salas-Leiva et al. 2021). Specifically in the case of oxymonad M. exilis, the improvement of the genome assembly using long-read resequencing from 2092 scaffolds to 101 contigs led to only a marginal increase of BUSCO value from 75.3 to 77.5 (Treitli et al. 2021).

      As a side note, this will also influence the number of proteins absent in other lineages and as such has consequences on LGT calls versus de novo invention. For the cases with LGT as an explanation, it would help to briefly discuss the candidate donors and some details of the proteins in the eco-physiological context (e.g. lines 263-268 suggest that HPAD may have been acquired by EGT which was facilitated by a shared anaerobic habitat and also comment on adaptive values for acquiring this gene). Exchanging metabolic genes via LGT (Line 163) blurs the differences between roles and extent of LGT in prokaryote vs eukaryote, and therefore is exciting and could use support/arguments other than phylogenies. I guess the number of reported LGTs among protists (whatever the source) over the last decade has by now deflated the novelty of the issue in more general; a report of the numbers is expected but they alone won't get you far anymore in the absence of a good story (such as e.g. work on plant cell wall degrading enzymes in beetles).

      We agree with the reviewer that the cases of LGT involving Preaxostyla would deserve more discussion in the manuscript. On the other hand, we also agree that none of them provides such a “cool” story that would deserve a special chapter or even a separate paper. Therefore, we have decided, also with regard to keeping the text in a reasonable dimension, not to expand the discussion of LGTs with the exception of HgcAB, where some new information has been included and the phylogeny of the genes updated. Please note that we had discussed in the original manuscript the donor lineages and ecological/biochemical context in the cases of GCS-L2, HPAD, UbiE, and NAD+ synthesis and this material has been kept also in the revised version.

      It would help to clarify which parts of the mitochondrial ancestor were reduced during the process of reductive evolution at what time in their hypothesized trajectory. For instance, loosing enzymes of anaerobic metabolism conflicts with the argued case of an aerobic (as opposed to facultative anaerobic) mitochondrial ancestor followed by gains of anaerobic metabolism in the rest of the eukaryotes via LGT, and some papers the authors themselves cite (e.g. the series by Stairs et al.). There is no coherent picture on LGT and anaerobic metabolism, although a reader is right to expect one.

      These are very interesting questions, that would fill a separate article. In the manuscript, we focus on the Preaxostyla lineage only and there the trajectory seems relatively simple: replacement of the mitochondrial ISC by cytosolic SUF in the common ancestor of Preaxostyla, loss of methionine cycle and in in consequence mitochondrial GCS and the mitochondrion itself. We have modified the first conclusion paragraph in this sense and it now reads the following:

      The switch to the SUF pathway in these species has apparently not affected the number of Fe-S-containing proteins but led to a decrease in the usage of 2Fe-2S clusters. The loss of MRO impacted particularly the pathways of amino acid metabolism and might relate also to the loss of large hydrogenases in oxymonads.

      It is not clear to us how to understand the reviewer’s remark concerning the conflict between loss of enzymes of anaerobic metabolism and the (presumed) aerobic nature of the mitochondrial ancestor. Provided that we read the reviewer’s rationale correctly, is it really so implausible that the anaerobic metabolism gained laterally by a particular lineage was then secondarily lost in specific descendant lineages? As a clear example demonstrating the feasibility of such an evolutionary pattern consider the evolution of plastids. There is no doubt these organelles move across eukaryotes by secondary or higher-order endosymbiosis or kletoplastidy, establishing themselves in lineages where there was no plastid before. Secondary simplification of such plastids, e.g. by the loss of photosynthesis, in its extreme form culminating in the complete loss of the organelle, has been robustly documented from several lineages, such as Myzozoa (e.g., https://pubmed.ncbi.nlm.nih.gov/36610734/). Hence, we see absolutely no reason to rule out the possibility that the ancestral mitochondrion was obligately aerobic and enzymes of anaerobic metabolism spread secondarily by eukaryote-to-eukaryote LGT, with their secondary loss in particular lineages. We really do not see any conflict here and we do not agree with the interpretation provided by the reviewer. That said, we admit that the discussion on the earliest stages of mitochondrial evolution is not an essential ingredient of the story we try to tell in our manuscript, so to avoid any unnecessary misunderstanding we have removed the original last sentence of Conclusions (“Thorough searches revealed …”) from the revised manuscript.

      In light of their data the authors also discuss the importance of the mitochondrion with respect to the origin of eukaryotes:

      First, the mitochondrion brought thousands of genes into the marriage with an archaeon, surely hundreds of which provided the material to invent novel gene families through fusions and exon shuffling and some of which likely went back and forth over the >billion years of evolution with respect to localizations. The authors look at a minor subset of proteins (pretty much only those of protein import, Fig. 6) to conclude, in the abstract no less: „most strikingly the data confirm the complete loss of mitochondria and every protein that has ever participated in the mitochondrion function for all three oxymonad species." I do not question the lack of a mitochondrion here, but this abstract sentence is theatrical in nature, nothing that data on an extant species could ever proof in the absence of a time machine, and is evolutionary pretty much impossible. A puzzling sentence to read in an abstract and endosymbiont-associated evolution.

      We feel that the reviewer is putting too much emphasis on an aspect of our original manuscript that is rather peripheral to its major message. Indeed, the manuscript is not, and has never been thought to be, primarily about eukaryogenesis and the exact role the mitochondrion played in it. We are, therefore, somewhat reluctant to react in full to the very long and complex argument the reviewer has raised in his/her report, so we keep our reaction at the necessary minimum. Concerning the criticized sentence in the original version of the abstract, it alluded to a section of the manuscript (“No evidence for subcellular retargeting of ancestral mitochondrial proteins in oxymonads”) that we have removed from the revised version, and hence we have modified also the abstract accordingly by removing the sentence. We still think our original arguments were valid, but apparently, much more space and more detailed analyses are required to deliver a truly convincing case, for which there is no space in the manuscript.

      Second, using oxymonads as an example that a lineage can present eukaryotic complexity in the absence of mitochondria and conflating it with eukaryogenesis is a logical fallacy. This issue already affected the 2019 study by Hampl et al.. We have known that a eukaryote can survive without an ATP-synthesizing electron transport chain ever since Giardia and other similar examples and the loss of Fe-S biosynthesis and the last bit of mitosome (secondary loss) doesn't make a difference how to think about eukaryogenesis. It confuses the need and cost to invent XYZ with the need and cost of maintenance. How can the authors write "... and undergo pronounced morphological evolution", when they evidently observe the opposite and show so in their Fig. 1? The authors only present evidence for reductive evolution of cellular complexity with the loss of a stacked Golgi. What morphological complexity did oxymonads evolve that is absent in other protists? A cytosolic metabolic pathway doesn't count in this respect, because it is neither morphological, nor was it invented but likely gained through LGT according to the authors. This is quite confusing to say the least. A recent paper (https://doi.org/10.7554/eLife.81033) that refers to Hampl et al. 2019 has picked this up already, and I quote: "Such parasites or commensals have engaged an evolutionary path characterized by energetic dependency. Their complexity might diminish over evolutionary timescale, should they not go extinct with their hosts first." Here the authors raise a red flag with respect to using only parasites and commensals that rely on other eukaryotes with canonical mitochondria as examples. If we now look at Fig. 1 of this submission, Novak et al. underpin this point perfectly, as the origin of oxymonads is apparently connected to the strict dependency on another eukaryote (or am I wrong?), and they support the prediction with respect to complexity reducing after the loss of mitochondria - mitosome gone, Golgi almost gone. What's next? This is a good time to remember that extant oxymonads are only a single picture frame in the movie that is evolution, and their evolution might be a dead-end or result in a prokaryote-like state should they survive 100.000s to millions of years to come.

      It seems that in this point the reviewer is particularly concerned with the following sentence that is part of the Introduction and which relates to the existence of amitochondrial eukaryotes we are studying: “The existence of such an organism implies that mitochondria are not necessary for the thriving of complex eukaryotic organisms, which also has important bearings to our thinking about the origin of eukaryotes (Hampl et al. 2018). Even after re-reading the sentence we confess we stay with it and find it perfectly logical. Nevertheless, we decided to omit it from the text so as not to distract from the main topic of the study.

      Next, when mentioning “… pronounced morphological evolution” we mean the evolution of four oxymonad families (Streblomastigidae, Oxymonadidae, Pyrsonymphidae and Saccinobaculidae) comprising almost a hundred described species with often giant and morphologically elaborated cells that evolved from a simple Trimastix-like ancestor (Hampl 2017, Handbook of Protists, 0.1007/978-3-319-32669-6_8-1). This is a fact that can hardly be dismissed. Also, given the current oxymonad phylogenies (Treitli et al. 2018, doi.org/10.1016/j.protis.2018.06.005) and the reported absence of a mitochondrion in M. exilis, B. nauphoetae, and S. strix we can infer that the mitochondrion was lost in the common ancestor of the three species at latest. This organism must have lived more than 100 MYA, as at that time oxymonads were clearly diversified into the families (Poinar 2009, 10.1186/1756-3305-2-12). So, these organisms indeed have lived without mitochondria for at least 100 MY. We think that these facts and our inferences based on them are solid enough to keep in the conclusion the following statement: “This fact moves this unique loss to at least 100 MYA deep past, when oxymonads had been already diversified (Poinar 2009), and shows that a eukaryotic lineage without mitochondria can thrive for eons and undergo pronounced morphological evolution, as is apparent from the range of shapes and specialized cellular structures exhibited by extant oxymonads (Hampl 2017).” Furthermore, as documented in Karnkowska et al. 2019 (https://pubmed.ncbi.nlm.nih.gov/31387118/), apart the loss of the mitochondrion oxymonads are surprisingly “normal” and complex eukaryotes, in fact much less reduced than, e.g., Giardia, Microsporidia, or even S. cerevisiae (in terms of the number of genes, introns, etc.). We strongly disagree with the claim that “Golgi is almost gone” in oxymonads, and our manuscript shows exactly the opposite. Viewing oxymonads as a lineage heading towards a prokaryote-like simplicity is dogmatic and ignores the known biology of these organisms.

      Some more thoughts: Line 47-52: Hydrogenosome or mitosome is a biological and established label as (m)any other and I find the use of the word "artificial" in this context strange. While the authors are correct to note that there is a (evolutionary) continuum in the reduction - obviously it is step by step - they exaggerate by referring to the existing labels as "artificial". You make Fe-S clusters but produce no ATP? Well, then you're a mitosome. It's a nomenclature that was defined decades ago and has proven correct and works. If the authors think they have a better scheme and definition, then please present one. Using the authors logic, terms such as amyloplast or the TxSS nomenclature for bacterial secretions systems are just as artificial. As is, this comes across as grumble for no good reason.

      We agree that the original wording sounded like unwarranted grumbling and we have changed the sentence in the following way: "However, exploration of a broader diversity of MRO-containing lineages makes it clear that MROs of various organisms form a functional continuum (Stairs et al. 2015; Klinger et al. 2016; Leger et al. 2017; Brännström et al. 2022)."

      Line 158: A duplication-divergence may also explain this since sequence similarity-based searches will miss the ancestral homologues.

      We do not disagree about this, in fact, the gene the reviewer’s point is concerned with for sure is a result of duplication and divergence, as it belongs to a broader gene family (major facilitator superfamily, as stated in the manuscript) together with other distant homologs. Nevertheless, this is not in conflict with our conclusion that it “may represent an innovation arising in the common ancestor of Metamonada”.

      Lines 201-202: Presence of GCS-L in amitochondriate should be explained in light of this group once having a mitochondrion, which then makes ancestral derivation and differential loss (as invoked for Rsg1) also a likely explanation along with eukaryote-to-eukaryote LGT.

      Yes, this most likely holds for the standard paralogue GCS-L1 (in P. pyriformis PAPYR_5544), which has the expected distribution and phylogenetic relationships and is absent in oxymonads. The discussion is, however, mainly about the rare, divergent and until now overlooked paralogue GCS-L2 (in P. pyriformis PAPYR_1328), which we found only in three distantly related eukaryote groups, Preaxostyla, Breviatea, and Archamoebae, which strongly suggests inter-eukaryotic LGT.

      Lines 356-392: Describes plenty of genomic signal for Golgi bodies but simultaneously cites literature suggesting the absence of a morphologically an identifiable Golgi in oxymonads. An explicit prediction regarding what to observe in TEM for the mentioned species might be nice to stimulate further work.

      We thank the reviewer for their suggestion and are glad that they are enthusiastic about this aspect of the manuscript. Unfortunately, the morphology of unstacked Golgi ranges from single cisternae (yeast, Entamoeba), vesicles (Mastigamoeba), and a “tubular membranous structure” in Naegleria. Therefore, no strong prediction is possible of what the oxymonad Golgi might look like under light or TEM. However, the data that we have provided should lead to molecular cell biological analyses aimed at identifying the organelle, giving target proteins to tag or against which to create antibodies as Golgi markers. An additional sentence to this effect has been added to the manuscript, “They also set the stage for molecular cell biological investigations of Golgi morphological variation, once robust tools for tagging in this lineage are developed.”

      Lines 414: The preceding paragraphs in this result section describes only the distribution, without mentioning origins - a sweeping one-line summary that proclaims different origin needs some context and support. Furthermore, the distribution of glycolytic enzymes might indeed be patchy, but to suggest it represents an 'evolutionary mosaic composed of enzymes of different origins' without discussing the alternative of a singular origin and different evolutionary paths (including a stringer divergence in one vs. another species) discredits existing literature and the authors own claim with respect to why BUSCO might fail in protists.

      The part of the text about glycolysis the reviewer alluded to has been removed while shortening the manuscript.

      Line 486: How uncommon are ADI and OTC in lineages sister to metamonada?

      This is an interesting but difficult question. Firstly, we are uncertain what is the sister lineage to Metamonada. Discoba, maybe, but a recent unpublished rooting of the eukaryotic tree does not support it (https://pubmed.ncbi.nlm.nih.gov/37115919/). Generally, the individual genes of the pathway (ADI, OTC and CK) are quite common in eukaryotes, but the combination of all three is rare (Metamonada, the heterolobosean Harpagon, the green algae Coccomyxa and Chlorella, the amoebozoan Mastigamoeba, and the breviate Pygsuia), see figure 1 in Novak et al 2016, doi: 10.1186/s12862-016-0771-4.

      Line 504: It might help an outside reader to include a few lines on consequences and importance of having 2Fe-S vs 4Fe-S clusters and set an expectation (if any) in Oxymonads.

      We apologize for omitting this explanation. The 2Fe-2S proteins are more common in mitochondria where 2Fe-2S clusters are synthesized in the early pathway of FeS cluster assembly, while the cytosolic CIA pathways produce 4Fe-4S clusters (https://pubmed.ncbi.nlm.nih.gov/33007329/). The original expectation therefore is that species without mitochondria should not have 2Fe-2S cluster proteins. Obviously, the switch to the SUF pathway affects this expectation as we do not know, what type of cluster this pathway produces in oxymonads (https://www.biorxiv.org/content/10.1101/2023.03.30.534840v1). For the sake of brevity, we have included a short statement as the beginning of the sentence in question, which now reads as follows: “As 2Fe-2S clusters are more frequent in mitochondrial proteins, the higher number of 2Fe-2S proteins in P. pyriformis compared to the oxymonads may reflect the presence of the MRO in this organism.

      Any explanations on what unique selection pressures and gene acquisition mechanisms may be operating in P. pyriformis which might allow for the unique metabolic potential?

      Every species exhibits a unique combination of traits that results from changing selection pressures imposed on historical contingency (including neutral evolutionary processes such as genetic drift). We lack real understanding of these factors for a majority of taxa including the familiar ones, so we should not expect to have a good answer to the reviewer’s question. In fact, we do not know how unique is the particular combination of P. pyriformis traits discussed in our manuscript, as there has been no comprehensive comparative analysis that would include ecologically and evolutionarily comparable taxa. We note that Paratrimastix represents only a third free-living metamonad with a sequenced genome (together with Kipferlia and Carpediemonas), so more data and additional analyses are needed to be in a position when we may start hoping answers to questions like the one posed by the reviewer are in reach.

      ** Referees cross-commenting** To R3: Hampl et al. 2019, to which Novak et al. refer, is about eukaryogensis and that is exactly the context in which this is discussed again and what Raval et al. 2022 had decided to touch upon. If the authors do not bring this up in light of the ability to evolve (novel) eukaryote complexity, then what else? Maybe they can elaborate, especially with respect to energetics to which they explicitly refer to in 2019 (and here). And with respect to text-book eukaryotic traits (and the evolution of new morphological ones), I do not see any new ones evolving in any oxymonad, but reduction as Novak et al. themselves picture it in this submission. Is a change in the number of flagella pronounced morphological evolution? Maybe for some, but I believe this needs to be seen in light of the context of how they discuss it. I see a reduction of eukaryotic complexity and not a gain. They have an elaborate section on the loss of Golgi characteristics (and a figure), but I fail to read something along the same lines with respect to the gain of new morphological traits. Again, novel LGT-based biochemistry does not equal the invention of a new morphology such as a new compartment. Oxymonads depend on mitochondria-bearing eukaryotes for their survival or don't they? This is the main point, and if evidence show that I am wrong, then I will be the first to adapt my view to the data presented.

      While we do see the logic of the reviewer’s point, a good reply would have to be too elaborate and certainly beyond the scope of the current manuscript. As the reviewers’ reports led us to reconsider the structure of the manuscript and to make it more focused and concise, we decided to simplify the matter by removing the allusions to eukaryogenesis, realizing that it is perhaps more suitable for a different type of paper (opinion, review). The comment on the evolution of complex morphology has been answered previously (see above).

      I have concerns with the presentation of a narrative that in my opinion is too one-sided and that has been has been publicly questioned in the community (in press, at meetings, personally). For the benefit of science and of the young authors on this study, this reviewer feels strongly that these issues should be taken very seriously and discussed openly in a more balanced way. . We only truly move forward on such complex topics, if we allow an open and transparent discussion.

      We agree that opinions on specific details of eukaryogenesis are divided in the community and that the topic requires a nuanced discussion for which there is perhaps no place in the current manuscript. As stated in the reply to the previous point, we have removed the discussion of the implications of our current study to eukaryogenesis from the revised manuscript.

      Having said that, I am happy that R3 has picked up exactly the same major concerns as I did with respect to e.g. the phrasing on mito (gene) loss and the BUSCO controversy.

      We appreciate these comments and hopefully have resolved the concern in the previous answers.

      Reviewer #2 (Significance):

      Using draft genome sequencing of the free-living Paratrimastix pyriformis and the sister lineage oxymonad Blattamonas nauphoetae, Novack et al. infer the metabolic potential of the two protists using comparative genomics. The authors conclude that the common oxymonad ancestor lost the mitochondrion/mitosome and discuss general strategies for adapting to commensal/symbiotic life-style employed by this taxon. Some elaborations on pathways go on for several paragraphs and feel unnecessarily stretched, which made those sections of the paper rather difficult to digest. This might be also be because the work, and all conclusions drawn, depend entirely on incomplete (ca. 70-80%) genome data and simple similarity searches, and e.g. no kind of biochemistry or imaging is presented to underpin the manuscripts discussion.

      We have addressed the concern about the possible incompleteness of our genome data above, demonstrating it is not substantiated ad stems from an inadequate interpretation of quality measures we provide in the manuscript. We hope that the revised manuscript, which is streamlined and more concise compared to the initial submission, conveys the key messages in a substantially more persuasive way and will be appreciated by a broad community of readers.

      Reviewer #3 (Evidence, reproducibility and clarity):

      Summary: The genome sequences of two members of the protist group Preaxostyla are presented in this manuscript: Paratrimastix pyriformis and Blattamonas nauphoetae. The authors use a comparative genomics and phylogenetic approaches and compare the new genome datasets with three previously available genomes and transcriptomes from the group. The availability of genome-scale data from five Preaxostyla species is powerful to address interesting basic evolutionary questions. A substantial part of the manuscript is spent on testing the hypothesis of mitochondrial loss in the oxymonad lineage, which turns out to be supported. The datasets are also explored regarding the role of lateral gene transfer in the group, metabolic diversification and the evolution of Golgi.

      Major comments: I find the manuscript very interesting with many different fascinating results presented. However, the manuscript is very long. Two genome sequences are presented and it is not clear to me what the main question was when this project was initiated and why these two species was selected to answer this question. I do not see an obvious reason for sequencing the P. pyriformis genome if the mitochondrial loss was the main question (given that a transcriptome was already available). Why not spend the time and resources on a member of Preoxystyla, which lacked previous data? The authors should more clearly state why these organisms were chosen to answer the main question or questions of the study.

      We are sorry for having done a poor job when explaining the choice of the taxa for the comparison. The idea was to sample an outgroup of oxymonads (P. pyriformis) and a representative of other clades of oxymonads than M. exilis (B. nauphoetae and S. strix) for which it was feasible to obtain the data, or the data were already available. Obviously, more representatives of morphologically a probably also genetically diverse oxymonads should be investigated (e.g. Pyrsonympha, Oxymonas, Saccinobacullus) and we have such a plan but these organisms are difficult to work with. We considered it necessary to sequence the genome of P. pyriformis, and not rely on the transcriptome only, to avoid the issue of data set incompleteness (raised also by R2). Transcriptomes by nature provide an incomplete coverage of the full gene complement of the species, while our genome assemblies are close to complete, as we explain elsewhere.

      The evolution of MROs have received substantial attention from the protist research community since the 1990's. During this period the mitochondrial organelle have been considered essential for eukaryotes. Therefore, the result presented in the manuscript has a high significance. However, I am not convinced that it is appropriate to use the term "evolutionary transition" for the mitochondrial loss. The loss of MRO is the endpoint of a gradual change of the internal organisation of the cell that probably started when the ancestor of these organism adapted to an anaerobic lifestyle. The last step described in the manuscript probably had little impact on how these organisms interacted with their environment. The presence or absence of biosynthesis of p-cresol by some, but not all, Preaxystyla probably is much more significant from an ecological point of view. My point is that the authors need to consider how they use the term evolutionary transition and be explicit about that.

      We appreciate the comment concerning the use of the term “evolutionary transition”. Nevertheless, we believe there is no real consensus in the literature on what is and what is not an “evolutionary transition”, and the application of the term to specific cases is more or less arbitrary. For a lack of a standardized or better terminology, we have kept the term to refer to three evolutionary changes in the evolution of the Preaxostyla lineage that are particularly important from the cytological or ecological perspective, i.e. dispensing with the mitochondrion, reorganizing the Golgi apparatus by losing the stacked arrangement of the cisternae, and gaining the endobiotic life style.

      In the abstract the main finding is describes as "the data confirm the complete loss of mitochondria and every protein that has ever participated in the mitochondrion function for all three oxymonad species (M. exilis, B. nauphoetae, and Streblomastix strix) extending the amitochondriate status to the whole Oxymonadida.". I find this a really interesting observation, but I do find the wording a bit too bold for several reasons: • Not every protein that has participated in the mitochondrial function is known. • Mitochondrial proteins could be present in oxymonads, but divergent beyond the detection limit for existing methods. • Genes for one or several mitochondrial proteins could be present in one or more oxymonad genomes, but remain undetected due to the incomplete nature of the datasets.

      Although I do think that the authors' claim very well could be true, I don't think their data fully support it. Therefore, it needs to be rephrased.

      As a result of our decision to streamline the manuscript by removing the final part of Results and Discussion (“No evidence for subcellular retargeting of ancestral mitochondrial proteins in oxymonads”, the revised manuscript no longer support the statement “the data confirm the complete loss of … every protein that has ever participated in the mitochondrion function for all three oxymonad species” that is criticized by the reviewer, and hence the statement has been removed from the abstract. This addresses bullet point 1. As for bullet points 2 and 3, the proof of absence is in principle impossible to deliver, and we have been fighting with this already in the Karnkowska et al. 2016 paper. Although our certainty will never reach 100% (this is in fact impossible for a scientific, i.e., falsifiable, hypothesis), the mounting of evidence through studies gives the hypothesis on the amitochodriate status of oxymonads more and more credit. The genes for mitochondrial marker proteins have not been detected by the most sensitive methods available neither in the first genome assembly of M. exilis (Karnkowska et al. 2016), nor in the improved M. exilis genome assembly composed of only 101 contigs (Treitli et al. 2021), nor in either of the other two oxymonad species investigated here. On the other hand, they were readily detected in the data sets of P. pyriformis and T. marina. What is the probability that these genes always hide in the assembly gaps, or that they have all escaped recognition? Obviously, this probability is not zero, but we believe it is approaching so low values that it is reasonably safe to make the conclusion on the amitochondriate status of these species.

      The sentence was changed to: "Our results provide insights into the metabolic and endomembrane evolution, but most strikingly the data confirm the complete loss of mitochondria for all three oxymonad species investigated (M. exilis, B. nauphoetae, and Streblomastix strix), suggesting the amitochondriate status may be common to Oxymonadida."

      The third point maybe could be analysed further. BUSCO scores are reported, but also argued not being reliable for this group of organisms (which is true). Would it, for example, be useful to analyse how large fraction of the BUSCO proteins found in all non-Preoxystyla metamonada genomes that are present in the various Preoxystyla datasets?

      We provide a comprehensive answer to a similar comment of reviewer 2 above (page 6-8). We performed the requested analysis and provide the result in Supplementary file 11. In this table, we record presence/absence of each gene from the BUSCO set for our data sets and the highly complete “standard” datasets of Trypanosoma brucei, Giardia intestinalis and Trichomonas vaginalis. Of the 303 genes, 117 were present in all data sets and 17 in none (see column I). 20 were present only in Trypanosoma and not in metamonads. 6 were present in all Preaxostyla and absent in other metamonads (Trichomonas and Giardia), 44 were present in all Preaxostyla and Trichomonas and absent in Giardia, suggesting high divergence of this species. Only 23 (marked by *) were present in the three “standard” genomes and absent in one or more Preaxostyla species. Of those 8 and 8 were absent specifically in S. strix and P. pyriformis, respectively, but only 1 was absent specifically in M. exilis and no such case was observed in B. nauphoetae. We conclude that this non-random pattern argues for lineage-specific divergence rather than incomplete data sets, particularly in the case of M. exilis and B. nauphoetae.

      Line 160-161: 15 LGT events specific for the Preaxostyla+Fornicata clade is reported. This is an exciting finding because it supports a phylogenetic relationship between these two groups. But such an argument is only valid if the observed pattern is more common than the alternative hypotheses (Preaxostyla+Parabasalids and Fornicata+Parabasalids). How many LGT events support each of these groupings? How are these observation affected by the current taxon sampling with the highest number of datasets from Fornicata? How were putative metamonada-to-metamonada LGTs treated in this context?

      19 LGT are uniquely shared between Preaxostyla+Parabasalids, which is more than the number of shared LGTs between Preaxostyla and Fornicata. No common LGT was unique to Fornicata+Parabasalids. However, the latter is a direct consequence of our investigation method, which involved reconstruction phylogenies of genes present in Preaxostyla, and not across all metamonads. So, we do not have a way to investigate LGT gene families uniquely shared between Fornicata and parabasalids.

      When it comes to the effect of taxon sampling, we agree that it is possible that the number of genes of horizontal origin shared between parabasalids and Preaxostyla is underestimated because of the lower taxon sampling in parabasalids. However, it is still larger (19) than the number of LGTs shared uniquely between fornicate and Preaxostyla (15). In addition, while the taxon sampling is larger in fornicates, it also contains some representatives of closely related lineages (e.g., Chilomastix caulleryi and Chilomastix cuspidate) which, while they increase the number of fornicate representatives, does not increase the detection of shared genes between fornicates and Preaxostyla. Altogether, it's difficult to estimate how the current taxon sampling is biasing the detection of LGTs one way or another.

      Regarding metamonad-to-metamonad putative LGTs: we did not consider this possibility for the sake of not overestimating the number of gene transfers for two main reasons. First of all, our LGT detection relies on the incongruence between species tree and gene tree. The closer the lineages are in the species tree, the more difficult it is to interpret any incongruence in the gene tree as single protein phylogenies are notoriously poorly resolved because they rely on the little phylogenetic signal contained in few amino-acid positions. Because of this, small incongruences with the species tree could either reflect recent LGT events between metamonads, or simply blurry phylogenetic signal. Second, we can certainly use the argument that a limited taxonomic distribution among metamonads favors an LGT event between them. However, here again, the closer the lineages involved are, the more difficult it is to distinguish a scenario where one lineage was the recipient of an LGT from prokaryote before donating it to another metamonad, from a scenario involving a single ancestral LGT from prokaryotes to metamonads, followed by differential loss, leading to a patchy taxonomic distribution. Finally, we are working with both limited taxon sampling and incomplete genomic/transcriptomic data, which makes it more difficult to identify true absences. For all these reasons, we chose to be conservative and invoke the smallest number of LGT events.

      The authors have used a large-scale approach to make single-gene trees for inferences of LGT. In other parts of the manuscript inferences of evolutionary origins of single genes are made without support of phylogenetic trees. I find this inconsistent and argue that the hypothesis of the origin of a specific protein should be tested with the same rigor whether it is a putative LGT, gene duplication, gene loss or an ancestral member of LECA. Specific cases where I think a phylogenetic analysis is needed includes: • Line 222-223: It is concluded that Rsg1 is a component of LECA. • Line 307: HgcAB are argued to be acquired by LGT of a whole opeon. • Lines 350-355: It is unclear how the different numbers of transporters are interpreted (loss or expansion by duplication). This could be address with phylogenetics. • Lines 407-408: A tree should support the claim of LGT origin. (PFP) • Lines 414-415: The different origins of glycolytic enzymes should be supported by data or references. • Line 486: Trees or a reference (if available) should support the claim for LGT.

      As requested, trees were constructed for HgcA, HgcB, PFP and the transporters AAAP, CTL, ENT, pATPase, and SP. Citations were added for the glycolytic enzymes and the ADI pathway. No tree for Rsg1 is needed, as this is a eukaryote-specific protein lacking any close prokaryotic relatives. The inference on its presence in the LECA is based on the phylogenetically wide, however patchy, distribution across the eukaryote phylogeny. Testing possible eukaryote-eukaryote LGTs is hampered by a limited phylogenetic signal in the short and rapidly evolving Rsg1 sequences, resulting in very poorly resolved relationships among Rgs1 sequence in a tree we attempted to make (data not shown). For this reason, we opt for not presenting any phylogenetic analysis for Rsg1.

      Lines 530-531 and 773-774: "The switch to the SUF pathway in these species has apparently not affected the number of Fe-S-containing proteins but led to a decrease in the usage of 2Fe-2S clusters." I find it difficult to evaluate if the data support this because no exact numbers or identities are given for 2Fe-2S and 4Fe-4S proteins in the various genomes in Suppl. Fig. S4 or Supplementary file 4.

      The functional annotation of all detected FeS clusters containing proteins is provided in Supplementary Table S8 including the types of predicted clusters (columns G or F). Basically, the only putative 2Fe2S cluster containing proteins in species of oxymonad is xanthine dehydrogenase, while Paratrimastix and Trimastix contain also 2Fe2S cluster-containing ferredoxins and hydrogenases.

      The method used in the paper varies between the different parts of the paper. One example is single gene phylogenies, which are described three times in the method section [Lines 959-973, lines 1011-1034, lines 1093-1101], in addition to the automated approach within the LGT detection pipeline lines 923-926]. The approaches are slightly different with, for example, different procedures for trimming. This makes it difficult to know how the different presented analyses were done in detail. No rationale for using different approaches is given. At the least, it should be clear in the method section which approach was used for which analysis.

      The reviewer is correct, and we apologize for the inconsistency. The reason is only historical –the analyses were performed by different laboratories in different periods of time. We believe this fact does not make our results less robust, although it does not “look” nice and makes the description of the methods employed longer. We have double-checked the description and introduced slight changes as to make it maximally clear which method has been used for particular analyses presented in the Results and Discussion.

      Specific comments on single gene phylogenies:

      • Line 966-967: Why max 10 target sequences?

      The limit of 10 was applied in order to keep the datasets in manageable dimensions. The sentence has been changed to: " In order to detect potential LGT from prokaryotes while keeping the number of included sequences manageable, prokaryotic homologues were gathered by a BLASTp search with each eukaryotic sequence against the NCBI nr database with an e-value cutoff of 10-10 and max. 10 target sequences.

      • Lines 996-998: Is it a problem that these are rather old datasets?

      Although the publications are slightly older the set of queries is absolutely sufficient for the purpose.

      Minor comments: I appreciate that many data is included as supplementary material. However, the organisation of the data could be improved. The numbering of the files is not included in their names or within the files, as far as I could find. Descriptions of the files are often missing and information on the annotation such as colour coding is not always included. These aspects of the supplementary material needs to be strengthened in order to make it more useful. Specific comments: • Supplementary file 1, Table 1: accession numbers are missing. Kipferlia bialta appears to have a much smaller number of sequences than reported in the publication. The file consists of three tables and it would be very helpful if the reference in the main manuscript indicate the table number. • Supplementary file 4: The trees lack proper species names and a documented colour coding. There are multiple trees in the file, which make it difficult to find the correct tree. I would appreciate if the different trees were labelled A, B, C, etc., and if these were used in the main text.

      Supplementary file 1: Accession numbers were added.

      Supplementary file 4: Species names and alphabetical labelling were added. Colour coding was explained in the text at the first mention of the file: "(Supplementary file 4 H; Preaxostyla sequences in red)."

      o There is no HPAD-AE tree (as indicated on line 258), but a HPAD tree. Which part of the tree contain the described fusion protein?

      Thank you for spotting the mistake. There should have been “HPAD” instead of “HPAD-AE” indicated in the text. The sentence has been changed to:" The P. pyriformis HPAD sequence is closely related to its homolog in the free-living archamoebid M. balamuthi (Supplementary file 4 K), the only eukaryote reported so far to be able to produce p-cresol (Nývltová et al. 2017)."

      o Line 280-281: "UbiE homologs occur also in some additional metamonads, including the oxymonad B. nauphoetae and certain fornicates." These sequences should be clearly highlighted in the tree.

      We discovered these additional UbiE homologs only after the tree presented in the supplement had been constructed, so these sequences are missing from it. To ensure consistency we have decided to remove the remark on the presence of UbiE homologs metamonads other than P. pyriformis, so it is no longer part of the revised manuscript.

      o Lines 538-544: A three-gene system is mentioned, but only two AmmoMemoRadiSam trees are found.

      This part has been removed while streamlining the manuscript.

      • Supplementary file 6: I find it difficult to find the proteins discussed in the text, for example "the biosynthesis of p-cresol from tyrosine (line 254-255)".

      Abbreviations identifying the different enzymes have now been added to all mentions in the text, facilitating their localization in the supplementary file: "P. pyriformis encodes a complete pathway required for the biosynthesis of p-cresol from tyrosine (Supplementary file 6), only the second reported eukaryote with such capability. This pathway consists of three steps of the Ehrlich pathway (Hazelwood et al. 2008) converting tyrosine to 4-hydroxyphenyl-acetate (AAT, HPPD, ALDH) and the final step catalyzed by a fusion protein comprised of 4-hydroxyphenylacetate decarboxylase (HPAD) and its activating enzyme (HPAD-AE)."

      • Supplementary file 11: Which group of species are highlighted in red? How do I know from which species these sequences are (I can make educated guesses, but prefer full species names). I do not find any reference to this file in the main manuscript.

      We apologise for this inconvenience. The taxon labels in the treed in this supplementary file have been corrected to contain full species names.

      Line 227-228: "630 OGs seem to be oxymonad-specific or divergent, without close BLAST hits". It is unclear if BLAST searches includes only a representative of each 630 OGs, or every single protein in these OGs.

      The BLAST searches include every single protein in the investigated OGs. We clarified it in the text: “Of these, 630 OGs seem to be oxymonad novelties or divergent ancestral genes, without close BLAST hits (e-value -15) to any of these sequences.

      Line 243: I think it is five LGT mapped to internal nodes of Preoxystyla in Figure 1 (1+3+1).

      You are correct, we apologize for the mistake. The sentence has been changed to: "Also, 46 LGT events were mapped to the terminal branches and 5 to internal nodes of Preaxostyla, suggesting that the acquisition of genes is an ongoing phenomenon, and it might be adaptive to particular lifestyles of the species."

      Lines 325-331: The argument would be stronger with a figure showing the fusion and the alignment indicating the conserved amino acids mentioned in the text.

      We agree with the reviewer but for the sake of space, we finally decided not to include a new figure.

      Lines 425: "none of the species encoded" should be replaced by something like "none of the enzyme could be detected in any of the species" (the datasets are incomplete).

      The sentence has been changed to: "None of the alternative enzymes mediating the conversion of pyruvate to acetyl-CoA, pyruvate:NADP+ oxidoreductase (PNO) and pyruvate formate lyase (PFL), could be detected in any of the studied species."

      Line 455: "suggesting a cytosolic localization of these enzymes in Preaxostyla." The absence of a phylogenetic affiliation with the S. salmonicida homolog does not preclude a MRO localisation.

      The sentence was changed to: "Phylogenetic analysis of Preaxostyla ACSs (Supplementary file 4 B) shows four unrelated clades, none in close relationship to the S. salmonicida MRO homolog, consistent with our assumption that these enzymes are cytosolic in Preaxostyla."

      Lines 570-571: "Manual verification indicated that all the candidates recovered in oxymonad data sets are false positives" Using which criteria?

      The manual verification was based on the annotation of predicted proteins by BLAST and InterProScan. If the annotations did not correspond to the suggested function, they were considered false positives. For example, the protein BLNAU_15573 of Blattamonas nauphoetae was detected by Sam50 HMM profile and thus was considered a candidate for Sam50 proteins. Its functional annotation from BLAST was, however, unrelated to Sam50 (“putative phospholipase B”). Therefore, this candidate was concluded as a false positive hit of the HMM search resulting from the very high sensitivity of this method.

      We clarified this in the Results

      Reciprocal BLASTs indicated that all the candidates recovered in oxymonad data sets are very likely to be false positives based on the annotations of their top BLAST hits (mainly vaguely annotated kinases, peptidases and chaperones) (Fig. 6, Supplementary file 9).”.

      And Material and Methods

      Any hits received by the methods described above were considered candidates and were furter inspected as follows. All candidates were BLAST-searched against NCBI-nr and the best hits with the descriptions not including the terms 'low quality protein', 'hypothetical', 'unknown', etc. were kept. For each hit, the Gene Ontology categories were assigned using InterProScan-5.36-75.0. If the annotations received from BLAST or InterProScan corresponded to the originally suggested function, the candidates were considered as verified. Otherwise, they were considered as false positives.

      Lines 743-755: "Similar observations were made in other protists with highly reduced mitochondria, such as G. intestinalis or E. histolytica,..." References are needed.

      This part of the manuscript has been removed while streamlining the text.

      Line 849: How was the manually curation done for the gene models in the training set?

      The sentence has been changed to: "For de novo prediction of genes, Augustus was first re-trained using a set of gene models manually curated with regard to mapped transcriptomic sequences and homology with known protein-coding genes."

      Lines 853-856: It is a bit unclear which dataset was used for BUSCO and downstream analysis. Was it the Augustus-predicted proteins, or the EVM polished?

      The sentence has been changed to: "The genome completeness for each genome was estimated using BUSCO v3 with the Eukaryota odb9 dataset and the genome completeness was estimated on the sets of EVM-polished protein sequences as the input."

      Lines 858: What is it meant that KEGG and similarity searches was used in parallel (what if both gave a functional annotation?)?

      A sentence has been added for clarity: "KEGG annotations were given priority in cases of conflict."

      Lines 861-862 and 1007-1008: Which genes or sub-projects does this apply to? How many genes were detected in this procedure?

      The sentence has been changed to make this clear: "Targeted analyses of genes and gene families of specific interest were performed by manual searches of the predicted proteomes using BLASTp and HMMER (Eddy 2011), and complemented by tBLASTn searches of the genome and transcriptome assemblies to check for the presence of individual genes of interest that were potentially missed in the predicted protein sets (single digits of cases per set). Gene models were manually refined for genes of interest when necessary and possible."

      Lines 878-879: It is not clear to me why the sum of the two described numbers should be as high as possible and would appreciate an argument or a reference.

      When optimizing the inflation parameter of OrthoMCL, we reasoned that the optimal level of grouping/splitting for our purpose should result in the highest number of orthogroups containing all representatives of the groups of interest (i.e. Preaxostyla) but no other species – pan-Preaxostyla orthogroups. When going down with the values, you observe more and more groupings of pan-Preaxostyla OGs with others (indication of overgrouping) in the opposite direction you observe splitting of pan Preaxostyla OGs which indicates oversplitting. Because we were optimizing the inflation parameter for Preaxostyla and Oxymonadida at the same time, we maximized the sum of pan-Preaxostyla and pan-Oxymonadida groups.

      Lines 879-881: "Proteins belonging to the thus defined OGs were automatically annotated using BLASTp searches against the NCBI nr protein database (Supplementary file 1)." Why were these annotated in a different way (compare lines 857-859).

      This little inconsistency resulted from the fact that these parts of the analyses were performed by different researchers who did not cross-standardize the procedures. This inconsistency has no effect on the downstream analyses and conclusions as the annotations from Supplementary file 1 were not used in any further analyses.

      Lines 894-957: "Detection of lateral gene transfer candidates": • It is not clear which sequences were tested in the procedure. All Preaxostyla, or all metamonada? I think I am confused because in the result sections you only report numbers for Preaxostyla, but in the method section metamonada is mentioned repeatedly.

      Thank you for noticing. There was indeed some inconsistency in our writing.

      We did an all-against-all search using all metamonads. However, we filtered out all homologous families in which Preaxostyla were not present or that had no hit against GTDB. So in the end, the LGT search was restrained to protein families containing Preaxostyla homologues. We corrected the wording in our method section.

      • It would be easier to follow the procedure if numbers are provided for the different steps.

      We are not sure what numbers the reviewer refers to here.

      • Why was only small oxymonad proteins discarded (line 900)?

      This is indeed a mistake. We meant “Preaxostyla proteins”. This is because we only considered Preaxostyla sequences with significant hits against GTDB as a starting point, so we aimed to first remove those that might be too short to yield reliable phylogenies.

      • Line 911: How many sequences were collected?

      Up to 10,000 hits were retained. We have added that information to the text.

      • Lines 916-919: What is the difference between the protein superfamilies (line 916) and the OGs (line 919)? Are the OGs the same orthogroups that is described earlier in the method section? How are the redundancy of NCBI nr entries retrieved in different searches dealt with?

      We understand the confusion here. It primarily stemmed from two different ways to establish homologous families across the manuscript because of different researchers being responsible for different parts. Protein superfamilies that were used for reconstructing the single protein trees used for the LGT analyses were assembled based on the procedure describe line 916-919 (“Protein superfamilies were assembled by first running DIAMOND searches of all metamonad sequences against all (-e 1e-20 --id 25 --query-cover 50 --subject-cover 50). Reciprocal hits were gathered into a single FASTA file, as well as their NCBI nr homologues.”). However, this was a somewhat stricter procedure than the one used to establish the OGs that are discussed in the rest of the manuscript (because of the e-value and identity cut-off used), so we eventually enriched the datasets with the putatively missing metamonad sequences that were present in the OGs but not in the initial superfamily assembly. However, since these were often more divergent sequences, we did not use these as queries for our BLAST searches against prokaryotes.

      Line 987-989: "...was facilitated by Rsg1 being rather divergent from other Ras superfamily members" This statement is vague. What does it mean in practise?

      The sentence has been changed to: " The discrimination was facilitated by Rsg1 having low sequence similarity to other Ras superfamily members (such as Rab GTPases)."

      Lines 1037-1038: Why were these proteins re-annotated?

      They were not. We are sorry for this mistake, which has been fixed in the revised manuscript.

      Figures: The figures would be easier to follow if the colour coding for the five different species were consistent between the figures.

      This is a good point, the colour coding has been unified across all figures.

      Figure 1: It appears that the Venn diagram in C only shows the Preaxostyla-specific protein in B, not all OGs for which contain Preaxostyla proteins. This is not clear from legend or from the figure itself. The same comment applies to D.

      The interpretation of the figure by the reviewer is correct; we have modified the legend to make the meaning of the figure easier to understand.

      Figures 2 and 6: It would be clearer with panel labels A, B, etc, instead of "upper" and "lower" panel, as in the other figures.

      This is a fair point, we have added the alphabetical labels proposed by the reviewer to the figures.

      Figure 6: What is the colour code in the figure? The numbers within the boxes are not aligned.

      We have added an explanation of the color code to the legend and edited the figure to make it aesthetically more pleasing.

      Supplementary figures 1-3: What do green and magenta indicate in the figure?

      As with the previous figure, the color code is now explained in the revised legend.

      ** Referees cross-commenting** I agree with the other reviewers that the discussion of the functional and ecological implications of the LGTs could be developed.

      We understand the reviewers but as already explained in response to Reviewer 1, we have decided not to extend the already rather long manuscript further. We believe that the several exemplar LGT cases that we do discuss in detail provide a good impression of the significance of LGT in the evolution of Preaxostyla.

      In contrast to reviewer 2, I do not see that the authors discuss their result in the context of eukaryogenesis in this manuscript. Maybe the reference reviewer 2 mention could be cited in the introduction together with Hampl et al. 2018 to acknowledge that there are different views about the importance of secondarily amitochondrial eukaryotes on our thinking about the origin of eukaryotes. I disagree with reviewer 2's objection against the wording "... and undergo pronounced morphological evolution" because I think Fig. 4 in Hampl 2017 shows a large morphological diversity among oxymonads.

      We are glad to see that our perspective is not shared by other colleagues in the field. Nevertheless, having carefully considered the case we have decided to remove any mentions of eukaryogenesis from the revised manuscript, as we admit this topic is peripheral to the key message of our present study. On the other hand, we appreciate very much the note by the reviewer on the large morphological diversity among oxymonads – we have now added a similar remark to the revised manuscript (the last sentence of Conclusions).

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Using draft genome sequencing of the free-living Paratrimastix pyriformis and the sister lineage oxymonad Blattamonas nauphoetae, Novack et al. infer the metabolic potential of the two protists using comparative genomics. The authors conclude that the common oxymonad ancestor lost the mitochondrion/mitosome and discuss general strategies for adapting to commensal/symbiotic life-style employed by this taxon. Some elaborations on pathways go on for several paragraphs and feel unnecessarily stretched, which made those sections of the paper rather difficult to digest. This might be also be because the work, and all conclusions drawn, depend entirely on incomplete (ca. 70-80%) genome data and simple similarity searches, and e.g. no kind of biochemistry or imaging is presented to underpin the manuscripts discussion. This is noteworthy in light of other protist genome reports published in the last few years that differ in this respect, including previous work by this group. And for sequencing-only data, this paper - https://doi.org/10.1016/j.dib.2023.108990 - might offer an example of where we are at in 2023. With respect to previous work of the group (Karnkowska et al. 2016 and 2019), this submission is very similar (analysis pattern, even some figures and more or less the conclusion), i.e. to say, the overall progress for the broader audience is rather incremental. Then there are also some incidents, where the data presented conflicts with the authors own interpretation. The text (including spelling and grammar) needs some attention and the choice of words is sometimes awkward. The overuse of quotation marks ("classical", "simple", "fused", "hits", "candidate") is confusing (e.g. was the BLAST result a hit or a "hit").

      In its current form the manuscript is, unfortunately, very difficult to review. This reviewer had to make considerable efforts to go through this very large manuscript, mainly because of issues affecting to the presentation and the lack of clarity and conciseness of the text. It would be greatly appreciated if the authors would make more efforts upfront, before submission, to make their work more easily accessible both to readers and facilitate the task of the reviewers.

      About a fifth of the two genome is missing according the authors prediction (table 1). Early on they explain the (estimated) incompleteness of the genomes to be a result from core genes being highly divergent. In light of this already suspected high divergence, using (the simplest NCBI) sequence similarity approach to call out the absence of proteins (for any given lineage) may need lineage-specific optimization. The use of more structural motif-guided approaches such as hidden Markov models could help, but it is not clear whether it was used throughout or only for the search for (missing) mitochondrial import and maturation machinery. The authors state that the low completeness numbers are common among protists, which, if true, raises several questions: how useful are then such tools/estimates to begin with and does this then not render some core conclusions problematic? The reader is just left with this speculation in the absence of any plausible explanation except for some references on other species for which, again, no context is provided. Do they have similar issues such as GC-content, same core genes missing, phylogenetic relevance?, etc.. No info is provided, the reader is expected to simply accept this as a fact and then also accept the fact that despite this flaw, all conclusions of the paper that rests on the presence/absence of genes are fine. This is all odd and further skews the interpretations and the comparative nature of the paper.

      As a side note, this will also influence the number of proteins absent in other lineages and as such has consequences on LGT calls versus de novo invention. For the cases with LGT as an explanation, it would help to briefly discuss the candidate donors and some details of the proteins in the eco-physiological context (e.g. lines 263-268 suggest that HPAD may have been acquired by EGT which was facilitated by a shared anaerobic habitat and also comment on adaptive values for acquiring this gene). Exchanging metabolic genes via LGT (Line 163) blurs the differences between roles and extent of LGT in prokaryote vs eukaryote, and therefore is exciting and could use support/arguments other than phylogenies. I guess the number of reported LGTs among protists (whatever the source) over the last decade has by now deflated the novelty of the issue in more general; a report of the numbers is expected but they alone won't get you far anymore in the absence of a good story (such as e.g. work on plant cell wall degrading enzymes in beetles). It would help to clarify which parts of the mitochondrial ancestor were reduced during the process of reductive evolution at what time in their hypothesized trajectory. For instance, loosing enzymes of anaerobic metabolism conflicts with the argued case of an aerobic (as opposed to facultative anaerobic) mitochondrial ancestor followed by gains of anaerobic metabolism in the rest of the eukaryotes via LGT, and some papers the authors themselves cite (e.g. the series by Stairs et al.). There is no coherent picture on LGT and anaerobic metabolism, although a reader is right to expect one.

      In light of their data the authors also discuss the importance of the mitochondrion with respect to the origin of eukaryotes:

      First, the mitochondrion brought thousands of genes into the marriage with an archaeon, surely hundreds of which provided the material to invent novel gene families through fusions and exon shuffling and some of which likely went back and forth over the >billion years of evolution with respect to localizations. The authors look at a minor subset of proteins (pretty much only those of protein import, Fig. 6) to conclude, in the abstract no less: „most strikingly the data confirm the complete loss of mitochondria and every protein that has ever participated in the mitochondrion function for all three oxymonad species." I do not question the lack of a mitochondrion here, but this abstract sentence is theatrical in nature, nothing that data on an extant species could ever proof in the absence of a time machine, and is evolutionary pretty much impossible. A puzzling sentence to read in an abstract and endosymbiont-associated evolution.

      Second, using oxymonads as an example that a lineage can present eukaryotic complexity in the absence of mitochondria and conflating it with eukaryogenesis is a logical fallacy. This issue already affected the 2019 study by Hampl et al.. We have known that a eukaryote can survive without an ATP-synthesizing electron transport chain ever since Giardia and other similar examples and the loss of Fe-S biosynthesis and the last bit of mitosome (secondary loss) doesn't make a difference how to think about eukaryogenesis. It confuses the need and cost to invent XYZ with the need and cost of maintenance. How can the authors write "... and undergo pronounced morphological evolution", when they evidently observe the opposite and show so in their Fig. 1? The authors only present evidence for reductive evolution of cellular complexity with the loss of a stacked Golgi. What morphological complexity did oxymonads evolve that is absent in other protists? A cytosolic metabolic pathway doesn't count in this respect, because it is neither morphological, nor was it invented but likely gained through LGT according to the authors. This is quite confusing to say the least. A recent paper (https://doi.org/10.7554/eLife.81033) that refers to Hampl et al. 2019 has picked this up already, and I quote: "Such parasites or commensals have engaged an evolutionary path characterized by energetic dependency. Their complexity might diminish over evolutionary timescale, should they not go extinct with their hosts first." Here the authors raise a red flag with respect to using only parasites and commensals that rely on other eukaryotes with canonical mitochondria as examples. If we now look at Fig. 1 of this submission, Novak et al. underpin this point perfectly, as the origin of oxymonads is apparently connected to the strict dependency on another eukaryote (or am I wrong?), and they support the prediction with respect to complexity reducing after the loss of mitochondria - mitosome gone, Golgi almost gone. What's next? This is a good time to remember that extant oxymonads are only a single picture frame in the movie that is evolution, and their evolution might be a dead-end or result in a prokaryote-like state should they survive 100.000s to millions of years to come.

      Some more thoughts:

      Line 47-52: Hydrogenosome or mitosome is a biological and established label as (m)any other and I find the use of the word "artificial" in this context strange. While the authors are correct to note that there is a (evolutionary) continuum in the reduction - obviously it is step by step - they exaggerate by referring to the existing labels as "artificial". You make Fe-S clusters but produce no ATP? Well, then you're a mitosome. It's a nomenclature that was defined decades ago and has proven correct and works. If the authors think they have a better scheme and definition, then please present one. Using the authors logic, terms such as amyloplast or the TxSS nomenclature for bacterial secretions systems are just as artificial. As is, this comes across as grumble for no good reason.

      Line 158: A duplication-divergence may also explain this since sequence similarity-based searches will miss the ancestral homologues.

      Lines 201-202: Presence of GCS-L in amitochondriate should be explained in light of this group once having a mitochondrion, which then makes ancestral derivation and differential loss (as invoked for Rsg1) also a likely explanation along with eukaryote-to-eukaryote LGT.

      Lines 356-392: Describes plenty of genomic signal for Golgi bodies but simultaneously cites literature suggesting the absence of a morphologically an identifiable Golgi in oxymonads. An explicit prediction regarding what to observe in TEM for the mentioned species might be nice to stimulate further work.

      Lines 414: The preceding paragraphs in this result section describes only the distribution, without mentioning origins - a sweeping one-line summary that proclaims different origin needs some context and support. Furthermore, the distribution of glycolytic enzymes might indeed be patchy, but to suggest it represents an 'evolutionary mosaic composed of enzymes of different origins' without discussing the alternative of a singular origin and different evolutionary paths (including a stringer divergence in one vs. another species) discredits existing literature and the authors own claim with respect to why BUSCO might fail in protists.

      Line 486: How uncommon are ADI and OTC in lineages sister to metamonada?

      Line 504: It might help an outside reader to include a few lines on consequences and importance of having 2Fe-S vs 4Fe-S clusters and set an expectation (if any) in Oxymonads

      Any explanations on what unique selection pressures and gene acquisition mechanisms may be operating in P. pyriformis which might allow for the unique metabolic potential?

      ** Referees cross-commenting**

      To R3: Hampl et al. 2019, to which Novak et al. refer, is about eukaryogensis and that is exactly the context in which this is discussed again and what Raval et al. 2022 had decided to touch upon. If the authors do not bring this up in light of the ability to evolve (novel) eukaryote complexity, then what else? Maybe they can elaborate, especially with respect to energetics to which they explicitly refer to in 2019 (and here). And with respect to text-book eukaryotic traits (and the evolution of new morphological ones), I do not see any new ones evolving in any oxymonad, but reduction as Novak et al. themselves picture it in this submission. Is a change in the number of flagella pronounced morphological evolution? Maybe for some, but I believe this needs to be seen in light of the context of how they discuss it. I see a reduction of eukaryotic complexity and not a gain. They have an elaborate section on the loss of Golgi characteristics (and a figure), but I fail to read something along the same lines with respect to the gain of new morphological traits. Again, novel LGT-based biochemistry does not equal the invention of a new morphology such as a new compartment. Oxymonads depend on mitochondria-bearing eukaryotes for their survival or don't they? This is the main point, and if evidence show that I am wrong, then I will be the first to adapt my view to the data presented.

      I have concerns with the presentation of a narrative that in my opinion is too one-sided and that has been has been publicly questioned in the community (in press, at meetings, personally). For the benefit of science and of the young authors on this study, this reviewer feels strongly that these issues should be taken very seriously and discussed openly in a more balanced way. . We only truly move forward on such complex topics, if we allow an open and transparent discussion.

      Having said that, I am happy that R3 has picked up exactly the same major concerns as I did with respect to e.g. the phrasing on mito (gene) loss and the BUSCO controversy.

      Significance

      Using draft genome sequencing of the free-living Paratrimastix pyriformis and the sister lineage oxymonad Blattamonas nauphoetae, Novack et al. infer the metabolic potential of the two protists using comparative genomics. The authors conclude that the common oxymonad ancestor lost the mitochondrion/mitosome and discuss general strategies for adapting to commensal/symbiotic life-style employed by this taxon. Some elaborations on pathways go on for several paragraphs and feel unnecessarily stretched, which made those sections of the paper rather difficult to digest. This might be also be because the work, and all conclusions drawn, depend entirely on incomplete (ca. 70-80%) genome data and simple similarity searches, and e.g. no kind of biochemistry or imaging is presented to underpin the manuscripts discussion.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      *Reviewer #1 (Evidence, reproducibility and clarity (Required)): **

      This study evaluates the effect of fungal toxin candidalysin on neutrophils. The authors show that candidalysin induces NETosis when secreted by hyphae, but when candidalysin is added on its own, NLS are formed instead which are distinct from NETs. The authors have done lots of carefully controlled experiments, and delineated key components of the pathway inducing NLS, including the role of ROS and histone modifications. The data provided is high quality and well presented in figures.

      Reviewer #1 (Significance (Required)):

      Strengths are the depth of analysis - many different aspects of NETosis is assessed and robustly tested.

      Comments: 1. I was a bit confused by what should be the main message of the paper - is it that candidalysin on its own doesn't induce NETosis but only NLS? The answer to this question wasn't well addressed in my opinion, but the paper switches between using live fungi and purified candidalysin so it became confusing at times.

      *

      Responses:

      Thank you for this important comment. We have clarified our narrative on candidalysin throughout the manuscript to provide a red thread for the readership. Our message is that candidalysin alone has not the capacity to induce a full cycle of signalling events which result in canonical NET formation. Our data show that candidalysin alone falls short and can only produce NLS. On the other hand, our data show that in the context of growing C. albicans cells candidalysin is able to promote the release of NETs. This is important, since previously the hyphal form of C. albicans has been reported to be a formidable inducer of NETs, whereas the yeast form was not. Our data put candidalysin in the centre of this observation, showing that it indeed is a major contributor to NET formation when present with growing C. albicans cells. Since candidalysin expression and release is strictly connected to hyphal growth our new data agrees with previous assessment and provides new insight in how this hyphae-specific inductive effect is accomplished.

      In the revised manuscript, at first, we describe the difference of neutrophil stimulation when using strains expressing and lacking candidalysin as compared to candidalysin stimulation alone. We added or modified the following phrases:

      • Line (163) “As candidalysin-expressing C. albicans strains induced more NETs than candidalysin-deficient strains, we investigated the role of the toxin alone in stimulating neutrophil extracellular trap release”.
      • Line (173) “In order to ensure consistency in NET/NLS quantification, NLS were quantified with the same criteria as previous described for NETs.”
      • Line (182) “In summary, candidalysin alone triggers morphologically distinct NLS in a time- and dose-dependent manner, whereas candidalysin-producing C. albicans hyphae induce canonical NETs (Fig. 1a).” Next, we describe the different morphology of NLS triggered by candidalysin alone in comparison to canonical NETs triggered by C. albicans strains expressing candidalysin. We added or modified the following phrases:

      • Line (198) “To investigate candidalysin-triggered NLS in further detail, we used scanning electron microscopy (SEM) that allows a more detailed view of the neutrophil-derived structures (Fig 3a).” Furthermore, to prevent switching between experiments using candidalysin alone and experiments with different Candida strains, we have moved the next paragraph “Candidalysin-expressing strains induce more NETs and higher citrullination levels than candidalysin-deficient strains” to the end of the result section (old Fig. 4 is now new Fig. 8). In doing so, we focus on the direct morphological, signalling and functional effects of candidalysin alone on neutrophils and towards the end, we analyse how the strains are affected by different neutrophil killing mechanisms (phagocytosis and NETs). Subsequently, we synthetize our findings by showing that candidalysin is the main driver of histone citrullination by quantifying this histone modification in the context with NET induction comparing candidalysin-expressing and -deficient strains. We conclude that citrullination-induced chromatin decondensation in combination with candidalysin-induced ROS production are most probably the main contributors of increased NET formation stimulated by C. albicans hyphae expressing candidalysin. This is the also a good conclusion of the manuscript showing that candidalysin alone is not enough but together with growing C. albicans cells it contributes to NET induction and increased NETs in turn inhibit growth and limit spreading of C. albicans.

      • We modified and added the following sentences to the discussion section: (line 457) “The data suggests that candidalysin is the key driver of histone citrullination in neutrophils infected with C. albicans and that addition of evenly distributed, external candidalysin in high concentration (15 µM) drives neutrophils towards NLS despite the presence of C. albicans cells. We conclude that, during infection, candidalysin-triggered Ca2+ influx and histone hypercitrullination amplify processes in neutrophils which are induced by C. albicans hyphae. These amplified processes culminate in a strongly increased release of NETs that in turn are formidable weapons to control hyphal filaments.”

        *2. If candidalysin on its own only induces NLS - what is the relevance of this for disease? A lot of work has been provided on the pathway driving NLS formation, but it wasn't clear to me why this is important. More in discussion needed or evidence of disease relevance. *

      Responses:

      Thank you for giving us the opportunity to clarify this issue. Candidalysin expression strongly increases with and is restricted to hyphal growth, which is the adhesive growth form of C. albicans. Given that epithelial cells expunge candidalysin for their own protection while hyphae remain attached, it could be possible that neutrophils get exposed to candidalysin before they encounter C. albicans cells. Therefore, it is relevant to understand how candidalysin per se shapes neutrophil responses. We have added the following sentences to the discussion section: (line 527) ”As epithelial cells are able to expunge candidalysin for protection while C. albicans hyphae remain adeherent 46 recruited neutrophils may encounter candidalysin before direct contact with hyphae.”

      With regard to relevance for candidiasis the observation that candidalysin-deficient strains are poor inducers of NETs is most important. Since candidalysin expression is entirely restricted to hyphal growth. this finding gives crucial, new insight into the previous observation that hyphae are better NET inducers than yeast from C. albicans. In this context, we wanted to make it very clear to the reader that this effect only works when C. albicans cells and candidalysin are combined and that candidalysin alone does not lead to full-blown NET formation. Therefore, we have included a thorough investigation of the effects of candidalysin on neutrophils to be able to better contextualize our findings comparing candidalysin-expressing and candidalysin-deficient strains.

      To make this point clearer. we added the following sentence to the summary at the end of the discussion section: (line 565) “Neutrophils encountering candidalysin-expressing hyphae are able to adequately respond by releasing increased amounts of NETs whereas secretion of candidalysin does not allow hyphae to evade from neutrophil attack.”

      In addition, we are convinced that the move of former Fig.4 to the end of the result section (now Fig. 8) additionally helps the reader to better understand the importance to first delineate the effect of candidalysin on neutrophils alone and then to conclude the manuscript with experiments using different C. albicans strains to put the findings into context.

      To add more substance to our conclusions we wrap up the new version of the manuscript with data comparing wild-type and candidalysin-deficient strains in neutrophil antimicrobial assays and quantification of histone citrullination. With the newly added antimicrobial assays we demonstrate that candidalysin expression does not affect phagocytic killing (Fig. 7d and 7e) as assessed by plating assays and that candidalysin does not affect inhibition by PMA-induced NETs (Fig. 7f and 7g). Thus, as stated above, during the interaction of hyphae and neutrophils candidalysin promotes the release of more NETs, but does otherwise not affect anti-Candida activity by neutrophils. Increased NETs in turn, however, inhibit growth and limit spreading of C. albicans. The manuscript now ends with the data on differences in histone citrullination when using wild-type and candidalysin-deficient strains indicating that citrullination-induced chromatin decondensation in combination with C. albicans cells ultimately leads to increased NET release.

      We added the following text to the manuscript: (Line 416) “To corroborate, whether candidalysin deficiency affects C. albicans’ susceptibility to neutrophil attack we performed two antimicrobial assays. In the first assay we determined NET-mediated anti-Candida activity by preformed NETs comparing wild-type and candidalysin-deficient strains. We used the same imaged-based analysis with calcofluor white staining. To be able to better observe differences in susceptibility of the different strains we used a slightly higher MOI than for the previous NET inhibition assays which explains higher survival percentage (Fig. 7c, black bars on the right side). As expected, candidalysin did not affect the inhibitory effect on C. albicans imposed by NETs (Fig. 7c). In the second assay, we determined short-term anti-Candida activity of intact neutrophils, which is predominantly phagocytic elimination, by serial dilution and plating for colony counts. Candidalysin-deficient and wild-type strains are killed similarly over the time of 1 to 4 h, both at MOI 1 and 3 (Fig. 7d and 7e). This indicates that candidalysin expression does not enable evasion from neutrophil phagocytic attack and this result agrees well with our previous finding that wild-type C. albicans engulfed by human neutrophils are unable to escape by hyphal outgrowth 16. In conclusion, while candidalysin strongly increases the NET-inductive capacity of C. albicans hyphae, the toxin does neither affect the anti-Candida effect of intact neutrophils nor of NETs.”

      Notably, it is not informative to use C. albicans as inducer of NETs and as target of anti-Candida activity by NETs in the same assay, since both induction and anti-Candida activity are dependent on the amount of C. albicans cells. We therefore chose to show two separate assays where we (i) quantify short-term killing by plating (mainly phagocytosis) and (ii) quantify growth inhibition of C. albicans by pre-stimulated NETs.

      *3. In Figure 2, it would be helpful to include images of ionomycin-stim neutrophils for comparison of the NLS structures across different stim conditions. *

      Response:

      This is a very good point. We supply a structural comparison between NLS and NETs induced by PMA, ionomycin and candidalysin in Figure 3. Additionally, the time-dependent changes for ionomycin are now included in the supplementary Figure S1.

      4. Few places where reference manager has failed (see bottom on page 10, line 190 for example)

      Response:

      We have fixed this issue, thank you for pointing it out.

      *5. Lines 191-198 - I was confused here by the text. I thought the point was that candidalysin induced NLS similar to ionomycin, but here the point is being made that the two are different? This led me to being confused as to the point of all the comparisons made between ionomycin NLS and candidalysin NLS... this could be made clearer. *

      Responses:

      Thank you for highlighting this. According to previous literature ionomycin, a bacterial peptide toxin, was the most prominent example for induction of leukotoxic hypercitrullination. Therefore, we used ionomycin to put our findings with candidalysin, a fungal peptide toxin, into context. We find that candidalysin share similarities but also some striking differences to ionomycin. While we could not investigate the nature of these differences in more detail, this could be the basis of a follow-up study, we think it is important to give the reader the comparison in order to better understand how candidalysin shapes neutrophil responses. One clear difference which we show in the manuscript is that candidalysin induces some ROS whereas ionomycin does not at all (Fig. 4).

      We changed the text in the result section accordingly to make our point clearer: (line 203) “PMA exposure generated widespread chromatin fibers in the extracellular space (Fig. 3a, left panels) whereas ionomycin exposure resulted in more compact, patchy areas occasionally dispersed with long, thin chromatin fibres (Fig. 3b, middle panels). With regard to morphological changes, candidalysin treatment resulted in compact, fibrous structures resembling those stemming from ionomycin treatment, however long, thread-like structures were absent in candidalysin-treated neutrophil samples (Fig. 3a right panels, for 7 h treatment see Fig. S1c).”

      And did so as well in the discussion section: (Line 513) ”While ionomycin- and candidalysin-induced NLS shared similar key features, such as increased histone citrullination, our study revealed striking differences between the two toxins. In contrast to ionomycin, candidalysin stimulation led to ROS production in neutrophils.”

      *6. Could the authors include some unstim neutrophil control images in Fig 3 for the SEM? Can the SEM sample processing affect neutrophil structure in anyway? Feels like an important control although I don't have much experience with SEM personally *

      Response:

      This is of course a relevant control image. We have included an image showing unstimulated neutrophils from similar time points, but without exposure to candidalysin (Fig. 3). The unstimulated neutrophils are spherical and morphologically distinctly different from candidalysin-treated neutrophils.

      *7. I was very intrigued by the experiments where the authors added candidalysin in to neutrophils infected with ece1-null strain. Those experiments showed that candidalysin addition still drove NLS instead of NETosis. Can the authors investigate why this is? Is membrane intercalation different when candidalysin is delivered by hyphae vs added on its own? Could that explain some of the differences they have seen? *

      Responses:

      Thank you for this comment. Yes, there is a clear difference, since we add candidalysin to the medium such that the peptide is evenly distributed and reaches membranes rather evenly from the extracellular space. When released from growing C. albicans hyphae candidalysin is then predominantly released on hyphal tips as demonstrated in the referenced article (doi.org/10.1111/cmi.13378). Hyphal tips in turn are readily attacked by human neutrophils (doi.org/10.1189/jlb.0213063). Hence, we can safely assume according to these previous publications that there will be a more uneven distribution of candidalysin concentrations over neutrophil membranes, when the sole source of the toxin stems from growing hyphae interacting with neutrophils. It would of course be very interesting to know how the toxin exactly intercalates into membranes and which morphologies potential pores may have. These questions are currently under investigation in the laboratories of Profs Hube and Naglik. To include these findings here would certainly be far beyond the scope of this study.

      We include and modify the following sentences to the discussion of this manuscript to clarify the issue: (Line 541). ”One of the main goals of the study was to delineate contribution of candidalysin to neutrophil responses either as factor released by C. albicans hyphae or as singular peptide toxin. Our data demonstrates that candidalysin is the main driver of histone citrullination in neutrophils infected with C. albicans (Fig. 8). Lack of candidalysin production in C. albicans results in significantly reduced histone citrullination, accompanied with decreased NET formation. However, citrullination is not required for NET release, but rather governs the formation of NLS, which is dominant when candidalysin is added exogenously with even distribution throughout the cell suspension. With regard to C. albicans hyphae secreting candidalysin, local concentrations of the toxin are likely to vary to a large degree, particularly when the candidalysin-secreting hypha is engulfed by a neutrophil. Therefore, it may be difficult to discriminate NLS form NETs during the interaction of neutrophils and C. albicans, as both structures may be induced concurrently 10. It seems logical that the pore-forming activity of candidalysin augments the release of NET fibres during C. albicans infection, where PRRs will additionally be triggered on neutrophils, resulting in combinatorial activation of downstream pathways. In line with this notion, candidalysin drives histone citrullination, which contributes to chromatin decondensation.”

      *8. Is phagocytosis needed for NETosis induction by candidalysin? What happens if you add beads or beta-glucan particles with candidalysin stimulation? Do you get NLS or NETs? *

      Responses:

      This is an interesting question. Physical contact is required for the induction of NET formation (10.1111/j.1462-5822.2005.00659.x, 10.1371/journal.ppat.1000639) and physical contact leads to pattern recognition unequivocally followed by phagocytic events in neutrophils. Hence, at the least indirectly, phagocytosis and NET formation are connected, but may not be so causally.

      While glucan-covered particles have been shown to induce NETs (10.1159/000365249), we show that C. albicans cells devoid of candidalysin induce NETs, but to a much lesser extent than wild-type C. albicans. In addition, the experiment shown in Fig. 8 shows exactly that. Instead of glucan-covered beats we used C. albicans cells (Fig. 8f) which by virtue are glucan covered.

      *9. Please confirm what the n numbers refer to in the figure legends - are these biological or technical replicates? How many experiments are the representative images representing? *

      Response:

      Thank you very much for pointing this out. We adapted our figure legends accordingly and added the number of biological and technical replicates (n=x(y), x=biological replicates, y=technical replicates). Each experiment has been performed with at least three biological replicates which includes the use of different neutrophil donors.


      *Reviewer #2 (Significance (Required)):

      *

      *The advantage of this work is the presentation of the mechanism associated with NLS formation in contact with candidalysin, where activation of NADPH oxidase and calcium influx have been documented to be important. This toxin can trigger ROS production and activate downstream signaling that is important for morphological changes and NLS formation. The important finding is also that NLS are resistant to nuclease treatment and increase the ability of neutrophils to control C. albicans hyphae formation and fungal cell growth. These findings provide a better understanding of the role of neutrophils in the treatment of infections caused by these microorganisms. Below I present are minor suggestions that, in my opinion, will improve the text and correct the presentation of the results, making this set of results a valuable source for explaining such a complex problem.

      *

      Response:

      Thank you for this assessment. In cases which we have identified as crucial for our message we have decided to include additional experiments to better convey our message (Fig. 6e-f and Fig. 7d-g). We also included a time course for ionomycin stimulation of neutrophils in Fig. S1. We appreciate that the overall assessment was that no additional experiments were required.

      1/ The authors should decide what thesis about NLS they want to prove: 100 NLS are less fibrous and ....... than canonical NETs and are triggered in an NADPH oxidase-independent fashion.

      * 121 NLS were dependent on NADPH oxidase-mediated reactive oxygen species (ROS) production

      *

      Response:

      This was indeed imprecisely formulated from our side. NLS were previously described as NADPH-independent processes stimulated by toxins (see ionomycin). Candidalysin seems to trigger NADPH-dependent and NADPH-independent pathways. However, the main differentiation criteria were described through the hypercitrullination which we could observe for candidalysin. To clarify, we have modified the following sentence: (line 121) ”In contrast to previously described stimuli of NLS, candidalysin induced NLS in partial dependence on NADPH oxidase-mediated reactive oxygen species (ROS) production, wheras PAD4-mediated histone citrullination could be observed as well. Notably, candidalysin alone failed to induce NETs as indicated by a lack of cell cycle activation determined via lamin A/C phosphorylation assays.”

      *2/ for the experiment described in the line below, MOI 2 was chosen; did the authors conduct an analysis of the response/eventual change in it, depending on the MOI?

      *

      Response:

      Yes, from our experience in in vitro experiments with human neutrophils MOI3 C. albicans overgrows too quickly. This is why an MOI 1-3 is the best option to analyse NET induction capacities.

      131 we infected neutrophils with wild-type C. albicans, ECE1-deficient (ece1ΔΔ), and corresponding revertant (ece1ΔΔ*+ECE1) strains,

      3/ Has the effect of deletion of ECE1 on other aspects of virulence, such as adhesion, virulence factor production, or biofilm formation, been analyzed? *

      Response:

      Yes indeed, the effect of candidalysin on other aspects has been studied. Candidalysin has no effect on adhesion and is expressed during biofilm formation. It has a broad effect on virulence in general and promotes neutrophil recruitment indirectly by a robust induction of damages responses. To clarify the amount of studies investigating these other aspects and to pinpoint the knowledge gap for direct interaction of neutrophils and candidalysin we include the following sentence: (line 132) “C. albicans hyphae release candidalysin and while the effects of the toxin for instance on virulence in general and on adhesion to host cells have been widely studied 17,18,23,28,30, the direct impact of candidalysin on the neutrophil immune response towards C. albicans, remains poorly understood. To investigate the role of candidalysin, we infected neutrophils with wild-type C. albicans,…”

      *137 the ECE1- and candidalysin-deficient strains triggered reduced levels

      4/ Fig.1 - How were C. albicans cells stained? Does 100%NET mean the number of cells netting after PMA treatment? This information should be given.

      *

      Response:

      Thank you for pointing this out. We were a bit unclear here. We added details in the respective figure legend and method section. C. albicans cells were visualised with anti-Candida antibody (1 µg/mL, ProSci, Cat#35-645). Furthermore, C. albicans nuclei are stained by DAPI, too. 100% NETs would mean that every single neutrophil (an image event which stains for neutrophil markers) in the analysed microscopic picture shows NET or NET-like morphology. We did not normalize to PMA treated cells.

      5/ 168 dependent effect with increased NLS formation from 3 μM to 15 μ*M. However, the reduced NLS

      How was determined the limiting concentration value of the toxin, for which an increase in NLS was observed? Was a wide range of concentrations used in the analysis or was the determination made only for these three selected values? A complete concentration analysis should be performed. *

      Response:

      This is of course a valid point. We showed data on these concentrations as established from previous studies of our collaborators (10.1111/cmi.13378; 10.1038/nature17625; 10.1038/s41467-019-09915-2). Under 3 µM we did not observe much measurable results and therefore omitted these. Concentrations above 70 µM did not change the outcome anymore than at 70 µM, so higher concentrations were omitted. We, thus, show 3µM at which we see mild effects, show 15 µM (a 5-fold increase compared to 3µM) at which we see profound effects and show 70 µM (again approximately a 5-fold increase compared to 15 µM) at which we see an overwhelming effect. Additional concentrations in between the applied concentration values would not add much new information.

      6/ 169 formation was observed at 70 μ*M (Fig. 2b), which can be explained by neutrophil cell death induced by the toxin as determined by a DNA Sytox Green assay (Fig. S1a).

      Was another viability test conducted? AnnexinV? Caspase 3/7? Sytox is not a specific staining in this regard. Furthermore, in Fig. S1a you state the kinetics of cell death, also after PMA treatment. On the one hand, you say that the production of candidalysin of NLS above 70 uM is reduced due to cell death, but at the same time you define as cell death the changes under PMA, which induce netosis. Please explain this reasoning better. *

      Responses:

      Thank you for pointing this out. We have no indication that candidalysin stimulates apoptosis in neutrophils. Therefore, no AnnexinV/Caspase 3/7 stain was performed. What we wanted to emphasize is that at 70 µM candidalysin the cytotoxic character of candidalysin is overwhelming leading to rather quick cell death, as assessed by the Sytox assay. Sytox is specific in the regard that it determines whether the plasma membrane is permeable and gives the stain access to the nuclear DNA to result in a positive signal. We use this assay to quantify NET formation, since it is a quantitative assay and less laborious than microscopy. However, we always back up NET assays with microscopic, image-based analyses and do not use the Sytox assay as standalone experiment for NET quantification, since the Sytox assay is not specifically staining netting cells, but it also stains other types of cell death.

      We clarify this in the text as follows: (line 659) “Neutrophil cell death or the presence of extracellular DNA was quantified using a Sytox Green-based (Invitrogen) fluorescence assay similar to previous descriptions 2,35. To ultimately quantify NETs or NLS we always used image-based assys, the cell death assay was only used as complementation.”

      *7/ 175 mixing of granular and nuclear components at ~120 min after stimulation (Fig. 2d and Fig. S2).

      Figure S2 does not show mixing with the content of the granules. You are not labeling any granule component, only histones. You cannot draw that conclusion from these results. *

      Response:

      We respectfully disagree. As indicated in the figure legend for Figure 2d we were labelling for neutrophil elastase (red) which is located in azurophilic granules and thereby presents a marker for granular content. Since we wrongfully referred to Figure S2 here, we removed this from the text. The latter reference probably remained erroneously from a previous version.

      *8/ Fig. 2. What concentration of PMA was used? What does 100% NLS mean? How is it different from 100% NET, since you are using PMA in both cases. Please explain. *

      Responses:

      We have now defined PMA concentration in the respective figure legend (100nM). The criteria for image-based assessment of NLS and NET quantification are the same for reason of comparison. PMA is included in each of the experiments as a positive control to show that the used neutrophils react upon stimulation. To clarify, we now specify at the y-axis %NETs or NLS. As stated above, 100% NLS means that each cell event in the image has increased in diameter such that it is considered as a NET or NLS. Hence, we use a common coordinate system to quantify extracellular events (NETs and NLS) based on size.

      We have adjusted the figure legend as follows: (line 186) “Fig 2. Candidalysin induces ____NLS ____in human neutrophils. Candidalysin, but not scrambled candidalysin or pep2, another Ece1p-derived peptide (all 15 µM), induce (a) DNA decondensation in human neutrophils after 4 h (n = 4(10-14)) in a (b) dose-dependent manner (n = 3(10-14)). To allow comparability, NLS were quantified with the same criteria as previously described for NETs. Data shown as mean ± SEM. Confocal images (c) of immunostained cells display morphological changes involving nuclear and granular proteins after 4 h compared to unstimulated cells or 100 nM PMA, or cells exposed to scrambled candidalysin and pep2. The morphological changes evoked by PMA considerably deviate from morphological changes evoked by candidalysin and, hence, are defined as NETs (for PMA) and NLS (for candidalysin). Time-dependent progression of morphological changes (d) in neutrophils induced by candidalysin over the course of 5 h (all images are with 60X magnification).”

      *9/ 181 NLS were quantified with the same criteria as previous described for NETs.

      The criterion for NETs was an area above 100um2, so what is the criterion for NLS? If we assume that this is the same as for NETs, then what is the difference between NLS and NETs? The criteria adopted do not differentiate between the two forms and appear to be subjective. *

      Responses:

      As stated above, for us it was very important to find a common coordinate system to quantify NETs and NLS, since we wanted to deliver comparable and solid quantitative data. Hence, the quantification method does not discriminate between NETs and NLS. The notable morphological differences of NETs and NLS are thoroughly described with Figure 2 and Figure 3 and defined by differences in their structure. In addition, we present differences and similarities of induced pathways leading to canonical NETs or candidalysin-induced NLS in Figure 6 and Figure 7. We are convinced that, since NETs and NLS vary in size (DNA area covered), it will not be accurate for quantification purposes to include an additional size cut-off in the attempt to discriminate NLS and NETs. Instead we have established that candidalysin alone induces morphologically distinct NLS, whereas Candida albicans hyphae induce morphologically distinct NETs. By combination of quantitative data and image-based assessment, both structures can be discriminated from each other. In addition, we have established that during neutrophil and C. albicans interaction, citrullination of histone mainly stems from candidalysin. We show here and others have shown previously (10.3389/fimmu.2018.01573) that citrullination of histone occurs during but is not required for NET formation. But histone citrullination is promoted mainly by candidalysin and is also required for formation of NLS. Thus, histone citrullination constitutes another important discriminatory factor between NETs and NLS.

      We added modified and added text to the respective figure legend: (line 188) ”To allow comparability, NLS were quantified with the same criteria as previously described for NETs. Data shown as mean ± SEM. Confocal images (c) of immunostained cells display morphological changes involving nuclear and granular proteins after 4 h compared to unstimulated cells or 100 nM PMA, or cells exposed to scrambled candidalysin and pep2. The morphological changes evoked by PMA considerably deviate from morphological changes evoked by candidalysin and, hence, are defined as NETs (for PMA) and NLS (for candidalysin).”

      *10/ 190 allows a more detailed view of the neutrophil-derived structures (Error! Reference source not Please, eliminate this error. *

      Response:

      Thank you for pointing this out to us. We have fixed this error.

      *11/ 193 Ionomycin has been previously reported to induce NLS, also... 194 Both, PMA and ionomycin generated widespread chromatin fibers in the extracellular space 197 In addition, C. albicans hyphae induced NETs with observable fibers and 198 threads similar to PMA- and ionomycin-stimulated neutrophils (Fig. 3b). 199 Image-based quantification of NLS events (candidalysin and ionomycin)

      In a sentence earlier (193) you mentioned that the action of PMA leads to classical netosis and ionomycin leads to NLS. You pointed out earlier that NLS are poorly developed NETs (line 100), and here you write that PMA and ionomycin generate the same developed structures. You again differentiate between these structures depending on the stimulating factors. Pointing out the differences between the two forms, you should be more precise and consistent in your descriptions. This comment applies to the entire manuscript. *

      Responses:

      Thank you, we agree that consistency and clarity is required to describe the observed phenomena. We therefore modified or included the following sentences to the manuscript:

      • (line 203) ”PMA exposure generated widespread chromatin fibres in the extracellular space (Fig. 3a, left panels) whereas ionomycin exposure resulted in more compact, patchy areas occasionally dispersed with long, thin chromatin fibres (Fig. 3b, middle panels). With regard to morphological changes, candidalysin treatment resulted in compact, fibrous structures resembling those stemming from ionomycin treatment, however long, thread-like structures were absent in candidalysin-treated neutrophil samples (Fig. 3a right panels, for 7 h treatment see Fig. S1c)”
      • (Line 513) ”While ionomycin- and candidalysin-induced NLS shared similar key features, such as increased histone citrullination, our study revealed striking differences between the two toxins. In contrast to ionomycin, candidalysin stimulation led to ROS production in neutrophils.”

        12/ 203 NLS after 3 h and 5 h, respectively, and led to overall fewer NLS events. This was confirmed by observation. 204 area-based analysis of the events (Fig. 3d). The average area per event that exceeded 100 μ*m2 was 205 determined using the images from the DNA stain. What is the accepted criterion for distinguishing between NLS and NETs? *

      Response:

      The main criteria distinguishing canonical NETs from NLS is a higher compactness for NLS and an increased citrullination of histones, the latter being absent in canonical NETs (10.3389/fimmu.2016.00461; 10.1016/j.mib.2020.09.011). Please see our comment above (regarding reviewer comment 9). Comparing candidalysin and ionomycin as stimuli for NLS they share key similarities, such as increased citrullination of histone (Fig. 3) and more compact structures than NETs (Fig. 3) with an average size of 151 µm2 for candidalysin-induced and 149 µm2 for ionomycin-induced NLS compared to 262 µm2 for PMA-induced and 231 µm2 for C. albicans-induced NETs (for clarification these average sizes are stated in the text). However, the NLS triggered by candidalysin and ionomycin also show differences. Ionomycin occasionally results in extended chromatin threads, whereas candidalysin does not. Ionomycin induces no ROS at all, whereas candidalysin does to some extent. By consistent usage of the definitions for NETs and NLS and by pinpointing the differences between ionomycin and candidalysin in terms of NLS induction (which are previously unknown) we hope we have sufficiently addressed this comment.

      *13/ line 218, 243 - reference error *

      Response:

      Thank you, we have fixed this error

      14/ What form are we actually talking about? Are we focusing on the effect of a natural agent or a synthetic one in relation to NLS/NET? Perhaps it is more important to focus on the citrullination process.

      • 247 synthetic candidalysin only induces NLS, we concluded that candidalysin augments NET release when the toxin is secreted by C. albicans hyphae. 256 This confirmed that candidalysin promotes C. albicans-triggered NET release. 262 Interestingly, the addition of synthetic candidalysin resulted in a shift to NLS, 274 External addition of synthetic candidalysin resulted in a shift to NLS structures rather than NETs as visualized by microscopy after 5 h incubation (20X).*

      Response:

      We used the adjective “synthetic” here to make clear that this is a synthetized peptide and not candidalysin isolated from growing C. albicans. Having said that, we fully agree that the synthetized peptide and the one released by C. albicans cells are essentially identical on the molecular level and thus it is irrelevant and confusing to state in this context here. Therefore, we removed the adjective “synthetic” throughout the study and refer the reader to the method section for information on the origin of candidalysin used in the study. At times, we state “candidalysin alone” when we want to emphasize that candidalysin was the sole trigger used for the respective assay.

      15/ Has there been any method to track candidalysin production during contact of C. albicans with neutrophils?

      Responses:

      Thank you for this comment. Yes, there is a QVQ nanobody that can be used which is currently not to our disposal (doi.org/10.1111/cmi.13378). However, we already know from this publication that candidalysin concentrations vary when released naturally. The concentrations are particularly high in invasion pockets or dense biofilms. We also know that if we add candidalysin to the medium we have even distribution throughout and this is by definition different from concertation spikes at host cell-fungal interaction sites. As we have stated above, hyphal tips in turn are readily attacked by human neutrophils (doi.org/10.1189/jlb.0213063). Hence, we can safely assume, according to these previous publications, that there will be a more uneven distribution of candidalysin concentrations over neutrophil membranes, when the sole source of the toxin stems from growing hyphae interacting with neutrophils. It would of course be very interesting to know how the toxin exactly intercalates into membranes and which morphologies potential pores may have. These questions are currently under investigation in the laboratories of B. Hube and J. Naglik. To incorporate these findings here would certainly be far beyond the scope of this study.

      We include and modify the following sentences to the discussion of this manuscript to clarify the issue: (Line 544). ” Lack of candidalysin production in C. albicans results in significantly reduced histone citrullination, accompanied with decreased NET formation. However, citrullination is not required for NET release, but rather governs the formation of NLS, which is dominant when candidalysin is added exogenously with even distribution throughout the cell suspension. With regard to C. albicans hyphae secreting candidalysin, local concentrations of the toxin are likely to vary to a large degree, particularly when the candidalysin-secreting hypha is engulfed by a neutrophil. Therefore, it may be difficult to discriminate NLS form NETs during the interaction of neutrophils and C. albicans, as both structures may be induced concurrently 10. It seems logical that the pore-forming activity of candidalysin augments the release of NET fibres during C. albicans infection, where PRRs will additionally be triggered on neutrophils, resulting in combinatorial activation of downstream pathways. In line with this notion, candidalysin drives histone citrullination, which contributes to chromatin decondensation.”

      *16/ In Figure 4f-the given information indicates 1,2 hour incubation, in the caption of the figure there is information about 5 hour incubation - please clarify. The description of the stains used is lacking. *

      Response:

      Microscopic analysis performed after 5h incubation time, whereas candidalysin has been added to different time points indicated in the Figure (in the new version this is now Figure 8f). We clarified in the legend as follows: (line 472) “(f) Neutrophils were infected with C. albicans and 15 µM candidalysin was added 0 h, 1 h or 2 h after the infection. Addition of candidalysin at the different time points after C. albicans infection resulted in a shift to NLS structures rather than NETs as visualized by microscopy after 5 h total incubation (20X).” The description of the strains is depicted directly in the Figure, next to the microscopic images.

      *17/ Fig. 5 - result for 15 uM MitoTEMPO - adds nothing to the results and introduces image information noise - should be removed. No information on the concentration of the peptide used. *

      Responses:

      We would like to keep the 15 µM MitoTEMPO concentration, since it is the more reasonable concentration at which we do not observe an effect. This argues that ROS is more-likely derived from NADPH oxidase and not mitochondrial ROS. We show TEMPOL effects at 15 µM and at 100 µM to document the dose dependency and for the sake of comparability, we would like to keep both concentrations also for MitoTEMPO.

      The indicated peptide concentration was added to the figure legend. Thank you for pointing this out.

      *18 / Fig. 5, line 309: and cell-permeable Sytox Green DNA dye (250310 nM) to determine the total number of cells".

      Please correct the information on the use of both dyes, according to the manufacturer's description: "SYTOX® Green nucleic acid stain is an excellent green-fluorescent nuclear and chromosome counterstain that is impermeant to live cells, making it a useful indicator of dead cells within a population." *

      Response:

      Thank you for highlighting this error. Indeed, we used Syto Green for this particular staining, a dye which stains both live and dead cells since the dye is cell-permeable. We corrected the error at this section of the text.

      *19/ 324 At later time points, BAPTA-AM led to an increase in NLS, probably due to toxic effects as indicated by higher background levels of NLS formation in non-stimulated, BAPTA-AM-treated neutrophils (Fig. 6d).

      If such an assumption is made, the toxic effect should also be observed for the control. *

      Response:

      The toxic effect was observed while conducting the experiments, but cannot be seen in the size-base quantification which is the read out for this particular experiment. We have performed a cytotoxicity assay using flow cytometry and PI staining to confirm the effect. The results are added as supplemental Figure (Fig. S3b).

      *20/Fig. 6C PAD inhibitor should affect PMA-induced netosis, but the figure presents NLS existence - how was this change found? *

      Responses:

      We are grateful for the opportunity to explain this more thoroughly. PMA does not trigger histone citrullination (10.3389/fimmu.2016.00461) and thereby there is no effect of the PAD inhibitor on PMA-induced NETs. Notably, some level of histone citrullination can also be observed in unstimulated neutrophils (see Fig. 3, 5 and 8), since histone modification is not exclusively dependent on stimulation. However, upon PMA stimulation we observe a decrease (Fig. S1b), not an increase, of histone citrullination consistent with previous reports.

      We adjusted the text as follows: (line 235) “. Expectedly, citH3 levels upon PMA stimulation did not increase, but rather decreased which is consistent with previous reports 10 (Fig. 3d and Suppl. Fig. S1b). While citrullination levels in unstimulated neutrophils decreased over time, ionomycin stimulation sustained high levels over 5 h.

      *21/ line320 "This indicates that candidalysin most probably causes Ca2+ influx via pore formation and not via direct receptor stimulation" And: line 358. As C.albicans hyphae bind to pathogen recognition receptors (PRRs), activate neutrophils and ultimately promote the release of NETs, we aimed to elucidate whether candidalysin alone leads to the activation of similar pathways in neutrophils. Hence, we stimulated neutrophils with candidalysin in the presence or absence of specific inhibitors for SYK, PI3K, and Akt.

      Lack of consistency in conclusion. *

      Response:

      Thank you for pointing this out. We adjusted the paragraph (line 331) as follows: “As C. albicans hyphae bind to pathogen recognition receptors (PRRs), activate neutrophils and ultimately promote the release of NETs, we aimed to elucidate whether candidalysin alone can trigger similar pathways in neutrophils via signalling cross talk induced by Ca2+ influx. Hence, we stimulated neutrophils with candidalysin in the presence or absence of specific inhibitors for SYK, PI3K, and Akt (Fig. 6b).”

      *22/ Fig. 7 It would be good to verify these results with experiments using mutants. Figures 7b, 7c, and 7d can be combined to make the whole drawing clearer. *

      Response:

      We thought this is very relevant and included additional experiments showing that the mutant strains also induce phosphorylation of lamin A/C independent of the expression of candidalysin (new Fig. 6e and 6f).

      *23/ line 603 'The percentage of dead cells was calculated using TritonX-100 lysed neutrophils as 100% control' - maybe use " treated or permeabilized" *

      Response:

      Thank you, we changed the phrasing accordingly.

    1. Reviewer #1 (Public Review):

      The authors aimed to contrast the effects of pharmacologically enhanced catecholamine and acetylcholine levels versus the effects of voluntary spatial attention on decision making in a standard spatial cueing paradigm. Meticulously reported, the authors show that atomoxetine, a norepinephrine reuptake inhibitor, and cue validity both enhance model-based evidence accumulation rate, but have several distinct effects on EEG signatures of pre-stimulus cortical excitability, evoked sensory EEG potentials and perceptual evidence accumulation. The results are based on a reasonable sample size (N=28) and state-of-the art modeling and EEG methods.

      Although the authors draw a few partial conclusions that are not fully supported by the data (see below), I think that the authors' EEG findings provide sufficient support for the overall conclusion that "selective attention and neuromodulatory systems shape perception largely independently and in qualitatively different ways". This is an important conclusion because neuromodulatory systems and selective spatial attention are both known to regulate the neural gain of task-relevant single neurons and neural networks. Apparently, these effects on neural gain affect decision making in dissociable ways.

      The effects of donepezil, a cholinesterase inhibitor, were generally less strong than those of atomoxetine, and in various analyses went in the opposite direction. The authors fairly conclude that more work is necessary to determine the effects of cholinergic neuromodulation on perceptual decision making.

      1) I believe that the following partial conclusions are not fully supported by the data:

      a) In the results section on page 6, the authors conclude that "Attention and ATX both enhanced the rate of evidence accumulation towards a decision threshold, whereas cholinergic effects were negligible." I believe "negligible" is wrong here: the corresponding effects of donepezil had p-values of .09 (effect of donepezil on drift rate), .07 (effect of donepezil on the cue validity effect on drift rate) and .09 (effect of donepezil on non-decision time), and were all in the same direction as the effects of atomoxetine, and would presumably have been significant with a somewhat larger sample size. I would say the effects of donepezil were "in the same direction but less robust" (or at the very least "less robust") instead of "negligible".

      b) "In the results section on page 8, the authors conclude that "Summarizing, we show that drug condition and cue validity both affect the CPP, but they do so by affecting different features of this component (i.e. peak amplitude and slope, respectively)."<br /> This conclusion is a bit problematic for two reasons. First, drug condition had a significant effect not only on peak amplitude but also on slope. Second, cue validity had a significant effect not only on slope but also on peak amplitude. It may well be that some effects were more significant than others, but I think this does not warrant the authors' conclusion.

      c) In the discussion section on page 11, the authors conclude that "First, although both attention and catecholaminergic enhancement affected centro-parietal decision signals in the EEG related to evidence accumulation (O'Connell et al., 2012; Twomey et al., 2015), attention mainly affected the build-up rate (slope) whereas ATX increased the amplitude of the CPP component (Figure 3D-F)."<br /> As I wrote above, I believe it is not correct that "attention mainly affected the build-up rate or slope", given that the effect of cue-validity on CPP slope was also significant. Also, while the authors' data do support the conclusion that ATX increased the amplitude and not the slope of the CPP component, a previous study in humans found the opposite: ATX increased the slope but did not affect the peak amplitude of the CPP (Loughnane et al 2019, JoCN, https://pubmed.ncbi.nlm.nih.gov/30883291/). Although the authors cite this study (as from 2018 instead of 2019), they do not draw attention to this important discrepancy between the two studies. I encourage the authors to dedicate some discussion to these conflicting findings.

      2) On page 12 and page 14 the authors suggest a selective effect of ATX on *tonic* catecholamine activity, but to my knowledge the exact effects of ATX on phasic vs. tonic catecholamine activity are unknown. Although microdialysis studies have shown that a single dose of atomoxetine increases catecholamine concentrations in rodents, it is unknown whether this reflects an increase in tonic and/or phasic activity, due to the limited temporal resolution of microanalysis. Thus, atomoxetine may affect tonic and/or phasic catecholamine activity, and which of these two effects dominates is still unknown, I think.

    1. Inventors ignoring the ethical consequences of their creations is nothing new as well, and gets critiqued regularly:

      I think that this is true. We are so busy with creating things that are technologically advanced and never thinking about whether or not the invention is good and if this invention may lead to something negative. For example, ChatGPT has been viewed with many different opinions, it can be very helpful to us, but in some ways or in further applications, it may actually be really harmful to us.

    1. winnicott once said you know there's no such thing as a baby there's only a baby and someone
      • "gestation rewires your brain in fundamental ways um you it rewire it primes you for caretaking as a as a mother in a way which is far more visceral and far it's it's pre-rational it's it's immensely transformative experience and it's permanent you know once you've been rewired for mummy brain you'd never really go back um and that from the point of view of raising a child that matters um because when after a baby is born it's you know as winnicott once said you know there's no such thing as a baby there's only a baby and someone there's a a baby doesn't exist as an independent entity until it's some years some years into its life arguably quite a few years into its life um and what I would say about artificial wounds is that you may be you may think that what you're doing is creating a baby without the misery of gestation but what you're doing in practice is creating a baby without creating a mother because a pregnancy doesn't just create a baby it also creates a mother"

      • Comment

    2. when people do die it is almost like I think a colleague of mine under Sandberg 00:31:27 says that when somebody died the library Burns because all of that wisdom that they're carrying around in their minds that it took decades and decades to build up inside of them gets extinguished
      • comment
        • I think many of us have had this thought!
          • that when we die, vast amounts of wisdom is extinguished along with that person
          • As our digital tools become more sophisticated, however,
        • we are uploading our libraries to the digital collective intelligence network
          • the internet may well evolve to become the epitome and master repository of human cumulative cultural evolution.
          • even AI could not exist if it did not mine a training set of billions of human and their shared ideas
          • Perhaps it is the internet which is the vehicle for collective hybridized human-cyborg immortality?
          • If knowledge is preserved this way, then this flavor of immortality is only meaningful for our species
    1. @chrisaldrich I think the is an underated idea more broadly. I would love to see this done with other authors books that use an index card system, like Robert Greene. I think it would be a useful illustration to help people better understand the research and writing process. I've been wanting to and created a few experimental vaults where I do a similar thing except for a podcast (all of Sean Carroll's Mindscape transcripts are free) or a textbook (Introduction to Psychology). But I never followed through on the projects just because of how much work it takes to due it right. This also makes me wish for a social media type zettelkasten, where a community can keep a shared vault, creating a social cognition of sorts. I know this was kind of happening with the shared vaults Dan Alloso was experimenting with but his seemed more focused than random/chaotic. I'm also not sure if he continued it for later books.

      Reply to Nick at https://forum.zettelkasten.de/discussion/comment/17926/#Comment_17926

      Some pieces of social media come close to the sort of sense making and cognition you're talking about, but none does it in a pointed or necessarily collaborative way. The Hypothes.is social annotation tool comes about as close to it as I've seen or experienced beyond Wikipedia and variations which are usually a much slower boil process. As an example of Hypothes.is, here's a link to some public notes I've been taking on the "zettekasten output problem" which I made a call for examples for a while back. The comments on the call for examples post have some rich fodder some may appreciate. Some of the best examples there include videos by Victor Margolin, Ryan Holiday (Robert Greene's protoge), and Dustin Lance Black along with a few other useful examples that are primarily text-based and require some work to "see".

      For those interested, I've collected a handful of fascinating examples of published note collections, published zettelkasten, and some digitized examples (that go beyond just Luhmann) which one can view and read to look into others' practices, but it takes some serious and painstaking work. Note taking archaeology could be an intriguing field.

      Dan Allosso's Obsidian book club has kept up with additional books (they're just finishing Rayworth's Doughnut Economics and about to start Simon Winchester's new book Knowing What We Know, which just came out this month.) Their group Obsidian vault isn't as dense as it was when they started out, but it's still an intriguing shared space. For those interested in ZK and knowledge development, this upcoming Winchester book looks pretty promising. I'd invite everyone to join if they'd like to.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements

      We would like to thank the reviewers for their professional comments and constructive suggestions. Our current plan is to revise the manuscript and supplemental materials in response to the reviewers’ requests and suggestions. Toward this goal we began experiments to obtain new data requested by the reviewers and anticipate the outlined experiments can be completed within the next three months.

      2. Description of the planned revisions:

      __Reviewer #1: __

      The results of experiments where Arp2/3 is blocked (Fig.2) should be confirmed by Arp2/3 knock-down and with an independent Arp2/3 inhibitor. Several are available (CK-869, Benproperine, Pimozide). For Fig.3 and 4, that would not be necessary, but to establish the specificity of the effect in fig.2 this is absolutely required.

      __Response: __As requested, we will include new data with CK-869, the indicated Arp2/3 complex inhibitor. We purchased the inhibitor and are currently confirming its efficacy before testing whether it inhibits transition to the hESC naïve state. However, we respectfully disagree with generating a Arp2/3 knock-down hESC line. Arp2/3 complex genes are known to be essential genes in both mouse and human embryonic stem cells (PMID: 29662178 and PMID: 31649057). Furthermore, reports on successful knockout of complex subunits indicate that additional genetic manipulations are needed to maintain cell survival, including knockout of INK4A/ARF to bypass apoptosis associated with Arp2 shRNA knockdown (PMID: 22385962) and genetic manipulations in mouse models (PMID: 22492726. Thus, knock-down of Arp2/3 complex members in our cells is beyond the scope of this manuscript.

      Yilmaz A, Peretz M, Aharony A, Sagi I, Benvenisty N. Defining essential genes for human pluripotent stem cells by CRISPR-Cas9 screening in haploid cells. Nat Cell Biol. 2018 May;20(5):610-619. doi: 10.1038/s41556-018-0088-1. Epub 2018 Apr 16. PMID: 29662178.

      Shohat S, Shifman S. Genes essential for embryonic stem cells are associated with neurodevelopmental disorders. Genome Res. 2019 Nov;29(11):1910-1918. doi: 10.1101/gr.250019.119. Epub 2019 Oct 24. PMID: 31649057; PMCID: PMC6836742.

      Wu C, Asokan SB, Berginski ME, Haynes EM, Sharpless NE, Griffith JD, Gomez SM, Bear JE. Arp2/3 is critical for lamellipodia and response to extracellular matrix cues but is dispensable for chemotaxis. Cell. 2012 Mar 2;148(5):973-87. doi: 10.1016/j.cell.2011.12.034. PMID: 22385962; PMCID: PMC3707508.

      I believe that the status of the actin cytoskeleton in both states is not well enough characterized. This is especially obvious for branched actin networks themselves that depend on the Arp2/3. To this end, the authors may localize Arp2/3 or cortactin, a useful surrogate that often gives a better staining. This point is particularly important since contractile fibers are not made of branched actin. Myosin cannot walk or pull along branched actin networks because of steric hindrance. It might well be that branched actin networks are debranched after Arp2/3 polymerization. I suggest staining tropomyosins that would indicate where the transition between branched and unbranched actin would be. Along this line, phosphoERMs should be localized and revealed by Western blots (we expect an increase from primed to naive state) because they cannot perform the proposed function of linker between the membrane and actin filaments if they are not phosphorylated.

      Response: As requested we will include new data with cortactin immunolabeling, which we already completed. These new data, shown below, confirm that cortactin, which binds to branched actin filaments, co-localizes with the F-actin fence around hESC naïve colonies, suggesting that the fence includes branched F-actin. Also as requested, we are currently immunoblotting for phosphorylated ERMs to more thoroughly assess if they may serve as a linker between the membrane and actin filaments.

      Branched actin is required for cell cycle progression and cell proliferation in normal cells. This requirement is lost in most cancer cells (Wu et al., Cell 2012; Molinie et al., Cell Res 2019). This would be really important to know whether ESCs stop proliferating upon CK-666 treatment. In other words, do they behave like normal cells or transformed cells. Proliferation is a major function that depends on the YAP pathway. Cell counts and EdU incorporation can easily provide answers to this important question.

      __Response: __As requested, we will include new data on proliferation. We anticipate that these new data will complement data we already have showing that CK-666 does not impair proliferation compared with hESC controls. We also note that the role of the actin cytoskeleton in proliferation is well established and an increase in proliferation is a hallmark of acquisition of the naïve state of pluripotency (PMID: 35005567).

      Chen C, Zhang X, Wang Y, Chen X, Chen W, Dan S, She S, Hu W, Dai J, Hu J, Cao Q, Liu Q, Huang Y, Qin B, Kang B, Wang YJ. Translational and post-translational control of human naïve versus primed pluripotency. iScience. 2021 Dec 17;25(1):103645. doi: 10.1016/j.isci.2021.103645. PMID: 35005567; PMCID: PMC8718978.

      Minor Comments:

      What about the rescue of cell morphology? Does active YAP restore the intercellular contractile bundle?

      __Response: __As requested, we obtained these data, as shown below. Expression of the YAP-S127A mutant does rescue the formation of the actin ring architecture in the presence of CK666. We are currently performing additional dedifferentiation assays to immunolabel for pMLC and address the question of if expression of YAP-S127A restores the contractile bundle.

      __Reviewer #2: __

      The authors found that a ring of actin filaments at the colony periphery was characteristic of the naive hESCs. However, because all the data are presented as an image of a single confocal section, the 3D organization of the actin filaments is not clear. Although the authors drew a scheme for this actin ring being located in the apical domain of polarized cells, such data have not been provided in the manuscript. Since naive hESCs form dome-like colonies, it is important to show the 3D organization of actin filaments in the colony. 3D reconstruction of confocal microscopy images of the naive hESC colonies is required to show the relationship between actin filaments, adherens junctions, and the nuclei (as a reference for the Z axis). If 3D reconstruction is not technically possible, confocal images at different Z levels and maximum projection images should be obtained and provided.

      __Response: __As requested, we are currently generating 3D images of the actin fence by using Imaris software, which we previously used to show 3D images of mitochondrial morphology (PMID: 34038242)

      Manoli SS, Kisor K, Webb BA, Barber DL. Ethyl isopropyl amiloride decreases oxidative phosphorylation and increases mitochondrial fusion in clonal untransformed and cancer cells. Am J Physiol Cell Physiol. 2021 Jul 1;321(1):C147-C157. doi: 10.1152/ajpcell.00001.2021. Epub 2021 May 26. PMID: 34038242; PMCID: PMC8321791.

      Some of the statistical analyses were inappropriate. The authors have used Student's t-test for all analyses; however, one-way ANOVA and post-hoc analysis must be used to compare three or more groups (Figs. 2B, D, E, 3G, 4B, D, E).

      __Response: __As requested, we will re-evaluate our statistical analysis. We note that our submission reports comparisons between two groups, and hence, Student’s t-test is appropriate. For example, we compared primed and naïve to demonstrate successful acquisition of naïve pluripotency, and then we compared the naïve condition to the CK666-treated conditions to demonstrate the impact of CK666-treatment. As Reviewer 2 suggests we will reanalyze all quantifications using one-way ANOVA with post-hoc analysis in the full revision and we will also discuss with Stuart Gansky, a statistician at UCSF whom we previously consulted for most appropriate statistical analysis of our studies.

      Minor Comments:

      Page 9, second paragraph. In the discussion section, authors have written that "Cells within the ICM of mouse blastocysts exclude YAP from the nucleus whereas cells within the ICM of human blastocysts maintain nuclear YAP." However, a recent study has reported that the ICM/epiblast of mouse late blastocysts also express nuclear YAP. Epiblast Formation by TEAD-YAP-Dependent Expression of Pluripotency Factors and Competitive Elimination of Unspecified Cells. Hashimoto M, Sasaki H. Dev Cell. 2019, 50:139-154.e5. doi: 10.1016/j.devcel.2019.05.024.

      __Response: __As requested, we will revise our Discussion section to include findings from the indicated new publication.

      Reviewer #3:

      Many of their conclusions seem to be based on the qualitative analysis of a single image (e.g. Figures 1D-G, Fig 2G, Supplementary Figure 2). The authors should provide quantitative information regarding these analyses and indicate the number of cells/replicas collected for each experiment.

      __Response: __As requested, our revision will have added quantitative data when feasible. We note that in the field, traction force microscopy isn’t commonly quantified beyond including scale bars, which our original manuscript shows. Moreover, pluripotency is standardly not quantified because it is a binary switch - cells are either double positive or they are not. We show 100% double positive, and rtPCR data with known stage-specific markers.

      The actin ring surrounding hESCs colonies was previously described by Närvä et al. Although the authors cited this previous work, they do not discuss in deep the differences and similarities with their observations.

      __Response: __As requested, our revised manuscript with include additional detail comparing our results with those from Närvä et al. In brief, we observe the formation of this actin ring only in the naïve state of pluripotency, whereas Närvä et al. observe an actin architecture in the primed state. One possible source of difference between their study and ours are the cells used for analysis. Närvä et al. utilize induced pluripotent stem cells, long since proposed to be closer to naïve pluripotency than primed stem cells as conventionally isolated and maintained (see PMID: 27424783 and PMID: 19497275). Additionally, we observe that the contractile actin ring in naïve pluripotent stem cells is in a higher z-plane than reported by Närvä et al., although a direct comparison is difficult to make.

      Theunissen TW, Friedli M, He Y, Planet E, O'Neil RC, Markoulaki S, Pontis J, Wang H, Iouranova A, Imbeault M, Duc J, Cohen MA, Wert KJ, Castanon R, Zhang Z, Huang Y, Nery JR, Drotar J, Lungjangwa T, Trono D, Ecker JR, Jaenisch R. Molecular Criteria for Defining the Naive Human Pluripotent State. Cell Stem Cell. 2016 Oct 6;19(4):502-515. doi: 10.1016/j.stem.2016.06.011. Epub 2016 Jul 14. PMID: 27424783; PMCID: PMC5065525.

      Nichols J, Smith A. Naive and primed pluripotent states. Cell Stem Cell. 2009 Jun 5;4(6):487-92. doi: 10.1016/j.stem.2009.05.015. PMID: 19497275.

      The qualitative observation of Figure 3F suggests a lower overall YAP levels in primed and +CK666 cells in comparison to naive cells. Could the authors check if this is correct and, if this is the case, explain the observation?

      __Response: __As requested, our revision will include new data on YAP protein expression by immunoblotting.

      The authors should discuss deeper the rationale of the pan-ERM immunostaining experiments (since they used the individual antibodies afterwards) and provide a brief discussion of their results and, in particular, the colocalization with moesin but not with ezrin or radixin.

      __Response: __As requested, our revised manuscript will include a more detailed discussion of our results with ERM immunolabeling.

      2. Description of the revisions that have already been incorporated in the transferred manuscript:

      __Reviewer #1: __

      Minor Comments:

      Fig2F: non-representative pictures or wrong quantification of the CK666 condition.

      __Response: __We thank the review for alerting us to this error. The CK666 Primed and Naïve condition images were swapped. We have edited the figure to correct this.

      Fig3A: Y-axis? What is it? How is it adjusted? -Log P?

      __Response: __Please see the methods section. Differential expression analysis was performed using DESeq2 R package. The resulting P values were adjusted (padj) using the Benjamini and Hochberg’s approach for controlling the False Discovery Rate (FDR). Genes with a padj

      Colors of dots not really visible (in reference to Figure 3A).

      __Response: __We thank the reviewer for this comment and have updated the figure to use more standard, colorblind-friendly color choices (see the above figure). Additionally, we fixed a drawing error in the figures when creating the volcano plots.

      Typos: Apr2/3 in the abstract, Hoeschst in Fig.S1B.

      __Response: __We thank the review for alerting us to these errors. We have edited the manuscript to correct them.

      __Reviewer #3: __

      There are many experimental details missing that are extremely relevant to fully understand the experiments and evaluate the robustness of the analyses (e.g., microscopy setup, fluorescent probes used for immunostaining, incubation conditions with the inhibitors SMIFH2 and CK666).

      __Response: __As requested we have updated the Materials and Methods section with more detailed information on procedures and reagents.

      Minor Comments:

      The Introduction makes the reader think that actin is the only cytoskeletal network involved in embryo development and stem cell properties. They should also include a brief discussion on the relevance of the other cytoskeletal networks in mechanotransduction and cell fate decisions.

      __Response: __As requested, we will revise our Introduction. We note, however, that in the field additional cytoskeleton components, including intermediate filaments and microtubules have mostly been shown for interacting with the nucleus with limited evidence for roles in differentiation.

      Many of the images seem to require a flat-field correction. Could the authors check that the illumination is homogeneous? This artifact could affect the data analysis.

      __Response: __As we indicate in the Methods section, the spinning disc confocal microscopes used in our study are equipped with a Borealis to mitigate uneven illumination across the field of view. Additionally, quantification in Figures 2C-E, Figures 3F-G, and Figures 4A-D are comparing measurements to a local background (i.e. cytoplasm nearby) in order to normalize for any uneven illumination effects.

      There are many abbreviations that are not defined in the text and are extremely specific to the field.

      __Response: __As requested, we have expanded the definition of many abbreviations in the text and any additional abbreviations changes will be clearly defined in our revised manuscript.

      Could the authors explain the selection of the pluripotency markers studied by qPCR? Specifically, why they studied DNMT3L, DPPA3, KLF2, and KLF4 (Fig. 1B) and the different set PECAM1, ESRRB, KLF4, and DNMT3L in Fig. 2B.

      __Response: __Defining the exact molecular and cell behavioral characteristics of naïve pluripotency remains an evolving point of development within the field. The pluripotency markers used in both original panels are known and established markers of naïve pluripotency. The original panel of DNMT3L, DPPA3, KLF2, and KLF4 was established based upon RNAseq datasets publicly available, whereas the secondary panel of PECAM1, ESRRB, KLF4, and DNMT3L was a more targeted analysis of genes found in the literature which have been interrogated in more detail for potential roles in naïve pluripotency. To facilitate clarity within the manuscript, we have updated Fig. 1B to match Fig. 2B for the purposes of defining a transcriptional hallmark of naïve pluripotency for the purposes of this manuscript.

      Figures 1G and 2G, please include the images of the colonies.

      __Response: __As requested, our revised manuscript will include phase contrast images, which we already have, as shown below. These images will be included Supplemental Figure 1 and Supplemental Figure 2 for the colonies used to show representative tractions in Figure 1G and 2G, respectively.

      3. Description of analyses that authors prefer not to carry out

      Reviewer #1:

      The results of experiments where Arp2/3 is blocked (Fig.2) should be confirmed by Arp2/3 knock-down and with an independent Arp2/3 inhibitor. Several are available (CK-869, Benproperine, Pimozide). For Fig.3 and 4, that would not be necessary, but to establish the specificity of the effect in fig.2 this is absolutely required.

      __Response: __As requested, we will include new data with CK-869, the indicated Arp2/3 complex inhibitor. We purchased the inhibitor and are currently confirming its efficacy before testing whether it inhibits transition to the hESC naïve state. However, we respectfully disagree with generating a Arp2/3 knock-down hESC line. Arp2/3 complex genes are known to be essential genes in both mouse and human embryonic stem cells (PMID: 29662178 and PMID: 31649057). Furthermore, reports on successful knockout of complex subunits indicate that additional genetic manipulations are needed to maintain cell survival, including knockout of INK4A/ARF to bypass apoptosis associated with Arp2 shRNA knockdown (PMID: 22385962) and genetic manipulations in mouse models (PMID: 22492726. Thus, knock-down of Arp2/3 complex members in our cells is beyond the scope of this manuscript.

      Yilmaz A, Peretz M, Aharony A, Sagi I, Benvenisty N. Defining essential genes for human pluripotent stem cells by CRISPR-Cas9 screening in haploid cells. Nat Cell Biol. 2018 May;20(5):610-619. doi: 10.1038/s41556-018-0088-1. Epub 2018 Apr 16. PMID: 29662178.

      Shohat S, Shifman S. Genes essential for embryonic stem cells are associated with neurodevelopmental disorders. Genome Res. 2019 Nov;29(11):1910-1918. doi: 10.1101/gr.250019.119. Epub 2019 Oct 24. PMID: 31649057; PMCID: PMC6836742.

      Wu C, Asokan SB, Berginski ME, Haynes EM, Sharpless NE, Griffith JD, Gomez SM, Bear JE. Arp2/3 is critical for lamellipodia and response to extracellular matrix cues but is dispensable for chemotaxis. Cell. 2012 Mar 2;148(5):973-87. doi: 10.1016/j.cell.2011.12.034. PMID: 22385962; PMCID: PMC3707508.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements

      We thank the reviewers for their constructive feedback, which has helped us improve the manuscript considerably (no comment on whether the improvements are “significant”). Below are our point-by-point responses. We have also highlighted all changes in the manuscript.

      2. Point-by-point description of the revisions

      Reviewer 1

      Summary

      In this study, Obodo et al. present a new iteration of their popular rhythm analysis tool LimoRhyde. The conceptual advancement in this new iteration is the focus on effect sizes (in the form of point estimates of amplitude and their prediction intervals) rather than the p-values, which has been the predominant form of statistical testing for rhythm analysis. Therefore, compared to a well-established non-parametric method for rhythm testing, LimoRhyde2 selects genomic features with larger amplitudes (effect-sizes) as it is designed to do.

      Major Comments

      1. (LimoRhyde2 algorithm, Page 2-) It is unclear what exactly the contributions/advancements of the authors are? Is it a novel statistical method, the combination of well-established tools in a novel workflow, or is it a novel application to a new field (rhythms)? I am afraid the sentence "LimoRhyde2 builds on previous work by our group and others to rigorously analyze data from genomic experiments [9,16,17], capture non-sinusoidal rhythms [18], and accurately estimate effect sizes [14,19]." is rather ambiguous.

      We have revised this sentence in the last paragraph of the Introduction to clarify LimoRhyde2’s contributions.

      1. (Moderate model coefficients, Page 3-) The authors implement empirical Bayes shrinkage on the coefficients. But the state-of-the-art methods used in LimoRhyde2 for linear model fitting, such as DESeq2/limma-voom/limma-trend, already implement shrinkage for the coefficients. Does algorithm implement a second round of Bayes shrinkage on the rhythm effect-sizes? How or why is this a statistically valid procedure? If not, how does Limorhyde2 add to shrinkage already implemented in DESeq2/limma-voom/limma-trend? Please elaborate.

      To our understanding, the two shrinkage procedures work at different levels and serve different purposes. Limma applies shrinkage on residual variances to account for any technical variation and to give a higher power to detect effects for data with smaller sample sizes within each condition; it does not shrink coefficients. In practice, limma’s shrinkage has little effect given the relatively large sample sizes of most circadian experiments. LimoRhyde2, on the other hand, uses mashr to apply shrinkage to the coefficients themselves to account for shared patterns of effects and variation across both features and conditions. We see no reason this approach is invalid, and in our conversations with Matthew Stephens, the author of ashr and mashr, he felt the same. We elaborate on each method’s contributions in the Discussion (paragraph 2).

      1. I think the goal to move to effect-sizes which lead to more reproducible results and better biological significance is sound and highly appreciated. However, to make the community switch to a completely different way of viewing their genomic analysis requires more convincing examples(s)/use-cases on why they should abandon the old method that they are used to. Now, results section merely shows that this algorithm performs as designed (to find large amplitude rhythms).

      We appreciate the comment and acknowledge that some readers may be particularly attached to p-values and our current analysis may not wholly convince them of the value of effect sizes. We believe the manuscript stands on its own, however, and are using LimoRhyde2 to guide experiments whose conclusions we hope to describe in future work. Nonetheless, we have revised the Discussion (paragraph 4) to clarify that some known relevant genes highly ranked by LimoRhyde2 were underappreciated by BooteJTK.

      1. Related to point 3, others have previously proposed using amplitude (effect-size) thresholds in addition to the p-value cutoffs (Lück & Westermark, 2016, Pelikan et al, 2022), how would the results of Limorhyde2 compare in a fairer contrast where both p-value and amplitude thresholds are implemented? Does the proposed sound method outperform the two-step approach. The authors may perform this analysis on their chosen datasets as well.

      Thank you for raising this point. Indeed, one way to view LimoRhyde2 is as a data-driven balancing of raw effect size and p-value. However, the approach of considering both raw amplitude and p-value is uncommon and requires yet another arbitrary cutoff, which complicates any genewise ranking and side-by-side comparison with other methods. Thus, we have decided to not perform this analysis, and instead mention what we see as the advantage of LimoRhyde2 in the Discussion (paragraph 2).

      1. I am also not completely convinced of the author's approach to compare their tool against BooteJTK. P-values only show ordering when the alternative hypothesis is true. P-values under the null hypothesis are uniformly distributed in [0,1] so would be meaningless for the purpose of ordering. Without knowing the ground-truth, ordering by p-values is rather risky. I understand the authors' difficulty. But maybe point 4 above yields a better evaluation strategy for LimoRhyde2.

      If one accepts that these datasets have a non-zero number of “true” rhythmic genes, which to us seems more than reasonable, then we don’t see this is a large issue. Ranking by (adjusted) p-value is also the standard in differential expression analyses.

      1. (OPTIONAL) LimoRhyde2 orders results by the point estimates of the effect-sizes (amplitudes). Is this biologically the most meaningful? Should the effect-size CIs be ordered at all? Maybe we only care about whether the lower limit of the CI is greater than a chosen threshold without any ordering. A discussion of this would be valuable to a user.

      We discussed this issue amongst ourselves as well, and ultimately elected for simplicity in ranking by only the point estimate and not the credible interval. We have now mentioned this issue in the penultimate paragraph of the Discussion.

      1. (OPTIONAL) If indeed the authors want to move away from p-values, one could argue that most of the insights from p-value analysis are or could be biased. So why compare against ordering by p-values at all in the results?

      We are not arguing that results from p-value-based analyses are biased. We seek to show the differences on real data between an analysis based on p-values, the dominant approach in the field, and one based on estimated effect sizes. We believe this has greater potential to promote thoughtful progress than does outright rejection of p-values based on a purely theoretical argument.

      Minor Comments

      1. In page 3, it is unclear why averaging the three fits is the best thing to do? How bad would the performance be if m = 1 was chosen compared to m=3.

      We have elaborated the relevant section of the Methods. For most genes in most datasets, the difference between m=1 and m=3 wasn’t much. However, m=1 tended to go noticeably sideways for some of the most rhythmic genes, depending on the relative locations of timepoints and spline knots, whereas m=3 did not.

      1. In page 4, "To account for this uncertainty, LimoRhyde2 constructs..." was difficult to understand and sounded arbitrary. Please explain further.

      We have revised this sentence.

      1. Lachmann et al. (2021) also use bootstrap confidence intervals rather than p-values to quantify rhythmicity that ought to be mentioned.

      We have now cited this paper in the Introduction.

      Significance Comments

      1. General assessment: The authors present an exciting new way of viewing results of high-throughput data analysis in the context of biological rhythms using a Bayesian-like approach. Previously work has revealed the flaws in focusing on p-values and how focusing of effect-sizes (in this context amplitudes) can yield more robust, reproducible results. Although this promises to also yield more biological meaningful results, it is unclear from this study how this might be.

      See reply to Major Comment 3 above.

      1. Advance: This study presents the first tool in the context of the rhythm analysis to provide prediction intervals for different rhythm parameters to facilitate a move away from the hypothesis testing framework of p-values. This is a technical advance in the field of rhythm analysis, but it is unclear what insights this could yield.

      See reply to Major Comment 6 above.

      Reviewer 2

      Major Comments

      1. The manuscript introduces a new tool to select rhythmic genes and to quantify amplitudes and phases. The authors combine splines, linear regression, Bayes sampling, and Mash. They focus on amplitudes instead p-values as in other packages. The performance and independence of JTK methods are illustrated using selected circadian expression profiles from different mammalian tissues. The paper is clearly written and provides a valuable extension of existing tools. I miss, however, an intuitive explanation of Mash.

      Thank you.

      1. I agree with their claim that amplitudes are quite important for physiological regulations. However, p-values are also helpful to explore, e.g., transcription factor binding sites. Moreover, amplitudes are taken into account in many studies (see e.g. papers of Naef, Korencic, Westermark, Ananthasubramaniam...). Since JTK or RAIN are non-parametric methods amplitudes are not in focus. The authors should discuss the biological relevance of amplitudes more clearly.

      Thanks for raising this point. We are careful to limit our claims to bulk transcriptome data, and have tried to cite the relevant prior work. We have revised the Discussion to clarify what we see as the potential value of amplitudes, as illustrated by our analysis.

      1. The selection of the 3 data sets and of specific genes seems reasonable since a range of technologies (microarrays versus RNS-seq), of durations (1 day versus 2 days), and of gene amplitudes are represented. Still the authors should comments their selections of data sets and genes.

      We have added justification for our choices.

      1. I find also the tissue-dependent phase distributions of clock-controlled genes of interest. However, a comparison with other studies (Zhang, GTEx from Talamanca et al.) and a discussion how amplitude thresholds such as 10%, 25%, 50% affect the phase distributions would be valuable.

      Thank you for the suggestion. We initially explored several values of the amplitude threshold for those histograms (Figure S4C) before selecting the top 25%, all led to the same conclusion. We consider this a minor issue and tangential to the main point of the paper, so we have left the figure as is. We invite any interested reader to explore the publicly available results.

      Reviewer 3

      Summary

      The authors developed LimoRhyde2, a method for quantifying rhythmicity in genomic data, and applied it to mouse transcriptome data from liver, lung, and suprachiasmatic nucleus (SCN) tissues. The method uses periodic spline-based linear models and an Empirical Bayes procedure (Mash) to produce posterior fits and rhythm statistics. LimoRhyde2 prioritizes high-amplitude rhythms of various shapes rather than monotonic rhythms with high signal-to-noise ratios, which contrasts with previous methods like BooteJTK. The authors demonstrated the value of LimoRhyde2 in quantifying rhythmicity and highlighted some of its advantages over traditional methods. However, they also acknowledged limitations, such as the inability to compare rhythmicity between conditions and the assumption of fixed rhythms.

      Major Comments

      1. The key conclusions are convincing, as the authors demonstrated LimoRhyde2's ability to fit non-sinusoidal rhythms and prioritize high-amplitude rhythms over monotonic rhythms with high signal-to-noise ratios. This is shown by the comparison with BooteJTK, a popular method in the field, and by the analysis of real circadian transcriptome data from mouse tissues. However, the authors acknowledged some limitations that could impact the method's broader applicability.

      Thank you.

      1. Data and methods are presented in a reproducible manner, with detailed descriptions of the periodic spline-based linear models, the use of Mash for moderating raw fits, and the calculation of rhythm statistics. This information is sufficient for other researchers to replicate the study and apply the LimoRhyde2 method to their own datasets. The code is available already.

      Thank you.

      1. Adequate replication and statistical analysis are provided, with the authors analyzing the same datasets using both LimoRhyde2 and BooteJTK to compare their performance. The use of Spearman correlation to assess the relationship between the adjusted p-values from BooteJTK and the amplitudes from LimoRhyde2 further supports the statistical rigor of the study.

      Thank you.

      Minor Comments

      1. Addressing LimoRhyde2's limitations would help improve the study.

      We have extensively addressed the method’s limitations to the best of our knowledge in Discussion paragraphs 6 and 7.

      1. Authors could provide more details on how LimoRhyde2 could be applied to single-cell RNA-seq data to improve the presentation. Single-cell quantification over time would be a challenging task, so some insight into this would be appreciated, rather than a brief comment at the end of the paper.

      Thank you for your interest in this topic. To do it justice, however, requires its own project and paper, so scRNA-seq is beyond the scope of the current paper.

      Significance Comments

      1. This study represents a technical advance in the field of genomic analysis of biological rhythms by introducing LimoRhyde2, a method that prioritizes high-amplitude rhythms and directly estimates biological rhythms and their uncertainty. The method's ability to capture non-monotonic rhythms and account for uncertainty makes it a valuable tool for researchers interested in understanding circadian systems and their physiological impact.

      1. The work is placed in the context of existing literature, as the authors compare LimoRhyde2 with BooteJTK, a refinement of the popular JTK_CYCLE method. The comparison highlights the differences in output, prioritization, and runtime, demonstrating LimoRhyde2's potential advantages over traditional methods in the field.

      2. However, BooteJTK is relatively underused compared to many other methods, partly because of the difficulty and time required to run the analysis. The paper would be improved by comparing LimoRhyde2 to JTK_Cycle itself, as well as RAIN and ARSER. The latter are the most commonly used methods for rhythm detection, and thus the value of the paper's findings would be far greater by comparing to these methods. Like LimoRhyde2, they are also not resource-intensive to run.

      Thanks for your feedback on this point, which is one we discussed at length amongst ourselves. In the end, we decided on BooteJTK because it seems to be the best performing version of the most common method. ARSER and RAIN are simply not the standard, and based on our interpretation of the evidence, not generally superior to JTK. If we had selected the vanilla JTK_Cycle, we felt a reviewer could discard our results by saying "well, they're comparing their method to a version of a method known to be flawed". Given our objective to highlight the differences between prioritization based on estimated effect size and prioritization based on p-value, we do not see the value of including additional methods in the analysis.

      1. LimoRhyde2's ability to efficiently prioritize large effects with functional significance in the circadian system can provide valuable insights for these researchers and advance the understanding of biological rhythms. The LimoRhyde2 approach is different to conventional reliance on arbitrary p- or q-values, which are taken as almost sacrosanct in the field as a measure of a dataset's worth. LimoRhyde2 could thus help to change this false perception of how to rate a circadian rhythm, which has particularly been ushered in by a reliance on JTK_Cycle p- and q-values as the method of choice for assigning meaningfulness to rhythms. Unfortunately, JTK_Cycle is very conservative and is limited to detecting sinusoidal-type rhythms. LimoRhyde2 could overcome these limitations (as RAIN does too) if widely adopted. However, to do this, it must be compared to things like JTK_Cycle directly.

      See reply to Significance Comment 3 above.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In this study, Obodo et al. present a new iteration of their popular rhythm analysis tool LimoRhyde. The conceptual advancement in this new iteration is the focus on effect sizes (in the form of point estimates of amplitude and their prediction intervals) rather than the p-values, which has been the predominant form of statistical testing for rhythm analysis. Therefore, compared to a well-established non-parametric method for rhythm testing, LimoRhyde2 selects genomic features with larger amplitudes (effect-sizes) as it is designed to do.

      Major Comments:

      1. (LimoRhyde2 algorithm, Page 2-) It is unclear what exactly the contributions/advancements of the authors are? Is it a novel statistical method, the combination of well-established tools in a novel workflow, or is it a novel application to a new field (rhythms)? I am afraid the sentence "LimoRhyde2 builds on previous work by our group and others to rigorously analyze data from genomic experiments [9,16,17], capture non-sinusoidal rhythms [18], and accurately estimate effect sizes [14,19]." is rather ambiguous.
      2. (Moderate model coefficients, Page 3-) The authors implement empirical Bayes shrinkage on the coefficients. But the state-of-the-art methods used in LimoRhyde2 for linear model fitting, such as DESeq2/limma-voom/limma-trend, already implement shrinkage for the coefficients. Does algorithm implement a second round of Bayes shrinkage on the rhythm effect-sizes? How or why is this a statistically valid procedure? If not, how does Limorhyde2 add to shrinkage already implemented in DESeq2/limma-voom/limma-trend? Please elaborate.
      3. I think the goal to move to effect-sizes which lead to more reproducible results and better biological significance is sound and highly appreciated. However, to make the community switch to a completely different way of viewing their genomic analysis requires more convincing examples(s)/use-cases on why they should abandon the old method that they are used to. Now, results section merely shows that this algorithm performs as designed (to find large amplitude rhythms).
      4. Related to point 3, others have previously proposed using amplitude (effect-size) thresholds in addition to the p-value cutoffs (Lück & Westermark, 2016, Pelikan et al, 2022), how would the results of Limorhyde2 compare in a fairer contrast where both p-value and amplitude thresholds are implemented? Does the proposed sound method outperform the two-step approach. The authors may perform this analysis on their chosen datasets as well.
      5. I am also not completely convinced of the author's approach to compare their tool against BooteJTK. P-values only show ordering when the alternative hypothesis is true. P-values under the null hypothesis are uniformly distributed in [0,1] so would be meaningless for the purpose of ordering. Without knowing the ground-truth, ordering by p-values is rather risky. I understand the authors' difficulty. But maybe point 4 above yields a better evaluation strategy for LimoRhyde2.
      6. (OPTIONAL) LimoRhyde2 orders results by the point estimates of the effect-sizes (amplitudes). Is this biologically the most meaningful? Should the effect-size CIs be ordered at all? Maybe we only care about what whether the lower limit of the CI is greater than a chosen threshold without any ordering. A discussion of this would be valuable to a user.
      7. (OPTIONAL) If indeed the authors want to move away from p-values, one could argue that most of the insights from p-value analysis are or could be biased. So why compare against ordering by p-values at all in the results?

      Minor Comments:

      1. In page 3, it is unclear why averaging the three fits is the best thing to do? How bad would the performance be if m = 1 was chosen compared to m=3.
      2. In page 4, "To account for this uncertainty, LimoRhyde2 constructs..." was difficult to understand and sounded arbitrary. Please explain further.
      3. Lachmann et al. (2021) also use bootstrap confidence intervals rather than p-values to quantify rhythmicity that ought to be mentioned.

      Significance

      General assessment:

      The authors present an exciting new way of viewing results of high-throughput data analysis in the context of biological rhythms using a Bayesian-like approach. Previously work has revealed the flaws in focusing on p-values and how focusing of effect-sizes (in this context amplitudes) can yield more robust, reproducible results. Although this promises to also yield more biological meaningful results, it is unclear from this study how this might be.

      Advance:

      This study presents the first tool in the context of the rhythm analysis to provide prediction intervals for different rhythm parameters to facilitate a move away from the hypothesis testing framework of p-values. This is a technical advance in the field of rhythm analysis, but it is unclear what insights this could yield.

      Audience:

      This will be useful to all chronobiologists (clinical and basic research) who use high-throughput genomic assays. Since this is an open R-package, I suspect most of those who want to will be able to easily use it. My expertise is in chronobiology, data science and systems biology.

    1. Author Response

      Reviewer #2 (Public Review):

      The paper by Arribas et al. examines the coding properties of adult-born granule cells in the hippocampus at both single cell and network level. To address this question, the authors combine electrophysiology and modeling. The main findings are:

      Noisy stimulus patterns produce unreliable spiking in adult-born granule cells, but more reliable responses in mature granule cells.

      Analysis of spike patterns with a spike response model (SRM) demonstrates that adult-born and mature GCs show different coding properties.

      Whereas mature GCs are better decoders on the single cell level, heterogeneous networks comprised of both mature and adult-born cells are better encoders at the network level.

      Based on these results, the authors conclude that granule cell heterogeneity confers enhanced encoding capabilities to the dentate gyrus network.

      Although the manuscript contains interesting ideas and initial data, several major points need to be addressed.

      Major points:

      1) The authors use and noisy stimulation paradigm to activate granule cells at a relatively high frequency. However, in the intact network in vivo, granule cells fire much more sparsely. Furthermore, granule cells often fire in bursts. How these properties affect the coding properties of granule cells proposed in the present paper remains unclear. At the very least, this point needs to be better discussed.

      In vivo whole cell recordings of granule cells are very scarce. In our study, we based the design of our stimulus on recordings from the intact network in vivo (PerniaAndrade and Jonas 2014), which show that granule cells receive a wide range of frequencies, with a power spectrum that exhibits a power law decay. These properties are built in our noisy stimuli. These in vivo recordings have also reported the presence of theta oscillations, showing a peak in the spectrum. However, in our approach we deliberately removed these oscillations from our stimuli because it is best to fit GLMs using white noise or noise with an exponentially decaying autocorrelation (Paninski et al. 2004).

      Thus, our choice of the stimuli is far from arbitrary, but rooted on experimental evidence from intact network in vivo recordings, together with previous knowledge about GLM/SRM fitting. This comment reveals to us that we did not clarify this enough in the manuscript. We are grateful to the reviewer for revealing this omission, since this is in fact an important aspect of the study strategy. In the revised manuscript, we brought these points up front in the results section when we introduce the stimulus for the first time, and more thoroughly discussed it in the Methods section that describes the stimulus.

      Still, the bursts observed in granule cells are an important feature and they have been observed to be phase locked to the theta-gamma oscillations in vivo (Pernia-Andrade and Jonas 2014). In the revised version of the manuscript we included new experiments and simulations with stimuli that include a peak in theta frequency. We found that immature neurons also improve decoding performance with these theta modulated stimuli.

      2) The authors induce spiking in granule cells by injection of current waveforms. However, in the intact network, neurons are activated by synaptic conductances. As current and conductance have been shown to affect spike output differently, controls with conductance stimuli need to be provided. Dynamic clamp is not a miracle anymore these days.

      The use of dynamic clamp sounds in principle like a good suggestion. However, in the manuscript we have taken a different approach to enable the use of a single neuron GLM that uses currents as inputs. To control for the differences between mature and immature neurons we used currents with amplitude normalized by the input resistance, and both types of neurons were measured with the same technique to allow for the comparison.

      Importantly, the GLM type model that we use assumes that the membrane potential is a linear convolution of the input, which permits a straightforward and robust fitting approach. We argue that this is not a minor issue, since using dynamic clamp would require a drastic modification of the model. Furthermore, the use of conductance stimuli would not allow for the straightforward model fitting we perform with our approach. The key point here is that the membrane potential would not be correctly approximated as a linear function of the conductance stimulus, precluding the fitting strategy.

      Finally, at the moment we do not have the equipment to perform the suggested experiment, so this suggestion would require a big amount of time to acquire the equipment and set up the experiments in mature and immature neurons. In addition, we would have to change the model and develop a different fitting strategy. With the controls that we already have in the manuscript, we do not think dynamic clamp experiments would fundamentally change the conclusions of the manuscript. Thus, we argue that this is beyond a reasonable timeframe for this revision, but could be something to further explore in future. We now mention this possibility in the discussion.

      3) The greedy procedure is a good idea, but there are several issues with its implementation. First, it is unclear how the results depend on the starting value. What we end up with the same mixed network if we would start with adult-born cells? Second, the size of the greedy network is very small. It is unclear whether the main conclusion holds in larger networks, up to the level of biological network size (1 million). Finally, the fraction of adult-born granule cells in the optimal network comes out very large. This is different from the biological network, where clearly four or five-week-old granule cells cannot represent the majority. Much more work is needed to address these issues.

      The reviewer approves the greedy procedure that we apply in our manuscript and poses three issues for consideration.

      First, the reviewer queries what would be the result of starting the procedure with a different pool of simulated neurons, and whether we would obtain “the same mixed network if we would start with adult-born cells”. Let us remark that the outcome of the greedy procedure is not always the same mixed population of neurons. For each different mature neuron that we use to start the procedure, the trajectory (see Fig. 4A) of selected neurons will be different. Thus, the final population (network) will be different, and this is reflected in the error bars that we obtain in Fig. 4. Presumably, starting with adult-born cells will change the outcome of the greedy procedure. However, note that this is not the point of the approach. The motivation to start with mature neurons is to ask whether adult-born cells can contribute something to decoding, given that mature cells on their own perform better.

      Second, the reviewer questions the size of the population that we reach with the greedy procedure. Note that for the population sizes that we show in the manuscript the decoding performance already begins to saturate, Fig. 4F-H. Furthermore, it is unfeasible to construct a 1M neurons population due to the computational cost –the time it takes to run the algorithm. These two facts motivated us to stop at 12 neurons as it strikes a good balance between computational time and saturation. Importantly, as we expand below, the aim of the greedy procedure simulation is not reconstructing the actual network of the dentate gyrus. Rather, we seek to understand whether immature neurons could improve coding in a population.

      Third, the reviewer observes that the fraction of adult born cells in the reconstructed populations using the greedy procedure are large as compared to the biological network. Again, here note that the aim of the whole in-silico experiment is not to recover the biological network, where other aspects are at play. More simply, we query the possible contribution of adult born cells to coding. In fact, if we obtained the same proportion it would be by chance, since we do not think that adult-born cells in the dentate gyrus are chosen according to the greedy algorithm.

      Still, this comment from the reviewer motivated us to include further simulations of the greedy procedure with constraints. In the revised manuscript we show new results using the greedy procedure, but constraining the fraction of immature neurons in the resulting populations, see Figure 4-figure supplement 2.

      More generally, we think that these comments reveal a possible misunderstanding about the approach, its purpose and the interpretation of the results. The point of the greedy procedure is to show that immature neurons do in fact contribute to improve the decoding, despite being generally worse individually. We do not claim that the population obtained with the greedy procedure faithfully reflects the actual shape of the in vivo network. We are aware that it does not. We see that this may have not been clear in the original version. In the revised version, we now explain the purpose of the greedy procedure when we introduced it. Additionally, we comment on the proportion of immature neurons in the same paragraph.

      4) Likewise, the idea of dynamic pattern separation seems quite nice. However, the authors focus on the differences between mixed and pure networks, which are extremely small. Furthermore, the correlation coefficients of "low", "medium", and "high" correlation groups are chosen completely arbitrarily. A correlation coefficient of 0.99, considered low here, would seem extremely high in other contexts. Whether dynamic pattern separation is possible over a wider range of input correlation coefficients is unclear (see O'Reilly and McClelland, 1995, Hippocampus, for a possible relationship). Finally, aren't code expansion and lateral inhibition the key mechanisms underlying pattern separation? None of these potential mechanisms are incorporated here.

      The reviewer positively appreciates the idea of the pattern separation task that we propose in the manuscript, and poses some questions concerning the extent of the contribution of adult-born neurons.

      We agree that code expansion and lateral inhibition are key mechanisms for pattern separation in the DG, and we do not claim that adult-born neurogenesis is the key mechanism behind pattern separation. Rather, in our work we explore the role of adultborn immature neurons in coding in general, and in pattern separation in particular, given that it’s a commonly attributed function to the DG.

      We note that the correlation in O'Reilly and McClelland 1994 (actually, what they call pattern overlap) is of a very different nature than the one we compute in our work. They compute the overlap between different patterns of activation in a population of neurons, that is the probability that a single neuron is active in two different patterns of activation. In our manuscript we compute the correlation between different continuous time-varying stimuli that stimulate single neurons.

      Importantly, previous work has shown that ablating neurogenesis particularly affects fine spatial discrimination, that is when the separation between patterns is small, but not when it is large (Clelland 2009, Science). Hence, we were actually expecting the impact of adult-born neurons to be important only for relatively large correlation coefficient values.

      In the revised manuscript, we now explain the rationale for the choice of correlation values, both in the main text when we introduce the task, and in the Methods when we set the values for the low, medium and high correlation classes. We also added a sentence to the discussion on pattern separation, bringing in the importance of the ideas of lateral inhibition, code expansion, and the work of O’Reilly 1994.

      5) A main conclusion of the paper is that while mature GCs are better decoders on the single cell level, heterogeneity in mixtures improves coding in neuronal networks. However, this seems to be true only for r^2 as a readout criterion (Fig. 4F). For information, the result is less clear (Fig. 4G). The results must be discussed in a more objective way. Furthermore, intuitive explanations for this paradoxical observation are not provided. Saying that "this is an interesting open question for future work" is not enough.

      This is an interesting point raised by the Reviewer. While r^2 is quantified by comparing the decoded stimuli with the true stimuli, mutual information is related to the uncertainty about the decoding. That is, it quantifies the correspondence between decoded and true stimuli, but does not tell us whether it is a good approximation to it. For example, a decoder could achieve perfect mutual information but result in a poor reconstruction by performing a perfectly scrambled one-to-one mapping of the true stimulus [Schneidman et al. 2003], see also our reply to point [5] by Reviewer #1 above.

      We agree that this is an important point and we realize that it was not clear in the original version of the manuscript. In the revised manuscript we added some sentences to clarify this point.

      6) The authors ignore possible differences in the output of mature and adult-born granule cells in their thinking. If mature and adult-born granule cells had different outputs, this could affect their contributions to the code (either positively or negatively). At the very least, this possibility should be discussed.

      Newborn neurons contact the same targets as mature neurons, born during development: pyramidal cells in CA3, and interneurons in CA3 and the DG. During the maturation, there is a sequence of connectivity with CA3 and within the DG (Toni et a. 2008). At 4 weeks, newborn cells are already contacting their postsynaptic targets. Still, there may be subtle differences in the strength of these connections compared to mature neurons.

      So, although the targets are the same, there may be quantitative differences in the way they contribute to the code. Thus the point raised by the reviewer is interesting, so we decided to discuss it further in the revision.

    1. Author Response

      Reviewer #1 (Public Review):

      This study used intersectional genetic approaches to stimulate a specific brainstem region while recording swallow/laryngeal motor responses. These results, coupled with histology, demonstrate that the PiCo region of the IRt mediates swallow/laryngeal behaviors, and their coordination with breathing. The data were gathered using solid methods and difficult electrophysiological techniques. This study and its findings are interesting and relevant. The analysis (and/or the presentation of the analysis) is incomplete, as there are analyses that need to be added to the manuscript. The interpretation of the data is mostly valid, but there are claims that are too speculative and are not well-supported by the results. The introduction and discussion would benefit from more citations and a deeper exploration of how this study relates to other work - especially a thorough accounting of and comparison to other studies concerning putative swallow gates.

      General/major concerns:

      The field of respiratory control is far from unified regarding the role of PiCo in breathing or any other laryngeal behaviors. If anything, the current consensus does not support the triple-oscillator hypothesis (in which PiCo is one of 3 essential respiratory oscillators). The name "PiCo", short for "post-inspiratory complex", suggests a function that has not been well-supported by data - it is a putative post-inspiratory complex, at best. I suggest putting this area in context with other discussions i.e. IRt (such as in Toor et al., 2019) or Dhingra et al. 2020 showed broad activation of many brainstem sites at the post-I period (including pons, BotC, NTS)

      The reviewer’s comment refers to our previous publication and not the present one. With all due respect to the reviewer, the submitted study investigates PiCo’s involvement in swallow and laryngeal activation and its coordination with breathing.

      We did not feel that it is appropriate for us to critique the Dhingra paper in the present study. However, since this seems to be important to this reviewer, we would like to clarify: Because of filter characteristics, and the low temporal and spatial resolution of these field recordings, the approach used by Dhingra is inappropriate for providing insights into the presence or absence of PiCo. We therefore developed an alternative approach, which provides more detailed insights into population activity, the Neuropixel approach. This Neuropixel recording from PiCo (black trace) exemplifies how field recordings (yellow) fail to pick up post-I activity. We could provide many more examples, but as stated above, addressing the study by Dhingra is tangential to the present study.

      We would also emphasize that the study by Dhingra was never designed to provide negative evidence, and Dhingra et al. never claimed that their study demonstrates the absence of PiCo. Unfortunately, the data by Dhingra were misinterpreted by Swen Hülsmann in his Journal of Physiology editorial which created considerable confusion, but also sensation in the field. Objectively, Toor et al reproduced the Anderson study in rats as we will elaborate below. Unfortunately, Toor et al added to the confusion, by renaming the PiCo area into IRt. The field of respiration would have also been confused if the first study reproducing the Smith et al. 1991 study in a different rodent species would have refused to call this area preBötC and instead would have called it e.g. ventrolateral reticular field.

      Did you perform control experiments in which the opto stimulations were done on animals without the genetic channels (for example, WT or uncrossed ChAT-ires-cre, etc.), or in mice with the genetic channels that weren't crossed (uncrossed Ai32 mice)? If so, please include. If not, why?

      Yes, we performed many control experiments. Aside of many recordings in which viral injections were targeted outside PiCo, we also performed optogenetic stimulations in mice lacking channelrhodopsin. We have now added the following statements and supplemental figure.

      Optogenetic stimulation in mice lacking channelrhodopsin

      Stimulation of PiCo, across all stimulation durations, in 3 Ai32+/+ mice and 4 ChATcre:Vglut2FlpO:ChR2 mice where the ChR2 did not transfect ChATcre:Vglut2FlpO, as confirmed by a post-hoc histological analysis, resulted in no response (Fig. S3).

      How do you know that your opto activations simulate physiological activation? First, the intensive optical activation at the stim site does not occur in those neurons naturally.

      This seems like a generic critique of the optogenetic approach. In none of the 10,000+ published optogenetic studies is it known to what extent optogenetic activation stimulates exactly the same neurons and the same degree of activity as during a natural behavior. What we know is that PiCo neurons are activated during postinspiration (Anderson et al. 2016) and that optogenetic activation stimulates these neurons and that this activation evokes the same muscles in the same temporal sequence as a water-evoked swallow. We assume that the reviewer’s comment does not intend to imply that “swallows” evoked by nonspecifically stimulating the SLN is more physiological than the optogenetically-evoked swallows of a specific neuron population? From the reviewer’s other comments, it is obvious that the reviewer has no problems with the results of the Toor study that used exclusively SLN stimulations, an approach which is known to be very non-specific.

      Doing a natural (water) stim for comparison is good, but it cannot necessarily be directly compared to the opto stim. The water stim would activate many other brainstem regions in addition to PiCo.

      Can the reviewer provide any hard evidence that “many other brainstem regions” are activated by water stimulation in comparison to optogenetic stimulation?

      A caveat is that opto PiCo stim =/= water stim (in terms of underlying mechanisms) should be included. Second, in looking at the differences between water vs opto swallows in Table S2: it appears that the ChAT animals (S2A) have something weaker than a swallow with opto stim. For the Vglut2 and ChAT/Vglut2 (S2B&C), the opto swallows also aren't as "strong" as the water swallows (the X and EMG amplitudes are smaller). The interpretation/discussion attributes this to the lack of sensory input during opto stim, but does not mention the strong possibility that there is a difference in central mechanisms occurring. It also seems to be dismissed with the characterization of the swallow as "all-or-none" (see note on Fig 3 results).

      With all due respect, we are somewhat surprised that the reviewer dismisses the entire paragraph in the discussion that specifically addresses the comparison between water-swallows and PiCo-stimulated swallows. We discussed the possibility that PiCo stimulated swallows may not activate the full pathway/mechanism as does the water swallow. We carefully compared and confirmed that PiCo-stimulated swallows have the same temporal motor sequence of the same muscles as those activated in water swallows. As already stated, it is surprising that the reviewer has no problem with accepting the validity of previously published methods like electrical non-specific stimulations of the cNTS or SLN, a frequently used and accepted model to produce and study swallow.

      The writing needs extensive copy editing to improve clarity and precision, and to fix errors.

      Thank you for this comment, we have revised and reviewed the writing.

      Results/Fig 1: What proportion had no/other motor response (non-swallow, non-laryngeal) to the opto stim? I can extrapolate by subtraction, but it would be nice to see the "no/other response" on the plot.

      With all due respect to this reviewer, but it is not possible to address this question. Specifically, it is not possible to know if a “No response” (meaning “no behavioral output” occurred in response to PiCo stimulation), would have resulted in a swallow or laryngeal activation. However, figure 2 contains responses other than swallows, i.e. “non swallows”, which includes both laryngeal activation as well as “no responses” meaning “no behavioral response” in response to PiCo stimulation. This was determined to assess how the respiratory rhythm is affected when a swallow is not produced by PiCo stimulation.

      The explanation of genetics is too spread out and confusing. There needs to be more detail about all the genetic tools used, using the standard language for such tools, in one spot. Please also provide a clear explanation of what those tools accomplish. Include a figure if necessary.

      We apologize for creating confusion. We added more explanations to the text.

      Pick a conventional designator/abbreviation for the different strains, define them in the methods and in the first paragraph of the results section, and use those abbreviations throughout. I think that using ChAT as an abbreviation for your ChAT-ires-cre x Ai32 mice is confusing because it makes it sound like you're talking about the enzyme rather than the specific strain/neurons. Saying "ChAT stimulated swallows... swallows evoked by water or ChAT" makes it sound like the enzyme choline acetyltransferase itself is stimulating swallow. As is convention, pick a more precise abbreviation like ChAT-cre/Ai32 or ChAT:Ai32 or ChAT-ChR2 or ChAT/EYFP. This goes for the other strains as well.

      Thank you for pointing this out. To avoid confusion the strains/neurons are now referred to as: ChATcre:Ai32, Vglut2cre:Ai32, and ChATcre:Vglut2FlpO:ChR2

      For Fig S2C&D, why does it say mCherry? Isn't it tdTomato? Is it just an anti-ChAT antibody and then the tdTomato Ai65 is only labeling Vglut2? I don't see this in the methods section.

      Thank you for pointing this out. We apologize for our mistake, and we have corrected the manuscript to say tdTomato.

      I also don't see methods for all the staining in Fig S3. The photomicrograph says Vglut2-cre Ai6, but there's no mention of Ai6 anywhere else. Which mice are these? Did you cross Vglut2-cre with an Ai6 reporter mouse? How can you image an Ai6 mouse (which I assume expresses ZsGreen? and that you excited at 488?) and a 488 anti-goat in the same section (that's the only secondary listed in the methods that would work with your goat anti-ChAT)? Is there an error in listing the fluorophores in the methods? Please give more details on the microscopy including which filters were used for the triple staining.

      We have decided to remove the CTb data from the manuscript.

      Regarding the staining: I would expect the staining/maps in for the 2 different ChAT/Vglut2 intersectional strains to be similar (Fig 5A/B and S2C/D). The photomicrographs look very different to me, while the heat maps (this goes for all the heat maps in the paper) have barely distinguishable differences. In Fig 5, the staining looks much stronger than in Fig S2C. Why does it look like there are so many more transfected neurons in Fig 5A2 than there are red neurons in the corresponding panel Fig S2C2? And for Fig 5A4 and Fig S2C44? The plot and results text for Fig 5 says the avg number of neurons was 123+¬11. The plot for Fig S2D says 112+¬15, but the results text says 242+¬12 (not sure which is the correct number).

      Thank you for your comments. Previously the heat maps had different scale bars if you compare Fig 5A/B and S2C/D (now figure S4C/D). We changed the heat maps keeping the same scale for all of them. Discussing the representative photomicrography, even figure Fig 5A/B and S4C/D represents the same cluster of cells (PiCo Chat/Vglut+). Figure S4D states 242 ± 12 neurons (also stated in the results section).

      However, we want to point out that there are several technical differences between both, 1) figure 5A represents the transfection promoted by the virus injection, impacting the number of cells stained/transfected (133 ± 16 neurons), 2) figure S4C/D represents a intersectional mouse ChATcre: Vglut2FlpO: Ai65; (242 ± 12 neurons). In this case, we have more tdTomato positive cells because this genetic approach is able to detect most of the Chat and Vglut2 cells. The difference between figures is considered normal for anatomical studies, in some studies the same bregma can show different number of cells. Thus, the differences are due to the differences in the type of approaches (viral expressions vs. intersectional approach).

      We have also added additional experiments to figure 5 (now N=7) which has been reflected in the text and figures.

      The results text for Fig S2C also says the staining is "similar to the previous ChAT staining...", which I assume refers to S2A/B. The plot and results text for Fig S2B reports 403+¬39 neurons, while S2D is either 112 or 242 (not sure?). The plots have different Y scales, which should be changed to be the same. But why do the photomicrographs and the heat maps look so similar? I would expect far fewer neurons to be stained in the intersectional mice (Fig 5 and Fig S2C/D) than in the ChAT staining (Fig S2A/B). I am having trouble reconciling the different presentations/quantifications and making sense of the data in these histology figures.

      We removed “similar to the previous ChAT staining” and we have reviewed the heat maps. Since the original submission, we performed more experiments and now added more animals to the analysis (now N=7), each heat map represents the correct number of neurons in PiCo, respectively to each experiment.

      The Y scales has been adjust to better demonstrate the Chat staining vs. the intersectional mice triple conditioned.

      How can you distinguish PiCo from non-PiCo in the histology, especially in the ChAT-only staining? It seems that you have arbitrarily defined the PiCo region, and only counted neurons within that very constrained area.

      Even in ChAT-only staining, the N.ambiguus is very distinct from the cholinergic neurons located more medial to the N.ambiguus. This can be unambiguously be confirmed by combining ChAT with glutamatergic in situ staining as done in the Anderson et al. study, or unambiguously be demonstrated with the viral approach as done in the present study. Thus, we don’t see why it is arbitrary to define the distribution of PiCo neurons. What is arbitrary is the definition of the preBötC, yet the field of respiration seems to have no problem with this. We assume that the reviewer knows that Dbx1 neurons are spread along the entire ventral respiratory column and dorsal portion of the PreBötzinger Complex up to the level of the XII nucleus. Yet it is commonly accepted for authors to refer to the PreBötzinger Complex by counting dbx1 neurons within a constrained area of what is believed to be the PreBötzinger Complex, even though the borders are arbitrary. It is e.g. known that some of the ventrally located preBötC neurons are presumed rhythmogenic while the more dorsally located Dbx1 neurons are premotor. The transition from rhythmogenic to premotor is gradual. Similarly, NK1 staining, or SST staining is not restricted to the preBötC and it is arbitrary to define where preBötC begins and what to include. Indeed, our PNAS paper indicates that inspiratory bursts can be generated by optogenetically stimulating Dbx1 neurons along the entire VRC column – so it is not clear where the rhythmogenic portion of the preBötC begins rostrocaudally and dorsoventrally and where the rhythmogenic portion and preBötC itself ends. Thus, we want to re-iterate and emphasize, that for the present study, we developed a method using the cre/FlpO approach to unambiguously define the PiCo region. It is surprising that this reviewer does not acknowledge this technical advance that added significantly more specificity to the anatomical and physiological characterization of PiCo, than the Toor et al. study, and also the Anderson et al. study.

      I can see stained neurons in the area immediately outside of PiCo, and I'd like to see lower-magnification images that show the staining distribution in a broader region surrounding PiCo as well, especially in the rest of the reticular formation.

      We characterized the PiCo area based on the histological phenotype and in vitro and in vivo experiments performed by Anderson et al., 2016. PiCo is an area located close to the NAmb, presenting the same ChATcre phenotype. As stated above, the distribution and agglomeration of the NAmb is clearly very compact, and different then the observed ChATcre: Vglut2FlpO: Ai65 neurons located outside of NAmb. It is also important to emphasize, that like is the case for the preBötC, other transmitter phenotypes of neurons are also present in the PiCo region (i.e. GABA or Dbx1). However, the study performed by Anderson et al, 2016 paper, described only the functions of cholinergic neurons located in PiCo, and we always planned to publish a paper of the other neurons within PiCo – this area e.g. contains pacemaker neurons etc. But, I hope that the reviewer acknowledges that many investigators have studied the preBötC for the past 30 years. Hence, much more information has been accumulated on this region (which btw was at least as controversial at the beginning), and it will likely take at least another 30 years to fully identify and characterize PiCo.

      Similarly, how can you be sure you're stereotaxically targeting PiCo precisely (600um in diameter?) with your opto fiber (200um in diameter). Wouldn't small variations in anatomy put the fiber outside the tiny PiCo area?

      We assume the reviewer means “stereotactically”. And yes, the reviewer is correct, it is necessary to position the laser at a consistent anatomical location. Placement of the optical fibers outside of this area does not result in activation of PiCo. We have added an additional supplemental figure (Figure S6) to address this.

      Please put N's and stats results in Table S1 for both swallow and laryngeal activity. I took what I assume to be the Ns (10, 11, and 4) and did some stats like the ones you presented for the laryngeal duration. The differences between vagus duration for 40 and 200 ms pulse durations are all significant for each strain, by my calculations. Also, I think there must be an error in the orange swallow plot in Fig 3A. The orange dots don't correspond to the table values. I plotted all the Table S1 values for each strain. Each line looks similar to the blue laryngeal activation plot in Fig 3A. The slopes of the Vglut2 were less than the other strains, and the slopes for the swallow behavior were less than the laryngeal behavior for all strains. Otherwise, they all look similar. Please double-check your values/stats to address these discrepancies. If it is indeed true that the stim pulse duration affects swallow duration, revise the interpretations and manuscript accordingly.

      We thank the reviewer for the diligence in reviewing our manuscript. But, with all due respect, the reviewer is incorrect and misunderstood the data. To clarify: Table S1 is only presenting data for laryngeal activation, swallow data is presented in Table S2. The orange data points in Fig 3A are not detailed in Table S1 or S2. Table S2 is the average of all swallows across all laser pulse durations since the laser pulse duration does not affect swallow behavior duration. All data will be publically available after publication of the manuscript.

      Figure 3A is only representing the ChATcre:Vglut2FlpO:ChR2 column of Table S1

      The N’s have been added to table S1

      Please add more details on stats in general, including the specific tests that were performed, F values and degrees of freedom, etc.

      Thank you, this has been added throughout the results section. Please refer to the results section for this addition. However below we have provided an example.

      An example: A two-way ANOVA revealed a significant interaction between time and behavior (p<0.0001, df= 4, F= 23.31) in ChATcre:Vglut2FlpO:ChR2 mice (N=7).

      How do you know that you're not just activating motoneurons in the NA when you stimulate your ChAT animals, especially given the results in Fig 1B? In this case, the phase-specific results could be explained by inhibitory inputs (during inspiration) to motoneurons in the region of the opto stim.

      As stated in this paper as well as the Anderson et al 2016 paper (and for that matter also the Toor et al study) this is a caveat. This major caveat motivated the development and use of the ChATcre:Vglut2FlpO:ChR2 (specifically targeting the PiCo neurons that co-express ChAT and Vglut2, not laryngeal motor neurons) experiments that have mostly the same response as the ChATcre:Ai32 mice. We cannot say this is due to inhibitory inputs to laryngeal motoneurons, since the cre/FlpO specific experiments are not directly activating laryngeal motoneurons. But we do not want to entirely exclude that some premotor mechanisms may also occur in PiCo. The reviewer may know that there is overlap of rhythmogenic and premotor functions for the Dbx1 neurons in the PreBötC, But, addressing this issue is beyond the scope of this study. In fact, we are working on a separate connectivity study using novel, still unpublished antegrade and retrograde vectors that do not reveal any direct connections to laryngeal motoneurons. Hence, we expect that the connectivity from PiCo to laryngeal motoneurons is more complex and addressing this question cannot be done as a simple add-on to an already complex study. Again, we would refer to the PreBötzinger complex, where nobody expects that one study can resolve all the physiological and anatomical characterizations that have been accumulated over 30 years in one study. We would argue that in some ways, our cre/FlpO approach is more specific than the Dbx1 stimulations which activates not only rhythmogenetic PreBötzinger complex neurons, but also pre motoneurons as well as glia cells, and many neurons rostral and caudal to the PreBötzinger complex. We are aware of these caveats, and we have discussed this in the original submission, and also in the revision.

      While the study from Toor et al is cited, there needs to be a much more thorough discussion of how their findings relate to the current study.

      Many thanks for asking for a more thorough discussion of Toor et al., which we are happy to provide here. Perhaps we were too polite in our original manuscript to emphasize all the problems in that study.

      They demonstrated that PiCo isn't necessary for the apneic portion of swallow. Inhibiting this region also didn't affect TI.

      Please note – the fact that Toor et al did not find an effect on TI confirms Anderson et al. 2016: In Figure 3G,3F of the Nature paper, the reviewer will find that injections of DAMGO and SST into PiCo inhibited post-I activity without affect inspiratory duration. This figure also shows that the inspiratory burst can terminate in the absence of postinspiratory activity.

      The reviewer states: “They demonstrated that PiCo isn't necessary for the apneic portion of swallow”. With all due respect to this reviewer, this is NOT correct. Toor et al showed that inhibiting PiCo did block SLN-evoked fictive-swallows but not the apnea caused by SLN stimulation. This is not the apnea caused by swallows (which was never studied by Toor), but by the SLN stimulation. The apnea evoked by SLN stimulation has most likely nothing to do with the apnea caused by swallows. Unfortunately, the Toor et al. makes the same misleading claim as the reviewer.

      PiCo cannot be the sole source of post-I timing, and the evidence overwhelmingly favors the major involvement of other regions such as the pons.

      This comment seems to be unrelated to the main thrust of this paper that studies PiCo’s involvement in swallow and laryngeal activation in coordination with breathing. However, since this comment seems to discredit the Ramirez lab in general, we would like to clarify that inhibiting PiCo with DAMGO and SST inhibits post-I activity (Anderson et al 2016, Fig.3G,3F). Thus, we don’t understand the rationale or actual data for the reviewer’s conclusion that PiCo cannot be the sole source of post-I timing? We also don’t understand the basis for the reviewer’s conclusion that “the evidence overwhelmingly favors the major involvement of other regions such as the pons”. We also want to add, that no-where in the Anderson et al. study did we state that the pons plays NO role. Indeed, we specifically stated: “In this context it will be interesting to resolve the role of the PiCo in specific postinspiratory behaviors and to identify how the PiCo interacts with other neural networks such as the Kolliker-Fuse nucleus, a pontine structure that has been hypothesized to gate postinspiratory activity and the periaqueductal grey a structure involved in vocalization and the control of postinspiration”.

      They also showed that inhibition of all neurons (not just ChAT/Vglut) in the PiCo region suppresses post-I activity in eupnea. This suppression was overcome by the increased respiratory drive during hypoxia.

      Before comparisons are made with Toor et al. it is important to note the species and methodological differences between Toor et al. rat anesthetized, vagotomized, paralyzed and artificially ventilated model which evaluated fictive swallows (deafferented and paralyzed). By contrast this study uses a mouse anesthetized, vagal intact, freely breathing model and evaluates natural physiologic swallow via water and central stimulation. It seems that the reviewers does not acknowledge one of the main innovations of this study. For this study we introduced a genetic approach to specifically target and activate ChATcre/Vglut2FlpO PiCo neurons. This has never been done before, and developing this approach took more than 4 years of breeding and crossing and testing different options.

      As for Toor et al., these authors pharmacologically, bilaterally inhibited neurons in the area of PiCo with isoguvacine, a specific GABA-A agonist. Even though this pharmacological intervention does not specifically inhibit cholinergic/glutamatergic neurons in PiCo, these authors essentially confirm the study by Anderson et al. We do not find this finding controversial. Perhaps the reviewer finds the definition of PiCo “controversial”, because Toor et al called the identical area IRt instead of PiCo, even though they exactly reproduce the finding by Anderson. Toor et al. not only arrive at the same conclusion as Anderson but they added more details – none of which is contradicting the results by Anderson et al.: Here are excerpts from the Toor study “We therefore conclude that the ongoing activity of neurons in the IRt contributes to eupneic respiratory and sympathetic post-I activities without exerting significant control on other respiratory or cardiovascular parameters” “IRt significantly inhibited the post-I components of VNA” “IRt inhibition was also associated with a reduction in PNA” “increase in respiratory cycle frequency” “due to a reduction in TE“ “with no effect on TI observed”. “Bilateral microinjection of isoguvacine selectively reduced the magnitudes of post-I VNA and rSNA, but not PNA responses to acute hypoxemia”.

      In this statement the reviewer probably refers to one particular aspect, i.e. the fact that Toor et al. did not significantly block some of the post-I activity – they state: “had no significant effect on the AUC of post-I rSNA (305+/- 24 vs 230+/- 28,p=0.16,n=6)”. Please note that there is a tendency, a reduction from 305-230. Perhaps the Toor study was not sufficiently powered to fully block the effect, perhaps the drug did not inhibit the entire PiCo. These are all open questions that a critical reader should know. The reviewer will agree that it is as difficult if not more difficult to demonstrate the absence of an effect. To arrive at a negative conclusion experiments should be done with the same scrutiny than to demonstrate a positive result. We also assume that the reviewer is familiar with animal experiments and will understand that pharmacological injections are often difficult to interpret, in particular in case of local in vivo injections. It is possible that Toor et al is inhibiting e.g. parts of the Bötzinger complex.

      We have added to the manuscript the following statement: It is important to note that SLN stimulation does not only trigger swallows, but also changes in the overall stiffness and tension of the vocal cords (Chhetri et al., 2013) as well as prolonged hypoglossal activation independent of swallowing (Jiang, Mitchell, & Lipski, 1991). It has been hypothesized that inhibition of the IRt blocks fictive swallow but not swallow-related apnea. Yet this apnea was generated by SLN stimulation and not by a natural swallow stimulation (Ain Summan Toor et al., 2019). It is known that SLN stimulation causes endogenous release of adenosine that activates 2A receptors on GABAergic neurons resulting in the release of GABA on inspiratory neurons and subsequent inspiratory inhibition (Abu-Shaweesh, 2007), suggesting that the SLN evoked apnea may not be the same as a swallow related apnea. Moreover, microinjections of isoguvacine into the Bötzinger complex attenuated the apneic response but not the ELM burst activity (Sun, Bautista, Berkowitz, Zhao, & Pilowsky, 2011), suggesting the Bötzinger complex, not PiCo, could be involved in modulating apnea.

      We would also like to add that our current study characterized swallow-related specific muscles and nerves in both water-triggered and PiCo-triggered swallows to better characterize the physiological properties of this swallow behavior. By contrast, Toor et al. only characterized nerve activities that are involved in multiple upper airway activities and breathing. It is somewhat surprising that the reviewer did not consider the fact that Toor et al. characterized putative swallows that were triggered by SLN stimulation and that Toor et al. were content with nerve-recordings and failed to confirm that the behavior that they evoked is actually a physiological swallow. Which, according to the comments from this reviewer (see above), indicates the possibility of differences in central mechanisms occurring between fictive swallow and physiological swallows.

      While we have cited Toor et al and their truly excellent work in the broad iRt we did not feel it is appropriate to critique them for the fact that they are confusing the field by using a different anatomical term for the area that was clearly defined by us as an area containing cholinergic-glutamatergic neurons. We also did not feel it is appropriate to discuss results that are similar to comparing Apples and Oranges. Toor et al. never specifically manipulated glutamatergic-cholinergic neurons, thus their entire results rest on indirect stimulation affecting this general area – which will unavoidably also include laryngeal motoneurons. We don’t want to criticize this approach, since PiCo is heterogenous, which is another misunderstanding that we find in the reviewers’ critique. We used cholinergic-glutamatergic neurons to define this area. However, like the preBötC, PiCo is also heterogenous. This region contains inhibitory neurons, it also contains glutamatergic neurons that are not cholinergic, and cholinergic neurons that are not glutamatergic. Because of this heterogeneity we compared the effects of stimulating glutamatergic neurons and cholinergic neurons as well as cholinergic-glutamatergic neurons. This is an approach that is generally accepted in the field. As already stated, there is not a single marker that uniquely characterizes the PreBötC. Thus, when stimulating Dbx1 neurons, glutamatergic neurons, or Somatostatin neurons it only captures subpopulations of this region. The recently published study by Menuet et al. in eLife, used even more indirect methods to inhibit preBötC. They used a pan-neuronal CBA promotor that targets neurons irrespective of phenotype. It is not our intention to discredit this very elegant study, but we object the statement that we “have arbitrarily defined the PiCo region”.

      This study has not demonstrated some of the things that are depicted in Fig 7 and included in the discussion. While swallow can inhibit inspiration, there are many mechanisms by which this can happen other than a direct inhibitory connection from the DGS to PreBotC. You cite Sun et al., 2011 findings of "a group of neurons that inhibits inspiration" during SLN stim, but don't mention that it is the BotC and that the paper shows that swallow apnea is dependent on BotC. That is also supported by the Toor study. I don't understand how post-I (aka E) can be discussed without discussion of the BotC - this is a glaring omission.

      We have removed figure 7, which was only meant as a hypothetical schematic.

      Why is it necessary for PiCo to innervate the cNTS?

      This was a hypothesis based on CTb data that we have now removed.

      That is true if the conjecture that PiCo gates swallowing is true, as the cNTS is the only known region for central swallow gating. However, PiCo could influence afferent input to the NTS less directly, and therefore not function as a gating hub per se. The experimental evidence does not warrant the claim that PiCo gates swallowing. The definition of a swallow gate(s) is a topic of much debate and no conclusive experimental evidence has emerged for swallow gating regions to exist anywhere except in the NTS. The current study's evidence also does not meet the criteria necessary to conclusively call PiCo a swallow gate. The authors should soften this claim and language throughout the manuscript.

      Although we do not know of any studies that has optogenetically gated swallow in the cNTS, it seems the reviewer objects our use of the word “gate”. We have revised the manuscript and removed any wording stating PiCo is a swallow “gate”. It would be interesting to know whether the reviewer has the same objections of the use of the word “relay” as done by Toor et al.?

      It is also unclear that PiCo acts directly on the swallow pattern generator to gate swallowing. It is not just "conceivable that the gating mechanism involves" the pons, but nearly certain. Swallow gating by respiratory activity may not be able to be ascribed to one particular location. At a minimum, it likely involves the NTS/DSG, pons, and possibly IRt (inclusive of PiCo). The authors are correct that "further studies are necessary to understand the interaction between PiCo and the pontine respiratory group on the gating swallow and other airway protective behaviors." This is why it shouldn't be stated that "this small brainstem microcircuits acts as a central gating mechanism for airway protective behaviors."

      We have removed all language stating PiCo is a swallow gate.

      PiCo is likely part of the VSG (and thus the swallow pattern generator). PiCo, as part of the IRt/VSG could indeed be surveilling afferent information and providing output that affects swallow or other laryngeal activation and the coordination of these behaviors with breathing. However, this is not the responsibility of PiCo alone. This role is likely shared by other parts of the SPG, and may require distributed SPG network participation to be functional. If one were to stim other regions of the distributed SPG, similar results might be expected. When Toor et al silenced the PiCo area (and locations that lie at least lightly beyond the borders of what the present study defines as PiCo), stim-evoked fictive swallows were greatly suppressed. However, swallow-related apnea was unaffected. This supports the role of PiCo as a premotor relay for swallow motor activation, but not as the site that terminates inspiration. Therefore, it cannot be called a gate.

      We already addressed the issue that Toor never demonstrated that the “swallow-related” apnea was unaffected. Toor et al only demonstrated that the SLN-evoked apneas were unaffected, and their conclusions were only based on nerve recordings under fictive conditions (deafferented and paralyzed). Also, to the best of our knowledge, many aspects of the putative swallow pattern generator that this reviewer mentions are purely hypothetical. However, to avoid further arguments, we have removed the word gate and Figure 7 from this manuscript.

      Similarly, Fig 7 does not accurately depict things that are already well-supported by evidence. PiCo should be included as part of the swallow pattern generator (VSG), not as a separate entity acting on it. The BotC and pons are glaring omissions. This study has not demonstrated the labeled inhibitory connection from DSG to PreBotC. The legend states speculations as fact and needs to be dialed way back to either include statements with solid experimental evidence or to clearly mark things as putative/speculative.

      We have removed figure 7.

      The discussion of expiratory laryngeal motoneurons needs to be expanded and integrated better into the discussion of swallow, post-I, and other laryngeal motor activation. Why can't PiCo just be premotor to ELMs?

      If PiCo would “only” or “just” be premotor to ELM then it would not be expected that it could trigger an all-or-none swallow response with a temporal activity pattern similar to the one of a water-evoked swallow. We would also not expect that the activation of the activity pattern is independent of the laser stimulation duration as demonstrated in Figure 3. This was our reasoning why we originally called PiCo a “gate” because at the correct phase it will gate/trigger a complex swallow sequence. But, as stated above, we avoid the word gate in the revised manuscript.

      Concerning the discussion of "PiCo's influence as a gate for airway protective behaviors is blurred...": The incomplete swallow motor sequence didn't seem super different in timing or duration compared to the fully transfected animals (comparing plots from Fig 6 to Fig S1, and Table S2 to Table S3. The values for swallow durations (XII and X) for each group for water and opto seem within similar ranges, as do the differences between water & opto-evoked swallows between strains. While the motor pattern is distinctive from the normal swallow, with laryngeal activity rather than submental activity leading, one might not even be able to call that a swallow. Is it evidence against a classic all-or-nothing swallow behavior any more than the graded swallow results from (fully transfected) Table S1?

      We fully agree that it is possible that this unidentified behavior may not be a swallow. We have changed the name of this behavior to “upper airway motor activity.” However we also cannot rule out the possibility of this being some portion of a graded swallow which would argue that a graded swallow response is exact evidence against the classic all or nothing swallow behavior.

      Please expand on this point and put it into context with others' results: "This brings into question whether this is the first evidence against the classic dogma of swallow as an "all or nothing" behavior, and/or whether this is an indication that activating the cholinergic/glutamatergic neurons in PiCo is not only gating the SPG, but is actually involved in assembling the swallow motor pattern itself."

      This has been expanded and included citation of other studies. The following paragraph can be found in the discussion

      Swallow has been thought of as an “all or nothing” response as early as 1883 (Meltzer, 1883). Whether modulating spinal or vagal feedback (Huff A, 2020b), central drive for swallow/breathing (Huff, Karlen-Amarante, Pitts, & Ramirez, 2022) or lesions in swallow related areas of the brainstem (Car, 1979; Robert W Doty, Richmond, & Storey, 1967; Wang & Bieger, 1991) swallow either occurred or did not. Swallows are thought to be a fixed action pattern, with duration of stimulation having no effect on behavior duration (Fig. 3) (Dick, Oku, Romaniuk, & Cherniack, 1993). Thus, it was particularly interesting that in instances when few PiCo neurons were transfected, either unilateral or bilateral, an unknown activation of upper airway activity occurred. Motor activity no longer outlasted laser stimulation rather was contained within, and the timing of the motor sequence was reversed in comparison to a water or PiCo evoked swallow (Fig. 6). Thus, if insufficient numbers of neurons are activated, PiCo’s influence to specifically activate swallow or laryngeal activation is blurred, resulting in the uncoordinated activation of muscles involved in both behaviors. This brings possible evidence against the classic dogma of swallow as an “all or nothing” behavior, or the presence of an entirely different behavior. We are not the first to bring possible evidence against the classic dogma, “small swallows” were described but failed to be discovered if this was in-fact a partial or incomplete swallow (Miller & Sherrington, 1915). The SPG is thought to consist of bilateral circuits (hemi-CPGs) that govern ipsilateral motor activities, but receive crossing inputs from contralateral swallow interneurons in the reticular formation, thought to coordinate synchrony of swallow movements (Kinoshita et al., 2021; Sugimoto, Umezaki, Takagi, Narikawa, & Shin, 1998; Sugiyama et al., 2011). Incomplete activation of PiCo activates the muscular components of a swallow, without establishing the coordinated timing and sequence of the pattern. It is possible that PiCo is involved in assembling the swallow motor pattern itself and unilateral activation of PiCo could either desynchronize swallow interneurons or activates only one side of the SPG. Since we did not record bilateral swallow related muscles and nerves this question needs to be further examined.

      Reviewer #3 (Public Review):

      Huff et.al further characterise the anatomy and function of a population of excitatory medullary neurons, the Post-inspiratory Complex (PiCo), which they first described in 2016 as the origin of the laryngeal adduction that occurs in the post-inspiratory phase of quiet breathing. They propose an additional role for the glutamatergic and cholinergic PiCo interneurons in coordinating swallowing and protective airway reflexes with breathing, a critical function of the central respiratory apparatus, the neural mechanics of which have remained enigmatic. Using single allelic and intersectional allelic recombinase transgenic approaches, Huff et al. selectively excited choline acetyltransferase (ChAT) and vesicular glutamate transporter-2 (VGluT2) expressing neurons in the intermediate reticular nucleus of anesthetised mice using an optogenetic approach, evoking a stereotyped swallowing motor pattern (indistinguishable from a water-induced swallow) during the early phase of the breathing cycle (within the first 10% of the cycle) or tonic laryngeal adduction (which tracked tetanically with stimulus length) during the later phase of the breathing cycle (after 70% of the cycle).

      They further refine the anatomical demarcation of the PiCo using a combination of ChAT immunohistochemistry and an intersectional transgenic strategy by which fluorescent reporter expression (tdTomato) is regulated by a combinatorial flippase and cre recombinase-dependent cassette in triple allelic mice (Vglut2-ires2-FLPO; ChAT-ires-cre; Ai65).

      Lastly, they demonstrate that the PiCo is anatomically positioned to influence the induction of swallowing through a series of neuroanatomical experiments in which the retrograde tracer Cholera Toxin B (CTB) was transported from the proposed location of the putative swallowing pattern generator within the caudal nucleus of the solitary tract (NTS) to glutamatergic ChAT neurons located within the PiCo. We would like to thank the reviewer for acknowledging the technical advances of the present study and for the positive statements in general.

      Methods and Results

      The experimental approach is appropriate and at the cutting edge for the field: advanced neuroscience techniques for neuronal stimulation (virally driven opsin expression within a genetically intersecting subset of neurons) applied within a sophisticated in vivo preparation in the anaesthetized mouse with electrophysiological recordings from functionally discrete respiratory and swallowing muscles. This approach permits selective stimulation of target cell types and simultaneous assessment of gain-of-function on multiple respiratory and swallowing outputs. This intersectional method ensures PiCo activation occurs in isolation from surrounding glutamatergic IRt interneurons, which serve a diverse range of homeostatic and locomotor functions, and immediately adjacent cholinergic laryngeal motor neurons within the nucleus ambiguous (seen by some as a limitation of the original study that first described the PiCo and its roll in post-I rhythm generation Anderson et al., 2016 Nature 536, 76-80). These experiments are technically demanding and have been expertly performed.

      Again, we would like to thank the reviewer for these positive comments acknowledging the advances of the present study.

      The supplemental tracing experiments are of a lower standard. CTB is a robust retrograde tracer with some inherent limitations, paramount of which is the inadvertent labelling of neurons whose axons pass through the site of tracer deposition, commonly leading to false positives. In the context of labelling promiscuity by CTB, the small number of PiCo neurons labelled from the NTS (maybe 5 or 6 at most in an optical plane that features 20 or more PiCo neurons) is a concern. Even assuming that only a small subset of PiCo neurons makes this connection with the presumed swallowing CPG within the cNTS, interpretation is not helped by the low contrast of the tracer labelling (relative to the background) and the poor quality of the image itself. The connection the authors are trying to demonstrate between PiCo and the cNTS could be solidified using anterograde tracing data the authors should already have at hand (i.e. EYFP labelling driven by the con-fon AAV vectors from PiCo neurons (shown in Fig5), which should robustly label any projections to the cNTS).

      We fully agree with the reviewer that the CTB staining is of a lower standard and have removed this approach.

      The retrograde labelling from laryngeal muscles seems unnecessary: the laryngeal motor pool is well established (within the nAmb and ventral medulla), and it would be unprecedented for a population of glutamatergic neurons to form direct connections with muscles (beyond the sensory pool).

      The authors support their claim that PiCo neurons gate laryngeal activity with breathing through the demonstration that selective activation of glutamatergic and cholinergic PiCo neurons is sufficient to drive oral/pharyngeal/laryngeal motor responses under anaesthesia and that such responses are qualitatively shaped by the phase of the respiratory cycle within which stimulation occurs. Optical stimulation within the first 10% of the respiratory cycle was sufficient to evoke a complete, stereotyped swallow that reset the breathing cycle, while stimuli within the later 70% of the cycle, evoked discharge of the laryngeal muscles in a stimulus length-dependent manner. Induced swallows were qualitatively indistinguishable from naturalistic swallow induced by the introduction of water into the oral cavity. The authors note that a detailed interpretation of induced laryngeal activity is probably beyond the technical limits of their recordings, but they speculate that this activity may represent the laryngeal adductor reflex. This seems like a reasonable conclusion.

      We thank the reviewer for this comment. Unfortunately, we felt compelled to remove the word “gating” based on the statements by reviewer 1.

      The authors propose a model whereby the PiCo impinges upon the swallowing CPG (itself a poorly resolved structure) to explain their physiological data. The anatomical data presented in this study (retrograde transport of CTB from cNTS to PiCo) are insufficient to support this claim. As suggested above, complementary, high-quality, anterograde tracing data demonstrating connectivity between these structures as well as other brain regions would help to support this claim and broaden the impact of the study.

      We fully agree with this reviewer. We have been working on a thorough anatomical characterization for more than 3 years using cutting edge anterograde and retrograde viruses in collaboration with vector experts at the University of Irvine. But these are partly novel, unpublished techniques that are in development, and require many careful controls and characterization. We feel that this is a separate study as it doesn’t relate to swallowing coordination and also includes partly different authors. We hope to submit this as a separate study later this year.

      This study proposes that the PiCo in addition to serving as the site of generation of the post-I rhythm also gates swallowing and respiration. The scope of the study is small, and limited to the subfields of swallowing and respiratory neuroscience, however, this is an important basic biological question within these fields. The basic biological mechanisms that link these two behaviors, breathing and swallowing, are elusive and are critical in understanding how the brain achieves robust regulation of motor patterning of the aerodigestive tract, a mechanism that prevents aspiration of food and drink during ingestion. This study pushes the PiCo as a key candidate and supports this claim with solid functional data. A more comprehensive study demonstrating the necessity of the PiCo for swallow/breathing coordination through loss of function experiments (inhibitory optogenetics applied in the same transgenic context) along with robust connectivity data would solidify this claim.

      Thanks again for the positive assessment of our study.

    1. Author Response

      Reviewer #1 (Public Review)

      Using in vitro assays that take advantage of thymic slices, with or without the ability to present pMHC antigens, the authors define an early period in which CCR4 expression is induced, which induces their migration to the medulla and likely encounter with cDC2 and other APCs. Notably, the timing for CCR4 expression precedes that of CCR7 and illustrates the potential role for this early expression to initiate the movement of post-positive selection thymocytes to the medulla. The evidence for supporting a role for CCR4, as well as CCR7, in sequential tolerance induction is provided using multiple approaches, and although the observed changes amount to small percent changes, the significance is clear and likely biologically relevant over the lifespan of a developing T cell repertoire. Overall, the model provides a holistic view of how tolerance to self-antigens is likely induced during T cell development, which makes this work highly topical and influential to the field.

      We thank the reviewer for their comments and for highlighting the significance of identifying distinct roles for CCR4 and CCR7 in promoting medullary localization and inducing self-tolerance of thymocytes at different stages of T-cell development.

      Reviewer #2: (Public Review )

      This manuscript describes that CCR4 and CCR7 differentially regulate thymocyte localization with distinct outcomes for central tolerance. Overall, the data are presented clearly. The distinct roles of CCR4 and CCR7 at different phases of thymocyte deletion (shown in Figure 6C) are novel and important. However, the conclusion that expression profiles of CCR4 and CCR7 are different during DP to SP thymocyte development was documented previously. More importantly, the data presented in this manuscript do not support the conclusion that CCR7 is uncoupled from medullary entry. Moreover, it is unclear how the short-term thymus slice culture experiments reflect thymocyte migration from the cortex to the medulla.

      We thank the reviewer for pointing out the significance of our finding that CCR4 and CCR7 regulate different phases of thymocyte deletion. We agree that prior reports, including our own (Cowan et al. 2014, Hu et al., 2015) have shown that CCR4 and CCR7 are expressed by different post-positive selection thymocytes. However, the expression data we present here provides a higher resolution perspective on the specific thymocyte subsets that express these two receptors, as well as the different timing with which the receptors are expressed after positive selection. These data, coupled with chemotaxis assays of the granular thymocyte subsets responding to CCR4 versus CCR7 ligands, and 2-photon imaging data showing that CCR4 and CCR7 are required for medullary accumulation of distinct thymocyte subsets, are critical for delineating the unexpectedly distinct roles of these two chemokine receptors in promoting medullary entry and central tolerance.

      The reviewer raises an important question about our conclusion that CCR7 is “uncoupled” from medullary entry. We think there was likely a misunderstanding of our intended meaning, as we did not mean to imply that CCR7 does not promote medullary entry of thymocyte subsets; we have modified the wording of the abstract to replace “uncoupled” to clarify. As we detail in the Introduction, the role of CCR7 in directing chemotaxis of single-positive thymocytes towards the medulla and inducing their medullary accumulation is well established (Ehrlich et al., 2009; Kurobe et al., 2006; Kwan & Killeen, 2004; Nitta et al., 2009; Ueno et al., 2004). Instead, our data demonstrate that 1) the most immediate post-positive selection thymocyte subset (DP CD3loCD69+) does not require CCR7 for medullary entry, and 2) the next stage of post-positive selection thymocytes (CD4SP SM) express CCR7, but CCR7 recruits these cells only modestly into medulla. In contrast, CCR7 promotes robust medullary accumulation of more mature thymocyte subsets (CD4SP M1+M2), in keeping with the well-known role of CCR7 in promoting thymocyte medullary localization. We think these findings are highly significant for the field because currently, there is a widely held assumption that post-positive selection thymocytes that do not express CCR7 are located in the cortex, while those that express CCR7 are located in the medulla. Our data show that neither of these assumptions is true: CCR4 drives medullary accumulation of cells that do not yet express CCR7, and the earliest post-positive selection cells that express CCR7 continue to migrate in both the cortex and medulla. These findings form the basis of our statement that CCR7 expression is “not synonymous with” medullary localization. The finding that thymocytes do not robustly accumulate in the medulla in a CCR7-dependent manner until more the mature SP stages has important implications for central tolerance, as localization of thymocytes in the cortex versus medulla will impact which APCs and self-antigens they encounter when testing their TCRs for self-reactivity.

      The reviewer also raised concerns about whether short-term thymus slice cultures reflect physiological thymocyte migration. Short-term live thymic slice cultures have been widely used to investigate the development, localization, migration, and positive and negative selection of thymocytes, as they have been shown to faithfully reflect these in vivo processes, including confirming the role of CCR7 in inducing chemotaxis of mature thymocytes from the cortex into the medulla (Au-Yeung et al., 2014; Dzhagalov et al., 2013; Ehrlich et al., 2009; Lancaster et al., 2019; Melichar et al., 2013; Ross et al., 2014). However, we acknowledge that thymic slices are not equivalent to intact thymuses and have now discussed limitations of this system in our revised Discussion.

      Comment 1: Differential profiles in the expression of chemokine receptors, including CCR4, CCR7, and CXCR4, during DP to SP thymocyte development were well documented. Previous papers reported an early and transient expression of CCR4, a subsequent and persistent expression of CCR7, and an inverse reduction of CXCR4 (Campbell, et al., 1999, Cowan, et al., 2014, and Kadakia, et al. 2019). The data shown in Figures 1, 2, and 3 are repetitive to previously published data.

      The expression profile of CCR4, CCR7 and CXCR4 on thymocytes has been documented previously in the studies cited above and in our prior publication (Hu et al., 2015). Campbell et al. (Campbell, Haraldsen, et al., 1999) investigated chemotactic effects of chemokines, but did not directly address expression of chemokine receptors by thymocyte subsets. Cowan et al. (Cowan et al., 2014) examined the expression of CCR4 versus CCR7 on DP and CD4SP thymocytes. However, our data provide a more detailed analysis of expression of these distinct chemokine receptors by subsets of DP, CD4SP, and CD8SP thymocyte subsets along the trajectory of differentiation after positive selection, using a gating scheme inspired by a study published after the above-cited papers (Breed et al., 2019). Our more nuanced evaluation of CCR4 versus CCR7 expression sets the stage for finding that they play distinct roles in promoting medullary entry and central tolerance of early- versus late-stage post-positive selection thymocytes. Without examining CCR4 and CCR7 expression patterns by distinct thymocyte subsets in detail, we would not have made the unexpected observation that although CCR7 is expressed at high levels by many CD4SP SM thymocytes, it does not induce strong chemotaxis or medullary accumulation of this subset, relative to its role in more mature SP thymocyte subsets. This finding has important implications for which APCs thymocytes encounter as they are tested for self-reactivity to enforce central tolerance. As we were working on these studies, Kadakia et al. reported that extinguishing CXCR4 expression was important for enabling medullary entry (Kadakia et al., 2019). Thus, we thought it was important to place CXCR4 in the context of CCR4 and CCR7 expression on thymocyte subsets in our study, and in doing so found another example of asynchronous chemokine receptor expression and function, further indicating that expression of a chemokine receptor alone is not a reliable marker of functional activity or thymocyte localization, as cells migrate dynamically between the cortex and medulla.

      Through more extensive gating and simultaneous investigation of chemokine receptor expression and function, our data have provided new insights into how thymocytes respond to chemokine cues at different time points during their post-positive selection development. Moreover, our refined gating scheme (Figure 1) can be used to distinguish thymocyte subsets at different development stages without relying on chemokine receptor expression, thus providing an unbiased way of investigating chemokine receptor expression at different developmental stages.

      Comment 2: The manuscript describes the lack of CCR7 at early stages during DP to SP thymocyte development (Figure 1-3). However, CCR7 expression is detected insensitively in this study. Unlike CCR4 detection with a wide fluorescence range between 0 and 2x104 on the horizontal axis, CCR7 detection has a narrow range between 0 and 2x103 on the vertical axis (Figure 1C, 1D, 4B, 4C, 6B, S2, S3), so that flow cytometric CCR7 detection in this study is 10-times less sensitive than CCR4 detection. It is therefore likely that the "CCR7-negative" cells described in this manuscript actually include "CCR7-low/intermediate" thymocytes described previously (for example, Figure S5A in Van Laethem, et al. Cell 2013 and Figure 6 in Kadakia, et al. J Exp Med 2019).

      We provide new data to address the possibility that we were failing to detect low levels of CCR7 expression on early post-positive selection DPs (CD3loCD69+). We agree that CCR7 immunostaining of mouse cells is known to be more challenging than immunostaining of other chemokine receptors, including CCR4 and CXCR4. CCR7 immunostaining needs to be carried out at 37°C, which we did throughout our studies. We provide new data comparing CCR7 expression by Ccr7+/+ versus Ccr7-/- thymocyte subsets (Figure 1—figure supplement 2A-B), which confirm that CCR7 is not expressed at detectable levels by CD3loCD69+ DP cells above the background seen in CCR7-deficient cells. As thymocytes transition to theCD4SP SM stage, low/intermediate to high expression of CCR7 can be detected (Figure 1—figure supplement 2A). To further test whether we were failing to detect low levels of CCR7 by post-positive selection DPs, we incubated thymocytes at 37°C for up to 2 hours prior to immunostaining for CCR4 and CCR7, as a prior study indicated in vitro culture would enable increased cell surface expression of CCR7 by alleviating ligand-mediated CCR7 internalization (Britschgi et al., 2008). However, we did not observe increased CCR7 (or CCR4) expression by any thymocyte subset incubated at 37°C (Figure 1—figure supplement 2C-D). Lack of expression of CCR7 by CD3loCD69+ DP cells is consistent with their failure to undergo chemotaxis to CCR7 ligands in vitro, and initial expression of CCR7 by CD4SP SM is consistent with their chemotaxis towards CCR7 ligands in vitro (now show in greater detail in Figure 2—figure supplement 1), albeit at a much lower migration index than subsequent thymocyte subsets.

      Comment 3: Low levels of CCR7 expression could be functionally evaluated by the chemotactic assay as shown in Figure 2. However, the data in Figure 2 are unequally interpreted for CCR4 and CCR7; CCR4 assays are sensitive where a migration index at less than 1.5 is described as positive (Figure 2A and 2B), whereas CCR7 assays are dismissal to such a small migration index and are only judged positive when the migration index exceeds 10 or 20 (Figure 2C and 2D). CCR7 chemotaxis assays should be carried out more sensitively, to equivalently evaluate the chemotactic function of CCR4 and CCR7 during thymocyte development.

      We thank the reviewer for his insight about the possibility that we could have overlooked CCR7-mediated chemotaxis at lower migration indexes. When data from the chemotaxis assays were evaluated separately for each thymocyte subset, CCR7-mediated chemotaxis of CD4SP SM and subsequent DP CD3+CD69+ co-receptor reversing thymocytes could be detected. However, DP CD3loCD69+ thymocytes still did not undergo CCR7-meidated chemotaxis, but were responsive to the CCR4 ligand CCL22 (Figure 2—figure supplement 1).

      We did not detect CCR7-mediated chemotaxis of CD4SP SM and DP CD3+CD69+ subsets in our previous analysis because their lower-level chemotactic index relative to mature thymocytes did not reach statistical significance when chemotaxis of all subsets were compared simultaneously (Figure 2D). We note that the magnitude of difference in the responsiveness of CD4SP SM cells compared to mature CD4SP and CD8SP M1 & M2 thymocytes (Figure 2D) is likely physiologically important as CCR7 deficiency results in severely reduced medullary accumulation of CD4SP M1+M2 cells, but only a very mild reduction in medullary accumulation of CD4SP SM cells, which is only detected with our new paired analyses in Figure 5C. We feel these new analyses provide important new insights and thank the reviewer for this suggestion.

      Comment 4: Together, this manuscript suffers from the poor sensitivity for CCR7 detection both in flow cytometric analysis and chemotactic functional analysis. Conclusions that CCR7 is absent at early stages of DP to SP thymocyte development and that CCR7 is uncoupled from medullary entry are the overinterpretation of those results with the poor sensitivity for CCR7. The oversimplified scheme in Figure 3D is misleading.

      We agree that the scheme in Figure 3D, as previously constructed, did not ideally display the difference in scale between thymocyte responses to CCR7 ligands versus CCR4 and CXCR4 ligands (as detected in vitro). Thus, we have now modified the schematic to include the mild response to CCR7 ligands that we observed in CD4SP SM thymocytes (comment 3) and to emphasize the higher chemotactic response of mature thymocytes to CCR7 ligands than of DPs and CD4SP SM to CCR4 ligands. Likewise, we have modified the manuscript to clarify the importance of CCR7 expression in the medullary entry and accumulation of mature thymocyte subsets.

      We respectfully disagree that the sensitivity of CCR7 detection was poor in our flow cytometry and chemotactic analyses. Our CCR7 stains identified a range of CCR7 expression levels, from no expression by pre- and post-positive positive selection DP cells to high expression by CD4SP M1 cells, and we now provide new data confirming our ability to detect CCR7 expression (Figure 1—figure supplement 2), as described in response to Comment 3. Our chemotaxis assays detected CCR7 responses over a range of migration indexes from ~ 2 up to 100, showing our sensitive ability to detect CCR7-mediated chemotaxis in vitro (Figure 2 and Figure 2—figure supplement 1). In live thymic slices, we were also able to capture a range of biologic activities of CCR7, from mediating modest medullary accumulation of CD4SP SM cells to robust medullary accumulation of CD4SP M1+M2 cells (Figure 5A-C). Importantly, our results demonstrate that CCR7 is not the only chemokine receptor responsible for medullary entry and accumulation of thymocytes. Complex spatiotemporal regulation of thymocytes at distinct stages of development is achieved through tight orchestration of expression and signaling through multiple chemokine receptors, including CCR4, as shown by our data. However, our study does not negate an important role for CCR7 in mediating medullary entry of thymocytes, which we have clarified in the text.

      Comment 5: The short-term thymus slice culture experiments should be described more carefully in terms of selection events during DP to SP thymocyte development, which takes at least 2 days for CD4 lineage T cells and approximately 4 days for CD8 lineage T cells (Saini, et al. Sci Signal 2010 and Kimura, et al. Nat Immunol 2016). The slice culture experiments in this manuscript examined cellular localization within 12 hours and chemokine receptor expression within 24 hours (Figures 4, 5) even for the development of CD8 lineage T cells (Figure S2), which are too short to examine entire events during DP to SP thymocyte development and are designed to only detect early phase events of thymocyte selection.

      Experiments in Figures 4 and 5 were indeed designed to capture behaviors of thymocytes relatively early after introduction onto thymic slices. Figure 4 (and Figure 4—figure supplement 1) shows that the timing of CCR4 versus CCR7 expression after positive selection is dramatically different: CCR4 is expressed within hours of positive selection, concomitant with medullary entry, while CCR7 expression takes several days in the slices (sufficient time for CD8SP development, Figure 4—figure supplement 1). Figure 5 shows that medullary accumulation of CD4SP M1+M2 cells occurs robustly in the medulla of thymic slices within a couple of hours after introduction into the slices, and this localization is CCR7 dependent, while CCR4 induces more mild medullary accumulation of post-positive selection DPs. As indicated by the reviewer, it has been shown that it takes days for DP thymocytes to develop into mature CD4SP and CD8SP cells (Kimura et al., 2016; Lutes et al., 2021; Saini et al., 2010), as recapitulated in the thymus slice system (Figure 4—figure supplement 1) (Lutes et al., 2021). The relatively short time frame of our time-course experiments (up to 12 hours after addition of pre-positive selection thymocytes to positively selecting thymic slices) allowed us to detect expression of CCR4 within a few hours after positive selection and to determine that this timing correlated with medullary entry. Thus, the 12-hour time-course was important for temporal resolution of chemokine receptor expression and medullary localization after initial stages of positive selection.

      Comment 6: It is unclear what the medullary density alteration measured in the thymus slice culture experiments represents. Although the manuscript describes that the increase in the medullary density reflects the entry of cortical thymocytes to the medulla (Figure 4E and S2E), this medullary density can be affected by other mechanisms, including different survival of the cells seeded on the top of different thymus microenvironments. Thymocytes seeded on the medulla may be more resistant to cell death than thymocytes seeded in the cortex, for example, because of the rich supply of cytokines by the medullary cells. So, the detected alterations in the medullary density may be affected by the differential survival of thymocytes seeded in the cortex and the medulla. Also, the medullary density is measured only within a short period of up to 12 hours. The use of MHC-II-negative slices and CCR4- or CCR7-deficient thymocytes in the thymus slice cultures may verify whether the detected alteration in the medullary density is dependent on TCR-initiated and chemokine-dependent cortex-to-medulla migration.

      We thank the reviewer for pointing out these possibilities. The purpose of the positive selection timing experiment (Figure 4) was to establish the early correlation between receiving a positive selection signal, upregulating CCR4, and migrating into the medulla. In this system, cells only enter only the cortex in the first hour after migration in the slice, consistent with prior studies of localization of pre-positive selection thymocytes to the cortex (Ehrlich et al., 2009; Ross et al., 2014); subsequently, they move into the medulla. Because CCR7 is widely accepted to be essential for medullary entry, we feel it is important to demonstrate the disconnect between the timing of medullary entry and CCR7 expression in multiple ways. The timing experiment design utilized MHCII-/- and β2m-/- slices to show that positive selection was necessary for expression of CCR4. To test whether CCR4 or CCR7 were required for medullary entry of early post-positive selection DPs, we evaluated medullary accumulation of this subset from WT, Ccr4-/-, Ccr7-/-, and Ccr4-/-Cc7-/- mice. This experiment provided a more robust means of determining the extent to which CCR4 deficiency impacted medullary localization of a large cohort of cells that had passed positive selection (Figure 5), and again showed that the post-positive selection thymocytes, which express CCR4 but not CCR7, accumulate in the medulla in a CCR4-dependent manner. We note that in Figure 5, we show that all Ccr4-/-Ccr7-/- thymocyte subsets imaged have medullary:cortical density ratios of ~1, indicating an even distribution across cortex and medulla, which is highly consistent with an essential role for these two chemokine receptors in cooperating to mediate medullary accumulation of different stages of developing T cells.

      The reviewer makes an interesting point that survival cues could differ in the cortex versus medulla. However, if thymocytes lacking one or both chemokine receptors had impaired survival because they didn’t enter a region of the thymus efficiently to receive survival cues, we would expect to detect increased apoptosis in Ccr4-/-, Ccr7-/-and Ccr4-/-Cc7-/- thymocytes. However, we found that chemokine receptor deficiencies resulted in diminished apoptosis of different thymocyte subsets (Figure 6). This finding is more consistent with reduced negative selection of these subsets due to reduced clonal deletion. We nonetheless discuss this possibility in our revised manuscript, as it important to consider that chemokine-mediated migration of thymocytes into different microenvironments could alter their access cytokines and other pro-survival cues.

      Reviewer #3 (Public Review)

      In this manuscript, Li et al. examine how the expression of the chemokine receptor CCR4 impacts the movement of thymocytes within the thymus. It is currently known that the chemokine receptor CCR7 is important for developing thymocytes to migrate from the cortical region into the medullary region and CCR7 expression is therefore often used to define medullary localization. This is important because key developmental outcomes, like enforcing tolerance to self-antigens amongst others, occur in the medullary environment. The authors demonstrate that the chemokine receptor CCR4 is induced on thymocytes prior to expression of CCR7 and thymocytes exhibit responsiveness to CCR4 ligands earlier in development. Using elegant live confocal microscopy experiments, the authors demonstrate that CCR4 expression is important for the entry and accumulation of specific thymocyte subsets while CCR7 expression is needed for the accumulation of more mature thymocyte subsets. The use of cells deficient in both CCR4 and CCR7 and competitive migration/accumulation experiments provide strong support for this conclusion. The elimination of CCR4 expression results in decreases in apoptosis of thymocyte subsets that have been signalled through their antigen receptor and are responsive to CCR4 ligands. As expected, more mature thymocyte subsets show decreased apoptosis when CCR7 is absent. Distinct antigen-presenting cells in the thymus express CCR4 ligands supporting a model where CCR4 expressing thymocytes can interact with thymic antigen-presenting cells for induction of apoptosis. The absence of CCR4 results in an increase in peripheral T cells that can respond to self-antigens presented by LPS-activated antigen-presenting cells providing further support for the model. Collectively, the manuscript convincingly demonstrates a previously unappreciated role for CCR4 in directing a subset of thymocytes to the medulla.

      We thank the reviewer for appreciating the novelty of the finding that CCR4 directs distinct subsets of thymocytes into the medulla relative to CCR7, as supported by multiple lines of evidence.

    1. Author Response

      Reviewer #1 (Public Review):

      The sustainability of vaccination programs is subject to multiple threats, from a pandemic like COVID-19 to political changes. The present study assesses different strategies, including gender-neutral vaccination, to better respond to threats in HPV national immunization programs. The authors showed that vaccinating boys against HPV (compared to vaccinating girls alone), would not only prevent more cases of cervical cancer but also limit the impact of disruptions in the program. Moreover, it would help attain the goal set by the World Health Organization of eliminating cervical cancer as a public health problem sooner, even in the case of disruptions.

      Strengths and weaknesses: I found the manuscript well-written and easy to read. Decision-makers may find the results helpful in policy development and other researchers may use the study as an example to investigate similar scenarios in their local contexts. Nevertheless, there are some limitations. First, it should be considered that the present study is only applicable to India and other countries with a similar HPV context. Second, because it is a study based on a mathematical model, errors might arise from the assumptions considered for its construction. It also relies on the quality of the data used to construct and calibrate the model.

      Models are important tools for decision-making, they allow us to assess different scenarios when obtaining real-world data is not feasible. They also allow to carried-out multiple sensitivity analyses to test the strengths of the results. The study carries out a necessary assessment of different vaccination strategies to minimize the impact on cervical cancer prevention due to disruptions in the HPV immunization program. By using a mathematical model, the authors are able to assess different scenarios regarding vaccination coverage rates, disruption time, and cervical cancer incidence. Therefore, decision-makers can consider the scenario which best represents their current situation.

      The present study is not only valuable for decision-making, but also from a methodological point of view as future research can be conducted exploring more in deep the impact of vaccination disruptions and prevention measures.

      The conclusions of this paper are mostly well supported by data, but some aspects of the methodology need clarification; furthermore, some aspects of the calculations can be improved. It would be more informative, and better for comparisons between the four scenarios, to have relative measures instead of the absolute numbers of cases prevented.

      We thank the reviewer for the kind acknowledgement of the merits of the paper. We have tried to address the suggestions and questions as much as possible in the revised manuscript.

      We agree to the points of weaknesses raised by the reviewer regarding the applicability of our study results is limited to other countries and the possible errors arising from a using a mathematical model. We have added more elaborate discussion of these points in the manuscript, as follows: - Page 15 lines 310-312: “Extrapolation of the results of this study to other populations will be limited to those sharing similar patterns of demography, social norms, and cervical cancer epidemiology as India.” - Page 17 lines 361-363: “…, within the limitations of our model, the modelbased estimates show that shifting from GO to GN vaccination may improve the resilience of the Indian HPV vaccination programme while also enhancing progress towards the elimination of cervical cancer.”

      Furthermore, we have tried to clarify the rationale, advantages, and limitations of the measure of resilience we have adopted.

      Reviewer #2 (Public Review):

      This study evaluated the effect of population-based HPV vaccination programs in India which is suffering from the disease burden of cervical cancer. The authors used model simulations for estimating the outcomes by adopting the latest available data in the literature. The findings provide evidence-based support for policymakers to devise efficient strategies to reduce the impacts of cervical cancer in the country.

      Strengths.

      The study investigated the potential impact of cervical cancer elimination when HPV vaccination was disrupted (e.g., during the COVID-19 pandemic) and for meeting the WHO's initiatives. The authors considered several settings from the low to high effects of vaccination disruption when concluding the findings. The natural history was calibrated to local-specific epidemiological data which helps highlight the validity of the estimation.

      Weaknesses.

      Despite the importance and strengths, the current study may likely be improved in several directions. First, the study considered the scenario of using a recently developed domestic HPV vaccine but assuming vaccine efficacy based on another foreign HPV vaccine that has been developed and used (overseas) for more than 10 years. More information should be provided to support this important setting.

      Second, the authors are advised to discuss the vaccine acceptability and particularly the feasibility to achieve high coverage scenarios in relatively conservative countries where HPV vaccines aim to prevent sexually transmitted infection. Third, as the authors highlighted, the health economics of gender-neutral strategies, which is currently missing in the manuscript, would be a substantial consideration for policymakers to implement a national, population-based vaccination program.

      We thank the reviewer for the kind acknowledgement of the merits and strengths of the paper.

      We have tried to address the reviewer’s three points of weaknesses as comprehensively as possible in the revised manuscript.

      Regarding the first two points of weaknesses, we have provided more background information about the current situation of HPV introduction and screening in India (see the more specific replies below for where changes have been made), and some data of observed coverage in India in the states where HPV vaccination has been introduced.

      Regarding the reviewer’s third point about the health economics of genderneutral strategies, we agree fully that it is an important aspect to consider for the local policymakers. However, a health economic assessment is out of the scope of the present paper. In the present paper, we are interested in highlighting the potential health benefits on GN HPV vaccination. Given the current context of HPV vaccination in India we think it is too early to provide a realistic assessment of the health-economic balance of GN vaccination. Please note that one manuscript (de Carvalho et al., MedRxiv, doi: https://doi.org/10.1101/2023.04.14.23288563) based on the same modelling exercise and reporting a health economic assessment of girls-only (routine and catch-up) HPV vaccination in India is currently submitted for peer-review.

      Reviewer #3 (Public Review):

      The authors put together a rigorous study to model the impact of HPV vaccine programme disruptions on cervical cancer incidence and meeting WHO elimination goals in a low-income country - using India as an example. The study explores possible scenarios by varying HPV vaccination strategies for 10-year-old children between a) increasing vaccine coverage in a girls-only vaccination programme and b) vaccinating boys in addition to girls (i.e a gender-neutral vaccination programme).

      The main strength of this study is the strength of the modelling methodology in helping to make predictions and in contingency planning. The study methodology is rigorous and uses models that have been validated in other settings. The study employs a high level of detail in calibrating and adapting the model to the Indian context despite poor data availability. The detailed methodology allows future studies to employ the model and techniques with locally-contextualised parameters to study the potential impact of HPV vaccine programme disruptions in other countries.

      The work in this field can begin to help lower-income countries explore varying HPV vaccination strategies to reduce cervical cancer incidence, keeping in mind the potential for future supply chains or other related disruptions. However, the scenarios could be better sculpted to model potentially realistic scenarios to guide policymakers to make decisions in situations with limited vaccine supplies - in other words comparing scenario alternatives based on a fixed number of vaccines being available. Using comparative alternatives will help policymakers grapple with the decisions that need to be made regarding planning national HPV vaccination programmes. The results could afford to provide readers with a clearer measure of vaccine strategy 'resilience'.

      In all, the authors are able to successfully explore the potential impact of varying HPV vaccination strategies on cervical cancer cases prevented in the context of vaccine disruptions, and make valid conclusions. The results produced are rich in information and are worthy of deeper discussion.

      We thank the reviewer for the kind acknowledgement of the merits and strengths of the paper.

    1. As previously discussed, deliberate practice, in this case through frequent and active homework, helps build expertise in a domain. Now we know that deliberate practice works to build expertise because it helps build synaptic plasticity. Think-pair-share also increases synaptic plasticity by engaging students’ brains in ways that recall semantic information but also may include the formation of skills and habits, depending on the questions posed. Concept maps rationally encode knowledge, which allows memories to build as synaptic networks. Problem-based learning encourages students in terms of motivation and attention, which in turn increase learning by increasing synaptic plasticity. Using culturally diverse examples in one’s pedagogy helps to alleviate or eliminate stereotype threat, which decreases stress.

      I LOVE scientific-based information, especially when it comes to study techniques/effective ways to learn, so this is really helpful for me! I'll definitely be implementing these into my study habits in the future.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2023-01910

      Corresponding author(s): Michael W. Sereda

      1. General Statements

      Reviewer #1:

      In this paper the authors report a direct correlation between PMP22 and PTEN expression levels in the nerve of CMT mutants. In CMT1A Pmp22tg rat nerves, PTEN levels are increased, whereas in Pmp22+/- mutants, a model of the HNPP neuropathy, PTEN levels decrease. Consistent with this, Pmp22tg nerves display lower Akt phosphorylation and, vice versa, Pmp22+/- nerves have higher Akt phosphorylation. The authors lowered PTEN in the transgenic and inhibited mTOR using Rapamycin in the Pmp22+/- to support the functional relevance of the PMP22-PTEN correlation. ... In conclusion, the correlation between PMP22 and PTEN is a potential interesting observation. However, in my opinion, experiments as shown don't support the conclusion that PMP22 controls PTEN expression level and activity, which is suggested at the basis of the pathogenesis of PMP22 dosage-related neuropathies.

      We thank Reviewer #1 for this detailed feedback. We appreciate the Reviewer’s assessment that our observation that PMP22 and PTEN are correlated in CMT1A and HNPP is of potential interest. In the revised manuscript we addressed this key point by adding additional quantifications (Figure 1a, d; Figure 5d) and novel Western Blot analyses (Figure 1a, d). Regarding the pathophysiological significance of the correlation, we point out that both the original as well as the partially revised manuscript contain multiple pieces of evidence demonstrating that altered PTEN activity is critical for both PMP22 gene-dosage related neuropathies:

      1. The inhibition of the PI3K/PTEN/AKT/mTOR axis upstream (LY294002) or downstream (Rapamycin) of decreased PTEN ameliorates myelin defects in an in vitro HNPP model (Figure 2b, c).
      2. Downstream of PTEN, Rapamycin treatment ameliorates myelin defects, motor behavior and electrophysiology in the HNPP mouse model in vivo (Figure 3c, d, e,____ g, i)
      3. Targeting of increased PTEN directly by inhibiting its activity pharmacologically (VO-OHpic) in a CMT1A rat model or by depleting it genetically in a CMT1A model leads to ameliorated myelination in vitro (Figure 4b, c; Figure 5f, g).
      4. The genetic depletion of PTEN in a CMT1A mouse model increases myelination in vivo, albeit not in the long term (Figure 6a, b, c, d). We therefore feel that any additional evidence to show that "PMP22 controls PTEN activity" is not vital for supporting the major claims of the manuscript, i.e. that the observed correlation of PTEN levels with PMP22 gene dosage has relevance for the etiology of PMP22 dosage diseases and and that targeting the PI3K-PTEN-AKT-mTOR axis downstream of PTEN provides a potential pharmacological therapy of HNPP (while directly targeting PTEN ultimately fails to rescue CMT1A). However, we agree that the activity of PTEN on the molecular level is interesting, and such evidence would further strengthen our conclusions. Therefore, in the final revised version, we plan to add further Western Blots and explore possible downstream effects of altered PTEN levels.

      Reviewer #2:

      This study investigates the modulation, both genetically and pharmacologically, of the PI3K/Akt/mTOR signaling in preclinical animal models for the inherited peripheral neuropathies HNPP and CMT1A. These conditions result from a gene dosage abnormality of the peripheral myelin protein gene PMP22. The exact biological molecular mechanisms remain enigmatic despite it having been over 30 years since the major genetic lesions, the CMT1A duplication and HNPP deletion, were described. With respect to myelin biology one observes focally slowed nerve conduction at pressure palsies and local/segmental hypermyelination in HNPP whereas hypomyelination occurs in CMT1A. The study is nicely conducted, data illustrations very informative, and writing clear and concise. This paper will likely be of great interest to your readers. The authors provide convincing evidence that the HNPP pathobiology is ameliorated by PI3K/Akt/mTOR inhibitors. Interestingly they found radial myelin growth was most affected by this approach and suggest an interesting transdermal approach in injured nerves in the acute prevention of pressure palsies.

      We thank Reviewer #2 for this positive evaluation.

      Reviewer #3:

      *In this paper Sareda and co-workers demonstrate that the PTEN/mTOR pathway is indirectly involved in regulating myelin thickness and wrapping in models of altered PMP22 gene dosage both in vitro and in vivo. Inhibition of this pathway decreases myelin thickness in models of HNPP, while increasing myelin thickness in models of CMT1A. The evidence for these conclusions is complex but reasonably presented, and the conclusions mainly supported by the data. The abstract for this paper, however, presents a somewhat oversimplified conclusion that the PTEN pathway mainly modifies models of HNPP, where the paper clearly demonstrates that models of CMT1A are also affected by this same pathway. This should be clarified. *

      We thank Reviewer #3 for the feedback on the manuscript. We agree with the Reviewer that the same pathway (PI3K/Akt/mTOR) also affects CMT1A, but it is of importance for us to highlight that the disease mechanisms are -at least partly- different between HNPP and CMT1A. This is supported by our observation that PTEN reduction in CMT1A only transiently improves myelination in vivo (Figure 6) and the persistent alteration of differentiation markers despite PTEN reduction, which is not observed in HNPP (Figure 7).

      2. Description of the planned revisions

      Reviewer #1

      Regarding the activity of PTEN

      Figure 1

      • Additional experiments are needed to support the conclusion of Figure 1 that, in the two mutants, Pten levels reversely correlate with PI3K-Akt-mTOR pathway activation, which represents the rationale of all further experiments. For example, it should be shown systematically in both mutants both Akt and ERK phosphorylation levels (Akt at both T308 and S473), and mTOR activity read outs. In the previously published paper (Fledrich et al.) only increased Akt phosphorylation in Pmp22+/- nerves was reported, whereas Pmp22tg analysis was focused on the interdependence between Akt and ERK without exploring mTOR activation, which is relevant here. 2) (Figure 4) A different model, the C61 mouse a Pmp22tg overexpressing PMP22 is used here (rather than the CMT1A rat). This should be explained in the results. Is also this model characterized by increased Pten levels in the nerve? And low Akt-mTOR activation for instance? 3) (Figure 5) How is Akt-mTOR signaling in the double mutant as compared to Pmp22tg? Is that increased at P18? * Response: We fully agree with the Reviewer that further exploration of PTEN downstream effects will add value to the manuscript. We already justified the usage of the C61 mouse model more clearly, added P-S6 staining of wildtype in addition to an improved representation in Figure 5e, and performed extra Western Blot analysis of PTEN expression (described in the next section “Incorporated *revisions”). Moreover, we will further evaluate the downstream signaling components of PTEN and will perform additional Western Blot analyses of peripheral nerves of HNPP mice, CMT1A rats as well as C61 and C61xPTENhKO mice.

      Figure S1

      • *Figure S1, page 4: what does it mean "in line with this finding we were unable to detect protein-protein...". May be the authors meant: since there is a direct correlation between Pmp22 and Pten expression levels in the mutants, the authors explored the possibility of an interaction between the two. Regarding the co-IPs, in panel a, the co-IP at the endogenous level, the immunoprecipitation efficiency of PMP22 is very low. May be a pull-down experiment using either exogenous purified PMP22 or PTEN and nerve lysates can help to rule out the possibility of an interaction. The experiments in b, c are performed in overexpression in a heterologous system (293 cells). * Response: We agree with the Reviewer that we might have missed a possible interaction between PMP22 and PTEN in the experiments performed so far. Indeed, pull-down experiments may prove helpful to rule out / reveal protein-protein interaction. Therefore, we will use purified PMP22 and perform pull-down experiments using nerve lysates of wildtype and CMT1A rats.

      Figure 5

      • *Pten Fl/+ Dhh-Cre cultures seem to have axonal fasciculation. * Response____: We thank the Reviewer for this observation. We will systematically inspected all recorded images for features of fasciculation. We will also assess whether fasciculation is a representative feature in cultures derived from any of the genotypes, and if so, whether the genotypes differ in this regard.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Changes in the text are highlighted in green in the revised manuscript

      Reviewer #1:

      Figure 1

      • *Panel a: the decrease of Pten expression should be quantified with at least n=3 taking into account the variability among different samples at the different time points indicated (the same applies in panel b, even if here the increase of Pten expression level in Pmp22tg nerves is more evident). * Response: We agree with the Reviewer that the timeline is not sufficient to demonstrate alteration in PTEN expression in PMP22 gene dosage diseases CMT1A and HNPP. Therefore, we performed new Western Blot experiments evaluating PTEN expression in (i) HNPP mice, (ii) CMT1A rat (iii) C61 mice and (iv) C61xPTENhKO mice with minimum n = 3 biological replicates and performed the respective quantification which is shown in Figure1 (i, ii) and Figure 5 (iii, iv). The results of the Western Blot analysis and quantification show an increase in PTEN abundance in CMT1A rat (Figure 1d) and C61 mice (Figure 5d) while a decrease is observed in HNPP mice (Figure 1a) and PTENhKOxC61 mice (Figure 5d) when compared to wildtype controls.

      • *Panel a and b: the statement that Pten is more expressed at P18 at the peak of myelination in wildtype nerves is not supported by the blots as shown. * Response: We agree that this observation is only partly supported by the Western Blot analysis, as seen in the HNPP mouse model, and deleted this part in the results section.

      • Figure S1, page 4: what does it mean "in line with this finding we were unable to detect protein-protein...". May be the authors meant: since there is a direct correlation between Pmp22 and Pten expression levels in the mutants, the authors explored the possibility of an interaction between the two. Response: We thank the Reviewer for pointing out the lack of clarity here. We changed the respective sentence accordingly:

      “Since there is a direct correlation between PMP22 and PTEN expression levels in the mutants, we explored the possibility of an interaction between the proteins. By immunoprecipitation experiments we were unable to detect protein-protein interaction between PMP22 and PTEN (Figure S1).” (Page 4)

      • *Page 4: "Taken together, Pmp22 dosage inversely correlates with the abundance of PTEN...": please revise this statement * Response: We thank the reviewer for spotting this mistake. We changed the sentence accordingly, which now reads:

      “Taken together, Pmp22 dosage directly correlates with the abundance of PTEN and presumably the activation level of the PI3K/Akt/mTOR pathway in myelinating Schwann cells (Figure 1i)." (Page 4, Line 23)

      Figure 2:

      • The aberrant myelin figures displayed are similar to myelin ovoids preceding degeneration rather than myelin outfoldings. It is also strange that these alterations are in the wildtype cultures treated with RAPA, that instead, in this system, has been reported to increase myelination as it improves protein homeostasis (autophagy, quality control, etc). Response: We thank the Reviewer for pointing this out. Indeed, in the way the images have been presented the aberrant myelin profiles can be mistaken for ovoids. However, a close inspection of the TUJ1 channel images revealed continuity of the axons below the aberrant myelin, thereby excluding ovoid formation. In the partially revised manuscript, we now also show the TUJ1 channel individually (Figure 2), so that it can be appreciated that the defects are confined to the myelin. Concerning the incidence of the myelin defects in RAPA treated wildtype cultures, our analysis can have missed a potential amelioration due to the rather high variability in the data.

      Figure 3

      *Panel c-e: aberrant fibers should be normalized on total number of fibers and on the area, particularly because RAPA is used. *

      Response: We agree with the Reviewer that number of tomacula and recurrent loops should be normalized to the total number of fibers on the area. We have quantified the total number of fibers in the whole sciatic nerve and normalized the tomacula and recurrent loops number accordingly. Results show a decrease in both tomacula and recurrent loops after Rapamycin treatment in the HNPP mice (Figure 3c, d, e, f).

      Figure 4

      The improvement in the number of myelin segments following PTEN inhibition in Pmp22tg co-cultures is very weak. The 500 nM has instead a consistent effect in reducing myelin segments in the wildtype and I think that these results overall don't support the conclusion that myelination is ameliorated by reducing PTEN activity in Pmp22tg co-cultures.

      Response: We thank the Reviewer for this important point. We like to emphasize that we treated whole cultures with the PTEN inhibitor and we cannot rule out a (probably) negative effect on axonal PTEN, resulting in only weak improvement of myelination in PMP22tg cultures and strong effects also on the wildtype co-cultures. Therefore, we decided against a treatment of CMT1A models in vivo and further explored the effects of PTEN reduction specifically in Schwann cells using the genetic model as described Figure 5. The Reviewer made clear to us that this is inappropriately explained in the results section and we therefore adapted this in the manuscript on page 6:

      “Similarly, the prolonged inhibition of PTEN with VO-OHpic (for 14 days) caused a dosage-dependent reduction in myelinated segments in wildtype co-cultures (Figure 4c, Figure S2). The mechanism is currently unexplained but cannot rule out a negative effect of PTEN inhibition on DRG neurons and myelination.”

      Figure 5:

      • *A different model, the C61 mouse a Pmp22tg overexpressing PMP22 is used here (rather than the CMT1A rat). This should be explained in the results. Is also this model characterized by increased Pten levels in the nerve? And low Akt-mTOR activation for instance? * Response: We agree with the Reviewer that it has not been clear in the text why we changed here to the C61 mouse model. We clarified this in the Results section which now reads on page 6:

      “To reduce Pten function in CMT1A models also in vivo, we applied a genetic approach (Figure 5a). As the genetic tools to specifically target Schwann cells were only available in the mouse and not the rat, we used the C61 mouse model of CMT1A. We reduced PTEN by about 50% selectively in CMT1A Schwann cells by crossbreeding Pmp22 transgenic mice with floxed Pten and Dhh-cre mice, yielding PTENfl/+Dhhcre/+PMP22tg experimental mutants (Figure 5b). Western blot analyses of sciatic nerve lysates confirmed the increase of PTEN in PMP22tg mice and the reduction of PTEN in the double mutants (Figure 5c, d).”

      Moreover, regarding the PTEN expression we added Western Blot analysis and quantification in Figure 5c, d showing increased PTEN expression in the C61 mouse model of CMT1A and decreased PTEN in the PTENhKOxC61 double mutants. Further analysis of the downstream signaling is planned (see “planned revision”).

      • *PTEN, Akt-mTOR expression/activation levels should be checked biochemically also in this model. And quantified (panel c). * Response: We added an explanation for the use of the C61 mouse model (see point Figure 5.1 above). Moreover, we quantified the Western Blot analysis and added it in Figure 5d. The expression of PTEN was included in the Western Blot analysis (Figure 5c) showing increased PTEN expression also in the C61 mouse model. Further biochemical analysis of the C61 mouse model is planned (see “planned revision”).

      • *In panel d overactivation of mTOR (PS6 staining) in Schwann cells is not evident. * Response: We agree with the Reviewer that the way the image was displayed is not sufficient to show P-S6 activation in the double mutants. We have now split the image (Figure 5e) to better visualize the P-S6 staining alone compared to the co-staining with P0 (marker for compact myelin) and DAPI (nuclei). Further, we added staining of wildtype nerve. We hope this way the differences in P-S6 activation can be easier appreciated.

      Figure 6:

      *G-ratio analysis: which are the mean values (numbers) with SEM in the three groups analyzed wildtype, Pmp22tg and Pmp22tg; Pten fl/+; Dhh-Cre? *

      Response: We thank the Reviewer for pointing this out. We added the quantification of the mean g-ratios in Figure 6d, f.

      Figure 7:

      • *If more fibers are committed to myelinate in the double mutant as compared to the single Pmp22tg at P18 ,particularly, it is unclear why there is no difference in differentiation marker expression in Figure 7 (Oct6 and Hmgcr). * Response: We thank the reviewer for this comment. We do not necessarily expect to see a strong difference in the expression of differentiation markers given the mild increase in myelination in the double mutants. Similarly, we do not observe alterations in the expression of differentiation markers in HNPP mice, while these fibers produce more myelin. Therefore, we concluded that alterations in PTEN-PI3K/Akt/mTOR signaling do not influence differentiation in the mouse models while in the PMP22 overexpressing situation of CMT1A other mechanisms alter differentiation of the Schwann cells. We also note that experiments were performed at postnatal day 18 and we cannot rule out possible alterations in differentiation marker expression at earlier time points in development in the double mutants.

      • In conclusion, the correlation between PMP22 and PTEN is a potential interesting observation. However, in my opinion, experiments as shown don't support the conclusion that PMP22 controls PTEN expression level and activity, which is suggested at the basis of the pathogenesis of PMP22 dosage-related neuropathies. Response: Please also see section 1. In order to avoid any overstatement that "PMP22 controls PTEN expression level and activity", in our revised version we have clarified this point and changed the wording in the main text:

      "The mechanisms that link the abundance of PMP22 to that of PTEN are still unclear and we here neither show direct nor indirect control of PTEN expression by PMP22." (Page 8)

      Reviewer #2:

      1. Regarding in the Introduction: "...the molecular mechanisms causative for the abnormal myelination remain largely unknown and still no therapy is available." Suggest consider modifying to perhaps: '...no small molecule or pharmacological therapeutic intervention exist.' To say "no therapy" exist is 'myopic' and untrue.

      *Suggest adding question mark to end of sentence or changing ‘asked’ to “investigated” for following thought: “Here, we asked whether PI3K/Akt/mTOR signaling provides therefore a therapeutic target to treat the consequences of altered Pmp22 gene-dosage.” *

      Rather than attempt to establish PRIORITY perhaps ‘softening’ the INTRODUCTION concluding statement “Our results thus identify a potential pharmacological target for this inherited neuropathy.

      [This makes thePI3K/Akt/mTOR pathway a promising target for a preventive treatment of affected nerves also in human patients.] *Does this belong in RESULTS? Or rather DISCUSSION? *

      Response: We thank the Reviewer for the suggestions. We changed the sentences accordingly in the manuscript (1.: Page 3, Line 23; 2.: Page 3, Line 26; highlighted in green). Regarding point 3, we are convinced that identifying pharmacological targets for peripheral neuropathies should be given priority. Indeed, the aspect concerning point 4 is already highlighted in the discussion therefore we removed the sentence from the result section.

      Reviewer #3:

      *The abstract for this paper, however, presents a somewhat oversimplified conclusion that the PTEN pathway mainly modifies models of HNPP, where the paper clearly demonstrates that models of CMT1A are also affected by this same pathway. This should be clarified. *

      We agree with the Reviewer that the same pathway (PI3K/Akt/mTOR) also affects CMT1A, but it is of importance for us to highlight that the disease mechanisms are -at least partly- different between HNPP and CMT1A. This is supported by our observation that PTEN reduction in CMT1A only transiently improves myelination in vivo (Figure 6) and the persistent alteration of differentiation markers despite PTEN reduction, which is not observed in HNPP (Figure 7). For clarification we have altered the wording in the abstract which now reads: "In contrast, we found that CMT1A pathogenesis was only transiently ameliorated by altered PI3K/Akt/mTOR signaling, which drives radial but not longitudinal growth of peripheral myelin sheaths".

      3. Description of analyses that authors prefer not to carry out

      Reviewer #1:

      Figure 1:

      *Figure 1, Panel e: may be with this experiment the authors aim to suggest that Pten and Pmp22 are unlikely to interact directly or indirectly since Pten is cytosolic and Pmp22 myelin-membrane enriched. However, this myelin purification shows that Pmp22 as P0 expression levels are also abundant in the cytosol, may be also because P18 has been chosen as time point. What about a different type of membrane-cytosol fractionation experiment and/or another time point? *

      Response: We want to clarify that in this experiment not myelin and cytosol fractions were separated but myelin and whole sciatic nerve lysate (which is the input before isolation of the myelin fraction, called “lysate”). Therefore, the analysis aimed at showing an enrichment of PMP22 and P0 in the myelin fraction while PTEN and TUJ (as a control) are not, which makes it more unlikely for PTEN and PMP22 to interact directly. This experiment, together with the immunohistochemical analysis in Figure 1h should highlight the location of PMP22 and PTEN in the Schwann cell. Together with the newly suggested experiments of the Reviewer for Figure S1 (see planned Revision point 1) we do not see the need for extra membrane-cytosol fractionations and/ or another timepoint as the more detailed as the improved experiment on protein-protein interaction using nerve lysate (not only cell culture) is the experiment of choice to clarify whether we have a direct interaction or not.

      Regarding in vitro Schwann cell- DRG co-culture experiments:

      (Figure 2, Figure 4 and Figure 5e)

      1. *(Figure 2) For this experiment, pulse treatment may be beneficial rather than in continuous. Is Akt-mTOR phosphorylation-signaling increased also in Pmp22+/- co-cultures as in mutant nerves? Is the treatment reducing the overactivation? *
      2. *(Figure 4) Similarly to Figure 2, is PTEN level increased in Pmp22tg cultures along with Akt-mTOR downregulation? *
      3. *(Figure 5) Panel e: co-cultures are established using ex vivo Dhh-Cre recombination. The downregulation of Pten in the cultures should be documented. * Response: We agree with Reviewer #1 that a deeper analysis of the co-culture system regarding the downstream signaling of PTEN would increase the value of the experiments. Unfortunately, the experiments were designed in a very small scale with the intention of only evaluating myelin alterations on a histological level and we did have enough tissue to collect cells for deeper protein expression analysis. Moreover, we tried to use the co-culture system as a proof-of-principle experiment in parallel to our in vivo studies which we value more important due to the still quite artificial co-culture setup. We hope that the Reviewer can understand our approach with the focus we set on the in vivo work.

      Figure 3:

      1. *The RAPA treatment seems to increase Pten level in the mutant even above wildtype levels (panel b), which can result in decreased myelin thickness due to downregulation of Akt-mTOR. A different method to normalize expression levels should be used. * Response: Comparing the mean, relative expression levels resulting from our quantification as plotted in the graph (panel b) revealed no increase above wildtype level after Rapamycin treatment in the HNPP mouse. Further, we decided for whole protein staining as the superior approach to loading control because we have observed alterations in the expression of other frequently used “housekeepers” such as GAPDH, Actin and Vinculin in the CMT1A rodent models.

      *Panel c-e: Can these data also be reproduced in quadriceps nerves as tomacula are more prominent in these Pmp22+/- nerves showing less variability due to the prevalence of large caliber axons? *

      Response: Unfortunately, quadriceps nerves were not collected for histology in the experiment and therefore we cannot redo the quantification. Nevertheless, we agree that the quadriceps nerves have less variability than the sciatic nerve and will definitely include the tissue in our future experiments.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this paper the authors report a direct correlation between PMP22 and PTEN expression levels in the nerve of CMT mutants. In CMT1A Pmp22tg rat nerves, PTEN levels are increased, whereas in Pmp22+/- mutants, a model of the HNPP neuropathy, PTEN levels decrease. Consistent with this, Pmp22tg nerves display lower Akt phosphorylation and, viceversa, Pmp22+/- nerves have higher Akt phosphorylation. The authors lowered PTEN in the transgenic and inhibited mTOR using Rapamycin in the Pmp22+/- to support the functional relevance of the PMP22-PTEN correlation.

      I have major concerns on the data as shown, which, in my opinion, don't support the main conclusion of this paper. In more detail:

      Figure 1 Panel a: the decrease of Pten expression should be quantified with at least n=3 taking into account the variability among different samples at the different time points indicated (the same applies in panel b, even if here the increase of Pten expression level in Pmp22tg nerves is more evident) Panel a and b: the statement that Pten is more expressed at P18 at the peak of myelination in wildtype nerves is not supported by the blots as shown

      Figure S1, page 4: what does it mean "in line with this finding we were unable to detect protein-protein...". May be the authors meant: since there is a direct correlation between Pmp22 and Pten expression levels in the mutants, the authors explored the possibility of an interaction between the two. Regarding the co-IPs, in panel a, the co-IP at the endogenous level, the immunoprecipitation efficiency of PMP22 is very low. May be a pull-down experiment using either exogenous purified PMP22 or PTEN and nerve lysates can help to rule out the possibility of an interaction. The experiments in b, c are performed in overexpression in a heterologous system (293 cells).

      Panel e: may be with this experiment the authors aim to suggest that Pten and Pmp22 are unlikely to interact directly or indirectly since Pten is cytosolic and Pmp22 myelin-membrane enriched. However, this myelin purification shows that Pmp22 as P0 expression levels are also abundant in the cytosol, may be also because P18 has been chosen as time point. What about a different type of membrane-cytosol fractionation experiment and/or another time point?

      Page 4: "Taken together, Pmp22 dosage inversely correlates with the abundance of PTEN...": please revise this statement

      Additional experiments are needed to support the conclusion of Figure 1 that, in the two mutants, Pten levels reversely correlate with PI3K-Akt-mTOR pathway activation, which represents the rationale of all further experiments. For example, it should be shown systematically in both mutants both Akt and ERK phosphorylation levels (Akt at both T308 and S473), and mTOR activity read outs. In the previously published paper (Fledrich et al.) only increased Akt phosphorylation in Pmp22+/- nerves was reported, whereas Pmp22tg analysis was focused on the interdependence between Akt and ERK without exploring mTOR activation, which is relevant here.

      Figure 2 The aberrant myelin figures displayed are similar to myelin ovoids preceding degeneration rather than myelin outfoldings. It is also strange that these alterations are in the wildtype cultures treated with RAPA, that instead, in this system, has been reported to increase myelination as it improves protein homeostasis (autophagy, quality control, etc). Also for this experiment, pulse treatment may be beneficial rather than in continuous. Is Akt-mTOR phosphorylation-signaling increased also in Pmp22+/- co-cultures as in mutant nerves? Is the treatment reducing the overactivation?

      Figure 3 The RAPA treatment seems to increase Pten level in the mutant even above wildtype levels (panel b), which can result in decreased myelin thickness due to downregulation of Akt-mTOR. A different method to normalize expression levels should be used. Panel c-e: aberrant fibers should be normalized on total number of fibers and on the area, particularly because RAPA is used. Can these data also be reproduced in quadriceps nerves as tomacula are more prominent in these Pmp22+/- nerves showing less variability due to the prevalence of large caliber axons?

      Figure 4 A different model, the C61 mouse a Pmp22tg overexpressing PMP22 is used here (rather than the CMT1A rat). This should be explained in the results. Is also this model characterized by increased Pten levels in the nerve? And low Akt-mTOR activation for instance?

      The improvement in the number of myelin segments following PTEN inhibition in Pmp22tg co-cultures is very weak.. The 500 nM has instead a consistent effect in reducing myelin segments in the wildtype and I think that these results overall don't support the conclusion that myelination is ameliorated by reducing PTEN activity in Pmp22tg co-cultures. Similarly to Figure 2, is PTEN level increased in Pmp22tg cultures along with Akt-mTOR downregulation?

      Figure 5 As for Figure 4, the use of the mouse transgenic instead of the CMT1A rat should be specified and PTEN, Akt-mTOR expression/activation levels should be checked biochemically also in this model. And quantified (panel c). In panel d overactivation of mTOR (PS6 staining) in Schwann cells is not evident. Panel e: co-cultures are established using ex vivo Dhh-Cre recombination. The downregulation of Pten in the cultures should be documented. Pten Fl/+ Dhh-Cre cultures seem to have axonal fasciculation.

      Figure 6 G-ratio analysis: which are the mean values (numbers) with SEM in the three groups analyzed wildtype, Pmp22tg and Pmp22tg; Pten fl/+; Dhh-Cre? How is Akt-mTOR signaling in the double mutant as compared to Pmp22tg? Is that increased at P18? If more fibers are committed to myelinate in the double mutant as compared to the single Pmp22tg at P18 ,particularly, it is unclear why there is no difference in differentiation marker expression in Figure 7 (Oct6 and Hmgcr).

      Significance

      In conclusion, the correlation between PMP22 and PTEN is a potential interesting observation. However, in my opinion, experiments as shown don't support the conclusion that PMP22 controls PTEN expression level and activity, which is suggested at the basis of the pathogenesis of PMP22 dosage-related neuropathies.

    1. Author Response:

      The following is the authors' response to the original reviews.

      We are very glad that the editor and reviewers found our paper of broad interest to the community of population, evolutionary, and ecological genetics. We thank them for their positive feedback and insightful comments and suggestions. We have revised our manuscript to address some of the issues raised by the review. The main change we made was providing a detailed discussion of limitations of simulated genomes, focusing on considerations one needs to make when selecting a demographic model. This can be found in a new section “Limitations of simulated genomes” (pages 9-10). We made a few additional adjustments in other parts of the text based on the reviewers’ suggestions. They are all listed in the detailed point-by-point response to reviewers comments and questions below.

      Editor:

      1) It was noted that demographic models (or genomic parameters) that are inferred based on certain aspects of the genomic data (eg., site frequency spectrum, haplotype structure) may not recapitulate other aspects of the data. In other words, any inferred demographic models are expected to reliably reproduce only some aspects of the genetic variation data but not necessarily all. It would be helpful to emphasize this limitation in the manuscript and to include a table summarizing the types of variation that the demographic models for the catalogued species were based on.

      This is a very important point, which we addressed in the revision by adding a section entitled “Limitations of simulated genomes”. This section discusses the considerations that one should make when selecting an inferred demographic model to implement in simulation. This includes the samples used in analysis, the method used for inference, as well as various filters. In this section we also point to the documentation page of the stdpopsim catalog, which provides information about each demographic model that can help users decide whether it is appropriate for their needs. We decided not to summarize this information in a succinct table in the manuscript because it is not straightforward to summarize the strengths and potential limitations of each model in a table. Instead, we will expand the summary provided for each demographic model in the documentation page to provide additional information. See response to the second reviewer’s comment on this topic for more details.

      2) It will make stdpopsim more user-friendly to include an automated module that can visualize a demographic model given the corresponding parameters (or simulation scripts).

      As mentioned in the response to the first reviewer’s comment on this subject, the documentation page of the stdpopsim catalog provides a brief summary for each demographic model, including a graphical representation. See response below for more details.

      Reviewer #1:

      In the introduction, the authors cite numerous efforts to generate high-quality reference genomes. That's not an issue in itself, but leading with this might send the message to some readers that it is these reference genome efforts that are driving the need for population genomics analysis and simulation tools, which is not really the case - why not instead give some citation attention to actual population genomics projects aiming to address the types of evolutionary questions this paper is concerned with? The reference genome citations would fit better in the section dealing with reference genomes, where they already appear.

      Indeed, the desire to answer complex evolutionary questions is the main motivation for sequencing these genomes and also for generating realistic genome simulations. The reason we chose to lead with the genome-sequencing efforts is that high quality genome data is an important prerequisite for obtaining parameters for chromosome-scale simulations. So, with that perspective, these efforts which we cite are the driving force behind expansion of stdpopsim in the near future. Thus, we decided to leave these citations in the introduction. To balance things out, we now start the introduction with a statement about board questions in population genetics. Moreover, after we list the genome sequencing efforts, we added a list of specific types of questions that can be addressed by these newly emerging genomes, with relevant citations. The beginning of the introduction now reads:

      “Population genetics allows us to answer questions across scales from deep evolutionary time to ongoing ecological dynamics, and dramatic reductions in sequencing costs enable the generation of unprecedented amounts of genomic data that can be used to address these questions (Ellegren, 2014). Ongoing efforts to systematically sequence life on Earth by initiatives such as the Earth Biogenome (Lewin et al., 2022) and its affiliated project networks, such as Vertebrate Genomes (Rhie et al., 2021), 10,000 Plants (Cheng et al., 2018) and others (Darwin Tree of Life Project Consortium, 2022), are providing the backbone for enormous increases in the amount of population-level genomic data available for model and non-model species. These data are being used, among other things, in inference of population history and demographic parameters (Beichman et al., 2018), studying adaptive introgression (Gower et al., 2021), distinguishing adaptation from drift (e.g. Hsieh et al., 2021), and understanding the implications of deleterious variation in populations of conservation concern (e.g. Robinson et al., 2023).”

      Something that would be useful for the stdpopsim resource in general, though not necessarily something for the paper, would be some kind of more human-friendly representation of the demographic models implemented in the curated library. Perhaps I'm not looking in the right place, but as far as I can tell, if I want to study the curated demographic models, I need to go into the Python scripts on the stdpopsim GitHub page (e.g.

      https://github.com/popsim-consortium/stdpopsim/tree/main/stdpopsim/catalog/BosTau). Here the various parameters and demographic events are hard-coded into the scripts. To understand the model being implemented, one thus needs to go dig into these scripts - something which is not necessarily very accessible to all researchers. Visual representations, such as the one for Anopheles gambiae in Fig 2. in the paper, are more widely accessible. I wonder if such figures could be produced for all the curated models and included in the GitHub folders alongside the scripts, perhaps aided by an existing model visualization software such as POPdemog. Again, I would not suggest that this is necessary for the paper, but if practically feasible I think it would be a useful addition to the resource in the longer term.

      This is a very good point. The stdpopsim catalog actually has a documentation page that provides a brief summary for each demographic model, including a graphical representation. This graphical representation is generated using demesdraw applied to the demographic model object implemented in the code. Thus, potential users do not have to dig through the Python code to figure out the details of the demographic model. We used a similar approach to generate the image of the demographic history of A. gambiae for Fig. 2 of the paper. The documentation page is an important part of the stdpopsim catalog, and we now added a link to it in section “Data availability”, and we mention it in key places in the manuscript, such as the caption of Fig 2.

      Reviewer #2:

      An important update to the stdpopsim software is the capacity for researchers to annotate coding regions of the genome, permitting distributions of fitness effects and linked selection to be modeled. However, though this novel feature expands the breadth of processes that can be evaluated as well as is applicable to all species within the stdpopsim framework, the authors do not provide significant detail regarding this feature, stating that they will provide more details about it in a forthcoming publication. Compared to this feature, the additions of extra species, finite-site substitution models, and non-crossover recombination are more specialized updates to the software.

      It would be helpful to provide additional information regarding the coding annotation (and associated distribution of fitness effects and linked selection) that is implemented in the current version of stdpopsim, but will be detailed in a forthcoming paper. This is not to take away from the forthcoming paper, but I believe this is the most important update to the software, and the current manuscript only brushes over it.

      We agree that implementation of selection in simulations is a significant addition to stdpopsim. However, our intention in this manuscript is to focus on the separate effort we made in the last two years to expand the utility of stdpopsim to a more diverse set of species. We think the manuscript stands firmly even without discussing in detail the new features that allow modeling selection. The main reason we briefly mention these features in sections “Additions to stdpopsim” and “Basic setup for chromosome-level simulations” is because the released version of stdpopsim contains implemented DFEs for a few species, and we did not want to completely ignore this. We thus added a brief comment at the end of the “Basic setup” section (page 8) mentioning the three model species for which the stdpopsim catalog currently has annotations and implemented DFE models. We think that a more detailed description of how these features and how they should be used is best left to the manuscript that the PopSim community is currently writing (preprint expected later this year).

      When it comes to simulating realistic genomic data, the authors clearly lay out that parameters obtained from the literature must be compatible, such as the same recombination and mutation rates used to infer a demographic history should also be used within stdpopsim if employing that demographic history for simulation. This is a highly important point, which is often overlooked. However, it is also important that readers understand that depending on the method used to estimate the demographic history, different demographic models within stdpopsim may not reproduce certain patterns of genetic variation well. The authors do touch on this a bit, providing the example that a constant size demographic history will be unable to capture variation expected from recent size changes (e.g., excess of low-frequency alleles). However, depending on the data used to estimate a demographic history, certain types of variation may be unreliably modeled (Biechman et al. 2017; G3, 7:3605-3620). For example, if a site frequency spectrum method was used to estimate a demographic history, then the simulations under this model from y stdpopsim may not recapitulate the haplotype structure well in the observed species. Similarly, if a method such as PSMC applied to a single diploid genome was used to estimate a demographic history, then the simulations under this model from stdpopsim may not recapitulate the site frequency spectrum well in the observed species. Though the authors indicate that citations are given to each demographic model and model parameter for each species, this may not be sufficient for a novice researcher in this field to understand what forms of genomic variation the models may be capable of reliably producing. A potential worry is that the inclusion of a species within stdpopsim may serve as an endorsement to users regarding the available simulation models (though I understand this is not the case by the authors), and it would be helpful if users and readers were guided on the type of variation the models should be able to reliably reproduce for each species and demographic history available for each species. It would be helpful to include a table with types of observed variation that the current set of 21 species (and associated demographic histories) are likely and unlikely to recapitulate well.

      This is a very important point, which we now address in the section “Limitations of simulated genomes”, which we added to the manuscript. In this section, we expand on this topic and discuss various things that will affect the way simulated genomes reflect true sequence variation. This includes the choice of demographic inference method, but also the analyzed samples, and various filters. The main message of this section is that one should consider various things when deciding to implement a demographic model in simulation (or selecting a model among those implemented in stdpopsim). We also cite studies (including Beichman, et al. 2017), which compared different approaches to demography inference. However, we note that the conclusions of these comparisons are not as straightforward as the reviewer suggests. In particular, methods that make use of the site frequency spectrum (such as dadi) should be able to capture some aspects of haplotype structure, because this information is encoded in the demographic history. Furthermore, a demographic history inferred from a single genome (e.g., using PSMC) should do a reasonable job approximating some aspects of the site frequency spectrum. In other words, the aspects of genetic variation not modeled well by a given demographic inference method are not always predicted in a straightforward way. This is why we avoid summarizing this information in a table in the manuscript. The 2nd paragraph of the “Limitations of simulated genomes” section addresses some of these subtle considerations. In particular, we suggest that considering a demographic model for simulation requires some familiarity with the inference method and the way it was applied to data. Regarding the demographic models currently implemented in stdpopsim, we provide some information about each model in the documentation page of the catalog. When selecting a demographic model from the catalog, users should make use of this documentation to guide their decision. This is mentioned in the 3rd paragraph of the “Limitations of simulated genomes” section. Following-up on this issue, we intend to review the documentation and make sure it provides sufficient information for each demographic model. See this GitHub issue.

      Reviewer #3:

      - p5, 2nd paragraph: I think many Biologists, myself included, will think of horizontal gene transfer mostly as plasmids being transferred among bacteria and adding extra genetic material, not as homologous bacterial recombination. This made me confused about modelling horizontal gene transfer in the same way as gene conversion. It may be helpful for some readers if you specify that you are modelling this particular type of horizontal gene transfer. Some explanation along the lines of what is in Cury et al (2022) would be enough.

      This is a good point. We modified the text in that sentence in the 2nd paragraph on page 5 to clarify that we are modeling non-crossover homologous recombination, and not incorporation of exogenous DNA (e.g., via plasmid transfer). The relevant part of the text now says:

      “In bacteria and archaea, genetic material can be exchanged through horizontal gene transfer, which can add new genetic material (e.g., via the transfer of plasmids) or replace homologous sequences through homologous recombination (Thomas and Nielsen, 2005; Didelot and Maiden, 2010; Gophna and Altman-Price, 2022). However, the initial version of stdpopsim used crossover recombination to stand in for these processes. Although we cannot currently simulate varying gene content (as would be required to simulate the addition of new genetic material by horizontal gene transfer), the msprime and SLiM simulation engines now allow gene conversion, which has the same effect as non-crossover homologous recombination.

      Following (Cury et al., 2022), we use this to include non-crossover homologous recombination in bacterial and archaeal species.”

      - p5, 3rd paragraph: When you say gene conversion is turned off by default, you could refer to table 1 and briefly mention the consequence of ignoring gene conversion.

      We agree that it is important to note that avoiding to model gene conversion may lead to faulty lengths of shared haplotypes across individuals. This is implied by the statement we make in the beginning of the 3rd paragraph on page 5, where we lay out the motivation for modeling gene conversion in simulation. Following the reviewer’s suggestion, we now added a statement about this in the end of that paragraph:

      “Note that ignoring gene conversion may result in a slightly skewed distribution of shared haplotypes between individuals (see Table 1)”

      -  p7, item 1 and p9, 1st paragraph: I am not sure what you mean by genetic map here, can you define this term? I am not sure if it is synonymous with gene annotations, a recombination map, or something else. The linkage map doesn't seem to make sense to me here.

      The term ‘genetic map’ referred to the recombination map whenever it was used in the manuscript. To avoid any confusion, we now removed all mentions of ‘genetic map’, and use ‘recombination map’ instead. The recombination map is relevant in item 1 of page 7 because in species with poor assemblies you will not be able to reliably estimate recombination maps, making chromosome-scale simulations less effective. In the 1st paragraph of page 9, we discuss the issue of lifting over coordinates from one assembly to another, and if you have a recombination map estimated in one assembly, you might need to lift it over to another assembly to apply it in your simulation.

      -  Table 1, last row, middle column: when you say "simulated population", I think it is a bit ambiguous. You mean "the true population that we are trying to simulate", but could be read as "the population data that was generated by simulation". I would delete the word simulated here.

      What we mean here is that the selected effective population size should reflect the observed genetic diversity in real genomic data. We realize that the previous wording was confusing, and changed this to the following:

      “Set the effective population size (Ne) to a value that reflects the observed genetic diversity”

      -  Figure 2, and other places when you refer to mutation and recombination rate (eg p11, last paragraph), can you include the units (e.g. per base pair, per generation)?

      Throughout the manuscript, rates are always specified per base per generation. In Figure 2, this is specified in the caption (3rd line). We added units in other places in section “Examples of added species” on pages 12-13, where they were indeed missing.

      -  p11, "default effective population size": can you use a more descriptive word instead of the default? Maybe the historical average? Also, what is this value used for in the simulations when there is a demographic model specified (as in the case of Anopheles)?

      We think that “default effective population size” is the most appropriate term to use here, since we are referring to the parameter in the species model in stdpopsim. It is correct that the value of this parameter should reflect the historical average size in some sense, but it is really unclear what this should be in the case of a species like Bos taurus, which experienced a very dramatic bottleneck in the recent past. We address this subtle, yet important, issue in the sentence preceding this one. If a demographic model is specified in simulation, it overrides the default effective population size, and its value is ignored (which is why we refer to it as ‘default’). We added a short sentence clarifying this in the 2nd paragraph of the “Bos Taurus” section (now page 12).

      “Note that the default Ne is only used in simulation if a demographic model is not specified.”

      -  p8, when you say "Such simulations are useful for a number of purposes, but they cannot be used to model the influence of natural selection on patterns of genetic variation.": You may want to bring up the discussion that many of these neutral parameters taken from the literature could have been estimated assuming genome-wide neutrality, and thus ignoring the effect of background selection. Therefore the parameter values might reflect some effect of background selection that was unaccounted for during their estimation.

      This is an important subtle point, which we now address in the section “Limitations of simulated genomes”, which we added to the revised manuscript. In that section, we discuss various limitations of simulations, focusing on inferred demographic models. We address the potential influence of the segments selected for analysis toward the end of 2nd paragraph in that section (page 9):

      “... all methods assume that the input sequences are neutrally evolving. This implies that technical choices, such as the specific genomic segments analyzed and various filters, may also influence the inferred model and its ability to model observed genetic variation.”

      Interestingly, background selection in itself typically does not have a strong effect on the inferred model. This is something that is examined in the forthcoming publication that presents simulations with natural selection in stdpopsim.

      -  Why are some concepts written in bold (eg effective population size, demographic model)? Were you planning to make a vocabulary box? I think this is a good idea given that you are aiming for a public that can include people who are not very familiar with some population genetics concepts.

      In the “Examples of added species” section, we use boldface fonts to highlight the model parameters that were determined for each species. We added a statement clarifying this in the beginning of this section (page 11), and made sure that all the relevant parameters were consistently highlighted throughout this section. In other sections, we use boldface fonts only for titles. A few cases that did not conform to this rule were removed in the current version. We did not intend on adding a vocabulary box, but considered this when revising the manuscript, due to the reviewer’s suggestion. However, we found it difficult to converge on a small (yet comprehensive) set of terms with accurate and succinct definitions. We think that the important terms are adequately defined within the text of the manuscript, providing sufficient information also for readers who are not expert population geneticists.

      - p4, 2nd paragraph: Are these automated scripts that are used to compare models publicly available? If you are suggesting that people use this approach generally when coming up with a simulation model (p8, penultimate paragraph), it would be helpful to have access to these automated scripts.

      The scripts are part of the public stdpopsim repository on GitHub, and may be used by anyone. Some components of these scripts are more easy to apply in general, such as comparing a demographic model with one implemented separately by the reviewer. This step, for example, is achieved by application of the Demography.is_equivalent method in msprime. Other parts of the comparison depend on the specific structure of python objects used by stdpopsim, so they are not likely to be useful when implementing simulations outside the framework of stdpopsim.

      -  p9, 1st paragraph, and p.12 2nd paragraph: instead of adjusting the mutation rate to fit the demographic model (and using an old estimate of the mutation rate), would it be ok to adjust the demographic model to fit the new mutation rate? E.g. with a new mutation rate that is the double of a previous estimate, would it be ok to just divide Ne by 2 such that Ne*mu is constant (in a constant population size model)? I imagine this could get complicated with population size changes.

      In principle, this could be done if you were simulating neutrally evolving sequences (without modeling natural selection). Since the coalescence is scale-free, then you can scale down all population sizes and divergence times by a multiplicative factor, and scale up migration rates and the mutation rate by the same factor, and you get the exact same distribution over the output sequences. However, making sure you get the scaling right is tricky and is quite error-prone. Especially considering the fact that you have to do this every time the mutation rate of a species is updated. Moreover, once you start modeling natural selection, this scale-free property no longer holds. Thus, the simple solution we came up with in stdpopsim is to attach to each demographic model the mutation rate used in its inference.

    1. Author Response:

      The following is the authors' response to the original reviews.

      We sincerely thank all the editors and reviewers for taking the time to evaluate this study. Here is our point-by-point response to the reviewers’ comments and concerns.

      Reviewer #1 (Public Review):The study by Oikawa and colleagues demonstrates for the first time that a descending inhibitory pathway for nociception exists in non-mammalian organisms, such as Drosophila. This descending inhibitory pathway is mediated by a Drosophila neuropeptide called Drosulfakinin (DSK), which is homologous to mammalian cholecystokinin (CCK). The study creates and uses several Drosophila mutants to convincingly show that DSK negatively regulates nociception. They then use several sophisticated transgenic manipulations to demonstrate that a descending inhibitory pathway for nociception exists in Drosophila.

      […]

      Weaknesses:

      A minor weakness in the study is that it is unclear how DSK negatively regulates nociception. An earlier study at the Drosophila nmj shows that loss of DSK signaling impairs neurotransmission and synaptic growth. In the current study, loss of CCKLR-17D1 in Goro neurons seems to increase intracellular calcium levels in the presence of noxious heat. An interesting future study would be the examination of the underlying mechanisms for this increase in intracellular calcium.

      We thank the reviewer for the kind and very positive evaluation of our manuscript. We agree that this study has not elucidated the intracellular molecular pathway(s) downstream of CCKLR-17D1 that are involved in the regulation of the activity of Goro neurons, and we think that it would definitely be an interesting topic for future research.       

      Reviewer #1 (Recommendations For The Authors):

      The response latencies for the control yw larvae seem large, with many larvae appearing to be insensitive to the thermal stimulus. Is this just an effect of the yw genetic background? A brief discussion of this might be helpful.

      We thank the reviewer for pointing this out. We have also noticed that the yw control larvae tend to show longer response latencies than the other control strains, and in the revised manuscript, we have added the following sentence in the Result section (Lines 91–94):

      “We have noticed that the yw control strain, which was used by us to generate the dsk and receptor deletion mutants, showed relatively longer response latencies to the 42 °C probe compared to the other control strains used in this study. This may be attributed to the effect of the genetic background, although, presently, the cause for this difference is unknown.”

      Reviewer #2 (Public Review):_

      This is an exceptional study that provides conclusive evidence for the existence of a descending pathway from the brain that inhibits nociceptive behavioral outputs in larvae of Drosophila melanogaster. […] The study raises many interesting questions for future study such as what behavioral contexts might depend on this pathway. Using the CAMPARI approach, the authors do not find that the DSK neurons are activated in response to nociceptive input but instead suggest that these cells may be tonically active in gating nociception. Future studies may find contexts in which the output of the DSK neurons is inhibited to facilitate nociception, or contexts in which the cells are more active to inhibit nociception._

      Reviewer #2 (Recommendations For The Authors):I have no recommendations for the authors as this is a very complete and thoroughly executed study. The writing is crystal clear.

      We thank the reviewer for the kind and very positive evaluation of our manuscript. We are happy to know that our current manuscript was deemed to be clear and convincing by the reviewer.

      Reviewer #3 (Public Review):[…] Overall the authors use clean logic to establish a role for DSK and its receptor in regulating nociception. I have made a few suggestions that I believe would strengthen the manuscript as this is an important discovery.

      Major comments:

      1. It's not completely clear why the authors are staining animals with an FLRFa antibody. Can the authors stain WT and DSK KO animals with a DSK antibody? Also, can the authors show in supplemental what antigen the FLRFa antibody was raised against, and what part of that peptide sequence is retained in the DSK sequence? This overall seems like a weakness in the study that could be improved on in some way by using DSK-specific tools.

      We thank the reviewer for this query. We would like to clarify that we first tried the FLRFa antibody to visualize an RFamide-type neuropeptide other than DSK in Drosophila and found that the staining pattern is quite similar to that of anti-DSK, as shown by Nichols et al. [1]. According to the original paper describing the anti-FLRFa antisera [2] (already cited in the reviewed manuscript), the antigen used to raise it was the Phe-Met-Arg-Phe-NH2 peptide conjugated with succinylated thyroglobulin, and the study experimentally shows that the antibody well binds to peptides containing Met-Arg-Phe-NH2 or Leu-Arg-Phe-NH2 sequence and has 100% cross-reactivity to FLRFa. As DSK contains Met-Arg-Phe-NH2 sequence [3], the cross-reaction of this antibody to DSK is consistent with the description of the original study.    

      Although we were unable to use an antibody specific to DSK, our staining data with dsk deletion mutants and the expression pattern of DSK-2A-GAL4 corroborate each other (Figure 2 and Figure 2-figure supplement 1), which we believe provides compelling evidence for the specific expression of DSK in MP1 and Sv neurons, and for that DSK-2A-GAL4 is a reasonably effective tool to specifically manipulate DSK-expressing neurons.

      2. What is the phenotype of DSK-Gal4 x UAS-TET animals? They should be hyper-reactive. If it's lethal maybe try an inducible approach.

      We thank the reviewer for this question. Unfortunately, we have not attempted this experiment, although we agree that this would be a nice addition to further strengthen the study if TET worked well in the DSKergic neurons.

      3. Figure 9. This was not totally clear, but I think the authors were evaluating spontaneous (i.e. TRPA1-driven) rolling at 35C. The critical question is "does activating DSK-expressing neurons suppress acute heat nociception" and this hasn't really been addressed. The inclusion of PPK Gal4 + DSK Gal4 in the same animal kind of clouds the overall conclusions the reader can draw. The essential experiment is to express UAS-dTRPA1 in DSK-Gal4 or GORO-Gal4 cells, heat the animals to ~29C, and then test latency to a thermal heat probe (over a range of sub and noxious temperatures). Basically prove the model in Figure 10 showing ectopic activation or inhibition for each major step, then test heat probe responses.

      We thank the reviewer for suggesting ideas for alternative experiments to potentially strengthen our conclusion. Regarding experiments using heat probes, previous studies have demonstrated that (i) Blocking ppk1.9-GAL4-positive C4da neurons almost completely abolishes the larval nociceptive response to local heat stimulations [4]; (ii) Local heat stimuli above 39 °C readily activate C4da neurons and larval nociceptive rolling [5-9]; and (iii) Thermogenetically or optogenetically activating these neurons is sufficient to trigger Goro neurons and larval rolling [4, 10-12]. Thus, it has now been made clear that heat probes induce larval nociceptive rolling via excitation of the C4da pathway, and we believe that our experiments using thermogenetic activation of C4da neurons can be safely interpreted as an alternative to experiments using heat probes. Using heat probes demands a more complicated experimental set-up to be combined with CaMPARI imaging experiments, and this is another reason why we preferred to take the thermogenetic approach.

      We have also considered the experiment using Goro-GAL4 instead of ppk-GAL4. However, if dTRPA1 artificially activates Goro neurons far downstream of the neuronal mechanism by which MP1 activation suppresses Goro neuron activity, the effect of MP1 activation may be bypassed and masked. As we currently do not know the epistasis between dTRPA function and the effect of MP1 activation in modulating the activity of Goro neurons, we rather chose to activate C4da neurons by using ppk-GAL4, which likely resulted in more natural activation of Goro neurons than dTRPA1-triggered direct activations.

      4. It would also then be interesting to see how strong the descending inhibition circuit is in the context of UV burn. If this is a real descending circuit, it should presumably be able to override sensitization after injury.

      Indeed, this is an interesting avenue to explore in future studies to understand the type of situation in which the DSKergic descending system functions to control nociception.

      Reviewer #3 (Recommendations For The Authors):Overall this is a good story and the claims are generally supported with experimental evidence. The way to really improve this study would be to use more precise and definitive tools, like specific antibodies, specifically targeted genes, and better temporal control of the descending circuit to prove this is inducible sufficient to suppress acute thermal nociception and this occurs only via a descending pathway, etc. However this would be exponentially more work, and so the authors I guess need to weigh the cost-benefit of definitive proof vs. strong evidence for their claims. Overall I think this study will be the beginning of a new line of inquiry in the field that has the potential to guide our understanding also of mammalian descending pathways, and as such, this study is of value to the community.

      We appreciate the reviewer’s multiple interesting ideas for experiments that could have been performed to further reinforce our findings. We agree that some experiments that the reviewer suggested would potentially strengthen this work if supplemented. However, as aforementioned, in our humble opinion, we think that the experiments that the reviewer suggested are either outside the scope of this paper or have no significant benefits over the experiments that were already conducted, and hence are not essential to the present study.

      References

      1. Nichols, R. and I.A. Lim, Spatial and temporal immunocytochemical analysis of drosulfakinin (Dsk) gene products in the Drosophila melanogaster central nervous system. Cell Tissue Res, 1996. 283(1): p. 107-16.

      2. Marder, E., et al., Distribution and partial characterization of FMRFamide-like peptides in the stomatogastric nervous systems of the rock crab, Cancer borealis, and the spiny lobster, Panulirus interruptus. J Comp Neurol, 1987. 259(1): p. 150-63.

      3. Nassel, D.R. and M.J. Williams, Cholecystokinin-like peptide (DSK) in Drosophila, not only for satiety signaling. Front Endocrinol, 2014. 5.

      4. Hwang, R.Y., et al., Nociceptive neurons protect Drosophila larvae from parasitoid wasps. Curr Biol, 2007. 17(24): p. 2105-2116.

      5. Tracey, W.D., Jr., et al., painless, a Drosophila gene essential for nociception. Cell, 2003. 113(2): p. 261-73.

      6. Xiang, Y., et al., Light-avoidance-mediating photoreceptors tile the Drosophila larval body wall. Nature, 2010. 468(7326): p. 921-6.

      7. Burgos, A., et al., Nociceptive interneurons control modular motor pathways to promote escape behavior in Drosophila. eLife, 2018. 7.

      8. Honjo, K. and W.D. Tracey, Jr., BMP signaling downstream of the Highwire E3 ligase sensitizes nociceptors. PLoS Genet, 2018. 14(7): p. e1007464.

      9. Im, S.H., et al., Tachykinin acts upstream of autocrine Hedgehog signaling during nociceptive sensitization in Drosophila. eLife, 2015. 4: p. e10735.

      10. Ohyama, T., et al., A multilevel multimodal circuit enhances action selection in Drosophila. Nature, 2015. 520(7549): p. 633-9.

      11. Honjo, K., R.Y. Hwang, and W.D. Tracey, Jr., Optogenetic manipulation of neural circuits and behavior in Drosophila larvae. Nat Protoc, 2012. 7(8): p. 1470-8.

      12. Zhong, L., et al., Thermosensory and non-thermosensory isoforms of Drosophila melanogaster TRPA1 reveal heat sensor domains of a thermoTRP channel. Cell Rep, 2012. 1(1): p. 43-55.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Major points:

      1. Although the role of mitofusin on mitochondrial morphology has been established by others and comprehensively assessed in the present study, the authors should determine the functional outcome from the genetic manipulations on Mfn2 and Mfn1. As observed by increased glucose uptake, one could hypothesize an impairment in mitochondrial oxidative phosphorylation, leading the cells to rely uniquely or heavily on glycolysis as a fuel. Also, as mentioned by the authors in the discussion, ROS play a fundamental role in adipogenesis, and, therefore, mitochondrial ROS emission and/or cellular redox balance should also be assessed. I believe these two experiments will add insightful information to the current dataset.

      __Thank you for these suggestions. Whilst we agree with the general premise of this point, unfortunately quantifying oxidative phosphorylation and ROS production with sufficient precision to detect relatively subtle changes remains very challenging. We have attempted these experiments but they require considerable optimisation (particularly using adipocytes). Preliminary studies done in MEFs (Cover letter figure 1) suggest that under some stimuli there may be higher ROS in Mfn1 and Mfn2 knock-out lines. However this preliminary data would require further optimisation and repetition in adipocytes, which is more challenging. __

      For now, we have amended the Discussion to specify that these experiments are of particular interest.

      Cover letter figure 1. Levels of reactive oxygen species (ROS) in mouse embryonic fibroblasts measured by flow cytometry for fluorometric dyes CellROX (total cellular ROS), D2-HDCFA (total cellular ROS), and MitoSOX (mitochondrial ROS). Levels are expressed relative to wild-type. MEFs were treated with antimycin A (or media only) for 20minutes prior to incubation with the ROS dyes, then washed three times before assayed. AntA, Antimycin; CR, CellROX; M1, Mfn1-/- MEFs; M2, Mfn2-/- MEFs; MS, MitoSOX; WT, wild-type.

      The insulin effect on glucose uptake does not allow to conclude any impairment in insulin responsivity. The fold change of glucose uptake mediated by insulin was roughly 1.2 in undifferentiated adipocytes, 2.3 in differentiated WT, and 2.5 in Mfn1KO differentiated adipocytes. The absolute increase in glucose uptake could be a compensatory mechanism due to impairment in mitochondrial bioenergetics (see point #1), given that the cells can still respond to insulin. Measuring Akt phosphorylation levels following insulin treatment would help solve this issue.

      __As requested, we have assessed the effect of insulin treatment on Ser 473 phosphorylation of Akt2 (Pkb) in wild-type and knock-out MEFs differentiated into adipocytes (Fig 2D). Mfn1_-/-_ MEFs show an increase in Akt phosphorylation relative to the other cell lines. They also have higher expression of insulin receptor and Glut4, consistent with their degree of adipogenic differentiation. __

      We agree that impaired mitochondrial bioenergetics could account for the observations in perturbed glucose uptake in the knockout cell lines (especially Mfn2-null) and have therefore amended the text throughout to reflect this.

      Usually, working with clonal transgenic cells lines has the limitations that the cells might behave differently in terms of adipogenic potential over passages. A transient loss of function in the same cells would solve this concern. Also, introducing the patient mutations might be closer to the human situation than working with KO mouse fibroblasts.

      __We agree with this potential concern, which is why we conducted knock-down studies in 3T3-L1 cells in addition to the work in knockout MEFs. These data were concordant with what we observed in the KO MEFs so we don’t think it is necessary to conduct repeat KD experiments in WT MEFs. __

      In our previous study we observed that human fibroblasts with biallelic MFN2-R707W mutations did not have any obvious phenotype (____https://elifesciences.org/articles/23813____). We have separate work studying these mutations in vivo where we provide further characterisation of murine adipocytes harbouring Mfn2-R707W; this work is now published here: https://elifesciences.org/articles/82283

      Minor points:

      1. Although the authors mention in the introduction that the differentiation of adipocytes is followed by an increase in mitochondrial mass, it would be interesting the determine the expression profile of mfn1 and mfn2 during the differentiation process.

      We have found that there is an increase in markers of mitochondrial fusion (Mfn1 & Mfn2) as well as fission (Fis1) throughout differentiation of 3T3-L1s. ____We have included this data in the manuscript (Supplementary Figure ____6A ).

      The authors should discuss other models, even though pre-clinical, of mitochondrial dysfunction that results in lipodystrophy but with different metabolic outcomes. To cite a few but not only PMID: 29588285; PMID: 21368114; PMID: 31925461.

      Thank you for this suggestion. We have added a section on this in the introduction.

      It would be interesting to discuss the role of Mfn1/2 in the context of cold-induced adipogenesis, given the prominent role of mitochondrial dynamics, as mentioned by the authors in the reference list, on cold-induced adaptative thermogenesis (Mahdaviane et al. 2017; Boutant et al. 2017).

      Thank you for this suggestion. We have added a section on this in the introduction.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      • In Fig.2A, the authors report "increased lipid accumulation in Mfn1-/- MEFs, but not in Mfn2-/- MEFs". While the overall content might be similar, the pattern of lipid accumulation seems to be different. Indeed, differences in lipid droplet morphology have been observed in Mfn2 KO MEFs upon oleate treatment (McFie et al., 2016). The manuscript would benefit from having quantifications of lipid droplet size and number.

      Thank you for highlighting this. We have quantified lipid droplet size and, consistent with McFie et al have found increased size in Mfn2 knock-down. This data is now included in Supplementary Figure 6B.

      • Following the above point, McFie et al. also reported that Mfn1/Mfn2 double KO MEFs could differentiate into adipocytes. The authors should discuss these opposing observations. The contrasting observation may be due to acquired clonal differences in MEF lines. We have attempted ‘double’ knock down (of both Mfn1 and Mfn2 concurrently) in 3T3-L1 cells however this was essentially lethal and also did not generate any cells capable of differentiation. We have added a section in the Discussion regarding this point.

      • In relation to the effects of Mitofusin deletions on glucose uptake, the authors mention that Mfn2 KO MEFs show impaired insulin stimulated glucose uptake. The interpretation of the result is not straight forward, as basal glucose uptake is highly increased in Mfn2 KO MEFs. Maybe there is simply a treshold for maximal glucose uptake capacity in MEF-derived adipocytes. In any of these cases, the authors might want to check GLUT1 levels, in line of their suggestion that the increased basal glucose uptake might be related to higher GLUT1. Alternatively, the authors might also want to check elements of the insulin signaling path, in case there are alterations that could explain the phenomenon.

      As mentioned above in response to reviewer 1, we have now ____performed immunoblots to quantify some components of the insulin signalling cascade (Fig 2D). We observed lower expression of both Glut1 and Glut4 in the Mfn2-/- cells. Mfn2-/- cells did demonstrate some Akt phosphorylation but considerably less than Mfn1-/- cells. These results are now included in the revised manuscript (Figure 2D).

      • In line with the above point, one would have wished that mitochondrial biology was better characterized in the different MEF models. While mitochondrial shape analyses are provided, some information on, at least, mitochondrial respiratory capacity, glucose oxidation and/or fatty acid oxidation rates, would be important. This would allow for a more solid discussion on why Mfn2 KO MEFs display such high basal glucose uptake rates.

      We have responded to a similar suggestion from Reviewer 1, above.

      • In relation to the experiments in MEFs, one should never forget that WT, Mfn1 and Mfn2 KO MEFs derive from different mice. Hence, the phenotypes could be related to trait variabilities in the origin mice themselves, and not just the gene deletion. To control for this aspect, the authors could simply re-introduced Mfn1 or Mfn2 in their respective MEFs and evaluate if their alterations are normalized.

      __Yes one could try this but we have addressed this general concern by replicating the impact of Mfn1/2 KD in 3T3L1 cells so are not inclined to pursue this at this time. __

      • Transcriptomic analyses reveals a decrease in adipogenic gene expression in Mfn2 KO MEFs. However, lipid accumulation is comparable to WT MEFs is normal. This could be due to defects in lipolytic capacity, leading to similar lipid accumulation despite lower adipogenic capacity. This could be tested by evaluating the adrenergic response of these cells (e.g.: glycerol release).

      Thank you for this suggestion. We have commented in the Discussion to explain that we have not fully characterised this mechanism.

      • The experiments in 3T3-L1 would also benefit from some gene expression analyses to evaluate if Mfn1 depletion leads to acceleration and/or magnification of the differentiation stages. In relation to this, 3T3-L1 cells could be used to monitor Mfn1 and Mfn2 through differentiation, which in itself would be valuable information.

      We have performed a protein-level time course for markers of mitochondrial fusion (Mfn1 & Mfn2) as well as fission (Fis1) throughout differentiation of 3T3-L1s. We have included this data in the manuscript (Supplementary Figure 6A). We think that changes in protein expression are more relevant than changes in mRNA so have not included gene expression changes at this time.

      CROSS-CONSULTATION COMMENTS The comments from the three independent reviewers are extremely well aligned and agree that improving the following aspects could largely benefit the manuscript:

      • A better metabolic characterisation of the models used
      • Provide measurements in relation to mitochondrial bioenergetics and ROS production – we have attempted this but the data is not very clear in our view and warrants further optimisation which we are not inclined to pursue currently. - Explorations of insulin signaling - done thank-you.
      • Improve the validation and significance of the cellular models used, following the different suggestions from the three reviewers. Most notably, considering the introduction of human Mfn2 mutation forms – we have published a separate manuscript on follow up work on the human MFN2 variant as mentioned above.

      A number of additional comments are raised, all of which are very reasonable and, in my opinion, should not be difficult to address. I think we can all agree that a mechanistic underpinning of the observations would give a larger degree of novelty to the work. Also, none of us would like the revision's quality to be constraint by a tight deadline. I would therefore be totally OK to extend the timeframe for the revision beyond the original 3 months proposed.

      Reviewer #2 (Significance (Required)):

      This is an interesting and well-crafted manuscript. Mice deficient for Mfn2 or Mfn1 have been reported by different laboratories, yet most of them fail to explore the effects on early adipogenesis. The study is limited to cultured cells, but this is well acknowledged by the authors Given the existence of human mutations in the mitofusin-2 gene that largely alter fat mass distribution, this work provides new clues on how these mutations might impact adipose tissue.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Mann et al. The objective of this study is to determine the extent to which mitofusins (Mfn1 and Mfn2) have redundant functions and assess their contributions to adipocyte differentiation. While a point mutation in the Mfn2 gene has been associated with severe adipose tissue dysfunction and lipodystrophy, no disease phenotypes have been linked to mutations in Mfn1. To address these objectives, the authors sought to characterize how adipocyte differentiation and function is affected in Mfn1, Mfn2 or double knockout adipocytes in two distinct in vitro models. Their findings indicate divergent effects of Mfn1 and Mfn2 on adipocyte differentiation and function despite similar alterations to mitochondrial morphology. Loss of Mfn1 promotes adipogenesis while Mfn2 decreases it. The authors conclude that these findings are indicative of non-redundant functions in Mfn1 and Mfn2.

      Major comments: The observation that Mfn1 KO/KD leads to increased adipogenesis in vitro is somehow novel and, perhaps, surprising, as the author say. However, the molecular understanding underlying this phenotype remains unexplored. The analyses performed are mainly descriptive and don't dig deeper into the identification of the molecular mechanism. They do hypothesize that ROS production may be responsible for the observed effects, but that's how far they go.

      The authors do highlight the limitations of this work, but these limitations need careful consideration, for not addressing them seriously limits the novelty of this study, especially not testing these conditions in human cells. The current version of this work seems too preliminary to suggest useful experiments that could strengthen the study, since future analyses could take many different directions.

      Yes, we accept that the findings are rather preliminary but our initial efforts suggest that precisely elucidating the underlying mechanism/s is likely to be more difficult and complicated than alluded to by the reviewers. We would therefor prefer to share our initial observations so that others can also attempt to clarify the underlying mechanisms.

      A few unanswered questions that the authors might consider are: What is the difference between the Arg707Trp mutation and the KO/KD? Mfn1 and 2 deletions lead to fragmented mitochondria, but opposite adipogenic potentials. What other mitochondrial defects can explain it? Are organelle contact site disrupted only with Mfn2? How does Mfn1 and 2 KO/KD affect mitochondrial proteome? What does mitochondrial bioenergetics look like? How is ROS production affected? Is the increased glucose uptake (basal) a compensatory mechanism for mitochondrial dysfunction? Thank you for these suggestions. We acknowledge that this work is largely descriptive in nature. These are all questions that should be addressed to improve mechanistic understanding of our observations.

      __The difference between p.Arg707Trp and KO/KD is challenging to address because in the non-adipose cell lines studied so far (human and mouse fibroblasts) there has been no evidence of perturbation of the mitochondrial network. __

      As discussed above, we have done preliminary studies into ROS production but are unable to provide a complete characterisation at this time. Similarly, we have not been able to perform bioenergetic studies (e.g. Seahorse, Oxyboros) that would provide more insight into differences between Mfn1 and Mfn2 KO cell lines.

      CROSS-CONSULTATION COMMENTS I agree the work is interesting, but is too preliminary and merely descriptive. the experiments suggested will significantly improve the manuscript. However, I don't think they will take only three months to be completed. This work needs a significant amount of work including the study of the mechanism, at least an idea of what the mechanism could be, to be considered novel.

      We accept this limitation and have responded to this general point above.

      Reviewer #3 (Significance (Required)):

      Understanding how mitochondrial dynamics affect adipogenic differentiation is critical to better understand how metabolism impact cell signaling, cell fate and function.

      Strengths: this work reveals an interesting phenotype for Mfn1 and Mfn2 mutant preadipocytes. Weaknesses: this work is merely descriptive and preliminary to provide a clear understanding of the observed phenotypes

      Advance: Although the performed experiments are accurate, well designed, and well controlled, the fact that Mfn1 and 2 have distinct functions and cannot compensate for one another was already clear based on the embryonic lethality of either Mfn1 and Mfn2 KO mice as well as the Mfn2 mutation in humans that leads to a pathological condition.In the current verison, this work minimally contributes to advancing the field.

      Audience: an extensively revised version of this work including deeper phenotyping of thier models and human cell work would be of interest for sceintists studying mitchondrial biology, adipose tissue, metabolic diseases, and human genetic diseases.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reply to reviewers.

      We deeply thank the reviewers for the time spent on evaluating our manuscript as well as providing comments and suggestions to improve our study.

      __Reviewer #1 (Evidence, reproducibility and clarity (Required)): __

      *In this manuscript Lebdy et al. describe a new role of GNL3 in DNA replication. They show that GNL3 controls replication fork stability in response to replication stress and they propose this is due to the regulation of ORC2 and the licensing of origins of replication. Their data suggest that GNL3 regulates the sub nuclear localization of ORC2 to limit the number of licensed origins of replication and to prevent resection of DNA at stalled forks in the presence of replication stress.

      While many of the points of the manuscript are proven and well supported by the results, there are some experiments that could improve the quality and impact of the manuscript. The main issue is that the connection between the role of GNL3 in controlling ORC2, the firing of new origins and the protection of replication forks is not clearly established. At the moment the model relies on mainly correlative data. In order to further substantiate the model, we propose to address some of the following issues:*

      1. *The authors indicate that RPA and RAD51 accumulation at stalled forks is not affected by GNL3 depletion. These data should be included and other proteins should be analysed. In addition, the role of helicases could be explored through the depletion of the main helicases involved in the remodelling of the forks. * Response: As asked by the reviewer we will add the fractionation experiments that show that the level of RAD51 and RPA on chromatin is not affected by GNL3 depletion. So far, the other proteins we checked (RIF1 and BRCA1), both involved in nascent strand protection, did not show clear differences. Therefore, we concluded that depletion of GNL3 does not seem to affect the recruitment of major proteins required for protection of nascent DNA. Of course, we cannot exclude that other proteins may be affected by GNL3 depletion, but testing all the possible candidates would be time consuming with a very low chance of success. In addition, fractionation experiments are possibly not quantitative enough to uncover small differences and may be not that informative. Thus it remains possible that RPA exhaustion may be the cause of resection in absence of GNL3 as suggested by the work conducted in Lukas’ lab (Toledo et al. 2013. https://pubmed.ncbi.nlm.nih.gov/24267891/). To test this hypothesis, we will analyze if resection in absence of GNL3 is still occurring in a well-characterized cell line that overexpress the three RPA subunits that we obtained from Lukas’ lab.

      To our knowledge not many helicases have been shown to be involved in remodeling of stalled forks. The best example is RECQ1, however we feel that testing RECQ1 involvement in resection upon GNL3 depletion will complicate our story without adding much regarding the mechanism. We hope the reviewer understands our concern.

      • The proposed model implies that GNL3 depletion leads to increased origin licensing. FThe authors should address if the primary effect of GNL3 depletion is on origin firing by using CDC7 inhibition in the absence of stress (Rodríguez-Acebes et al., JBC 2018). *

      __Response: __This is an excellent point raised by the reviewer. To test if the primary effect of GNL3 depletion in on origin firing we will test if the defect in replication fork progression is dependent on CDC7 using DNA fibers experiments and CDC7 inhibitor.

      • A way to prove that origin firing mediates the effect of GNL3 on fork protection would be to reduce the number of available origins. The depletion of MCM complexes has been shown to limit the number of back-up origins that are licensed and leads to sensitivity to replication stress (Ibarra et al., PNAS 2008). If GNL3 depletion results in increased number of origins, this effect should be prevented by the partial depletion of MCM complexes. *

      __Response: __This is also an excellent point. We will test if MCM depletion decreases resection upon GNL3 depletion and treatment with HU. In addition, we will integrate in the manuscript experiments that we have done recently that show that treatment with roscovitine, a CDK inhibitor that impairs origin firing, decreases the level of resection observed in absence of GNL3. We think this experiment strengthens the results obtained with CDC7 inhibitors.

      *Alternatively, the authors could try to modulate the depletion of GNL3. Origin licensing takes place in the G1 phase and thus the depletion of GNL3 by siRNA could affect the following S phase. Using an inducible degron for GNL3 depletion would allow to deplete GNL3 in G1 or S phase specifically. If the model is correct, the removal of GNL3 in S phase should not affect fork protection but removing GNL3 in the previous G2/M phase should reduce the number of licensed origins and lead to impaired fork protection. *

      __Response: __This is obviously a good point given the fact that GNL3 deletion is not viable (see responses to reviewer 2). We tried to develop an auxin induced degron of GNL3, but we could not obtain homozygous clones, meaning that our clones had always an untagged GNL3 allele. Since GNL3 is essential its tagging may impair its function, explaining why we could not obtain homozygous clones. However, we are planning to optimize the design using other degrons system (for instance Halo-tag) to address the role of GNL3 specifically during S-phase. But we think this is above the scope of the present study.

      *In addition to the connection GNL3-origin firing-fork protection, it is unclear how the lack of GNL3 in the nucleolus and the change in the sub nuclear localization of ORC2 controls origin firing and resection. The strong interaction observed between GNL3-dB and ORC2, and the subsequent change in ORC2 localization does not explain how origin licensing can be affected. In this sense, the authors could address: *

      1. *Does the depletion of GNL3 and the expression of GNL3-dB affect the formation of the ORC complex, its subnuclear localization or its binding to chromatin? The authors have not explored if the interaction of GNL3 with ORC2 is established in the context of the ORC complex. An IF showing NOP1 with PLA data from GNL3-dB and ORC2 is needed to analyse how the expression of increasing amounts of GNL3-dB affects ORC2. * __Response: __We tested if GNL3 depletion impacts ORC2 and ORC1 recruitment on chromatin, but we could not observe significant differences. No clear differences were observed upon GNL3-dB expression either. One reason for this may be due to the excess of ORC complex on the chromatin, in addition chromatin fractionation is likely not sensitive enough to observe small differences. We think that quantitative ChIP-seq of ORC2 or other ORC subunits upon GNL3 depletion is required to visualize such differences, but this is above the scope of the study, and this constitutes the following of this project. We also tried to look at subnuclear localization of ORC2 using immunofluorescence, but the signal was not specific enough to observe differences. We think that the increased interaction (PLA) of ORC2 with GNL3-dB (Figure 5E) demonstrates a change in ORC2 subnuclear localization. To confirm this, we will perform the excellent experiment proposed by the reviewer to test if increasing level of GNL3-dB affects its interaction with ORC2 using PLA.

      We do not think that the interaction between ORC2 and GNL3 is established in the context of the ORC complex since only ORC2 (and not the other ORC) was significantly enriched in the GNL3 Bio-ID experiment. The full list of proteins from the Bio-ID experiment (Figure 4A) will be provided in the revised version. Therefore, we think that either GNL3 regulates ORC2 subnuclear localization that in turns impact the ORC complex or GNL3 regulates ORC2-specific functions. More and more evidences show that ORC2 plays roles possibly independently of the ORC complex (see Huang et al. 2016 https://doi.org/10.1016/j.celrep.2016.02.091 or Richards et al. 2022 https://doi.org/10.1016/j.celrep.2022.111590 for instance). Future work should uncover how these ORC2 functions may regulate origins activity.

      *In order to confirm if the mislocalization of ORC2 by the expression of GNL3-dB increases origin firing and mediates the effects on fork protection the authors could check DNA resection levels inhibiting CDC7 in high GNL3-dB conditions. Also, the levels of MCM2, phosphor-MCM2, CDC45, have not been analysed upon expression of GNL3-dB. *

      __Response: __This is a good point; we will test if the resection observed upon expression of GNL3-dB is dependent on origin firing using CDC7 inhibitor. We have not measured the level of the cited proteins but instead we performed DNA combing to measure Global Instant Fork Density. We now show that expression of GNL3-WT suppresses the increased origin firing observed upon GNL3 depletion, in contrast expression of GNL3-dB does not suppress it. This important result indicates that origin firing is increased upon GNL3-dB expression, providing a link between aberrant localization and increased firing. These data will be part of the revised version of the manuscript.

      The data in the paper suggest that GNL3 may affect the role of ORC2 in centromeres. Since depletion of GNL3 leads to increased levels of gH2AX, it would be interesting to address if this damage is due to incomplete replication in centromeres by analysing the co-localization of g*H2AX and centromeric markers both in unstressed conditions and upon the induction of replication stress. *

      __Response: __This is indeed and interesting comment, however since it has been previously shown that gH2AX signal is rather strong upon GNL3 depletion (see Lin et al. 2013. https://pubmed.ncbi.nlm.nih.gov/24610951/ ; Meng et al. 2013. https://pubmed.ncbi.nlm.nih.gov/23798389/) we do not think that co-localization experiments with CENP-A for instance will be informative given the high number of gH2AX foci.

      *Minor points: *

      1. In the initial esiRNA screen the basal levels of g*H2AX should also be shown. * Response: Our negative control is the transfection of an esiRNAs that targets EGFP (a gene that is not expressed in the tested cell line). This esiRNAs is ranked at the end of the list and therefore constitutes the basal level of gH2AX signal. In any case it is well-established that GNL3 depletion increases gH2AX signal (see Lin et al. 2013. https://pubmed.ncbi.nlm.nih.gov/24610951/ ; Meng et al. 2013. https://pubmed.ncbi.nlm.nih.gov/23798389/).

      *Figure EV1B: I think the rank needs another RS mark to see better the effect of each esiRNA on DNA lesions (high variability in all the conditions showed). *

      __Response: __We understand this issue, but we cannot repeat this set of experiments for technical reasons (reagents and cost mainly). Anyway, we believe that the role of GNL3 is response to replication stress is extensively addressed by other experiments of this manuscript.

      *Figure 1C and Figure EV1D/E: the quantification of the pCHK1/CHK1 levels could be included to show that there are no changes in phosphorylation upon GNL3 depletion. *

      Response: it is a good point; we will put quantification in the revised version.

      *In the first section of the results, at the end Figure 4B is incorrectly called for. *

      __Response: __Thanks for the comment, we will modify accordingly.

      The levels of GLN3 expression in 293 cells should be already included in section GNL3 interacts with ORC2.

      __Response: __We will add a figure that shows the level of expression in 293 cells.

      The full MS data needs to be included for both GNL3 and ORC2.

      __Response: __This will be integrated in the revised version.

      Figure 4B should be improved, since there is a faint band in the IgG mouse control.

      __Response: __it is true that the figure is not perfect, but we believed that our Bio-ID and PLA experiments fully demonstrate the interaction between GNL3 and ORC2.

      __Reviewer #1 (Significance (Required)): __

      *The work is nicely written, the figures are well presented and the experiments have the necessary controls. It provides relevant information to understand how replication stress is controlled and linked to replication fork protection through origin firing. These results are relevant to the field, linking GNL3 to origin firing and with potential to help understand the role of GNL3 in cancer. They provide new information and can give rise to new studies in the future. Many of the conclusions of the manuscript are well supported. Additional support for some of the main claims would strengthen the results and also increase the impact providing a bigger conceptual advance by performing some of the suggested experiments. *

      __Reviewer #2 (Evidence, reproducibility and clarity (Required)): __

      *This manuscript explores the role of GNL3/nucleostemin in DNA replication and specifically in the response of DNA replication to DNA damage. GNL3 is a predominantly nucleolar protein, previously characterised as a GTP-binding protein and shown to be necessary for effective recruitment of the RAD51 recombinase to DNA breaks. The entry point for this report is a mini screen, based on proteins identified previously by the authors to associate with replication forks by iPOND, for factors that increase gamma-H2Ax (an indicator of DNA damage) after treatment with the Top1 inhibitor camptothecin (CPT). In this mini-screen GNL3 emerged as the top hit.

      The authors put forward the hypothesis that GNL3 is able to sequester the replication licensing factor ORC2 in the nucleolus and that failure of this mechanism leads to excessive origin firing and DNA resection following CPT treatment.*

      • The model put forward is interesting, but currently rather confusing. However, for the reasons upon which I expand below, I do not believe that the data provide a compelling mechanistic explanation for the effects that are reported and I am left not being certain about some of the links that are made between the various parts of the study, even though individual observations appear to be of good quality. *

      *Specific points: *

      *The knockdown of GNL3 is very incomplete. In this regard, the complementation experiments are welcome and important. However, is it an essential protein? Can it be simply deleted with CRISPR-Cas9?

      *__Response: __There are obviously variations between experiments but overall, the depletion of GNL3 using siRNA seems good in our opinion. Deletion of GNL3/nucleostemin leads to embryonic lethality in mouse (Beekman et al. 2006. https://pubmed.ncbi.nlm.nih.gov/17000755/ ; Zhu et al. 2006. https://pubmed.ncbi.nlm.nih.gov/17000763/). ES cells deleted for GNL3 can be obtain but do not proliferate probably because of their inability to enter in S-phase (Beekman et al. 2006. https://pubmed.ncbi.nlm.nih.gov/17000755/). We wanted to test if it was the case in our cellular model and we tried to delete it using CRISPR-Cas9. We managed to obtain few clones deleted for GNL3, but they grow really poorly prevented us to do experiments. To bypass this, and as suggested by the reviewer 1, we tried to make an auxin-induced degron of GNL3. Unfortunately, we did not manage to obtain homozygous clones, only heterozygous. One possibility could be that the tagging induced a partial loss of function of GNL3, and since GNL3 is essential, it may explain why we did not obtain homozygous clones. We may also want to use alternative degron systems such as Halo-Tag, but we believe this is out of the scope of the study.

      __ __*Global instant fork density is not quite the same as actually measuring origin firing. Ideally, it would be good to see some more direct evidence of addition origin firing e.g. by EdU-seq (Macheret & Halazonetis Nature 2018) but this would be quite a significant additional undertaking. However, given the authors have performed DNA combing with DNA counterstain, they should be able to provide accurate measurements of origin density and inter-origin distance. *

      __Response: __As indicated by the reviewer EdU-seq would need a lot of development since we are not using this approach in our team. In addition, this method can detect replication origins only if performed in the beginning of S-phase, meaning that only the early firing origins will be detected and not the others. GIFD measurement is actually directly linked with origin firing since it is counting the forks to duplicate the genome. The measurements of IODs have at least two main limitations: (1) there is a bias for short IODs due to the length of analyzed fibers and (2) it focuses only on origins within a cluster not globally. Overall, we believe that GIFD is the method of choice to measures origins firing. In addition, these experiments have been done by the lab of Etienne Schwob (see acknowledgments), a leader in the field.

      *'Replication stress' is induced with CPT. This term is frequently used to describe events that lead to helicase-polymerase uncoupling (e.g. O'Connor Mol Cell 2015) but that is not the case with CPT, which causes fork collapse and breaks. Are similar effects seen with e.g. UV or cisplatin? Additionally, a clear statement of the authors definition of replication stress would be welcome. *

      __Response: __We will better define the term ‘replication stress’ in the revised version of the manuscript. It should be understood, in our case, that any impediment that leads to replication fork stalling and measurable by DNA combing or Chk1 phosphorylation. We have not performed experiments using UV and cisplatin.

      *It is really not clear how the authors explain the link between potential changes in origin firing and resection. i.e. What is the relationship between global origin firing and resection at a particular fork, presumably broken by encounter with a CPT-arrested TOP1 complex. What is the link mechanistically? This link needs elaborating experimentally or clearly explaining based on prior literature. *

      • *__Response: __Most of our results on resection has been performed with hydroxyurea, but it is true that we saw resection in absence of GNL3 in response to CPT. Treatment with HU or CPT reduces fork speed and activates additional replication origins (see Ge et al. 2007 https://pubmed.ncbi.nlm.nih.gov/18079179/ for HU or Hayakawa et al. 2021 https://pubmed.ncbi.nlm.nih.gov/34818230/ for CPT ). When GNL3 is depleted, more forks are active, meaning more targets for HU and CPT. In addition, it is likely that the firing of additional origins in response to HU and CPT is stronger in absence of GNL3. Because of this we believe that factors required to protect stalled forks may be exhausted explaining why resection is observed. This is inspired by the work of Lukas’ lab (Toledo et al. 2013 https://pubmed.ncbi.nlm.nih.gov/24267891/) and is described in the figure 6. One obvious candidate that may be exhausted is RPA, to test this we will check if resection upon GNL3 depletion and treatment with HU is still occurring in cell lines provided by Lukas’ lab that overexpress RPA complex (described in Toledo et al.). We will explain our model more carefully in the revised version.

      *Related to this, I remain unconvinced that the experiments in Figure 3 show that the effects of ATRi and Wee1i on origin firing and on resection are contingent on each other. I do not believe that the authors have adequately supported the statement (end of pg 9) 'We conclude that the enhanced resection observed upon GNL3 depletion is a consequence of increased origin firing.' The link between origin firing and resection needs really needs further substantiation and / or explanation.

      *__Response: __Our rational was the following. Inhibition of ATR or WEE1 increase replication origin firing, a situation that may be like the one observed for GNL3 depletion. In Toledo et al, they show that inhibition of WEE1 or ATR induces exhaustion of RPA. This exhaustion is reduced in presence of CDC7 inhibitor, roscovitine (a CDK inhibitor that inhibits origin firing) or depletion of CDC45, indicating that this is due to excessive origin activation. In our case we show that the resection observed upon WEE1 or ATR inhibition is reduced upon treatment with CDC7 inhibitor. We conclude that excessive replication origin firing induces DNA resection. Since we observed the same thing upon GNL3 depletion (but not upon BRCA1 depletion) we conclude that excessive origin firing favors DNA resection likely through exhaustion of RPA. As indicated above we will test this hypothesis by overexpressing RPA. In addition, we now show that treatment with roscovitine decreases resection upon GNL3 depletion (this will be part of the revised manuscript), an experiment that we believe confirms that excessive replication origins firing is responsible for resection upon GNL3 depletion. As suggested by reviewer 1, we will also test if depletion of MCM also reduces resection observed in absence of GNL3.

      *It is not clear whether the binding of ORC2 to GNL3 also sequesters other components of the origin recognition complex? Does loss of the ability of GNL3 to bind ORC2 actually lead to more ORC bound to chromatin? How does GNL3 contribute to regulation of origin firing under normal conditions? Is it a quantitatively significant sink for ORC2 and what regulates ORC2 release? *

      Response: The results of GNL3 Bio-ID were extremely clear, we could not significantly detect any other ORC subunits than ORC2 (these data were not present in the manuscript but will be added in the revised version), therefore we believe that GNL3 may sequester/regulate only ORC2. We tried to see if GNL3 depletion was changing the binding of ORC1 and ORC2 to the chromatin, but we could not see any difference, one possibility may be that small differences are not detectable by chromatin fractionation. We believe that ChIP-seq or ORC2 or other ORC subunits in absence of GNL3 is required but this it out of the scope of the study. GNL3 may regulates the stability of the ORC complex on chromatin via ORC2 but GNL3 may also regulates other ORC2 functions, at centromeres for instance. It has been shown indeed that ORC2 plays roles possibly independently of the ORC complex (see Huang et al. 2016 https://doi.org/10.1016/j.celrep.2016.02.091 or Richards et al. 2022 https://doi.org/10.1016/j.celrep.2022.111590 for instance). How exactly this is affecting origin firing is still mysterious. This is something we are planning to address in the future.

      We do not know if it is a quantitatively sink for ORC2 or how this is regulated, however we believe that the ability of GNL3 to accumulate in the nucleolus may sequester ORC2. Consistent with this, we show that a mutant of GNL3 (GNL3-dB) that diffuses in the nucleoplasm interacts more with ORC2 in the nucleoplasm suggesting a release. As suggested by reviewer 1 we will now test if the interaction between ORC2 and GNL3-dB is dependent on the level of expression of GNL3-dB. In addition, we now show that expression of GNL3-dB increases replication origin firing like GNL3 depletion (data that will be added in the revised version), suggesting that regulation of ORC2 is the major cause of increased firing upon GNL3 depletion.

      *Minor points: *

      *All blots should include size markers *

      __Response: __We will add them

      *Some use of language is not sufficiently precise. For instance: ** - the meaning of 'DNA lesions' at the end of the first paragraph of the introduction needs to be more explicit. *

      * - the approach to measurement of these 'lesions' (monitoring gamma-H2Ax) needs to be spelled out explicitly, e.g. line 4 of the last paragraph of the introduction. *

      *

      • 'we observed that the interaction between GNL3-dB and ORC2 was stronger' ... I do not see how number of foci indicates necessarily the strength of an interaction. *

      * - in many places throughout 'replication origins firing' should be 'replication origin firing' (or 'firing of replication origins'). *

      __Response: __We will correct these language mistakes.

      __Reviewer #2 (Significance (Required)): __

      The model put forward here has the potential to shed light on an important facet of the cellular response to DNA damage, namely the control of origin firing in response to replication stress that will certainly be of interest to the DNA repair / replication community and possibly more widely. The roles of GNL3 are poorly understood and this study could improve this state of affairs. However, the gaps in the mechanism outlined above and somewhat confusing conclusions do limit the ability of the paper to achieve this at present.

      __Reviewer #3 (Evidence, reproducibility and clarity (Required)): __

      *In this study, Lebdy et al propose a new mechanism to regulate the resection of nascent DNA at stalled replication forks. The central element of this mechanism is nucleolar protein GNL3, whose downregulation with siRNA stimulates DNA resection in the presence of stress induced by HU (Figure 1). Resection depends on the activity of nucleases MRE11 and CtIP, and can be rescued by reintroducing exogenous GNL3 protein in the cells (Figure 1G). GNL3 downregulation decreases fork speed and increases origin activity, without any strong effect on replication timing (Figure 2). Inhibition of Dbf4-dependent kinase CDC7 (a known origin-activating factor) also restricts fork resection (Figure 3). GNL3 interacts with ORC2, one of the subunits of the origin recognition complex, preferentially in nucleolar structures (Figure 4). A mutant version of GNL3 (GNL3-dB) that is not sufficiently retained in the nucleoli fails to prevent fork resection as the WT protein (Figure 5). In the final model, the authors propose that GNL3 controls the levels of origin activity (and indirectly, stalled fork resection) by maintaining a fraction of ORC2 in the nucleoli (Figure 6). *

      This model is interesting and provocative, but it also relies on a significant degree of speculation. The authors are not trying to "oversell" their observations, because the Discussion section entertains different interpretations and possibilities, and the model itself contains several interrogative statements (e.g. "ORC2-dependent?"; "exhaustion of factors?").

      • While the article is honest about its own limitations, the major concern remains about its highly speculative nature. I have some questions and suggestions for the authors to consider that could contribute to test (and hopefully support) their model. *

      • *If GNL3 downregulation induces an excess of licensed origins and mild replicative stress resulting in some G2/M accumulation (Figure 2), what is the consequence of longer-term GNL3 ablation? Do the cells adapt, or do they accumulate signs of chromosomal instability? (micronuclei, chromosome breaks and fusions, etc) * __Response: __This is an important point also raised by Reviewer 2: deletion of GNL3 leads to embryonic lethality in mouse and ES cells deleted for GNL3 do not proliferate and fail to enter into S-phase. Consistent with this, the clones deleted for GNL3 that we obtained using CRISPR-Cas9 grow poorly, thus preventing us to do experiments. To our knowledge micronuclei and chromosome breaks have never been analyzed upon transient depletion of GNL3 using siRNA. However, it is well established that depletion of GNL3 induces phosphorylation of H2A.X) and the formation of ATR, RPA32 and 53BP1 foci due to S-phase arrest (Lin et al. 2013. https://pubmed.ncbi.nlm.nih.gov/24610951/ ; Meng et al. 2013. https://pubmed.ncbi.nlm.nih.gov/23798389/). DNA lesions have also been visualized by comet assay (Lin et al. 2019. https://pubmed.ncbi.nlm.nih.gov/30692636/). Consistent with this we observed a weak increased of DNA double-strand breaks upon GNL3 depletion using pulse-field gel electrophoresis as well as mitotic DNA synthesis (MiDAS). We can integrate this data in the revised version of the manuscript if required. To sum up, it is clear that GNL3 depletion is inducing problems during S-phase that may lead to possible genomic rearrangements.

      • The model relies on the link between origin activity and stalled fork resection that is almost exclusively based on the results obtained with CDC7i (Figure 3). But CDC7 has other targets besides pre-RC components at the origins, such as Exo1 (from the Weinreich lab, cited in the study), MERIT40 and PDS5B (from the Jallepalli lab, also cited). The effect of CDC7i could be exerted through these factors, which are linked to fork stability and DNA resection. The loss of BRCA1 (Figure 3F) could somehow entail the loss of control over these factors. Could the authors check the possible participation of these proteins?*

      __Response: __It is true that CDC7 has other targets than pre-RC components. We therefore decided to inhibit origin firing using roscovitine, a broad CDK inhibitor, a strategy previously used in Lukas lab (Toledo et al. 2013. https://pubmed.ncbi.nlm.nih.gov/24267891/). We observed that treatment with roscovitine decreased significantly resection observed upon GNL3 depletion, confirming the link between origin activity and stalled fork resection. This will be integrated in the revised version of the manuscript. As asked by Reviewer 1, we will also perform depletion of MCM to strength our model.

      Exo1 is indeed a target of CDC7 as shown by the Weinreich lab (Sasi et al. 2018. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6111017/) however the authors do not formally demonstrate that Exo1 phosphorylation is required for its activity. We observed that depletion of Exo1 significantly reduced resection upon GNL3 depletion (data that will be added in the revised version), indicating that the effect of CDC7 inhibitor could be exerted via the control of Exo1. This is why our BRCA1 control is important, it is well stablished that Exo1 is required for nascent strand degradation upon BRCA1 depletion (Lemaçon et al. 2017. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5643552/) but CDC7 inhibition has no effect on resection upon BRCA1 depletion suggesting that resection by Exo1 may not be regulated by CDC7 in our context.

      As stated by the reviewer MERIT40 and PDS5B are targets of DDK kinases (Jones et al. 2021 https://doi-org.insb.bib.cnrs.fr/10.1016/j.molcel.2021.01.004) and seem to be required for protection of nascent DNA and in response to HU. However, little is known about the role(s) of these proteins and we think that adding them will complicate message. We hope the reviewer understands this.

      The model also relies on the fact that GNL3-dB mutant (not retained in the nucleoli) is not sufficient to counteract fork resection induced by HU (Figure 5G). The authors should test directly whether GNL3-dB induces extra origin activation, using their available DNA fibers-based technique.

      __Response: __This is an excellent point. We have now GIFD (Global Instant Fork Density) data that shows that the number of active forks is increased upon dB GNL3-dB expression. It demonstrates that when GNL3 is no longer retained in the nucleolus more origins are active. These data will be integrated in the revised version of the manuscript, and we believe further support the regulation of ORC2 by GNL3.

      *Finally, the model implies an exquisite regulation of the amount of ORC2 protein, which could influence the number of active origins and the extent of fork resection in case of stress. In this scenario, one could predict that ORC2 ectopic expression would have similar, or even stronger effects, than GNL3 downregulation. Is this the case? *

      __Response: __We completely agree with this prediction. However, we are afraid that overexpression of ORC2 may have indirect effects due to the many described functions of ORC2, therefore it may be difficult to interpret the data. We will give a try anyway.

      *Even if the connection between origins and fork resection could be firmly established, the molecular link between them remains enigmatic. The authors hint (as "data not shown") that it is neither mediated by RPA nor RAD51. Unfortunately, the reader is left without a clear hypothesis about this point. *

      __Response: __We will add data that show that RPA and RAD51 recruitment is not affected by GNL3 depletion. However, the sensitivity of chromatin fractionation approach may be too weak to detect low differences. Based on the work of Lukas Lab (Toledo et al. 2013 https://pubmed.ncbi.nlm.nih.gov/24267891/) one possible mechanism may be exhaustion of the pool of RPA. This may link the excessive activation of origins observed upon GNL3 depletion and resection. To test this, we will check if resection upon GNL3 depletion and treatment with HU is still occurring in cell lines that overexpress RPA complex (described in Toledo et al.) that we obtained from Lukas’ lab.

      __ __ **Referees cross-commenting**

      __ __In addition to each reviewer's more specific comments, the three reviews share a main criticism: the lack of mechanistic information about the proposed link between origin activity and resection of nascent DNA at stalled forks.

      __Reviewer #3 (Significance (Required)): __

      In principle, this study would appeal to the readership interested in fundamental mechanisms of DNA replication and the cellular responses to replicative stress.

      For the reasons outlined in the previous section, I believe that in its current version the study is not strong enough to provide a new paradigm about origins being regulated by partial ORC2 sequestering at nucleoli. The other potentially interesting advance is the connection between frequency of origin activity and the extent of nascent DNA resection at stalled forks, but the molecular link between both remains unknown.


    1. All art is quite useless.

      From ELIMIMIAN 626: By saying "All art is quite useless," Wilde is not only using paradox or a rhetorical style, but enforces the concept that contraries do not imply a negation. By claiming "Vice and Virtue" form suitable "material for the artist," Wilde encourages us, as readers, to think that the realm of artistic discourse is limitless. Art enables us to view life clearly, but art, by itself, is incapable of any permanent rendering. Similarly, when we interpret art, we interpret life, since life's colors are enshrined in it. Thus, Wilde would argue Art and life are not contingent but may complement each other.

      From WINWAR 171: This line struck right in the face of Victorian materialism and was coined indiscriminately; Its defenders interpreted it as an exaltation of art, but its opposers claimed it was dangerous to established values.

      From DUGGAN 61: In this one sentence, Wilde encapsulates the complete principles of the Aesthetic Movement popular in Victorian England. That is to say, real art takes no part in molding the social or moral identities of society, nor should it. Art should be beautiful and pleasure its observer, but to imply further-reaching influence would be a mistake.

    1. Reviewer #2 (Public Review):

      The authors had two aims in this study. First, to develop a tool that lets them quantify the synaptic strength and sign of upstream neurons in a large network of cultured neurons. Second, they aimed at disentangling the contributions of excitatory and inhibitory inputs to spike generation.

      For the quantification of synaptic currents, their methods allows them to quantify excitatory and inhibitory currents simultaneously, as the sign of the current is determined by the neuron identity in the high-density extracellular recording. They further made sure that their method works for nonstationary firing rates, and they did a simulation to characterize what kind of connections their analysis does not capture. They did not include the possibility of (dendritic) nonlinearities or gap junctions or any kind of homeostatic processes. I see a clear weakness in the way that they quantify their goodness of fit, as they only report the explained variance, while their data are quite nonstationary. It could help to partition the explained variance into frequency bands, to at least separate the effects of a bias in baseline, the (around 100 Hz) band of synaptic frequencies and whatever high-frequency observation noise there may be. Another weak point is their explanation of unexplained variance by potential activation of extrasynaptic receptors without providing evidence. Given that these cultures are not a tissue and diffusion should be really high, this idea could easily be tested by adding a tiny amount of glutamate to the culture media.

      For the contributions of excitation and inhibition to neuronal spiking, the authors found a clear reduction of inhibitory inputs and increase of excitation associated with spiking when averaging across many spikes. And interestingly, the inhibition shows a reversal right after a spike and the timescale is faster during higher network activity. While these findings are great and provide further support that their method is working, they stop at this exciting point where I would really have liked to see more detail. A concern, of course is that the network bursts in cultures are quite stereotypical, and that might cause averages across many bursts to show strange behaviour. So what I am missing here is a reference or baseline or null hypothesis. How does it look when using inputs from neurons that are not connected? And then, it looks like the E/(E+I) curve has lots of peaks of similar amplitude (that could be quantified...), so why does the neuron spike where it does? If I would compare to the peak (of similar amplitude) right before or right after (as a reference) are there some systematic changes? Is maybe the inhibition merely defining some general scaffold where spikes can happen and the excitation causes the spike as spiking is more irregular?<br /> The averaged trace reveals a different timescale for high and low activity states. But does that reflect a superposition of EPSCs in a single trial or rather a different jittering of a single EPSC across trials? For answering this question, it would be good to know the variance (and whether/ how much it changes over time). Maybe not all spikes are preceded by a decrease in inhibition. Could you quantitify (correlate, scatterplot?) how exactly excitation and inhibition contributions relate for single postsynaptic spikes (or single postsynaptic non-spikes)? After all, this would be the kind of detail that requires the large amount of data that this study provides.

      For the first part, the authors achieved their goal in developing a tool to study synaptic inputs driving subthreshold activity at the soma, and characterizing such connections. For the second part, they found an effect of EPSCs on firing, but they barely did any quantification of its relevance due to the lack of a reference.

      With the availability of Neuropixels probes, there is certainly use for their tool in in vivo applications, and their statistical analysis provides a reference for future studies.<br /> The relevance of excitatory and inhibitory currents on spiking remains to be seen in an updated version of the manuscript.

      I feel that specifically Figures 6 and 7 lack relevant detail and a consistent representation that would allow the reader to establish links between the different panels. The analysis shows very detailed examples, but then jumps into analyses that show population averages over averaged responses, losing or ignoring the variability across trials. In addition, while their results themselves pass a statistical test, it is crucial to establish some measure of how relevant these results are. For that, I would really want to know how much spiking would actually be restricted by the constraints that would be posed by these results, i.e. would this be reflected in tiny changes in spiking probabilities, or are there times when spiking probabilities are necessarily high, or do we see times when we would almost certainly get a spike, but neurons can fire during other times as well.<br /> I would agree that a detailed, quantitative analysis of this question is beyond the scope of this paper, but a qualitative analysis is feasible and should be done. In the following, I am detailing what I would consider necessary to be done about these two Figures:

      Figure 6C is indeed great, though I don't see why the authors would characterize synchrony as low. When comparing with Figure 4B, I'd think that some of these values are quite high. And it wouldn't help me to imagine error bars in panel 6D.<br /> Figure 6B is useful, but could be done better: The autocovariance of a shotnoise process is a convolution of the autocovariance of underlying point process and the autocovariance of the EPSC kernel. So one would want to separate those to obtain a better temporal resolution. But a shotnoise process has well defined peaks, and the time of these local maxima can be estimated quite precisely. Now if I would do a peak triggered average instead of the full convolution, I would do half of the deconvolution and obtain a temporally asymmetric curve of what is expected to happen around an EPSC. Importantly, one could directly see expected excitation after inhibition or expected inhibition after excitation, and this visualization could be much better and more intuitively compared to panel 6E.<br /> Panel D needs some variability estimate (i.e. standard deviation or interquartile range or even a probability density) for those traces.<br /> Figure 6E: Please use more visible colors. A sensitivity analysis to see traces for 2E/(2E+I) and E/(E+2I) would be great.<br /> Figure 6F: with an updated panel B, we should be able to have a slope for average inhibition after excitation for each of these cells. A second panel / third column showing those slopes would be of interest. It would serve as a reference for what could be expected from E-I interactions alone.<br /> Figure 6G: Could the authors provide an interquartile range here?

      Figure 7A: it may be hard to squeeze in variability estimates here, but the information on whether and how much variance might be explained is essential. Maybe add another panel to provide a variability estimate? The variability estimate in panel 7B and 7D only reflect variability across connections, and it would be useful to add panels for the timecourses of the variability of g (or E/(E+I) respectively).

      As a suggestion for further analysis, though I am well aware that this is likely beyond the scope of this manuscript, I'd suggest the following analysis:<br /> I would split the data into the high and low activity states. Then I would compute the average of E/(E+I) values for spikes. Assuming that spikes tend to happen for local maxima of E/(E+I) I would find local maxima for periods without spike such that their average is equal to the value for actual spikes. Finally, I would test for a systematic difference in either excitation or inhibition.<br /> If there is no difference, you can make the claim that synaptic input does not guarantee a spike, and compare to a global average of E/(E+I).

    1. Author Response

      Reviewer #1 (Public Review):

      First, we thank the reviewer for his instructive remarks. In the following we address the queries of Reviewer 1.

      1.1) At several points, the authors make claims that I believe extend beyond the data presented here. For instance, in the Abstract (line 27), the authors state "the development of adult songs requires restructuring the entire HVC, including most HVC cell types, rather than altering only neuronal subpopulations or cellular components." The gene ontology analyses performed do suggest that there is a progression from cellular transcriptional changes to organ-level changes, however caution should be taken in claiming that "most HVC cell types" exhibit transcriptional changes. In fact, according to Fig. 3D most of the transcriptional changes appear restricted to neurons. As the authors themselves note elsewhere, claims at this resolution are difficult without support from single-cell approaches. I do not suggest that the authors need to perform single-cell RNA-seq for this work, but strong claims like this should be avoided.

      We have revised our claim to more accurately reflect our findings. Our intended message is that testosterone treatment leads to extensive transcriptional changes in the HVC, likely affecting a majority of neuronal subpopulations rather than solely targeting specific cellular components. The revised text in lines 29-32 now reads: "Thus, the development of adult songs stimulated by testosterone results in widespread transcriptional changes in the HVC, potentially affecting a majority of neuronal subpopulations, rather than altering only specific cellular components."

      1.2) Similarly the Abstract states that parallel regulation "directly" by androgen and estrogen receptors, as well as the transcription factor SP8, "lead" to the transcriptional and neural changes observed after testosterone treatment of females. However, experiments that demonstrate such a causal role have not been performed. The authors do perform a set of bioinformatic analyses that point in this direction - enrichment of androgen and estrogen receptor binding sites in the promoters of differentially expressed genes, high coexpression of SP8 with other genes, and the enrichment of predicted SP8 binding sites in coexpressed genes. However, further support for direct regulation, at the level that the authors claim, would require some form of transcription factor binding assay, e.g. ChIP-seq or CUT&RUN. I am fully aware that these assays are enormously challenging to perform in this system (and again I don’t suggest that these experiments need to be done for this work); however, statements of direct regulation should be tempered. This is especially true for the role of SP8. This does appear to be a compelling target, but without some manipulation of the activity of SP8 (e.g. through knockdowns) and subsequent analysis of gene expression, it is too much to claim that this transcription factor is a regulatory link in the testosterone-driven responses. SP8 does appear to be a highly connected hub gene in correlation network analysis, but this alone does not indicate that it acts as a hub transcription factor in a gene regulatory network.

      We appreciate the reviewer's comment and have revised the statement concerning the role of SP8. Indeed, we document the coexpression of ESR2 and SP8, and our bioinformatics analysis suggests that SP8 might play an important role in transcriptomics. We have rephrased the statement in line 29-32 as follows: "Parallel gene regulation directly by androgen and estrogen receptors, potentially amplified by coexpressed transcription factors that are themselves steroid receptor regulated, leads to substantial transcriptomic and neural changes in specific behavior-controlling brain areas, resulting in the gradual seasonal occurrence of singing behavior." In addition, we have included discussions regarding limitations of promoter sequence analyses (lines 414 to 427).

      1.3. Along these lines, the in-situ hybridizations of ESR2 and SP8 presented in Figure 5 need significant improvement. The signals in the red and green channels, SP8 and ESR2, look suspiciously similar, showing almost identical subcellular colocalization. This signal pattern usually suggests bleed-through during image acquisition, as it’s highly unlikely that the mRNA of both genes would show this degree of overlap. I would suggest that control ISHs be run with one probe left out, either SP8 or ESR2, and compare these ISHs with the dual label ISHs to determine if signal intensity and cellular distribution look similar. Furthermore, on lines 354-356 the authors write, "The fact that the two genes were expressed nearby in the same cell may indicate physical interactions between the gene pair and warrant further investigation into the nature of their relationship.". Yet, even if the overlap between ESR2 and SP8 shown in Figure 5 is confirmed, close localization of transcripts does not imply that the protein products physically interact. The STRING bioinformatic analysis is more convincing that there is a putative regulatory interaction between ESR2 and the SP8 locus, and this suggestion of protein-protein interaction is weak and should be omitted. In addition, the authors note that ESR2 has not been detected in the songbird HVC in a previous study. To further demonstrate the expression of ESR2 (and SP8) in HVC, it would be useful to plot their expression from the microarray data across the different testosterone conditions.

      We repeated the coexpression study using confocal microscopy and fluorescent RNAScope in situ hybridization, which is now reflected in the revised Figure 5 and a new Figure 5 - Supplement Figure 1. We have also moderated our statement regarding the sparse co-expression of ESR2 and SP8 in HVC neurons. While the presence of co-expressing neurons may provide some anatomical basis for the bioinformatic findings, we have been cautious in our interpretation and have stated that "SP8 and ESR2 mRNAs exhibited low expression levels in HVC, co-localizing in a subset of cells, predominantly GABAergic cells" (lines 369-370). We have removed the speculation about potential protein interaction based on mRNA distribution. Additionally, we have highlighted that SP8 and ESR2 were differentially upregulated at T14d (lines 362-363).

      1.4) My final concern lies in the interpretation of these results as generalizable to other sex hormone-modualated behaviors. On lines 452-455, the authors write, "This suggests that the testosterone (or estrogen)-triggered induction of adult behaviors, such as parental behavior and courtship, requires a much more extensive reorganization of the transcriptome and the associated biological functions of the brain areas involved than previously thought.". The experiments and argument likely apply to other neural systems to undergo large seasonal fluctuations in sex hormones and similar morphological changes. However, the authors argue that the large number of transcriptional changes seen here may generalize broadly to sex hormone modulated adult behaviors. I think there are a couple of problems with this argument. First, as described here and in past work, testosterone drives major morphological changes the song system of adult canaries; such dramatic changes are not seen for instance in sex hormone-receptive areas underlying mating behavior in adult mammals. Similarly, the study introduced testosterone into female birds which drives a greater morphological change in HVC relative to similar manipulations in males, which again may account for the large number of differentially expressed genes. I would temper the generality of these results and note how the experimental and biological differences between this system and other sex hormone-responsive systems and behaviors may contribute to the observed transcriptional differences.

      We modified this statement in lines 473-478: “The testosterone-driven changes in female HVC morphology and function represent some of the most notable modifications known in the vertebrate brain. However, how this extensive, testosterone-induced gene regulation in the HVC applies to other seasonally testosterone-sensitive brain areas remains to be seen. Endpoint analysis of testosterone-induced singing in male canaries during the non-reproductive season also indicates considerable regulation of HVC transcriptomes (Frankl-Vilches et al., 2015; Ko et al., 2021)”.

      Reviewer #2 (Public Review):

      First, we would like to express our gratitude to Reviewer #2 for the constructive feedback. We have addressed the concerns in detail below:

      2.1). The bulk of the manuscript details WGCNA, GO terms, and promoter ARE/ERE motif abundance, using the initial pairwise comparisons for each timepoint as input lists. However, there are no p/adjp values provided for these pair-wise comparisons that form the basis of all subsequent analyses. Nor are there supplementary tables to indicate how consistent the replicates are within each group or how abundantly the genes-of-interest are expressed. With the statistical tests used here, and the lack of relevant information in the supplementary tables, I cannot determine if the data support the authors’ conclusions. These omissions mar what is otherwise a conceptually intriguing line of investigation.

      We appreciate the reviewer’s concerns. Please refer to our response addressing this point and the subsequent one (2.2) together in the section below.

      Reviewer #3 (Public Review):

      We appreciate the positive feedback from the reviewer and below addressed the issues pointed out by the reviewer.

      3.1) My biggest concern is the sample size. Most of the time points only have 5 or 6 individuals represented, and I question whether these numbers provide sufficient statistical power to uncover the effects the authors are trying to explore. This is a particular problem when it comes to evaluating the supposed "transient" of testosterone on gene expression. There is currently little basis for distinguishing such effects from noise that accrues because of low power. This can be a major problem with studies of gene expression in non-model species, like canaries, where among-individual variability in transcript abundance is quite high. Thus, it is possible that one or two outliers at a given time point cause the effect testosterone at this time point to become indistinguishable from the controls; if so, then a gene may get put into the transient category, when in fact its regulation was not likely transient.

      We acknowledge that our sample sizes may appear moderate. To address the concern regarding temporal regulation analysis, we followed Reviewer 3's suggestion and conducted a probe-level power analysis (point 2 of recommendations for the authors; labelled as point 3.9 below). We then excluded differentially expressed genes with a power less than 0.8 prior to conducting temporal classification. Consequently, 93% of our differentially expressed genes demonstrated a power ≥ 0.8 (9025/9710). Following further classification by temporal regulation pattern, we identified 29 constantly upregulated, 41 constantly downregulated, 39 dynamically regulated, and 8916 transiently regulated genes. If we apply a stricter constraint by requiring each differentially expressed gene to have at least two probe-sets with a power ≥ 0.8, 83% of differentially expressed genes (8033/9710) still have sufficient power.

      We recognize that our sample size may not be sufficient to detect weakly differentially expressed genes. However, we have intentionally excluded these genes from the beginning (those with |log2(fold change)| ≤ 0.5 were excluded).

      The scenario outlined by the reviewer, where outliers might cause the effect of testosterone to blend with controls, leading to misclassification, is indeed plausible. This could occur either because the genes are weakly regulated, or because the power to detect differential expression is insufficient, thus preventing these genes from surpassing the threshold to be deemed significantly differentially expressed. However, this also illustrates that the effect of testosterone does not regulate every gene in the same way.

      We have appended a column indicating high power genes (≥ 0.8) in the DiffExpression.tsv file, available in the Dryad repository. The power analysis has been incorporated to the method section at lines 801-808 and result section at lines 188-192.

      3.2) More on the transient categorization. Would a gene whose expression is not immediately upregulated (within 1 hour), but is upregulated later on (say in the 14d group) be considered transient? If so, this seems problematic. Aren’t the authors setting the null expectation of "non-transient" as a gene that does not increase immediately after 1 hour of treatment? The authors even recognize that it is quite surprising that gene expression changes after an hour. It may be that some genes whose regulation is classified as transient are simply slower to upregulate; but, really, would we say their expression in transient per se? Maybe I’m misunderstanding the categorizations?

      We appreciate the reviewer's insightful discussion regarding the transient categorization. We understand that it is indeed more challenging for a gene to be classified as constantly regulated than transiently regulated, due to smaller effects by testosterone or being undetectable owing to low power. To address this concern, we further dissected the transiently regulated category by reporting the number of time points at which a gene is differentially expressed in Figure 2 - Figure supplement 1. Approximately half of the transiently regulated genes were only regulated at one time point, further illustrating that the effect of testosterone on gene expression was not constant during the time window we examined (see lines 184 - 187).

      3.3) The authors don’t fully explain the logic for using females in this study to measure a "male-typical" behavior (singing). My understanding is that females have underlying circuitry to sign, and T administration triggers it; thus, this situation that creates a natural experiment in which we can explore T’s on brain and behavior, unlike in males which have fluctuating T. First, it might be good to clarify this logic for readers, unless perhaps I’m misunderstanding something. Second, I found myself questioning this logic a little. Our understanding of basic sex differences and the role that steroid hormones play in generating them has changed over the last few decades. There are, for example, a variety of genetic factors that underlie the development of sex differences in the brain (I’m especially thinking about the incredible work from Art Arnold and many others that harness the experimental power of the four core genotype mice). Might some of these factors influence female development, such that T’s effects on the female brain and subsequent ability to increase HVC size and sing is not the same as males.

      Indeed, sex-chromosome dosage compensation is absent in birds leading to higher Z-chromosomal gene expression in males. We demonstrated substantial sex differences in gene expression in our earlier work [Ko, M.-C., Frankl-Vilches, C., Bakker, A., Gahr, M., 2021. The Gene Expression Profile of the Song Control Nucleus HVC Shows Sex Specificity, Hormone Responsiveness, and Species Specificity Among Songbirds. Frontiers in Neuroscience 15].

      We have revised the introduction (lines 96-98) to clarify our rationale for using female canaries as a model for adult behavioral development, not as a model for male canaries. After testosterone treatment, these females start to sing, with song structure developing over time, similar to male seasonal progression. This approach eliminates the confounding effect of fluctuating testosterone levels seen in males, supported by distinct HVC transcriptomes in testosterone-implanted singing female canaries compared to males (Ko et al., 2021).

      The revised paragraph reads as below: Female canaries (Serinus canaria) are typically non-singers, with their spontaneous songs displaying less complexity than their male counterparts (Hartley et al., 1997; Herrick and Harris, 1957; Ko et al., 2020; Pesch and Güttinger, 1985). Despite their infrequent singing, these females possess the necessary underlying circuitry that can be activated by testosterone. Following testosterone treatment, these females start to produce simple songs, which gradually evolve in structure over weeks—paralleling the seasonal progression of male singing (Hartog et al., 2009; Ko et al., 2020; Shoemaker, 1939; Vallet et al., 1996; Vellema et al., 2019). Moreover, testosterone induces the differentiation of song control-related brain nuclei in adult female canaries, a critical step for song development (Fusani et al., 2003; Madison et al., 2015; Nottebohm, 1980). In this study, we focus on these testosterone-treated female canaries as a model for adult behavioral development rather than a model for male canaries. This unique model allows us to examine transcriptional cascades in parallel with the differentiation of the song control system and the progression of song development, without the confounding impact of fluctuating testosterone levels seen in males, which often results in considerable individual differences in the non-reproductive season baseline singing behavior. This approach is backed by the observation that the HVC transcriptomes of testosterone-implanted singing female canaries are distinct from those of singing males (Ko et al., 2021).

      3.4) I was surprised by the authors assertion that testosterone would only influence several tens or hundreds of genes. My read of the literature says that this is low, and I would have expected 100s, if not 1,000s, of genes to be influenced. I think that the total number of genes influenced by T is therefore quite consistent with the literature.

      We apologize for any confusion caused by our statement. We did not mean to imply that testosterone only influences several tens or hundreds of genes, but rather that we did not expect such an extensive transcriptional regulation in the HVC by testosterone. We have clarified this in our revised manuscript, specifically in lines 450-451. Thank you for helping us to clarify this point.

      3.5) I found the GO analyses presented herein uncompelling. As the authors likely know, not all GO terms are created equally. Some GO terms are enriched by hundreds of genes and thus reflect broad functional categories, whereas other GO terms are much more specific and thus are enriched by only a few genes. The authors report broad GO terms that don’t tell us much about what is happening in the HVC functionally. This is particularly the case when a good 50% of the genome is being differentially regulated.

      We appreciate the reviewer's comment. We have added KEGG pathway enrichment analysis in Figure 3 - Figure supplement 1 as an alternative. However, we believe that the GO term enrichment results still provide valuable insights, and therefore we have retained them in Fig. 3.

      3.6) The Genomatix analyses are similarly uncompelling. This approach to finding putative response elements can uncover many false positives, and these should always be validated thoroughly. Don’t get me wrong-I appreciate that these validations are not trivial, and I value the authors response element analysis.

      We appreciate the reviewer's comment on the presence of AR or ER motifs in promoters and acknowledge that in mammals, AR and ER predominantly bind at distal enhancers rather than promoters. Our analysis focused on promoter regions due to the limitations of available tools and resources for our study species. We understand that this approach may not capture the full complexity of AR and ER regulation. We have revised our manuscript to note the limitations of our approach and clarify that the presence of AREs and EREs alone is not indicative of active receptor binding or direct regulation (lines 416-427).

      3.7) I’m sceptical about the section of the paper that speculates about modification of steroid sensitivity in the HVC. These conclusions are based on analyses of mRNA expression of AKR1D1, SRD5A2, and the like. However, this does not reflect a different in the capacity to metabolize steroids, or at least there is little evidence to suggest this. Note that many of these transcripts have different isoforms, which could also influence steroidal metabolism.

      We agree that mRNA expression levels of AKR1D1, SRD5A2, and other transcripts involved in steroid metabolism do not necessarily reflect changes in steroid metabolizing capacity. However, we believe that these changes in mRNA expression are indicative of potential changes in steroid sensitivity in the HVC, which could affect the neural response to steroids. We acknowledge that isoform differences of these transcripts may influence steroid metabolism and further studies are necessary to confirm our findings and elucidate the mechanisms underlying the observed changes in gene expression. In response to this comment, we have amended the text in lines 245-249 to reflect this consideration.

    2. Reviewer #3 (Public Review):

      I found this paper fascinating. It is a study that needed to be done in the field of behavioral endocrinology, as it addresses our understanding of exactly how steroid hormone action might regulate behavioral output like few other published studies. For decades, researchers have been implanting animals with steroids and observing corresponding changes in behavior, noting that some behavioral traits are immediately expressed, while others take time to be expressed. Why would this be? The answer lies in the temporal dynamics of steroid action, but few have ever addressed this. Having said this, I do have several issues with the manuscript that I think need to be addressed.

      1) My biggest concern is the sample size. Most of the time points only have 5 or 6 individuals represented, and I question whether these numbers provide sufficient statistical power to uncover the effects the authors are trying to explore. This is a particular problem when it comes to evaluating the supposed "transient" of testosterone on gene expression. There is currently little basis for distinguishing such effects from noise that accrues because of low power. This can be a major problem with studies of gene expression in non-model species, like canaries, where among-individual variability in transcript abundance is quite high. Thus, it is possible that one or two outliers at a given time point cause the effect testosterone at this time point to become indistinguishable from the controls; if so, then a gene may get put into the transient category, when in fact its regulation was not likely transient.

      2) More on the transient categorization. Would a gene whose expression is not immediately upregulated (within 1 hour), but is upregulated later on (say in the 14d group) be considered transient? If so, this seems problematic. Aren't the authors setting the null expectation of "non-transient" as a gene that does not increase immediately after 1 hour of treatment? The authors even recognize that it is quite surprising that gene expression changes after an hour. It may be that some genes whose regulation is classified as transient are simply slower to upregulate; but, really, would we say their expression in transient per se? Maybe I'm misunderstanding the categorizations?

      3) The authors don't fully explain the logic for using females in this study to measure a "male-typical" behavior (singing). My understanding is that females have underlying circuitry to sign, and T administration triggers it; thus, this situation that creates a natural experiment in which we can explore T's on brain and behavior, unlike in males which have fluctuating T. First, it might be good to clarify this logic for readers, unless perhaps I'm misunderstanding something. Second, I found myself questioning this logic a little. Our understanding of basic sex differences and the role that steroid hormones play in generating them has changed over the last few decades. There are, for example, a variety of genetic factors that underlie the development of sex differences in the brain (I'm especially thinking about the incredible work from Art Arnold and many others that harness the experimental power of the four core genotype mice). Might some of these factors influence female development, such that T's effects on the female brain and subsequent ability to increase HVC size and sing is not the same as males.

      4) I was surprised by the authors assertion that testosterone would only influence several tens or hundreds of genes. My read of the literature says that this is low, and I would have expected 100s, if not 1,000s, of genes to be influenced. I think that the total number of genes influenced by T is therefore quite consistent with the literature.

      5) I found the GO analyses presented herein uncompelling. As the authors likely know, not all GO terms are created equally. Some GO terms are enriched by hundreds of genes and thus reflect broad functional categories, whereas other GO terms are much more specific and thus are enriched by only a few genes. The authors report broad GO terms that don't tell us much about what is happening in the HVC functionally. This is particularly the case when a good 50% of the genome is being differentially regulated.

      6) The Genomatix analyses are similarly uncompelling. This approach to finding putative response elements can uncover many false positives, and these should always be validated thoroughly. Don't get me wrong-I appreciate that these validations are not trivial, and I value the authors response element analysis.

      7) I'm sceptical about the section of the paper that speculates about modification of steroid sensitivity in the HVC. These conclusions are based on analyses of mRNA expression of AKR1D1, SRD5A2, and the like. However, this does not reflect a different in the capacity to metabolize steroids, or at least there is little evidence to suggest this. Note that many of these transcripts have different isoforms, which could also influence steroidal metabolism.

    1. Author Response

      Reviewer #1 (Public Review):

      This manuscript confirms previous studies suggesting a great deal of heterogeneity of gene expression at the neural plate border in early vertebrate embryos, as neural, placodal, neural crest, and epidermal lineages gradually segregate. Using scRNA-seq, the study expands previous studies by using far larger numbers of genes as evidence of this heterogeneity. The evidence for this heterogeneity and the change in heterogeneity over time is compelling.

      Many studies have suggested that there is considerable heterogeneity of gene expression in the developing neural plate border as the neural, neural crest, placodal and epidermal lineages segregate. Although the evidence for such heterogeneity was strong, until the advent of scRNA-seq, the extent of this heterogeneity was not appreciated. By using scRNA-seq at different stages of chick development, the authors sought to characterize how this heterogeneity develops and resolves over time.

      The work is technically sound, and the level of analysis of gene expression, clustering, synexpression groups, and dynamic changes in gene modules over time is state-of-the-art. A weakness of the results as they stand now is that the conclusions of the analysis are not tested by the authors and thus, are over-interpreted. Such tests could be performed in future studies either by gain- and loss-of-function experiments or by using lineage tracing to demonstrate that the cell states the authors observe - especially the "unstable progenitors" they characterize - are biologically meaningful. The data will nevertheless be a useful resource to investigators interested in understanding the development of different cell lineages at the neural plate border.

      We thank the reviewer for the positive assessment of our work. We agree that our models will need to be tested experimentally in the future, however, this will require a substantial amount of work. We therefore opted to share our data as a resource to be used by the community.

      Reviewer #2 (Public Review):

      The study of Thiery et al. aims to elucidate how cells undergo fate decisions between neural crest and (pan-) placodal cells at the neural plate border (NPB). While several previous single-cell RNA-Seq studies in vertebrates have included neural plate border cells (e.g. Briggs et al., 2018; Wagner et al., 2018; Williams et al., 2022), these previous studies did not provide conclusive insights on cell fate decisions between neural crest and placodes, due to either the limited number of genes recovered, the limited number of cells sampled or the limited numbers of stages included. The present study overcomes these limitations by analyzing almost 18,000 cells at six stages of development ranging from gastrulation until after neural tube closure (8 somite-stage), with an average depth of almost 4000 genes/cell. Using this extensive and high-quality data set, the study first describes the timing of segregation of neural crest and placodal lineages at the NPB suggesting that at late neural fold stages (somite stage 4) most cells have decided between placodal and neural crest fates. It then identifies gene modules specific for neural crest and placodal lineages and characterizes their temporal and spatial expression. Focusing on an NPB-specific subset of cells, the study then shows that initially most of these cells co-express neural crest and placodal gene modules suggesting that these are undecided cells, which they term "border-located unstable progenitors" (BLUPs). The proportion of BLUPs decreases over time, while cells classified as placodal or neural crest cells increases, with few BLUPs remaining at late neural fold stages (and a few scattered BLUPs even at somite stage 8). Based on these findings, the authors propose a new model of cell fate decisions at the NPB (termed the "gradient border model"), according to which the NPB is not defined by a specific transcriptional state but is rather a region of undecided cells, which diminishes in size between gastrulation and neural fold stages due to more and more cells committing to a placodal or neural crest fate based on their mediolateral position (with medial cells becoming specified as neural crest and lateral cells as placodal cells).

      The study of Thiery et al. provides an unprecedentedly detailed, methodologically careful, and well-argued analysis of cell fate decisions at the NPB. It provides novel insights into this process by clearly demonstrating that the NPB is an area of indecision, in which cells initially co-express gene modules for ectodermal fates (neural crest and placodes), which subsequently become segregated into mutually exclusive cell populations. The paper is very well written and largely succeeds in presenting the very complex strategy of data analysis in a clear way. By addressing the earliest cell fate decisions in the ectoderm and one of the earliest cell fate decisions in the developing vertebrate embryo, this study will have a significant impact and be of interest to a wide audience of developmental biologists. There are, two conceptual issues raised in the paper that require further discussion.

      We thank the reviewer for the positive comments on our work and its significance; we have addressed the conceptual issues below and in the revised version of the manuscript.

      First, the authors suggest that their data resolve a conflict between two previously proposed models, the "binary competence model" and the "neural plate border model". The authors correctly describe, that the binary competence model proposed by Ahrens and Schlosser (2005) and Schlosser (2006) suggests that the ectoderm is first divided into two territories (neural and non-neural), which differ in competence, with the neural territory subsequently giving rise to the neural plate and neural crest and the non-neural territory giving rise to placodes and epidermis (sequence of cell-fate decisions: ([neural or neural crest]-[epidermal or placodal]). This model was proposed as an alternative to a "neural plate border state model", which instead suggests that initially the NPB is induced as a territory characterized by a specific transcriptional state, from which then neural crest and placodes are induced by different signals (sequence of cell fate decisions: neural-[placodal or neural crest]-epidermal) (see Schlosser, 2006, 2014). Instead in this paper, the authors contrast the binary competence model with a model they call the "neural plate border" model according to which the NPB can give rise to all four ectodermal fates with equal probability. However, I think this misses the main point of contention since all previously proposed models are in agreement that initially the neural plate border region is unspecified and can give rise to all four fates and that lineage restrictions only appear over time. "Binary competence" and "Neural plate border state" model, differ, however, in their predictions about the sequence, in which these fate restrictions occur.

      We appreciate the reviewer's thoughtful feedback, but respectfully disagree with their comment regarding the sequence of events predicted by the neural plate border (NPB) model. While the NPB model does suggest that the NPB is a transcriptionally distinct state, it does not make specific predictions about the sequence of fate decisions. Although several papers cited in the Schlosser 2006 and 2014 reviews suggest that the NPB gives rise to all four ectodermal fates, none of them (and, to the best of our knowledge, no other primary paper referring to the NPB model) specifically defines the sequence of fate specification from the NPB.

      The key points of the NPB model are that the NPB is defined by overlapping expression of early neural/non-neural markers (which is also observed in Xenopus – see Pieper et al., 2012 supplementary material), contains progenitors for all four ectodermal fates, and that this "state" exists prior to the emergence of definitive neural crest and placodal cells.

      To investigate the heterogeneity in the order of cell fate decisions at the NPB, we carried out additional pairwise co-expression analyses of forebrain, mid-hindbrain, neural crest, and placodal gene modules, which reveals multiple different hierarchies of cell fate choice depending on a cell's axial positioning, as shown in Figure 6-figure supplement 1.

      Considering these findings, we have expanded our discussion of the previously proposed binary competence and neural plate border models to highlight how neither of these models is sufficient to fully characterize the heterogeneity in cell fate decisions observed in our study. We hope this clarification will help address any concerns the reviewer may have had about the NPB model and its implications for our results.

      Second, the authors should be more careful when relating their data to the specification or commitment of cells. Questions of specification and commitment can only be tested by experimental manipulation and cannot be inferred from a transcriptome analysis of normal development. So the conclusion that the activation of placodal, neural and neural crest-specific modules in that sequence suggests a sequence of specification in the same temporal order (lines 706-709) is not justified. Studies from the authors' own lab previously showed that epiblast cells from pre-gastrula stages are specified to express a large number of NPB border markers including neural crest and panplacodal markers, when cultured in vitro (Trevers et al., 2018; see also Basch et al., 2006 for early specification of the neural crest), which is not easily reconciled with this interpretation. I am not aware of any experimental evidence that shows that a panplacodal regulatory state is specified prior to neural crest in the chick (although I may have missed this). In Xenopus, experimental studies have shown instead that neural crest is specified and committed during late gastrulation, while the panplacodal states are specified much later, at neural fold stages (Mancilla and Mayor, 2006; Ahrens and Schlosser, 2005). It may well be the case that the relative timing of neural crest and panplacodal specification is different between species (and such easy dissociability may even be expected from the perspective of the binary competence model).

      We very much agree with the reviewer that the definitions and correct terminology is important and apologise for lack of clarity. We have reworded the text carefully.

      The reviewer is correct: specification of neural crest, placodes and neural plate is observed very early in chick, prior to gastrulation. However, in specification experiments tissue is removed from its normal environment to reveal what it does autonomously in the absence of additional signals. In the current study, we assess the activation of gene modules in normal development. We have therefore reworded the text to avoid ‘specification’ in this context.

      Reviewer #3 (Public Review):

      The goal of this work was to better understand how cell fate decisions at the neural plate border (NPB) occur. There are two prevailing models in the field for how neural, neural crest and placode fates emerge: (i) binary competence which suggests initial segregation of ectoderm into neural/neural crest versus placode/epidermis; (ii) neural plate border, where cells have mixed identity and retain the ability to generate all the ectodermal derivatives until after neurulation begins.

      The authors use single-cell sequencing to define the development of the NPB at a transcriptional level and suggest that their cell classification identified increased ectodermal cell diversity over time and that as cells age their fate probabilities become transcriptionally similar to their terminal state. The observation of a placode module emerging before the neural and neural crest modules is somewhat consistent with the binary competence model but the observation of cells with potentially mixed identity at earlier stages is consistent with the neural plate border model.

      Differences in the timing of analyses and techniques used can account for the generation of these two original models, and in essence, the authors have found some evidence for both models, possibly due to the period over which they performed their studies. However, the authors propose recognizing the neural plate border as an anatomical structure, containing transcriptionally unstable progenitors and that a gradient border model defines cell fate choice in concert with spatiotemporal positioning.

      The idea that the neural plate border is an anatomical structure is not new to most embryologists as this has been well-recognized in lineage tracing and transplantation assays in many different species over many decades.

      We appreciate the reviewers comment and agree that the neural plate border has previously been characterised anatomically. However, many studies have applied the term literally in reference to a transcriptional state which is specified through the expression of ‘neural plate border specifiers’, prior to segregation of the placodes and neural crest. Here we highlight that treating the neural plate border as a definitive transcriptional state which can be identified through the expression of ‘neural plate border specifiers’ is false. Instead, we find these ‘specifiers’ are upregulated within either neural crest, placodal or neural cell lineages over time. Cells at the neural plate border co-express these alternate lineage markers and therefore predicted to be undecided.

      The authors don't provide molecular evidence for transcriptional instability in any cells. It's a molecular term and phenomenon inaccurately applied to these cells that are simply bipotential progenitors.

      We thank the reviewer for pointing this out; we have therefore refrained from using the term unstable and instead refer to the cells as ‘undecided’ as suggested by reviewer 2.

      Lastly, there's no evidence of a gradient that fits the proper biochemical or molecular definition. Graded or sequential are more appropriate terms that reflect the lineage determination or segregation events the authors characterize, but there's no data provided to support a true role for a gradient such as that achieved by a concentration or time-dependent morphogen.

      We agree with the reviewer that ‘gradient’ was misleading. We have now replaced ‘gradient’ with ‘graded’ and expanded figure 6 to highlight the graded co-expression of gene modules associated with alternate fates. We have changed the title to reflect this.

      A limitation of the study is that much of it reads like a proof-of-principle because validation comes primarily from known genes, their expression patterns in vivo, and their subsequent in vivo functions. Thus, the authors need to qualify their interpretations and conclusions and provide caveats throughout the manuscript to reflect the fact that no functional testing was performed on any novel genes in the emerging modules classified as placode versus neural or neural crest.

      We agree with the reviewer that we do not provide any functional data to validate our predictions; it is for this reason that we submitted the manuscript as a ‘resource’ to make our data available to the community.

      Lastly, a limitation of gene expression studies is that it provides snapshots of cells in time, and while implying they have broad potential or are lineage fated, do not actually test and confirm their ultimate fate. Therefore, in parallel with their studies, the authors really need to consider, the wealth of lineage tracing data, especially single-cell lineage tracing, which has been performed using the embryos of the same stage as that sequenced in this study, and which has revealed critical data about the potential cells through when and where lineage segregation and cell fate determination occurs.

      The reviewer rightly points out the significance of the classical experiments in the context of the neural plate border. However, only one of the mentioned studies (Bronner-Fraser and Fraser, 1989), analyses cells at a single-cell level and does not assess placodes, while the remaining studies use tissue transplantation or cell population labelling. Although these studies provide valuable insights, they do not examine the fate or potential of single cells, nor do they reveal the transcriptional signature of these progenitors.

      Our findings emphasize the transcriptional heterogeneity at the neural plate border, suggesting that distinct subsets of neural plate border progenitors undergo varying sequences of fate restrictions. The upcoming challenge will be to conduct clonal analysis alongside scRNAseq to determine if neural plate border progenitors with similar transcriptional signatures experience the same fate restrictions or if external factors, such as cell-cell signalling, dictate cell fate choices.

      We have amended the manuscript to clarify that predictions of fate decisions require future validation through lineage tracing. Additionally, we have acknowledged in the introduction that previous studies have demonstrated the intermingling of neural, neural crest, and placodal progenitors at the neural plate border.

    1. Author Response

      Reviewer #2 (Public Review):

      Kim et al. examined the properties of neuronal connections responsible for inhibitory cell activation to show that the characteristics examined were similar in humans and rodents. This is important, as it suggests that the many rodent studies carried out over the past decades are physiologically relevant to humans.

      Strengths

      1) Human brain tissues are difficult to obtain, hence the study provides valuable insights

      2) An impressive multipronged approach was used for cell classifications

      3) Despite the lack of novel findings, the revelation of the similarities between human and rodent synapses is important and has far-reaching implications. This important finding suggests the knowledge generated from rodent research is, at least partly, physiologically relevant to and transferrable to humans.

      Weaknesses

      1) The study is descriptive by design, and hence provides limited conceptual advances, especially with the retrospect that synaptic properties are similar between humans and rodents (although see strength #3). For example, very similar findings and techniques have already recently been reported by a number of the same authors in the Campagnola et al., Science 2022 paper.

      We agreed that stimulus protocols of connectivity assays with multiple patch-clamp recordings in this study had been adapted from the recent publication (Campagnola et al., Science 2022). In this previous study, especially for human synaptic connectivity data, the main cell type categorization was at the level of excitatory and inhibitory neurons which identified based on morphological features and observed PSP characteristics (e.g., direction of membrane potential changes) when it connected each other. However, we went further to identify interneuron subclasses in the connectivity assays using virally labeled slice cultures and post-hoc HCR staining in addition to intrinsic classifier, which is not investigated from the recent publication (Campagnola et al., 2022). Therefore, following scientific findings and their implications are not the same shown in the previous study and we think this study provides a significant advance of our understanding in human cortical circuits organization.

      2) Despite the fact that normal physiology was reported, the use of pathological human brain tissue could affect the results.

      We agreed that the use of pathological human brain tissue to investigate normal physiology is not ideal, however, as mentioned in the METHODS below (section of “Acute slice preparation”), our surgically resected neocortical tissues show minimal pathology, and we believe these tissue preparations can be used to address normal physiological properties of human neurons. Importantly, we saw no effect of disease state (epilepsy vs. tumor) on the intrinsic or synaptic properties that we measured. Our METHODS state that “Surgically resected neocortical tissue was distal to the pathological core (i.e., tumor tissue or mesial temporal structures). Detailed histological assessment and using a curated panel of cellular marker antibodies indicated a lack of overt pathology in surgically resected cortical slices (Berg et al., 2021).”. We also state in the RESULTS that “These tissues were distal to the epileptic focus or tumor, and have shown minimal pathology when examined (Berg et al., 2021). Brain pathology was evaluated using six histological markers that were independently scored by three pathologists. Surgically resected tissues have been used extensively to characterize human cortical physiology and anatomy (Berg et al., 2021).”. Lastly, this is the best possible human tissue available for us to conduct physiological experiments. It is an unavoidable caveat of this work that our healthy brain tissue was derived from a donor brain exhibiting a serious disease.

      3) The manuscript may not be easy to understand for the uninvited, because many concepts and abbreviations were not properly introduced.

      Thank you for pointing this oversight out. We updated our manuscript and made sure that we fully describe all abbreviations. We now changed the abbreviation of MPC back to multiple patch-clamp recording, and some other abbreviations such as LAMP5, SLC17A7, DLX are now better explained. We have also changed the order of multiple figures (i.e., Figure 5 – Figure supplements to Figure 3 – Figure supplements) and removed some complicated figures (e.g., Figure 1 – Figure supplement 1) to present the data in a fashion that can be understood by a more general reader.

      4) The statistical treatment is not ideal, so some conclusions may not be valid.

      We performed additional statistical analyses as suggested and implemented in the text of the RESULTS.

      Furthermore, we also made additional Figure supplements (Figure 4 – Figure supplement 3, Figure 4 – Figure supplement 4, Figure 6 – Figure supplement 2, and Figure 6 – Figure supplement 3) to support our conclusions.

      5) The mixed usage of acute and cultured slices is not ideal and likely affects the outcome.

      We agree that the mixed usage of acute and cultured slices is not ideal, and it could affect the interpretation of outcome. Therefore, we performed additional analyses to see if there is any correlated change of synaptic property (i.e., paired pulse ratio) along the days after slice culture (now implemented in Figure 4 – Figure supplement 4 and Figure 6 – Figure supplement 3) and we didn’t find any significant correlation. However, we noticed the short-term synaptic dynamics are rather differentiated between acute and slice culture condition shown in Figure 4 – Figure supplement 1d. We think this is due to sampling bias rather than tissue preparation difference and these points are now more carefully described in the DISCUSSION as “This difference we observed in this study, i.e., more facilitating synapses were detected in slice cultures than in acute slices, could either reflect an acute vs. slice culture difference. However, we believe it is more likely to reflect a selection bias for PVALB neurons when patching in unlabeled acute slices, and that the AAV-based strategy with a pan-GABAergic enhancer allows a more unbiased sampling of interneuron subclasses whose properties are preserved in culture. In support of this, PPR analysis as a function of days after slice culture shows no relationship to acute versus slice culture preparation (Figure 4 – Figure supplement 4, Figure 6 – Figure supplement 3). Furthermore, we have observed that viral targeting of GABAergic interneurons greatly facilitates sampling of the SST subclass in the human cortex compared to unbiased patch-seq experiments (Lee et al., 2022), and this selection bias likely explains synapse type sampling differences in cultured slices compared to acute preparations.”.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      In this manuscript, Wang and colleagues demonstrate that a single systemic injection of a high dose of Akkermansia muciniphila (A.m.) lysate drives a rapid pancytopenia followed by prolonged anaemia and hepatosplenomegaly with late-onset extramedullary hematopoiesis (EMH). The latter, as well as the splenomegaly, were likely mediated through activation of pattern recognition receptors and IL-1R signalling pathways. This was demonstrated through the partial and full phenotype reversal in Tlr2;4-/- and MyD88;Trif-/- mice, respectively. Moreover, the phenotype was partially reversed following IL-1R antagonism. After performing multiplex protein assays and flow cytometry, the authors conclude that EMH, in this model, is mediated by IL-1a produced within the spleen by local monocytes and DC.

      Overall, the manuscript by Wang et al. is quite well presented, the experiments are mostly well controlled, the methods are well reported, and the data fit a clearly defined story with clinical relevance. Nevertheless, there are several major concerns that if addressed would greatly increase the strength of the authors conclusions.

      Major comments

      1. The "two wave" hypothesis of hematopoiesis - first in the bone marrow (BM) and then in the spleen - is interesting. However, although an early wave of BM hematopoiesis would make sense, under these circumstances, I don't think the data are strong enough to support this hypothesis as they stand. For example, although the frequency of LSK cells increase, the numbers of most LSK subsets decrease. Given the decrease in the absolute number of BM cells 1d after A.m. injection, isn't it possible that the LSK cells are only proportionally increased relative to the remaining Lin- cells? What happens to the absolute number of LSK cells following A.m. injection?

      Also, describing "two distinct waves of HSPC increase in the A.m.-injected spleen" (Fig 2 & S2 titles) and describing a "first wave" of HSPC expansion in the BM (lines 396, 399, 402 etc.) is misleading for the following reasons: (i) the data strongly support a single wave of increasing HSPC in the spleen, peaking at d14, and (ii) there is no evidence HSPC are increased in the BM until d56, although there does appear to be an early increase in MPP. The language should be changed accordingly. 2. The flow cytometry panel is not comprehensive enough to fully characterize the mature hematopoietic cell populations to the levels that are claimed here. For example, it is a stretch to assume that all B220- CD3- CD11c- cells are DC (splenic NK cells, eosinophils, monocytes and red pulp macrophages, for example, can express CD11c, particularly following inflammatory insult), or that CD11b+ F4/80+ SSC-hi cells are eosinophils, especially when eosinophils should be F4/80-lo are not known to express Ly6C in the spleen (For reference, see Immgen). These gating issues may explain the conspicuous absence of macrophages (should be F4/80+CD11b+Ly6C- and would also have a higher SSC than monocytes) in the plots. The B cell gate will also contain PDC, which express B220 (but can be easily excluded using Ly6C and CD11c). With respect to assessing the mature leukocyte populations in the spleen, relabelling the gates (CD11c+ cells instead of DC, F4/80+ myeloid cells instead of eosinophils) would suffice, however, these issues become a problem when trying to identify which cell populations express IL-1a.

      Due to the limited antibody panel used here, there is not enough evidence to suggest that DC and monocytes are producing IL-1a. Moreover, the histograms showing the changes in expression of IL-1a on the "DC" and "Mo" are not very convincing. How does the IL-1a staining look on a dot plot? Is there good separation between positive and negative? These plots need to be included. What happens if you gate on the IL-1a+ cells first, then phenotype them?

      Macrophages and splenic stromal cells are also likely candidates for IL-1a production. To assess which cell types are the true source of IL-1a, the authors need to repeat these experiments (namely, injecting A.m. and assessing IL-1a expression by leukocytes (and ideally also mesenchymal cells)) at d1 and d14, using a more comprehensive panel. Consider adding MHCII, CD64, Siglec F and CD24 to help differentiate between DC, MF, eosinophils and monocytes. CD45+ vs CD45- could be used as a minimum to assess the expression of IL-1a on leukocytes vs. stroma.

      OPTIONAL: The mechanism could be better defined using bone marrow chimeras to assess the different contribution of TLR2/4 signalling and IL-1R signalling on the hematopoietic vs. mesenchymal cell compartments. 3. From these experiments, it is difficult to fully rule out a contribution from the adaptive immune system to the splenomegaly phenotype due to the marked difference in the size of BALB/c and MSTRG spleens at steady state. The authors should show the differences in spleen weight and total cell number as a % increase from control. The no of HSPC should also be normalized per gram of tissue weight or represented as a fold change compared to the relevant control groups. 4. When using fluorescent imaging to compare the abundance of HSPC and other cell populations in the spleen, the authors should provide absolute quantification from multiple FOV and multiple mice. 5. Finally, although the experiments are adequately replicated, the stats are not always appropriate. For example, a t-test shouldn't be used when there are >2 groups, or for a time course. This needs to be amended.

      Minor comments

      • Line 82-83: I'm fairly certain monocytes and inflammatory Ly6Chi cells are the same thing.
      • Line 83-84: "IL-1a is crucial for sustaining inflammatory responses, recruiting myeloid cells to infected tissue and inducing hematopoietic stem and progenitor cell (HSPC) mobilization and expansion both in vitro and in vivo" - I don't believe IL-1a has been shown to be crucial for either, even if it has been shown to play a role. If I am mistaken, please reference with a manuscript showing relevant phenotypes using KO mice.
      • Line 214: "Thus, we decided to use 200ug of lysate for the rest of all experiments." - is this what was usen for Figures 1A-C? This is not mentioned anywhere.
      • Line 227: "containing both non-hematopoietic cells and immature HSPCs" Please reference Fig. 1H here. Otherwise, it is unclear how you identified the "HSPC and other cell types" in Fig. 1G.
      • Figure S2A is described in text before Supp 1I-O and Fig S1H is not referenced in text at all.
      • It would be interesting to include what happens to hepatomegaly in MSTRG, Tlr2;4-/- and MyD88;Trif-/- mice.
      • Please define WBM. Presumably whole bone marrow?
      • Notably, CCL2 is increased in spleen lysate, BM lysate and serum. Given is role in myeloid cell mobilization from the BM, I would expect its role in the phenotype described here to at least be discussed.
      • HSPC LT gate includes MPP1, and should be labelled as such.

      Significance

      General assessment: The manuscript provided by Wang et al. describes, for the first time, a prolonged anaemia and hepatosplenomegaly with late-onset extramedullary hematopoiesis following a single systemic injection of A.m. lysate. The EMH phenotype appears robust and the data implicating TLR-signalling and IL-1a production are compelling. The work has clinical relevance as it increases our understanding of the factors driving EMH.

      There are two key limitations that let this study down. Firstly, the lack of depth in the flow cytometry panel used for immunophenotyping means it is not at all clear which cell types are producing IL-1a. Secondly, the authors use an enormous dose of bacterial lysate that is well above physiological levels, even following a loss of barrier integrity (e.g., in patients with IBD). This makes me question the biological relevance of the study, particularly with respect to Akkermansia translocation.

      Advance: With some improvement, this study will advance the field, in general. Previous work has looked at EMH following LPS injection, or live E. coli infection, however; the authors are able to demonstrate a distinct Akkermansia-specific effect that differs to that of LPS, membrane components of a different gram-negative bacteria, B. theta. The advancements implicate IL-1a in the modulation of EMH, for the first time, providing some mechanistic insight into this phenomenon.

      Audience: This work will likely be of interest to basic researchers interested in EMH. It may also be of interest to clinical researchers of pathologies where EMH is a known complication, such as rheumatoid arthritis and cancer. The impact of the work will depend on whether or not EMH contributes to pathogenesis, or is an epiphenomenon. To my knowledge, this has not been fully established, although this is not my area of research.

      I am a basic researcher with expertise in immunology focused on host-microbe interactions, both within the intestine and at distal tissues. I have knowledge of BM hematopoiesis and the microbial factors that influence if although my knowledge on extramedullary hematopoiesis is limited.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The manuscript presents a detailed numerical model of blood flow in a region of the zebrafish vasculature.

      The results section is quite intense and detailed. it is difficult to understand what the authors are after. I think a rewrite would beneficial. The authors present simulations for a wild type and a couple of phenotypes. For each of these they speculate on the possible adaptation mechanism leading to the discussed phenotype, as preservation of constant wall shear stress. However, the comparison between experiments and numerical simulations is really elusive as the conclusions on those mechanisms. Overall we suggest a rewrite with clearer organisation in a way that the reader is not overflown with useless details.

      We thank the reviewer for the advice on the general writing standard and data organization. We have reanalyzed experiment data and interpreted the findings more conservatively for application into the simulation models. As a result, some conclusions to the results sections have changed. Accordingly, we have done a major revision of the entire Results, Discussion and Models and Methods sections in the paper to articulate these reinterpretations while removing superfluous details that may obfuscate the data.

      It is not always clear what info of the experiments are used in the simulations on top of the anatomy. Our understanding is that the pressure boundary conditions are set to match the red blood cel velocity observed in experiments. Is this always the case for the three phenotypes and which vessels ?

      We thank the reviewer for the question. Only WT and Marcksl1 KO have been matched for peak velocities in the CA, CV and ISVs between experiments and simulations. WT results were compared to both the experimental reference of 27 embryos in Table 3 and also to the current experiment pool of WT (5 embryos) in Table 6. Marcksl1 KO simulation models 1, 2 and 3 were compared against the average level seen in the low and moderate perfusion Marcksl1 KO phenotypes (8 embryos) from the experiment (Table 5 and Table 6). Additionally, we also have represented the similar level of RBC hematocrit in the CA for WT model to WT experiment data from the reference cited in Table 3.

      In addition to the velocity comparisons, we now use the experimentally observed trend of decreased flow rate in the CA of Marcksl1 KO experiment data to assess the model boundary conditions amongst Marcksl1 KO models 1, 2 and 3 that best reflect the experimental observations:

      Page 11, lines 1 to 20

      The Marcksl1 OE cannot be matched because we do not have the experiment data for that, the same goes for PlxnD1 where we have no experiment flow data. These two networks represent more conceptual discussions, particularly in PlxnD1 case where we have explicitly stated in the new discussion section:

      Page 15, lines 24 to 34

      There are about 7 inlets and outlets where to impose pressure boundary conditions. Can the author comment on the uniqueness of this problem?

      Can different combination of pressure boundary condition leading to the same result ? In how many points/vessels is the measured velocity matched ?

      We thank the reviewer on this insightful concern. Indeed, the uniqueness of flow and pressure field can be a problem without careful consideration. We have tried to address this to some extent, because CA, CV are connected by the ISV and DLAV network, to match flow velocity in all regions, the pressure distribution ought to be unique to the particular setting we employed.

      As shown in table 3, average systolic peak flow velocities in the entire CA and CV encompassing the 5 ISV segment domain is matched between the simulation and the population-averaged experimental data from the experimental reference (27 fish sampled in the cited reference) for the same regions in WT network. Average systolic peak flow velocities for the 10 ISVs in the simulation were matched against WT experiment population-averaged systolic peak flow velocities in arterial and venous ISVs in the same caudal region.

      Additionally, we also compared the flow velocities to the experiment conducted within this study (5 WT, and embryos). This comparison data is shown in Table 6. Admittedly the discrepancy was large for CV and ISVs regions likely due to a smaller data set sampled in this study and biological variations that happen from one experiment to another. We have acknowledged this deficiency in the revised discussion section:

      Page 15, lines 3 to 9

      The argument that similar beating frequency in the WT and GATA1 MO suggest pressure does not change is not clear. If the heart was a volumetric pump it would impose the same flow rate, not the same pressure. It would be more useful to measure the cardiac output in terms of flow rate in the Dorsal Aorta. Previous measurements by Vermot suggested the latter would not change much in gata1 MO. It could be that the cardiac output is the same but the vasculature network is different in a way that the shear stress remain the same. It does not look like this was checked by the authors.

      We thank the reviewer for this insight. In accordance with the reviewer’s suspicion, we have estimated the flow rates in the CA of gata1 MO injected embryos and found the level to be similar to WT. This supports the reviewer’s opinion that the heart rate similarity indicates cardiac output similarity and not arterial pressure similarity as we previously put forward. Furthermore, we have checked that the gata1 morphants do in fact present reduced ISV diameters. In light of this reinterpretation, we performed an additional zero hematocrit model (model 3 in section 2.1). We have consequently rewritten the entire section on how RBC hematocrit modulates hemodynamics in a microvascular network:

      Page 6, line 18 to page 8 line 10.

      Additionaly, it would be useful to provide an effective viscosity for the different vessels, and an effective hydraulic impedance relating DP and Q to interpret the results.

      We have followed the reviewer’s advice and have analyzed for vessel hydraulic impedance and effective viscosity in all the network models presented. This is included in the main figures and discussion. The vessel impedances are discussed for the various models in these following parts of the manuscript:

      Page 9, lines 20 to 29

      Page 11, lines 28 to 30

      Page 12, line 1 to page 13 line 10

      Is the hydraulic impedance of the vessels kept constant in the smooth-geometry model? This needs clarification

      The SGM diameters have been determined based on geometric averages and not impedance equivalency. The reason why we did this is because the impedance will not be known until the CFD is performed for the WT network. This is because without a pressure distribution (which cannot be determined experimentally) we cannot calculate vessel impedance since only flow can be measured and both flow and pressure are requirements to impedance calculation. Our intention with the SGM is to highlight how geometric averaging of morphological characteristics lead to incorrect flow and stress predictions. However, we understand the reviewer’s sensibility and have revised the entire section of the SGM results. We have now discussed three SGM models with varying degrees of geometry simplification. The SGM1 in the revised manuscript matches WT network impedance in the ISVs by including both axial variation in lumen diameter of the WT network and the elliptical fit representation of cross-sectional skewness seen in WT ISV lumens. SGM 2 has representation of axial variation but not luminal skewness and SGM3 has only geometric average similarity to WT ISVs. The new findings and discussion can be found in the revised manuscript here:

      Page 8, line 19 to page 9 line 36.

      As mentioned by the authors they propose a very complex and time expensive simulation. However the results they report are kind of intuitive. Given the availability of the experimental results, would it be useful to use a simpler red blood cell model in the future, to make their simulation more practical? Or clarify when such demanding simulations can add something new?

      We agree that the intuition feedback depends on the expertise of the investigator. The boundary condition selection is intuitive from the experimental findings and key data like pressures in the network cannot be measured. Furthermore, population-averaged flow data does not always match the flow-to-geometry situations that vary from sample to sample, thus demonstrated by the high margin of prediction discrepancy for flow velocities in table 6. We have discussed these challenges and our recommendations for improvement in the Discussion section:

      Page 15, lines 3 to 9

      Page 15, lines 35 to 40

      Page 16, lines 12 to 15

      On the topic of RBC model simplification, we agree with the reviewer that our work suggests the methodology would benefit from a further coarse-graining approach to the RBC phase. Accordingly, we discussed the possibility of using a low-dimensional RBC model already published in literature:

      Page 14, lines 13 to 17

      The authors should check their references as this is not the first time work has been done on the topic. Would be good to have a check in the work of Freund JB and colleagues, as well as Dickinson and colleagues and Franco and colleagues to discuss how the work compares. There may be interesting work in modelling cardiac flow forces in the embryo too.

      Thank you for referring us to other publications that are related to our study. To our knowledge and after performing publication search on these authors, we find that although Dickinson and colleagues performed experiments to examine the effects of perturbed blood flow on vessel remodelling (Udan et al., 2013), they did not perform any numerical modelling to calculate hemodynamic forces such as WSS and luminal pressure. Instead, changes in vessel morphogenetic process were only correlated with blood flow velocity. In our study, we attempt to quantitatively correlate WSS and pressure distributions within a vascular network. Franco and colleagues (Bernabeu et al., 2014) developed PoINet to model haemodynamic forces in mouse retina model of angiogenesis. From what we understand, PoINet is different from our 3D CFD model by 1) not having red blood cells incorporated in their model and as such, the blood viscosity prediction is modelled using shear-rate dependent formulation and not through red blood cell hematocrit, 2) cross sections of blood vessels are assumed to be circular and therefore have no irregularity and 3) live imaging of blood flow is difficult in mouse retina therefore preventing accurate boundary conditions for the model.

      We have included the reference to work of Franco and colleagues:

      Page 14, line 28 to line 31

      Page 9, lines 12 to 14

      Freund JB indeed has had extensive work on RBC and cellular flow in microvessels. We have included a reference of his work in:

      Page 14, lines 22 to 25.

      Reviewer #1 (Significance (Required)):

      The authors discuss the applicability of a detailed numerical model of blood flow in a region of the zebrafish vasculature.

      We are not expert in the lattice boltzmann method used here, but the results are what it would be expected from a physical stand point, and together with the information from the method section, we do not have major concerns about the numerics.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: The authors report corroborating numerical-experimental studies on the relationship between morphological alterations (e.g. vessel lumen dilation/constriction, network mispatterning) and hemodynamical changes (e.g. variation in flow rate, pressure, wall shear stress) in the vascular network of zebrafish trunk circulation. Various physiological or pathological adaptation scenarios were proposed and tested, with a range of simulation and experiment models. Where I found it a solid piece of work supported by abundant data, certain aspects need to be clarified/enhanced to improve the scientific rigor and potential impact of the manuscript. Below are my detailed comments in the hope of helping the authors improve the manuscript's quality.

      Major comments:

      1. Cellular blood flow in vascular networks has been extensively studied in recent years by existing computational models (some of which were published open-source) with similar methods and features to the one proposed by the present work. Can the authors be more explicit about the original contributions of the current model, and provide evidence accordingly (e.g. Github repository or code resources)

      The RBC model is essentially the model developed by Fedosov and colleagues (Fedosov, et al., 2010). Likewise, the LBM solver for fluid flow calculation is not. Following the reviewer’s advice, we have removed the details of these non-novel aspects of the methodology and placed them in sections E and F of supplementary material instead. The new Models and methods now show condensed descriptions of the three numerical solvers used and the addition of a grid independence matrix discussion section:

      Page 17, line 8 to page 20, line 33.

      Crucial details for the simulation setup and model configuration are missing. What were the exact boundary conditions (e.g. inlet and outlet pressures) and initial conditions (e.g. feeding hematocrit of RBCs), and how the numerical-experimental validation process of "to match the velocities of various segments of the network by iteratively altering the pressure inputs ..." as stated on page 13 (lines 1-2) was performed for simulations in this work?

      We apologize for the vagueness of our description on how numerical to experimental validations were performed. As replied to reviewer 1 for a similar clarification, we have indicated in Table 3 how average systolic peak flow velocities in the entire CA and CV encompassing the 5 ISV segment domain were matched between the simulation and the population-averaged experimental data for the same regions in WT network. Average systolic peak flow velocities for the 10 ISVs in the simulation were matched against WT experiment population-averaged systolic peak flow velocities in arterial and venous ISVs in the same caudal region.

      With regards to what iterative alterations of pressure inputs mean, we monitored the average systolic peak velocities and hematocrit levels in CA, CV and ISVs in intervals of 5 cardiac cycle intervals before manually correcting the pressure input levels to better match average systolic peak velocities in these vessels from the experiment averages. Since we are using population averaged flow data, we do not expect their levels to match the levels in a particular fish-specific geometry, the degree of discrepancy between experiment averages and the model predictions of systolic velocities can be large (Table 6). Admittedly, this is one of the weaknesses of our approach and this limitation is stated in the Discussion section:.

      Page 15, lines 3 to 9

      As RBC flow typically requires roughly 5 cardiac cycles of flow to reach flow development this process of iterative correction typically takes place over 10 to 20 cardiac cycles. We understand that validation may be a subject of keen interest to readers, hence we have now briefly described the solution initialization and flow development protocol in our modeling approach here:

      Page 6, lines 5 to 8

      What lattice resolution was used for the flow solver and was the RBC membrane mesh chosen accordingly? Were there any sensitivity analysis (regarding pressure input) or grid-independence study (regarding lattice resolution)

      We originally decided on the grid (∆X) and time (∆T) discretization resolutions (0.5 µm and 0.5 µs) based on the acceptable computing turnaround time for each model within our scale of resources. We have now included a section on the grid independence matrix in Models and Methods:

      Page 19, line 20 to page 20, line 33

      Details of the statistical tests (type of tests used, assessment of data normality, sample size etc.) should be given in the figure caption where applicable (e.g. Fig. 3C, Figs. 7-9).

      We apologize for the lack of clarity. All statistical tests used have now been mentioned at least once in each section of results and also in Figure captions wherever significance bars are displayed.

      The regression models should also be used with caution, e.g. in Fig. 4B, why should data from two different fish types, namely Gata1 MO and WT, be grouped to fit a linear regression model?

      We understand the reviewer’s concern that two population data sets should not be carelessly pooled together for regression analysis without adequate justification. In this case we are utilizing gata1 morpholino injection as a means to alter hematocrit level. There is no reported side-effect as to the best of our knowledge, only hematocrit and possibly hemodynamics and morphological response related to hematocrit level should be affected. Moreover, we have mislabelled the companion set to the gata1 morpholino as WT, the data is in fact data from control morphants and not WT. This change has been applied to Fig. 3 graphs and Table 4 and results section:

      Page 7, lines 3 to 16

      Finally, as we want to generate a continuum range of varying hematocrit for embryos of the same developmental age. In this regard, we think that within the scope of our intentions and well-accepted usage of gata1 morpholino as a hematocrit reduction protocol it is reasonable to pool the two data sets together for regression analysis.

      4.I found the data presented in Fig. 7 insufficient to confidently exclude the numerical models 2, 3 but favor model 1 as the adaptation scenario for the Marcksl1KO case. The first question is, how are the threshold RBC perfusion levels determined to categorize the experimented Marcksl1KO fishes into four groups, i.e. "high", "moderate", "low", "zero"? The authors also need to justify why the "high", "moderate", "low" groups can be mapped to the three modelling scenarios (namely models 1, 2, 3) is it just because "a qualitative match with the experimental trend of ascending CA blood velocity" (Fig. 7F)?

      We thank the reviewer for his interpretation of our results. Firstly, we apologize for generating the confusion but we are not trying to map simulation models 1, 2 and 3 to high moderate and low groups respectively in Fig. 7. The high, moderate and low categorizations of experimental Marcksl1 KO phenotypes are based on RBC flux levels observed experimentally. We are trying to ascertain which Marcksl KO phenotype the models 1, 2 and 3 fit, if they do fit the experiment trend at all.

      Second, in Fig. 7C, it is shown that no significant difference exists between the "high" group and the WT in their average ISV diameter, then what is defining that group as Marcksl1KO type ?

      We apologize for the confusion generated. High flow phenotype is similar to WT flow, the diameter is also similar to WT. In Marcksl1 KO mutants we don’t always see clear phenotyping and often a range is presented from mutant to mutant. Hence the high group is essentially morphometrically and hemodynamically similar to WT, the only reason we know it is a mutant because we have genotyped the zebrafish (marcksl1a-/-;marcksl1b1-/-).

      Third, a central assumption here is using heart rate as a measure of the pressure drop in different fish individuals (Fig. 7D). Can't two fishes with similar heart rate have distinct pressure drops in the trunk due to difference in network architecture and topology, vice versa?

      We agree with the reviewer’s opinion and now feel that our initial proposition was naïve. After addressing the interpretation of heart rate similarity in the gata1 morphants with more convincing CA flow rate estimations, we now believe that heart rates might not be useful indicators of flow or pressure levels in the network. Instead, cardiac output in the form of CA flow rate as reviewer 1 has suggested might be a better indicator. As the reanalysis has dismantled the earlier interpretation, and found that based on the flow rate estimation for the CA, Marcksl1 KO networks have reduced blood flow rates in the CA.

      Page 11, lines 9 to 20

      This finding has been incorporated into the consideration of flow adaptation scenarios predicted by the simulation models accordingly in the revised manuscript:

      Page 12, line 1 to page 13, line 10

      Fourth, the authors should explain why a power-law fit (note that it is not "exponential" as stated on page 10, line 3) should be adopted for the regression analysis in Figs. 7E-v,vi (a useful reference may be Joseph et al. eLife 2019: 10.7554/eLife.45077).

      We thank the reviewer for the useful reference and the careless mislabeling of regression curve used. This figure has been redone and a linear regression is instead used that does not attempt to imply any physical law for a power or exponential fitting.

      Change made: Fig. 7C

      Minor comments:

      1. The state of art of cell-resolved blood flow models employed to simulate microcirculatory hemodynamics is not accurately described in the introduction (page 4). More recent works should be cited and critically reviewed to present a fair view on the novelty of the computational model developed herein.

      We apologize that the models were mentioned in a passing manner. However ,the need for brevity in introduction somewhat limits their expansion. We have instead gave more direct discussion on similar studies and their relevance to our present work in the Discussion section:

      Page 14, lines 13 to 31

      It is unclear what "realistic representation of local topologies in the network" (page 7, lines 28-31) means as a claim of novelty. If it means vessel "diameter variation", this geometric feature has been modeled by the works the author referenced (namely Roustaei et al. 2022, Zhou et al. 2021). If it means something else, for example, unsmooth or non-circular vessel surface (or "irregularity of the local endothelium surface" as mentioned on page 5, line 2), then strangely the effects of such features are actually not described in the manuscript.

      We apologize for not meeting the expectation of novelty as claimed. We see value in the SGM study matrix have now generated data on three SGM scenarios. The SGM1 in the revised manuscript matches WT network impedance in the ISVs by including both axial variation in lumen diameter of the WT network and the elliptical fit representation of cross-sectional skewness seen in WT ISV lumens. SGM 2 has representation of axial variation but not luminal skewness and SGM3 has only geometric average similarity to WT ISVs. Essentially the comparison between SGM1 and SGM2 highlights the role of luminal cross-sectional shape skewness while SGM2 to SGM3 highlights the role of axial variation in luminal diameter. With this new SGM data set, we think we can better qualify the aspiration of demonstrating how vessel shape “irregularities” can alter network hemodynamics. The new findings and discussion can be found in the revised manuscript here:

      Page 8, line 19 to page 9 line 36.

      Why should Fig. 8 contain data from Marcksl1KO model 2? The scenario underlying model 2 was rejected earlier in the manuscript (see point 6 above), and the Marcksl1KO model 2 data are not mentioned in the text when describing the results of Fig. 8, either.

      We have reanalyzed the experiment trend and rewritten the outcome of this results section. In summary, both models 1 and model 2 meet the trend of flow rate reduction (with respect to WT levels) in the CA observed in the experiment. Hence, model 2 inclusion is relevant to the WSS analysis. The changes pertaining to this can be found here:

      Page 11, line 9 to page 13 line 10.

      It is a dense article with loads of data, which is an advantage but only if appropriately streamlined. More subheadings should be considered, especially for section 2.3 (for which the current subsections appear mistaken, 2.3.1 followed by 2.4.2) The manuscript could also benefit from restructuring through optimal combination of simulation visualizations and quantitative analyses. For example, in Fig. 6, not all simulation snapshots are needed here (it is difficult to visually compare the changes between different cases), whereas some quantification in the form of histograms or boxplots will be handy for the readers to note the variation of WSS magnitudes and ranges.

      Thank you for the advice, we removed the unnecessary graphical plots and refer to simulation videos in supplementary data instead for such cases. The bad indexing of results subsections has been fixed, while new subsections have been made for better directional narrative to the paper. These changes are colored in red throughout the revised results section:

      Page 4, line 37 to page 13 line 39

      Related to point 8, the authors could also consider integrating or synthesizing the analyses for individual aISVs and vISVs presented in various figures. Current descriptions for the ISV data appear scattered with frequent exceptions to the summarized trends or relationships. Some minor formatting issues should also be addressed, e.g. the confusing color codes in Figs. 9D-i, E-i.

      Thank you for the advice, we have now pooled aISVs together into one group and vISVs into another, instead of discussing data trends on each of the 10 ISVs.

      The mispattening case presented in the end of the results section (section "2.4.2") is interesting but appears loosely connected to the preceding contents. Also, it seems not even mentioned in the discussion section.

      We agree that the mispatterning case has been only tangentially relevant to the rest of the manuscript. We have linked the topic thematically by network alterations transforming network flows. It is also now included in the discussion section here:

      Page 15, lines 30 to 34

      Finally, apart from the effect of topological features on local blood flow, the authors should consider the global flow redistribution arising from the network structure (useful refs. Include Chang et al. PLOS Computational Biology 2017: 10.1371/journal.pcbi.1005892; Meigel et al. Physical Review Letters 2019: 10.1103/PhysRevLett.123.228103; Schmid et al. eLife 2021: 10.7554/eLife.60208).

      Thank you for the additional references. These are solid pieces of work that have been added to the discussion here:

      Page 16, lines 3 to 10

      **Referees cross-commenting**

      This review report resonates with mine from an experimental perspective and I agree with all points made regarding issues of the current manuscript that the authors need to address with a revised version.

      Reviewer #2 (Significance (Required)):

      Significance: The particular merit of the work lies in its comprehensiveness of design and abundance of data, which will be of great interest to both the computational and experimental communities in this research field. However, some crucial details (especially with respect to the modelling aspects) are missing, thus hampering the scientific rigor and potential impact of the work. Furthermore, certain justifying statements appear speculative and inconclusive to explain the obtained data, especially regarding the effect of boundary conditions and systemic parameters. The citation of references (some not cited, some cited already but not properly discussed) also needs to be enhanced with engaging discussions to better bridge the findings of the current work (e.g. RBC partitioning in vascular network, effect of WSS on vasculature morphogenesis) with recent works on this research topic.

      References

      Fedosov DA, Caswell B, Karniadakis GE. 2010. A Multiscale Red Blood Cell Model with Accurate Mechanics, Rheology, and Dynamics. Biophys J 98:2215–2225. doi:10.1016/j.bpj.2010.02.002

      Freund JB, Goetz JG, Hill KL, Vermot J. 2012. Fluid flows and forces in development: functions, features and biophysical principles. Dev Camb Engl 139:1229–45. doi:10.1242/dev.073593

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The manuscript presents a detailed numerical model of blood flow in a region of the zebrafish vasculature.

      The results section is quite intense and detailed. it is difficult to understand what the authors are after. I think a rewrite would beneficial. The authors present simulations for a wild type and a couple of phenotypes. For each of these they speculate on the possible adaptation mechanism leading to the discussed phenotype, as preservation of constant wall shear stress. However, the comparison between experiments and numerical simulations is really elusive as the conclusions on those mechanisms. Overall we suggest a rewrite with clearer organisation in a way that the reader is not overflown with useless details.

      It is not always clear what info of the experiments are used in the simulations on top of the anatomy. Our understanding is that the pressure boundary conditions are set to match the red blood cel velocity observed in experiments. Is this always the case for the three phenotypes and which vessels ? There are about 7 inlets and outlets where to impose pressure boundary conditions. Can the author comment on the uniqueness of this problem? Can different combination of pressure boundary condition leading to the same result ? In how many points/vessels is the measured velocity matched ?

      The argument that similar beating frequency in the WT and GATA1 MO suggest pressure does not change is not clear. If the heart was a volumetric pump it would impose the same flow rate, not the same pressure. It would be more useful to measure the cardiac output in terms of flow rate in the Dorsal Aorta. Previous measurements by Vermot suggested the latter would not change much in gata1 MO. It could be that the cardiac output is the same but the vasculature network is different in a way that the shear stress remain the same. It does not look like this was checked by the authors.

      Additionaly, it would be useful to provide an effective viscosity for the different vessels, and an effective hydraulic impedance relating DP and Q to interpret the results.

      Is the hydraulic impedance of the vessels kept constant in the smooth-geometry model? This needs clarification

      As mentioned by the authors they propose a very complex and time expensive simulation. However the results they report are kind of intuitive. Given the availability of the experimental results, would it be useful to use a simpler red blood cell model in the future, to make their simulation more practical? Or clarify when such demanding simulations can add something new?

      The authors should check their references as this is not the first time work has been done on the topic. Would be good to have a check in the work of Freund JB and colleagues, as well as Dickinson and colleagues and Franco and colleagues to discuss how the work compares. There may be interesting work in modelling cardiac flow forces in the embryo too.

      Significance

      The authors discuss the applicability of a detailed numerical model of blood flow in a region of the zebrafish vasculature.

      We are not expert in the lattice boltzmann method used here, but the results are what it would be expected from a physical stand point, and together with the information from the method section, we do not have major concerns about the numerics.

    1. Author Response:

      The following is the authors' response to the original reviews.

      We would like to thank all Reviewers for their careful evaluation of our work. Below please find our responses and comments.

      Reviewer #1 (Recommendations For The Authors):

      1) The detection of cell-released GLP-1 is addressed in an indirect, averaged way in Fig. 2 - Supplement 1. This question seems like a good opportunity for an antagonist experiment (Exendin-9), which presumably would require much lower concentrations than those used to antagonize a saturating dose of GLP-1. It would also be much more convincing if GLPLight1 could be used to detect stimulated release of GLP-1 from the GLUTag cells.

      We tried multiple times to acutely stimulate GLUTag cells using Forskolin and IBMX, but unfortunately we did not observe any robust fluorescence increase of GLPLight1. The only observation that was consistent was the higher baseline fluorescence of GLPLight1, and the reduced maximal response to saturating GLP-1 when GLPLight1 expressing HEK cells were cultured overnight with GLUTag cells. We considered this assay to be at best qualitative and — despite the aforementioned attempts — could not determine quantitative values.

      2) The excitation-ratiometric response of the sensor, shown in Fig. 1D, is usually accompanied by strong pH-dependence of sensor function. It would be valuable to characterize this pH-dependence, using permeabilized cells in which the pH is changed; the ability of small (0.2-0.5 unit) pH changes to produce changes in fluorescence, as well as to affect the dynamic range of the sensor, should be characterized. This will prevent the misidentification of agents that affect cellular pH as having (for instance) an inhibitory effect on the binding of GLP-1 to GLPLight.

      The pH sensitivity of cpGFP-based sensors is a valid concern. However, considering that the cpGFP module from GLPLight1 is intracellular (and thus largely protected from potential extracellular pH changes) we assume that GLPLight1 signal should be robust in most in-vivo or cell-based assays. In fact we have previously characterized this for a similarly-built neuropeptide sensor (PMID: 35145320) and believe that this will be the case also for GLPLight1.

      3) The reported Kd for Exendin-9 is in the low nM range. Please explain the partial response at 1000x the concentration (including a discussion of the Kd of GLP-1 itself, as well as its off kinetics, and a comparison of this assay to the assays used previously).

      The partial response is due to the presence of 1 uM GLP-1 in the imaging buffer, which is in constant competition with Exendin-9 for the binding to GLPLight1. Because GLP-1 has similar affinity as Exendin9 (see for example PMIDs: 34351033 and 21210113) and both are present at saturating concentration, we did expect to observe a partial response from GLPLight1. In this study, we did not exactly determine the on and off kinetics of both GLP-1 and Exendin9 on the GLPLight1 sensor due to technical challenges: to perform these experiments, we would need to set up a perfusion system where we could remove the unbound ligand and either wash off the bound ligand with buffer or compete it out with an antagonist. Unfortunately, we currently do not have access to such a set up.

      4) Are the turn-on kinetics in Fig. 2C limited by drug application or by association? Are the on-rates much slower for the lower concentrations used for Fig. 2C? This is important for knowing how fast responses are likely to be at the lower concentrations likely to be achieved by endogenous release.

      If we consider Fig 2B and 2C, we assumed the on-kinetics to be mostly driven by association since the ligand is expected to be homogeneously distributed.

      The on-rate kinetics are indeed slower when lower concentrations of GLP-1 are used as shown in (Figure 2b) where we observe a TauOn of 4.7s with 10 uM GLP-1 and much slower kinetics when GLP-1 is applied a 1 uM for example (Figure 3d). As a result, we chose to incubate the ligand with GLPLight1 expressing cells for at least 30 minutes before the measurement of the dose-response to be close to equilibrium.

      5) The parameters for the fitted dose-response curves in Fig 2C should be listed. The ~4x discrepancy between the dose-response in HEK-293 cells and neurons should be discussed. Are there known auxiliary subunits, dimerization, or lipid dependence that might account for this? It seems important to understand this if the sensors are to be used in an assay that may compare different systems.

      We added the EC50 values to Fig 2C as requested. We did not consider a 4x discrepancy to be significant, because the measurement error in the EC50 region is relatively high and this difference seemed to be within the error range. In fact, the 95% confidence interval ranges are 7.8 to 11.1 nM in Neurons and 23.8 to 32.1 nM for HEK cells, if we consider the upper and lower boundaries of each, the difference drops to around 1-fold. We also performed a statistical test to compare the two fits (Extra sum of squares F-test) that confirmed the two fits were not significantly different (P value = 0.3736). Of course, the interaction partners and membrane composition are different in HEK cells and neurons and probably have an influence on the EC50 of GLPLight1, but their exact influence is unclear.

      6) It seems surprising that removal of the endogenous N-terminal secretory sequence is actually helpful for membrane expression. Do the authors have any suggested explanation for this?

      GLPLight1 contains an N-terminal hemagglutinin (HA) secretory motif. The hmGLP1R sequence that we chose also contained an endogenous secretory sequence that most likely interfered with the membrane transport mechanism and resulted in a lower sensor expression with both secretory sequences. We thus decided to keep the HA instead of endogenous to remain consistent with other sensors created in-house.

      7) In Fig. 1, supplement 3, are the transient responses real? Do they occur with the control construct?

      While we have not measured the G-protein recruitment on GLPLight-ctr, we have often observed this phenomenon for various receptors and ligands. The transient responses are thus most likely an artifact after manual addition of the ligand possibly due to:

      -       Temperature difference

      -       Exposure of the plate to ambient light before resuming measurement (phosphorescence)

      -       Re-suspension of the cells affecting the proximity to the detector

      -       Other unknown variables

      If these responses were real, we would also expect them to be more sustained over time.

      8) Please include a sentence or two explaining the luminescence complementation assay, and a reference.

      We updated the results section of the manuscript with a section describing the luminescence complementation assay along with a reference:

      “Next, we compared the coupling of GLPLight1 and its parent receptor (WT GLP1R) to downstream signaling. We first measured the agonist-induced membrane recruitment of cytosolic mini-G proteins and β-arrestin-2 using a split nanoluciferase complementation assay (Dixon et al., 2016). In this assay both the sensor/receptor and the mini-G proteins contains part of a functional luciferase (smBit on the sensor/receptor and LgBit for Mini-G proteins) that becomes active only when these two partners are in close proximity (Wan et al., 2018).”

      Bravo to the authors for already making the sensor plasmids available at addgene.com. It would be helpful to include the plasmid IDs and/or a URL in the manuscript.

      We would like to thank Reviewer #1 for noticing this. We have updated the data availability section of the manuscript and added the AddGene plasmid numbers of the constructs generated in this study.

      Reviewer #2 (Recommendations For The Authors):

      1) There are some parts of the introduction that need clarification. For example, GLP1 is quoted as an anorexigenic peptide, however, that is probably only true for centrally- derived GLP1. There is no evidence that enteroendocrine-derived GLP1 (the major pool) is anorexigenic- it is likely to be substantially degraded by DPPIV before reaching the brain. In any case, the discovery of GLP1 was always one of glucose-dependent insulin secretion, with the brain system being described decades later. Overall, the intro needs to be slightly reframed. While the tools presented here are more useful for assessment of central GLP1-releasing circuitry, they are ultimately based upon GLP1R signaling that is much better validated in the periphery.

      We have slightly reframed the introduction accordingly.

      2) "The human GLP1R (hmGLP1R) is a prime target for drug screening and drug development efforts, since GLP-1 receptor agonists (GLP1RAs) are among the most effective and widely-used weight-loss drugs available to date (Shah and Vella, 2014)." GLP1R was for two decades the breakthrough drug for treatment of type 2 diabetes mellitus and correction of glucose tolerance as assessed through HbA1c. It is only through reporting on millions of patients receiving GLP1RA that the weight loss effects were noted, leading to Phase1-3 trials and eventual approval for obesity indication. Again, some slight reframing of the introduction is required here.

      Also for this point, we have slightly reframed the introduction accordingly.

      3) GLP1 was applied at a maximal dose of 10 uM, which is 10-fold higher than maximal. Can the authors confirm absence of cytotoxic effects of exposing to peptide at such concentration? Ex4 (9-39) at such concentrations is usually cytotoxic at least in primary tissue.

      We did not observe any obvious cytotoxic effect of GLP-1 at this concentration in HEK293T cells or Neurons.

      4) "As expected, GLPLight1 responded to both GLP1RAs with almost maximal activation, on par with GLP1 (Figure 2a)." Such a claim is difficult to interpret without concentration-response curves, since the maximal concentration of liraglutide and semaglutide might not have been achieved in these experiments.

      We agree with this statement is difficult to interpret without further clarification. We know from the literature that GLP-1, liraglutide and semaglutide all have very high affinity to the hmGLP1R (PMID: 31031702). We also proved that GLPLight signal saturates at concentrations above 1 uM of GLP-1 (figure 2C), we thus applied a 10x excess of all ligands and considered this signal as maximal.

      5) "These results indicate that GLPLight1 can serve as a direct readout of pharmacological drug action on the hmGLP1R with higher temporal resolution than previously available approaches, such as downstream signaling assays (Zhang et al., 2020)." Many investigators use cAMP imaging to investigate GLP1R signaling, which is arguably of similar spatiotemporal resolution, also with the advantage of FRET quantification in some cases (e.g. EpacVV). Direct GLP1R signaling can also be inferred using cell lines heterologously-expressing GLP1R. Thus, the advantage of the current probes is that they can be used to readout direct GLP1R activation in native cells/tissues where promiscuous class B binding might limit signaling measures or where endogenous GLP1 release needs to be investigated.

      We have edited the manuscript text accordingly.

      6) "State-of-the-art techniques for detecting endogenous GLP-1 or glucagon release in vitro from cultured cells or tissues consist of costly and time-consuming antibody- based assays (Kuhre et al., 2016) or analytical chemistry procedures (Amao et al., 2015)." Agreed, but non-specificity/cross-reactivity of such assays is more prohibitive/problematic (e.g. against glicentin).

      We have edited the introduction accordingly.

      7) The studies using co-culture of GLUTag and GLP1Light1-HEK293 cells, whilst interesting, are not entirely convincing in their current form. Firstly, co-culture could influence GLP1Light expression levels (can the authors label FLAG?). Secondly, specificity of the response is not tested e.g. by adding Ex4 (9-39). Thirdly, titration with GLUTag conditioned media is not performed.

      We partially addressed this issue in the answer to comment #1 from Reviewer #1. We previously performed a FLAG staining of GLPLight1 in the presence or absence of GLUTag cells and we did not notice any obvious difference. This goes in line with the fact that GLPLight1 is signaling inert, and the presence of GLP1 should not interfere with the surface expression of the sensor. We also checked that HEK293T cells did not express high levels of GLP1R according to the BioGPSCell line Gene Expression profile (https://maayanlab.cloud/Harmonizome/gene_set/HEK293/BioGPS+Cell+Line+Gene+Expression+Profiles).

      We also tried to add GLUTag media after stimulation in bolus to GLPLight1 expressing cells and observed no response. This indicated that the “sniffer” cells must be present in close proximity to GLUTag cells for an extended period of time to observe any substantial difference in response, justifying our choice of experimental setup.

      8) "Given that our photocage was placed at the very N-terminus of photo-GLP1, our results show that this caging approach prevents the peptide's ability to activate GLP1R but, at the same time, preserves its ability to interact with the ECD." An alternative hypothesis is that PhotoGLP1 does activate GLP1R, but this is undetectable with the sensitivity of GLP1Light. PhotoGLP1 cAMP concentration-response assays are needed (uncaged versus cage) to properly characterize and validate the compound (as would be standard for any newly-described GLP1R peptide ligand).

      While we agree that there is a chance that Photo-GLP1 could activate GLP1R at high concentrations, we think that the characterization of Photo-GLP1 has to be determined by the end user directly with the technique of choice (GLPLight1 in our case) in order to get a reliable comparison of potency and efficacy. We modified the text accordingly to more accurately reflect the direct conclusions from our data, as follows:

      “our results show that this caging approach prevents the peptide's ability to activate GLPLight1”.

      9) "Surprisingly, GLPLight1 shows a fluorescent response in all three uncaged areas, while its fluorescence remained unaltered throughout the rest of the FOV, indicating high spatial localization of the response to GLP-1 (Figure 3f)." Why is this surprising?

      We agree that this result is, indeed, not surprising and would like to thank Reviewer #2 for spotting this mistake, which has now been corrected in the manuscript.

      10) The localized PhotoGLP1 experiments are interesting and show the utility of the ligand. There is however activation outside of the region of uncaging, which would argue against a pre-bound ECD mode of action. Possibly some PhotoGLP1 is pre- bound to the ECD, and some is freely diffusing? Alternatively, the scan area might be below the diffraction limit/accuracy of the microscope?

      We would like to thank Reviewer #2 for this comment and agree with their observation. There could be some free Photo-GLP1 that gets photo-activated and binds regions around the uncaging area (similar to what has been observed for Photo-OXB:,PMID: 36481097). The activation around the uncaging area could also be due to lateral diffusion of the activated receptor on the membrane. There is also most likely some light diffraction at the uncaging area that could account for this phenomenon. To increase the spatial resolution, future studies could involve uncaging during sensor imaging via two-photon microscopy.

      11) What was the rationale for caging native GLP1, which is then susceptible to DPPIV-mediated degradation? Would the N-terminal cage and first 2 amino acids also not be cleaved by DPPIV, thus rendering the tool of limited in vivo application? Conversely, PhotoGLP1 provides a template for similar light-activated (stabilized) GLP1R agonists such as Ex4 or liraglutide.

      Thank you for making us aware of this (in vivo) limitation. We designed photoGLP1 as a tool for neurobiological experiments in the brain, where DPPIV expression would be low compared to peripheral organs (https://www.proteinatlas.org/ENSG00000197635-DPP4/tissue). We also envisage that the presence of the photocage would be enough to hinder the binding to DPP4 that cuts the first 2 AA. This hypothesis, however, was never tested experimentally, and we, therefore, acknowledge the limitation in the manuscript. We would furthermore like to thank the reviewers for his comment on additional photo-caged GLP1 agonists, which could be developed future studies.

      12) It wasn't clear how GLP1Light could be used as a HTS screen for drug discovery? Surely, conventional systems (e.g. GLP1R + BAR/Ca2+/cAMP reporting) allow signal bias, an important component of GLP1RA action, to be assessed. Or could GLP1Light1 be used as a pre-screen to exclude any ligands that do not orthosterically bind GLP1R?

      We would like to thank Reviewer #2 for this comment and would like to offer some clarification. We indeed thought that GLPLight1 could be used as a first line of screening to exclude ligands that do not bind in the orthosteric pocket. It is also a rather flexible method as the fluorescence increase of those sensors can be monitored using various techniques/devices that are available in most labs (e.g. microscopy, plate reader, flow cytometry).

      13) Limitations of GLP1Light1 and PhotoGLP1 are not acknowledged in the discussion.

      We would like to thank Reviewer #2 for pointing out the lack of description of the limitations of these tools, which have now been added to the Discussion.

      14) Full characterization of PhotoGLP1 is missing, to include UV/Vis, Tr and HRMS.

      PhotoGLP1 was fully characterized by UV/Vis and HRMS, and all experimental and analytical data was uploaded as supplementary data when the manuscript was initially submitted for publication in eLife.

      Reviewer #3 (Recommendations For The Authors):

      1) The ~1000 fold lower EC50 for GLP1 of GLPLight1 compared with native GLP1R needs to be openly acknowledged as a major limitation of the sensor, as this will substantially reduce the types of experiment for which it will be useful. Because it needs 1000 times higher GLP1 levels than wild type GLP1R to be activated, it is unlikely, for example, to be useful for monitoring the dynamics of activation of native GLP1R in vivo. The claim that the sensor could be used for in vivo imaging for fibre photometry is therefore an exaggeration.

      We would like to first thank Reviewer #3 for this comment and to further provide some clarification. We recognized that the data presented in this manuscript might have been confusing when comparing the affinity of GLP1R (using cAMP) and GLPLight1 (using the fluorescence increase because there is no coupling to cAMP). We believe that the low EC50 measured in the cAMP assay cannot accurately be compared to GLPLight1 response because it is an enzymatically amplified process. In order to support this claim, we included another set of experiments where we titrated agonist- induced recruitment of miniGs protein to the GLP1R receptor and found an EC50 of 3.8 nM for native GLP-1 using this assay (added as panel l in Figure1 Supplement 3). We thus confirmed that the nature of the assay itself has a drastic influence on the EC50 measured and it is not unusual to observe 100x fold difference of EC50 for the same receptor-ligand pair.

      We believe that the miniGs protein recruitment is a better comparison to GLPLight1 because it is not enzymatically amplified. This assay reveals that GLPLight1 has around 8-fold lower affinity to GLP1 compared to its parent receptor, which is in line with the EC50 loss observed previously for other GPCR-based sensors of this class. We are thus confident that GLPLight1 has to potential to be used in vivo under specific circumstances, specifically in brain tissue. We elaborated on this point in the Discussion part of the manuscript.

      2) Fig2 suppl 1 is described as demonstrating a reduced response of GLPLight1 to GLP-1 when HEK cells with were cultured with GLUTag cells. However, it is speculation to conclude that this is because GLP1Light1 was partially pre-activated by endogenous GLP-1, without demonstrating the response of GLPLight1 before and after GLUTag cell stimulation. Unless additional data are generated, the presented data do not convincingly demonstrate that GLP1Light1 can detect GLP1 released from GLUTag cells.

      We would like to thank Reviewer #3 for this comment which has been addressed already in the replies to Comment#1 from Reviewer #1 and Reviewer #2.

      3) The authors should openly acknowledge that photo-uncaging the GLP1 probe might not be very helpful for monitoring the temporal dynamics of the GLP1-GLP1R interaction, because unless all the photocaged glp1 is released by the light stimulus, the activation of photo-released GLP1 will be slowed by the remaining caged GLP1, and the dynamics will be slower than for native GLP1. This makes it unsuitable for many temporal questions, although it might be useful to deliver GLP1 in a spatial restricted manner.

      We do agree that the biggest advantage of Photo-GLP1 is its ability to be activated in a very localized manner. We also agree that the presence of caged Photo-GLP1 will influence the binding of the uncaged GLP-1. Nevertheless, there is still an advantage of using Photo-GLP1 in some assays such as pharmacological activation on brain slices. In fact, we have shown for our Photo-OXB molecule that the perfusion of OXB was much slower at eliciting neuronal depolarization compared to uncaging of Photo- OXB (see PMID: 36481097). We think that this was mainly due to the slow diffusion kinetics of the peptide into the brain tissue. We also think that uncaging can provide a more controlled activation with varying laser power and uncaging duration.

      4) To claim (as currently in the discussion) that GLPLight1 has potential to be used for investigating the dynamics of endogenous GLP1, the authors would need to compare the dynamics of the GLP1Light sensor with wild type GLP1R. We do not know that its activation dynamics will reproduce native glp1r.

      We would like to thank Reviewer #3 for this comment and would like to offer some clarification. Since GLPLight1 does not couple to intracellular signaling, it was impossible to compare its activation kinetics to GLP1R WT using the same assay. However, we can offer a relative comparison since we know that GLPLight1 takes around 50 seconds to be activated using 1 µM GLP-1 (figure 2B) and that it takes a similar time for GLP1R to be activated in the miniG protein recruitment assay (Fig 1 Supplement 3) using 100 nM GLP-1. Considering that GLPLight1 has a lower affinity than the GLP1R (8-10x lower), we think that the activation kinetics of both the sensor and GLP1R are comparable.

      Additional comments:

      1) In fig 2A,B, it is not clear whether the trace shows a partial reversal of GLP1- triggered activation by Ex9, or Ex9-independent receptor desensitization. A control trace is required to show the kinetics of GLP1-triggered activation without the addition of Ex9.

      We would like to thank Reviewer #3 for this comment. We can exclude the possibility of Ex9-independent desensitization because GLPLight1 has been shown to be signaling inert to all G-proteins, Beta arrestin-2 and cAMP. Moreover, we have observed that the fluorescence signal was stable for more than 30 minutes for the GLP-1 titrations, even at high concentrations of ligand.

      2) It would be helpful if the pEC50 for WT GLP1 were also shown in table 1, for comparison with the GLP1 mutants.

      We would like to thank Reviewer #3 for this comment, and we have now added the respective pEC50 for WT GLP1 to Table 1.

      3) Fig2 suppl 1. The methods and analysis for this figure are inadequately explained. To show that the HEK-GLPLight1 cells are responding to GLP1 released from GLUTag cells, the GLPLight1 response needs to be shown before and after GLUTag cell stimulation with an agent that should trigger GLP-1 release.

      We would like to thank Reviewer #3 for this comment which has been partially addressed already in the replies to Comment#1 from Reviewer #1 and Reviewer #2.

      Since we did not observe any response to acute stimulation of GLUTag cells we considered the high glucose concentration present in the culture media being a stimulation agent for GLUTag cells, which has been previously reported (PMID: 17643200).

      4) Fig 3g and others: The end of the photo activation period needs to be represented correctly on the timeline. In 3g, the bar that should indicate when photoactivation was applied does not end at the zero time point (which is labelled as the time relative to photoactivation).

      We would like to thank Reviewer #3 for pointing this out. The shaded area representing the photo-activation has been matched accordingly.

      5) Discussion para 1: the authors claim their data show that ligand induced activation of human GLP1R occurs more slowly than others similar GPCR sensors - they should give actual data to substantiate this claim, since the time course of glp1r activation has not been analysed and compared with other sensors in the manuscript.

      We added data to support this claim to the discussion: “As a reference, other previously-characterized class-A GPCR-based neuropeptide biosensors showed sub- second activation kinetics (Duffet et al., 2022a; Ino et al., 2022).”

      6) Methods: what wavelength was used for recording emission from GLP1Light1? The excitation wavelength is given, but I can't see the emission wavelength(s). In fig 1d, the excitation and emission spectra should be depicted in different colours/line properties, otherwise this figure is very confusing.

      We updated figure1d and changed the colors to improve data visualization. Regarding the missing wavelength, we would like to clarify that both wavelengths were already described in the methods section as: “The excitation and emission spectra were measured at λem =560nm and λex\= 470nm, respectively, on a TECAN M200 Pro plate reader at 37 °C. “. We would be happy to rewrite this paragraph, if necessary, shall it remain unclear to the reader.

    1. Sites like 4chan and 8chan bill themselves as sites that support free-speech, in the sense that they don’t ban trolling and hateful speech, though they may remove some illegal content, like child pornography. One thing these sites do ban though, is spam. While much of spam is certainly legal, and a form of speech, this speech is restricted on these sites. If the chat boards filled up with spam, the users would find it boring and leave, so for practical reasons, these sites still moderate for spam (though they may allow some uses of ironic spam, copypasta)

      I think there is definitely an interesting point to be made if we should restrict certain parts of free-speech. Do we get rid of hate-speech and spam even when they are harmful? Does that contradict democracy or uphold it? I don't know that I can answer these questions as online free-speech would set precedent to a slew of things (online +in person) and requires much more knowledge than I can provide as a freshman in college lol.

    1. Author Response:

      Reviewer #1 (Public Review):

      This manuscript features a key technical advance in single-molecular force spectroscopy. The critical advance is to employ a click chemistry (DBCO-cycloaddition) for making a stable covalent connection between a target biomacromolecule and solid support in place of conventional antigen-antibody binding. This tweak dramatically improves the mechanical stability of the pulling system such that the pulling/relaxation can be repeated up to a thousand times (the previous limit was a few hundred cycles at best). This improvement is broadly applicable to various molecular interactions and other types of single-molecule force spectroscopy allowing for more statistically reliable force measurements. Another strength of this method is that all conjugation steps are chemically orthogonal (except for Spy-catcher conjugation to the termini of a target molecule) such that the probability of side reactions could be reduced.

      The reliability of kinetic and thermodynamic parameters obtained from single-molecule force spectroscopy depends on statistics, that is, the number of pulling measurements and their distribution. By extending the number of measurements, this robust method enables fundamental/critical statistical assessment of those parameters. That is, it is an important and interesting lesson from this study that ~200 repeats can yield statistically reasonable parameters.

      The authors carried out carefully designed optimization steps and inform readers of the critical aspects of each. The merit, quality, and rigor as a method-oriented manuscript are impressive. Overall, this is an excellent study.

      We appreciate for the positive evaluation for our work. Additionally, the minor suggestions were helpful to improve our manuscript. Thank you!

      Reviewer #2 (Public Review):

      In this study, the authors have developed methods that allow for repeatedly unfolding and refolding a membrane protein using a magnetic tweezers setup. The goal is to extend the lifespan of the single-molecule construct and gather more data from the same tether under force. This is achieved through the use of a metal-free DBCO-azide click reaction that covalently attaches a DNA handle to a superparamagnetic bead, a traptavdin-dual biotin linkage that provides a strong connection between another DNA handle and the coverslip surface, and SpyTag-SpyCatcher association for covalent connection of the membrane protein to the two DNA handles.

      The method may offer a long lifetime for single-molecule linkage; however, it does not represent a significant technological advancement. These reactions are commonly used in the field of single-molecule manipulation studies. The use of multiple tags including biotin and digoxygenin to enhance the connection's mechanical stability has already been explored in previous DNA mechanics studies by multiple research labs. Additionally, conducting single-molecule manipulation experiments on a single DNA or protein tether for an extended period of time (hours or even days) has been documented by several research groups.

      One of the unique features of our work is the development of a robust single-molecule tweezer method that is applicable to membrane proteins, rather than simply making another stable system. As re-written in Introduction, it is not straightforward as we have to consider the membrane reconstitution. We believe that our work is expected to overcome the bottleneck in membrane protein studies that arises when using single-molecule tweezer methods.

      To improve the delivery of the contextual information, we revised Introduction, Results, and Discussion. The first four paragraphs in the Introduction briefly review previous tweezer methods with an improved stability and delineate where our work is placed. In the first paragraph of the Results, we also briefly discussed how and why our DBCO tethering strategy differs from previous DBCO methods. In the first paragraph of the Discussion, we compared the previous methods regarding the stability improvement.

      Additionally, the revised manuscript now includes new findings – the full dissection of structural transitions of a helical membrane protein, the observation of hidden helix-coil transitions at a constant force, and the estimation of kinetic pre-exponential factors. We believe that the new findings provide important insights into membrane protein folding, in addition to the usefulness of our method itself for membrane protein studies. We extensively edited the main text and Methods accordingly. Relevant figures are Figures 6 and 7, Figure 6–figure supplements 1–3, and Figure 7–source data 1.

      Reviewer #3 (Public Review):

      The authors describe a method to tether proteins via DNA linkers in magnetic tweezers and apply it to a model membrane protein. The main novelty appears to be the use of DBCO click chemistry to covalently couple to the magnetic bead, which creates stable tethers for which the authors report up to >1000 force-extension cycles. Novel and stable attachment strategies are indeed important for force spectroscopy measurements, in particular for membrane proteins that are harder and therefore less studied in this regard than soluble proteins, and recording >1000 stretch and release cycles is an impressive achievement. Unfortunately, I feel that the current work falls short in some regards to exploring the full potential of the method, or at least does not provide sufficient information to fully assess the performance of the new method. Specific questions and points of attention are included below.

      We appreciate for the positive evaluation. We were able to largely improve our manuscript while preparing our responses to the comments. Thank you!

      - The main improvement appears to be the more stable and robust tethering approach, compared to previous methods. However, the stability is hard to evaluate from the data provided. The much more common way to test stability in the tweezers is to report lifetimes at constant force(s). Also, there are actually previous methods that report on covalent attachment, even working using DBCO. These papers should be compared.

      As shown in Figure 4E, we evaluated the robustness of our method in a way suggested by you – the lifetime measurement at a constant force. Specifically, ~12 hours at 50 pN. Definitely, our tweezer approach established here is the most robust method for membrane protein studies. Please refer to the section “Assessing robustness of our single-molecule tweezers” in page 7 and line 31.

      We discussed the previous covalent methods for which quantitative data are presented in light of the system stability. Please refer to the first paragraph of Discussion. We also briefly discussed how and why our DBCO tethering strategy differs from previous DBCO methods, in the first paragraph of Results.

      - The authors use the attachment to the surface via two biotin-traptavidin linkages. How does the stability of this (double) bond compare to using a single biotin? Engineered streptavidin versions have been studied previously in the magnetic tweezers, again reporting lifetimes under constant force, which appears to be a relevant point of comparison.

      The papers in this comment showed that the tethering lifetimes of biotin-streptavidin variants were affected by the asymmetric bead anchoring point. However, the situation does not apply to our work as we do not anchor traptavidin to beads. Besides, the stability comparison between the single- and double-biotin systems is not the main point of our work, so we do not have the answer to the question. However, we cited the reference in the first paragraph of Discussion where we discuss the system stability.

      - Very long measurements of protein unfolding and refolding have been reported previously. Here, too, a comparison would be relevant.

      We briefly discussed the relevant previous works in the first paragraph of Discussion.

      In light of this previous work, the statement in the abstract "However, the weak molecular tethers used in the tweezers limit a long time, repetitive mechanical manipulation because of their force-induced bond breakage" seems a little dubious. I do not doubt that there is a need for new and better attachment chemistries, but I think it is important to be clear about what has been done already.

      The sentence is in Abstract, so we also had to consider the conciseness. By simply adding the phrase “used for the membrane protein studies”, we can place our work into a more proper context.

      In page 2 and line 3, “…However, the weak molecular tethers used for the membrane protein studies have limited long-time, repetitive molecular transitions due to force-induced bond breakage…”

      - Page 5, line 99: If the PEG layer prevents any sticking of beads, how do the authors attach reference beads, which are typically used in magnetic tweezers to subtract drift?

      The PEG layer consists of biotin-PEG and methyl-PEG at a 1:27.5 molar ratio. As the reference beads are coated with streptavidin, they are attached to the PEG layer by the regular biotin-streptavidin interaction. In page 19 and line 7, you can refer to “…The polystyrene beads are attached to the PEG surface via biotin-streptavidin interaction. The beads are used as reference beads for the correction of microscope stage drifts…”

      - Figure 3 left me somewhat puzzled. It appears to suggest that the "no detergent/lipid" condition actually works best, since it provides functional "single-molecule conjugation" for two different DBCO concentrations and two different DNA handles, unlike any other condition. But how can you have a membrane protein without any detergent or lipid? This seems hard to believe.

      We explained the raised point in page 6 and line 18,

      “…Indeed, the best condition was in the absence of any detergents or lipids (Figure 3; no detergents/lipids only during the conjugation step). This situation is possible because membrane proteins are sparsely tethered to the chamber surface, which kept them from aggregating. However, not using detergents or lipids means that the membrane proteins are definitely deformed from their native folds. Therefore, we sought an optimal solubilization condition for membrane proteins during the DBCO-azide conjugation step...”

      Figure 3 also seems to imply that the bicelle conditions never work. The schematic in Figure 1 is then fairly misleading since it implies that bicelles also work.

      The buffer conditions shown in Figure 3 are those ONLY during the DBCO-azide conjugation step. In this step, the bicelle conditions did not work. Therefore, after the conjugation in 0.5% DDM, the buffer was exchanged with a bicelle solution. This process is shown in Figure 2 and the finally assembled system is depicted in Figure 1.

      To clarify this point, we put a note “Buffer conditions only during the DBCO-azide conjugation step” just above the buffer conditions in Figure 3. You can also find for the relevant exchange step in page 6 and line 31, “…Following a 1 h incubation of the beads in the single-molecule chamber at 25°C, unconjugated beads were washed, and the detergent micelles were exchanged with bicelles to reconstitute the lipid bilayer environment for membrane proteins…”

      - When it comes to investigating the unfolding and refolding of scTMHC2, it would be nice to see some traces also at a constant force. As the authors state themselves: magnetic tweezers have the advantage that they "enable constant low-force measurements" (page 8, line 189). Why not use this advantage?<br /> In particular, I would be curious to see constant force traces in the "helix coil transition zone". Can steps in the unfolding landscape be identified? Are there intermediates?

      Yes, please refer to Figure 6. We were able to dissect three distinct transitions from the fully unstructured state to the native state, including the helix-coil transitions. We also reconstructed the folding energy landscape using a deconvolution method.

      Please refer to the pertinent sections in the main text, which are titled “Structural transitions and folding energy landscape over extended time scales” and “Mechanistic dissection of folding transitions”.

      - Speaking of loading rates and forces: How were the forces calibrated? This seems to not be discussed.

      We wrote an additional section in Methods titled “Instrumentation of single-molecule magnetic tweezers”, where we discuss the force calibration. For the actual force calibration data, please see Figure 4–figure supplement 1A.

      In page 20 and line 10, “…The mechanical force applied to a bead-tethered molecule was calibrated as a function of the magnet position using the formula F = k_B_T∙L/δx_2 derived from the inverted pendulum model96, where _F is the applied force, k_B is the Boltzmann constant, _T is the absolute temperature, L is the extension, and _δx_2 is the magnitude of lateral fluctuations…”

      And how were constant loading rates achieved? In Figure 4 it is stated that experiments are performed at "different pulling speeds". How is this possible? In AFM (and OT) one controls position and measures force. In MT, however, you set the force and the bead position is not directly controlled, so how is a given pulling speed ensured?<br /> It appears to me that the numbers indicated in Figures 4A and B are actually the speeds at which the magnets are moved. This is not "pulling speed" as it is usually defined in the AFM and OT literature. Even more confusing, moving the magnets at a constant speed, would NOT correspond to a constant loading rate (which seems to be suggested in Figure 4A), given that the relationship between magnet positions and force is non-linear (in fact, it is approximately exponential in the configuration shown schematically in Figure 1).

      You are correct, so we simply modified the “pulling speed” to “magnet speed” in the figure caption. The loading rates provided in the figure (with the notation <>) were average loading rates in 1–50 pN to provide rough estimates. We actually specified it in the caption as “average force-loading rate”. However, this can be misleading at a glance, so we just deleted all the loading-rate values in the figure and caption.

      - Finally, when it comes to the analysis of errors, I am again puzzled. For the M270 beads used in this work, the bead-to-bead variation in force is about 10%. However, it will be constant for a given bead throughout the experiment. I would expect the apparent unfolding force to exhibit fluctuations from cycle to cycle for a given bead (due to its intrinsically stochastic nature), but also some systematic trends in a bead-to-bead comparison since the actual force will be different (by 10% standard deviation) for different beads. Unfortunately, the authors average this effect away, by averaging over beads for each cycle (Figure 4). To me, it makes much more sense to average over the 1000 cycles for each bead and then compare. Not surprisingly, they find a larger error "with bead size error" than without it (Figure 5A). However, this information could likely be used (and the error corrected), if they would only first analyze the beads separately.

      We might be wrong, but there seems to be a misunderstanding. First, we added Figure 5–figure supplement 1 where you can see individual traces. As expected, the levels of unfolding forces/sizes appear consistent during the progress of pulling cycles. Second, the advantage of averaging for different beads is that you can effectively remove the bead size effect. This “averaging-out” is the key strategy in our kinetic analysis. Based on the error estimation, if you average the values of kinetic parameters obtained from different beads, you can then estimate them with reasonably small errors despite the bead size variations. This becomes more evident after initial hundreds of pulling cycles. The errors for 200 and 1000 cycles are of only ~1% difference, indicating that you do not need to blindly run the pulling cycles. These results are based on the “averaging-out” strategy, which is the merit of our analysis. For more details, please see the section in the main text titled “Assessing statistical reliability of pulling-cycle experiments”, where relevant figures, figure supplements, and Method sections are referred.

      What is the physical explanation of the first fast and then slow decay of the error (Figure 5B)? I would have expected the error for a given bead after N pulling cycles to decrease as 1/sqrt(N) since each cycle gives an independent measurement. Has this been tested?

      If the sampling was from one population (here, unfolding probability profile), the error would follow a 1/√n decay as expected for the standard error. In our analysis, however, we estimated the expected “mean” errors, regardless of detailed shapes of the unfolding probability profiles. To this end, we sampled the data from different possible profiles (shown in Figure 5–figure supplement 5). We then averaged all the error plots to obtain the plot of the mean errors during progress of pulling cycles (black curve in Figure 5D). In this case, the plot does not have to follow the standard error curve represented by the factor 1/√n.

      We tested this by fitting with the model function of y = A/√n, for various lower limit of N = 10, 30, 50, 100, 300, and 500 in the regression analysis (Figure 5–figure supplement 6). The results of the reduced chi-square (χ2) used for a goodness-of-fit test (χ2 = 1 for the best fit) indicates that the two-term exponential model (χ2 = 1.60) shows a better fit than the reciprocal square root model (χ2 = 2.30–6.01). The regression model adopted in our analysis is a phenomenological model that more properly describes the error decay curve. The trend of the first fast and then slow decay is not unusual because it is also expected for the reciprocal square root model – the plot 1/√n decays fast and then slowly, too (Figure 5–figure supplement 6).

    1. Author Response

      Reviewer #1 (Public Review):

      Estimating the effects of mutations on the thermal stability of proteins is fundamentally important and also has practical importance, e.g, for engineering of stable proteins. Changes can be measured using calorimetric methods and values are reported as differences in free energy (dG) of the mutant compared to wt proteins, i.e., ddG. Values typically range between -1 kcal/mol through +7 kcal/mol. However, measurements are highly demanding. The manuscript introduces a novel deep learning approach to this end, which is similar in accuracy to ROSETTA-based estimates, but much faster, enabling proteomewide studies. To demonstrate this the authors apply it to over 1000 human proteins.

      The main strength here is the novelty of the approach and the high speed of the computation. The main weakness is that the results are not compared to existing machine learning alternatives.

      We thank Prof. Ben-Tal for taking the time to assess our work, and for his comments and suggestions below.

      Reviewer 2 (Public Review):

      Summary:

      This work presents a new machine-learning method, RaSP, to predict changes in protein stability due to point mutations, measured by the change in folding free energy ΔΔG.<br /> The model consists of two coupled neural networks, a 3D selfsupervised convolutional neural network that produces a reduceddimensionality representation of the structural environment of a given residue, and a downstream supervised fully-connected neural network that, using the former network's structural representation as input, predicts the ΔΔG of any given amino-acid mutation. The first network is trained on a large dataset of protein structures, and the second network is trained using a dataset of the ΔΔG values of all mutants of 35 proteins, predicted by the biophysics-based method Rosetta.

      The paper shows that RaSP gives good approximations of Rosetta ΔΔG predictions while being several orders of magnitude faster. As compared to experimental data, judging by a comparison made for a few proteins, RaSP and Rosetta predictions perform similarly. In addition, it is shown that both RaSP and Rosetta are robust to variations of input structure, so good predictions are obtained using either structures predicted by homology or structures predicted using AlphaFold2.<br /> Finally, the usefulness of a rapid approach such as RaSP is clearly demonstrated by applying it to calculate ΔΔG values for all mutations of a large dataset of human proteins, for which this method is shown to reproduce previous findings of the overall ΔΔG distribution and the relationship between ΔΔG and the pathological consequences of mutations. The RaSP tool and the dataset of mutations of human proteins are shared.

      Strengths:

      The single main strength of this work is that the model developed, RaSP, is much faster than Rosetta (5 to 6 dex), and still produces ΔΔG predictions of comparable accuracy (as compared with Rosetta, and with the experiment). The usefulness of such a rapid approach is convincingly demonstrated by its application to predicting the ΔΔG of all single-point mutations of a large dataset of human proteins, for which using this new method they reproduce previous findings on the relationship between stability and disease. Such a large-scale calculation would be prohibitive with Rosetta. Importantly, other researchers will be able to take advantage of the method because the code and data are shared, and a google colab site where RaSP can be easily run has been set up. An additional bonus is that the dataset of human proteins and their RaSP ΔΔG predictions, annotated as beneficial/pathological (according to the ClinVar database) and/or by their allele frequency (from the gnomAD database) are also made available, which may be very useful for further studies.

      Weaknesses:

      The paper presents a solid case in support of the speed, accuracy, and usefulness of RaSP. However, it does suffer from a few weaknesses.

      The main weakness is, in my opinion, that it is not clear where RaSP is positioned in the accuracy-vs-speed landscape of current ΔΔGprediction methods. The paper does show that RaSP is much faster than Rosetta, and provides evidence that supports that its accuracy is comparable with that of Rosetta, but RaSP is not compared to any other method. For instance, FoldX has been used in large-scale studies of similar size to the one used here to exemplify RaSP. How does RaSP compare with FoldX? Is it more accurate? Is it faster? Also, as the paper mentions in the introduction, several ML methods have been developed recently; how does RaSP compare with them regarding accuracy and CPU time? How RaSP fares in comparison with other fast approaches such as FoldX and/or ML methods will strongly affect the potential usefulness and impact of the present work.

      Second, this work being about presenting a new model, a notable weakness is that the model is not sufficiently described. I had to read a previous paper of 2017 on which this work builds to understand the self-supervised CNN used to model the structure, and even so, I still don't know which of 3 different 3D grids used in that original paper is used in the present work.

      A third weakness is, I think, that a stronger case needs to be made for fitting RaSP to Rosetta ΔΔG predictions rather than experimental ΔΔGs. The justification put forward by the authors is that the dataset of Rosetta predictions is large and unbiased while the dataset of experimental data is smaller and biased, which may result in overfitting. While I understand that this may be a problem and that, in general, it is better to have a large unbiased dataset in place of a small biassed one, it is not so obvious to me from reading the paper how much of a problem this is, and whether trying to fix it by fitting the model to the predictions of another model rather than to empirical data does not introduce other issues.

      Finally, the method is claimed to be "accurate", but it is not clear to me what this means. Accuracy is quantified by the correlation coefficient between Rosetta and RaSP predictions, R = 0.82, and by the Mean Absolute Error, MAE = 0.73 kcal/mol. Also, both RaSP and Rosetta have R ~ 0.7 with experiment for the few cases where they were tested on experimental data. This seems to be a rather modest accuracy; I wouldn't claim that a method that produces this sort of fit is "accurate". I suppose the case is that this may be as accurate as one can hope it to be, given the limitations of current experimental data, Rosetta, RaSP, and other current methods, but if this is the case, it is not clearly discussed in the paper.

      We thank the reviewer for their detailed comments and suggestions.

      As discussed in our general comments above and also below, we have now added additional benchmarking, making it easier to compare the accuracy of RaSP with other methods. Regarding the model description, we have now added a more detailed description of also the 3D CNN.

      Regarding whether to fit the model to experiments or computational data, we agree that it is not clear cut that the former would also not work. Indeed, a main problem is that in both cases it is hard to answer which approach is better because of the scarcity of experimental data. One major problem with the larger sets of experimental data is, as we mention, the bias and variability; another is the provenance. While some databases exist, they are rarely exactly raw data, and for example may contain ∆∆G values estimated from ∆Tm values. In the revised manuscript we now explain better why we chose to target Rosetta, but also acknowledge that one might also have used experiments.

      As to the question of accuracy, we agree completely that the methods could be better. One problem, however, is that it is very difficult to answer how much better because of problems with experiments. As mentioned also by reviewer 1, variation across different experiments suggest that even a “perfect” predictor would only achieve Pearson correlation coefficients in the range 0.7–0.8 (https://doi.org/10.1093/bioinformatics/bty880). Clearly, this is an issue with imperfect data curation (it is possible to measure ∆∆G quite accurately), but in the absence of larger and better curated experiments, one will not expect much better accuracy than what we report here. This is now discussed in the revised manuscript.

      Reviewer 3 (Public Review):

      The authors present a machine learning method for predicting the effects of mutations on the free energy of protein stability. The method performs similarly to existing methods, but has the advantage that it is faster to run. Overall this is reasonable and a faster method will likely have some potential uses. However, not improving performance beyond the reasonable but not great performance of existing methods of course makes this a less useful advance. The authors provide predictions for a set of human proteins, but the impact of their method would be much greater if they provided predictions for all substitutions in all human proteins, for example. In places the text somewhat overstates the performance of computational methods for predicting free energy changes and is potentially misleading about when ddGs are predicted vs. experimentally measured. In addition, the comparison to existing methods is rather slim and there isn't a formal evaluation of how well RASP discriminates pathological from benign variants.

      We thank the reviewer for taking time to read our work and for their various suggestions.

    1. Author Response

      Reviewer #1 (Public Review):

      Alignment between high dimensional data which express their dynamics in a subspace is a challenge which has recently been addressed both with analytic-based solutions like the Procrustes transformation, and, most interestingly, via deep learning approaches based on adversarial networks. The authors have previously proposed an adversarial network approach for alignment which relied on first dimensionally-reducing the binned neural spikes using an autoencoder. Here, they use an alternative approach to align data without use of an initial dimensional-reduction step.

      The results are fairly clear - the Cycle-GAN approach works better than their previous ADAN approach and one based on dimensionality reduction followed by the Procrustes transform. In general, a criticism of this entire field is to understand what alignment teaches us about the brain or how it specifically will be used in a BCI context.

      There are a few issues with the paper.

      1.) To increase the impact of their work, the investigators have now used it to align data in multiple types of tasks. There was an unanswered question about this related to neuroscience - does alignment in one task predict alignment for another?

      This is a great question! We anticipate that it will be challenging for an alignment learned on one task to be used on another task, because we know that M1 decoders trained on data from one behavior often do not generalize when tested using a different behavior (Naufel et al., 2019)*. The same nonlinearities that prevent zero-shot decoding across tasks are also likely to impair the ability of an aligner trained on data from one task to successfully align data from another task. Furthermore, the results of Naufel et al. indicate that even if neural alignment is successful, we would need a decoder already trained on the new task to produce reliable predictions-- in which case the data needed to train that decoder could simply be used for alignment. A systematic study of the relation between the ability to align and decode from data is well warranted, but beyond the scope of our current work.

      *Naufel, S., Glaser, J. I., Kording, K. P., Perreault, E. J., & Miller, L. E. (2019). A muscle-activity-dependent gain between motor cortex and EMG. Journal of neurophysiology, 121(1), 61-73.

      Action in the text: none.

      2) Investigators use decoding as a way of comparing alignment performance. The description of the cycle GAN was not super detailed, and it wasn't clear whether there was any dynamic information stored in the network that might create questions of causality in actual use. It seems that input is simply the neural activity at a current time point rather than neural activity across the trial, which would alleviate this concern. However, they mention temporal alignment but never describe in detail whether all periods of spikes are properly modeled by the system or if only subsets of data (specific portions of task or non-task time) will work. Perhaps this is more a question of the Wiener filter, for which precise details are missing.

      As intuited by the reviewer, we did only use the neural activity at a current time point as the inputs for Cycle-GAN training, so the system is causal and can be used in real time. We have modified the text to clarify this.

      We apologize for any confusion caused by our use of the term "temporal alignment", which was for the sake of consistency with earlier-published, CCA-based alignment methods (e.g., in Gallego et al., 2020), but is indeed confusing. In the revised manuscript, we have switched to the term ‘trial alignment’ which we believe better reflects this pre-processing step, and we have included additional explanations in the introduction.

      Importantly, while CCA-style trial alignment is not required by our methods, we do still preprocess our data to exclude behaviors not related to the investigated task. Since monkeys were resting or performing task-irrelevant movements during inter-trial period, we chose to use data only from trial start to trial end, but without any explicit trial matching or alignment (see Appendix 1 - Behavior tasks). In the revised manuscript, we now show that our methods still works well when applied even to the continuous recordings, with Cycle-GAN significantly outperforming both ADAN and PAF.

      Action in the text (page 2, lines 72-74): clarifying CCA description and replacing “temporal alignment” with “trial alignment”.

      Action in the text (page 5, lines 191-192): stating that ADAN and Cycle-GAN have no knowledge of dynamics.

      Action in the text (page 6, lines 258-272): documenting performance on full-day recordings without trial matching.

      Action in the text (page 13, lines 647-649): again, stating that Cycle-GAN has no knowledge of dynamics.

      3) In general, precise details of the algorithms should have been provided.

      We appreciate the reviewer noting this-- in the submitted manuscript, the full descriptions of Cycle-GAN and ADAN were included as supplementary methods in Appendix 4, but we did not extensively reference this and it may have been missed. In the revised manuscript, we added more references to Appendix 4 and in the Methods section of the main text. We provided further details on the choice of hyperparameters for each method (including PAF) in Appendix 4 itself.

      Action in the text (page 13, lines 643-644): added “For a full description of the ADAN architecture and its training strategy, please refer to “ADAN based aligner” in Appendix 4 and (Farshchian et al., 2018).”

      Action in the text (page 14, lines 669): added “Further details about the Cycle-GAN based aligner are provided in “Cycle-GAN based aligner”, Appendix 4.” Action in the text (Appendix 4 Tables 1-2): We have added a summary table of hyperparameters for each method in Appendix 4 (ADAN: Appendix 4 Table 1; CycleGAN: Appendix 4 Table 2).

      4) Cross validation for day-0 alignment is not explained.

      As mentioned above, the training and validation details of day-0 models were included in Appendix 4, which was not extensively referenced in the manuscript and may have been missed. We have now added more references to the Appendix in the revised manuscript.

      Action in the text (page 13, lines 627-629): added “(Note that this LSTM based decoder is only used for latent space discovery, not the later decoding stage that is used for performance evaluation (see “ADAN day-0 training” in Appendix 4 for full details)).”

      5) Details of statistical tests is not provided.

      We apologize for this omission. In the revised manuscript, we have added a section in the methods summarizing all the statistical tests. In addition, we added the sample sizes for each stat reported in the results section.

      Action in the text (page 15, lines 754-768): new Methods section added.

      6) (minor) The idea that for neurons that have disappeared that the CycleGAN can "infer their response properties", seems an incorrect description. A proper description should be that it "hallucinates" their response properties?

      We prefer to avoid the term “hallucinate”, due to its recent increased (appropriate) use in the context of large language models describing content generation that is “nonsensical or unfaithful to the provided source content” (as per the Wikipedia article on hallucination in AI). The synthetized “responses” of vanished neurons are not nonsensical, but are indeed, inferred: they are the model’s best estimate of how these neurons would have responded, had they been observed. While not explored further here, this prediction could be of potential scientific use: a strong discrepancy between predicted and observed activity might be a clue to look for further evidence of learning or remodeling of neural representations of behavior.

      Action in the text: none.

      Reviewer #2 (Public Review):

      In this manuscript, the authors use generative adversarial networks (GANs) to manipulate neural data recorded from intracortical arrays in the context of intracortical BCIs so that these decoders are robust. Specifically, the authors deal with the hard problem where signals from an intracortical array change over time and decoders that are trained on day 0 do not work on day K. Either the decoder or the neural data needs to be updated to achieve the same performance as initially. GANs try to alter the neural data from day K to make it indistinguishable to day 0 and thus in principle the decoder should perform better. The authors compare their GAN approach to an older GAN approach (by an overlapping group of authors) and suggest that this new GAN approach is somewhat better. Major Strengths are multiple datasets from behaving monkeys performing various tasks that involve motor function. Comparison between two different GAN approaches and a classical approach that uses factor analysis. The weakness is insufficient comparison to another state-of-the-art approach that has been applied on the same dataset (NoMAD, Karpowicz et al. BioRxiv.)

      The results are very reasonable and they show their approach, Cycle GANs, does slightly better than the traditional GAN approach. However, the Cycle GANs have many more modules and also as I understand it performs a forward backward mapping of the day - 0 and day - k and thus theoretically better. But, it seems quite slow.

      We are concerned that the reviewer may have mistaken the Cycle-GAN training time (the time it takes to find an alignment, Figure 4B) with its inference time (the time it takes to transform data once an alignment has been found). Whereas inference time is critical for practical deployment of a model, we argue that Cycle-GAN's somewhat longer training time is not a substantial barrier to use: it is still reasonably fast (a few minutes) and training will only need to be performed on the order of once per day. We have modified the y-axis label of Figure 4B to make this distinction clearer.

      We have also now added information on the inference speed of trained models to the paper: we find that both Cycle-GAN and ADAN perform the inference step in under 1 ms per 50 ms sample of data – this is because the forward map in both models consists of a fully connected network with only two hidden layers. We also note that while forward-backward mapping between days does occur during Cycle-GAN training, only the forward mapping is performed during inference.

      Action in the text (page 7, lines 303-306): added inference time for Cycle-GAN and ADAN.

      I think the results are interesting but as such, I am not sure this is such a fundamental advance compared to the Farashcian et al. paper, which introduced GANs to improve decoding in the face of changing neural data. There are other approaches that also use GANs and I think they all need to be compared against each other. Finally, these are all offline results and what happens online is anyone's real guess. Of course, this is not just a weakness of this study but many such studies of its ilk.

    1. Author Response

      Reviewer #1 (Public Review):

      The manuscript by Gochman and colleagues reports the discovery of a very strong sensitization of TRPV2 channels by the herbal compound cannabidiol (CBD) to activation by the synthetic agonist 2aminoethoxydiphenyl borate (2-APB). Using patch-clamp electrophysiology the authors show that the ~100-fold enhancement by micromolar CBD of TRPV2 current responses to low concentrations of 2-APB reflects a robust increase in apparent affinity for the latter agonist. Cryo-EM structures of TRPV2 in lipid nanodiscs in the presence of both drugs report two-channel conformations. One conformation resembles previously solved structures whereas the second conformation reveals two distinct CBD binding sites per subunit, as well as changes in the conformation of the S4-S5 linker. Interestingly, although TRPV1 and TRPV3 are highly homologous to TRPV2 and both CBD binding sites are relatively conserved, the CBD-induced sensitization towards 2-APB is observable only for TRPV3 but not for TRPV1. Moreover, the simultaneous substitution of non-conserved residues in the CBD binding sites and the pore region of TRPV1 with the amino acids present in TRPV2 fails to confirm strong CBD-induced sensitization. The authors conclude that CBD-dependent sensitization of TRPV2 channels depends on structural features of the channel that are not restricted to the CBD binding site but involve multiple channel regions.

      These are important findings that promote our understanding of the molecular mechanisms of TRPV family channels, and the data provide convincing evidence for the conclusions.

      We appreciate the supportive evaluation of the reviewer.

      Reviewer #2 (Public Review):

      In this manuscript, Gochman et al. studied the molecular mechanism by which cannabidiol (CBD) sensitizes the TRPV2 channel to activation by 2-APB. While CBD itself can activate TRPV2 with low efficacy, it can sensitize TRPV2 current activated by 2-APB by two orders of magnitude. The authors showed, via single-channel recording, that the CBD-dependent sensitization arises from an increase in Po when the channel binds to both CBD and 2-APB. The authors then used cryo-EM to investigate how CBD binds to TRPV2 and identified two CBD binding sites in each subunit, with one site being previously reported and the other being newly discovered.

      TRPV1 and TRPV2 are two channels closely related to TRPV2. All three channels can be activated by CBD and 2-APB, but only TRPV2 and 3 are strongly sensitized by CBD. To understand the molecular basis of the different sensitivity to CBD, the authors compared the residues within the CBD binding sites and generated mutants by swapping non-conserved residues between TRPV1 and TRPV2. They then performed patch-clamp recordings on these mutants and found that mutations on non-conserved residues indeed influenced the CBD-dependent sensitization, thereby supporting the observed CBD binding sites.

      Unexpectedly, the authors did not identify the binding site of 2-APB, despite its robust effect in electrophysiology recordings, especially when combined with CBD. Although previous structural studies of TRPV2 have reported 2-APB binding sites, the associated densities in these studies were not wellresolved. Therefore, the authors called on the field to re-examine published structural data with regard to the 2-APB binding sites.

      Overall, this is an important study with well-designed and well-conducted experiments.

      We appreciated the supportive comments of the reviewer.

      Reviewer #3 (Public Review):

      In this paper, Gochman et al examine TRPV1-3 channel sensitization by CBD, specifically in the context of 2-APB activation. The authors primarily used classic electrophysiological techniques to address their questions about channel behavior but have also used structural biology in the form of cryo-EM to examine drug binding to TRPV2. The authors have carefully observed and quantified sensitization of the rat TRPV2 channel to 2-APB by CBD. While this sensitization has been reported previously (Pumroy et al, Nat Commun 2022), the authors have gone into much more detail here and carefully examined this process from several angles, including a comparison to some other known methods of sensitizing TRPV2. Additionally, the authors have also revealed that CBD sensitizes rat TRPV1 and mouse TRPV3 to 2-APB, which has not been reported previously. Up to this point, the work is well thought through and cohesive.

      The major weakness of this paper is that the authors' efforts to track down the structural and molecular basis for CBD sensitization neither give insight into how sensitization occurs nor provide a solid footing for future work on the topic. The structural work presented in this paper lacks proper controls to interpret the observed states and the authors do nothing to follow up on a potentially interesting second binding site for CBD. Overall, the structural work feels detached from the rest of the paper. The mutations chosen to examine sensitization are based on setting up TRPV1 in opposition to TRPV2 and TRPV3, which makes little sense as all three channels show sensitization by CBD, even if to different extents. The authors chose their mutations based on the assumption that response to CBD is the key difference between the channels for sensitization, yet the overall state of each channel or the different modes of activation by 2-APB seem to be more likely candidates. As a result, it is not particularly surprising that none of the mutations the authors make reduce CBD sensitization in TRPV2 or increase CBD sensitization in TRPV1.

      A difficulty in examining TRPV1-3 as a group is that while they are highly conserved in sequence and structure, there are key differences in drug responses. While it does seem likely that CBD would bind to the same location in TRPV1-3, there is extensive evidence that 2-APB binds at different sites in each channel, as the authors discuss in the paper. Without more basic information about where 2-APB binds to each channel and confirmation that CBD does indeed bind TRPV1-3 at the same site, it may not be possible to untangle this particular mode of channel sensitization.

      We appreciate this reviewer’s perspective and we too were disappointed that our approach did not yield more definitive answers to why some TRPV channels are more sensitive to CBD. We have revised the results and discussion sections to more clearly articulate what we think our results reveal. We have also added a section to the discussion to present the idea that the differential sensitivity of TRPV channels to CBD may have more to do with where 2-APB binds and how it activates the channel than CBD. These challenging points are all excellent and they have helped us to present our message more clearly.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The current study uses 3D organotypic rafts to culture primary keratinocytes from Foreskin, Tonsil and Cervix. Further the authors looked at the transcriptomic profiles of each tissue types to study similarities and differences depending on the tissue of origin as well as show the similarity in the tissue specific gene signatures and the ex-vivo samples (data from GTEx). As mentioned by authors Skin and Cervix keratinocytes have been previously cultured on collagen rafts however extending it to Tonsil provides resource and possibility of growing more tissue specific epithelial cells in 3D.

      Major comments 1. As the papers focus is to culture epithelial/ epidermal cells on 3D rafts, methods section needs more details about the raft composition, preparation, fibroblast embedding what was the plate size used for raft preparation and culturing of cells on those rafts. What culture media was used for epithelial raft cultures?

      We have a detailed published protocol that highlight these details. However, we will expand on some of these details in the manuscript

      Results: Figure 1, authors show IF staining's for COL17A1 as marker for basal cells and cornulin for differentiated layers. However, it is important to show how many cells in the basal layer are proliferative? (or how many layers of proliferative cells are present in different epithelia analysed here?) after 14 days majority of cells might already start losing their stemness potential (maybe staining for at least ki67 if staining for basal stem cell marker not possible? Along with loricrin or Involucrin might be good idea).

      We will stain for ki67 as suggested. However, based on published data using these raft cultures, we do not expect that many cells will be positive.

      This is also important as from supp fig 3 you can see F1 has higher expression of Loricrin, filaggrin etc as compared to all other samples indicating higher diff in this sample. Also, if authors can comment on what was the passage of cells used? And have they observed any difference in the re-epithelization in early passage versus late passage of keratinocytes?

      We will expand on this is the updated manuscript. Importantly, we grow these cells in a rho-kinase inhibitor that ‘conditionally’ immortalizes these cells as described (DOI: 10.1172/JCI42297).

      It is interesting to see Tonsillar 3D epithelia recapitulate the crypt and surface epithelia and authors also show this with gene expression profile, if possible (Optional), can authors show staining for crypt specific and surface specific markers.

      We agree that this is an important control. This will be included.

      For all the Supplementary tables where only Ensembl ids are represented, please add gene Id column alongside (it is easier to get biological context from gene id for the reader rather than looking up Ensembl ids). Rename the file names to include the Supplementary file 1, 2, 3?

      Since there is 1-to-1 conversion for Ensembl to Gene Id, we elected to not include these. The online app does try tp accommodate this as much as possible. We propose to include two versions of each table. 1 with Ensemble ids only and one with both IDs.

      Its excellent to see that in vitro tissue signature matched the in vivo tissue samples (Figure 8) but it will be interesting to show the gene expression differences if found any between the in vitro and in vivo samples that will give insight on the changes as result of in vitro system.

      Since the in vivo data will be a mixture of epithelial cells and stroma, these comparisons are not straightforward. However, we are currently examining the use of existing scRNA-Seq data to begin addressing these concerns. This data will be included in the next revision.

      Minor comments

      1. Abstract: Give sample number (n?) and brief results about the genes that had tissue specific expression pattens.
      2. Gene names needs to be in Italics throughout.
      3. Introduction: page 5 line 9, authors claim that they based on comparisons they can "identify potential therapeutic targets for various disease" I think this statement either needs experimental evidence or statement / claim needs to be modified.
      4. Data submission to GEO???
      5. Typo (page 15, line 16 should be "HFK-down", same on page 23 "ectocervix", "endocervix", "uterus", so on, please correct, comma needs to be placed after "
      6. Page 24 last line is the heatmap referred here Fig 9B?
      7. Fig. 1 legends please indicate what F1, F2, F3, C1--- T1--- represent. Fig 1C Please add axis range/ values for protein atlas data as well.
      8. Can authors comment in discussion how was current 3D cervix cells on raft method different from Meyers, C., 1996 3D system?

      All these ‘minor’ comments will be addressed.

      Reviewer #1 (Significance (Required)):

      This article does extend and validate the 3D raft culture method to different epithelial tissues in addition to Skin and cervix. This will be useful for the researchers using co culture systems and interested in understanding epithelial cell and immune cell interactions or host pathogen interacts etc

      Describe your expertise: establishing and maintaining primary skin and oral keratinocyte cultures on feeders and 3D cultures on DEDs, Organoid cultures from oral keratinocytes, Oral cancer biology, Histopathology, transcriptomics study, Immuno-oncology.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary Jackson et al.'s manuscript describes an experiment that directly compares 3D organotypic assays created with primary human epithelial cells from foreskin, cervix and tonsil using histological and bulk RNA sequencing approaches. The authors convincingly show the retention of site-specific histological and transcriptomic differences between the stratified epithelial tissues in culture. Differentially expressed genes are identified and pathway analyses suggest genes that might be involved in the different differentiation processes between these tissue sites and differential regulation of ECM and immune pathways. Differentially expressed genes are used to develop a classifier for tissue identification, which is tested using GTEx data.

      Major Comments • The interferon stimulated genes of B cells and macrophages (from Mostafavi et al., 2016) are likely to be very different from those in epithelial cells, so the analysis presented in Figure 9 seems like a stretch to me.

      We will include caveats to this interpretation. We are planning stimulation experiments of each tissue to compare IFN responses. However, depending on the outcomes, these may end up being outside of the scope of the current manuscript.

      • OPTIONAL: Further data comparing the nature and magnitude of the interferon responses of the three epithelia would improve interest in the manuscript but are not necessary for publication of the current dataset.

      See above

      Minor Comments • Details of n numbers and what each point represents should be added to Figure 1C. Are these points measurements from 25 um intervals of just one raft per donor? What are 'fields of view' here? Are measurements from one section or from multiple sections per raft? • Page 12 - provide a figure/panel citation for the "micrograph derived from a tonsillectomy" that is suggested for comparison. • In Figure 1 - Figure Supplement 1, how representative of the whole raft are these images? Does the extent of stratification change near to the edge of the collagen gel, for example? How well matched for location within a raft are the images shown? • Page 24 - clarify uses of the phrase "down-regulated in tonsils". Presumably this section refers to tonsil epithelium in 3D organotypic rafts.

      Typos • Page 3 - "the cervix is lined with stratified squamous epithelia", should be epithelium. • "J.G. Rheinwald" in in text references. • Page 6 - 'or' not 'and' in first sentence of primary cell culture section.

      All these ‘minor’ comments and typos will be addressed.

      Reviewer #2 (Significance (Required)):

      This highly descriptive study provides a detailed analysis of a bulk RNA sequencing experiment comparing foreskin, cervix and tonsil 3D organotypic rafts. Retained histological and transcriptional differences between epithelial tissues of different origins in organotypic assays are well documented in the literature (e.g., parmoplantar vs non-parmoplantar skin, PMID: 36732947; airway tract, PMID: 32526206) so the observed differences between these three very distinct anatomical tissues are unsurprising overall. The data have been made available via SRA and a shiny web app and are likely to be of interest and use to other researchers working on these tissues in culture. The experiment was performed in matched cell culture conditions so replicates are well-controlled, if limited in number (n=3).

      We appreciate this feedback. We agree this is a descriptive study. Nonetheless, we believe there is value in formally demonstrating differences and similarities between these tissues. The provided references will be included to expand our discussion.

      I am an epithelial cell biologist specializing in human cell culture models. I do not have sufficient computational background to comment in detail on the RNA sequencing methods or analysis within the manuscript.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This is very carefully analysed and written study describing the transcriptional differences between in vitro models of epithelia derived from cervix, foreskin and tonsil tissues. Importantly, they compare the findings to in vivo samples using publicly available data. The findings are significant and will be of interest to the scientific community. I cannot fault the analysis pathways or the conclusions, and the manuscript is a pleasure to read. I recommend it is accepted for publication as is.

      Reviewer #3 (Significance (Required)):

      This is an important study that is highly significant for researchers interested in epithelia tissue and infection. The data are clearly presented and the analysis is thorough. The authors state that they will make the data publicly available. This will be an important resource for the community.

      We appreciate the kind words

    1. Author Response

      Reviewer #1 (Public Review):

      The authors present a study of visuo-motor coupling primarily using wide-field calcium imaging to measure activity across the dorsal visual cortex. They used different mouse lines or systemically injected viral vectors to allow imaging of calcium activity from specific cell-types with a particular focus on a mouse-line that expresses GCaMP in layer 5 IT (intratelencephalic) neurons. They examined the question of how the neural response to predictable visual input, as a consequence of self-motion, differed from responses to unpredictable input. They identify layer 5 IT cells as having a different response pattern to other cell-types/layers in that they show differences in their response to closed-loop (i.e. predictable) vs open-loop (i.e. unpredictable) stimulation whereas other cell-types showed similar activity patterns between these two conditions. They analyze the latencies of responses to visuomotor prediction errors obtained by briefly pausing the display while the mouse is running, causing a negative prediction error, or by presenting an unpredicted visual input causing a positive prediction error. They suggest that neural responses related to these prediction errors originate in V1, however, I would caution against over-interpretation of this finding as judging the latency of slow calcium responses in wide-field signals is very challenging and this result was not statistically compared between areas.

      Surprisingly, they find that presentation of a visual grating actually decreases the responses of L5 IT cells in V1. They interpret their results within a predictive coding framework that the last author has previously proposed. The response pattern of the L5 IT cells leads them to propose that these cells may act as 'internal representation' neurons that carry a representation of the brain's model of its environment. Though this is rather speculative. They subsequently examine the responses of these cells to anti-psychotic drugs (e.g. clozapine) with the reasoning that a leading theory of schizophrenia is a disturbance of the brain's internal model and/or a failure to correctly predict the sensory consequences of self-movement. They find that anti-psychotic drugs strongly enhance responses of L5 IT cells to locomotion while having little effect on other cell-types. Finally, they suggest that anti-psychotics reduce long-range correlations between (predominantly) L5 cells and reduce the propagation of prediction errors to higher visual areas and suggest this may be a mechanism by which these drugs reduce hallucinations/psychosis.

      This is a large study containing a screening of many mouse-lines/expression profiles using wide-field calcium imaging. Wide-field imaging has its caveats, including a broad point-spread function of the signal and susceptibility to hemodynamic artifacts, which can make interpretation of results difficult. The authors acknowledge these problems and directly address the hemodynamic occlusion problem. It was reassuring to see supplementary 2-photon imaging of soma to complement this data-set, even though this is rather briefly described in the paper.

      We will expand on the discussion of caveats as suggested.

      Overall the paper's strengths are its identification of a very different response profile in the L5 IT cells compared other layers/cell-types which suggests an important role for these cells in handling integration of self-motion generated sensory predictions with sensory input. The interpretation of the responses to anti-psychotic drugs is more speculative but the result appears robust and provides an interesting basis for further studies of this effect with more specific recording techniques and possibly behavioral measures.

      Reviewer #2 (Public Review):

      Summary:

      This work investigates the effects of various antipsychotic drugs on cortical responses during visuomotor integration. Using wide-field calcium imaging in a virtual reality setup, the researchers compare neuronal responses to self-generated movement during locomotion-congruent (closed loop) or locomotion-incongruent (open loop) visual stimulation. Moreover, they probe responses to unexpected visual events (halt of visual flow, sudden-onset drifting grating). The researchers find that, in contrast to a variety of excitatory and inhibitory cell types, genetically defined layer 5 excitatory neurons distinguish between the closed and the open loop condition and exhibit activity patterns in visual cortex in response to unexpected events, consistent with unsigned prediction error coding. Motivated by the idea that prediction error coding is aberrant in psychosis, the authors then inject the antipsychotic drug clozapine, and observe that this intervention specifically affects closed loop responses of layer 5 excitatory neurons, blunting the distinction between the open and closed loop conditions. Clozapine also leads to a decrease in long-range correlations between L5 activity in different brain regions, and similar effects are observed for two other antipsychotics, aripripazole and haloperidol, but not for the stimulant amphetamine. The authors suggest that altered prediction error coding in layer 5 excitatory neurons due to reduced long-range correlations in L5 neurons might be a major effect of antipsychotic drugs and speculate that this might serve as a new biomarker for drug development.

      Strengths:

      • Relevant and interesting research question:

      The distinction between expected and unexpected stimuli is blunted in psychosis but the neural mechanisms remain unclear. Therefore, it is critical to understand whether and how antipsychotic drugs used to treat psychosis affect cortical responses to expected and unexpected stimuli. This study provides important insights into this question by identifying a specific cortical cell type and long-range interactions as potential targets. The authors identify layer 5 excitatory neurons as a site where functional effects of antipsychotic drugs manifest. This is particularly interesting as these deep layer neurons have been proposed to play a crucial role in computing the integration of predictions, which is thought to be disrupted in psychosis. This work therefore has the potential to guide future investigations on psychosis and predictive coding towards these layer 5 neurons, and ultimately improve our understanding of the neural basis of psychotic symptoms.

      • Broad investigation of different cell types and cortical regions:

      One of the major strengths of this study is quasi-systematic approach towards cell types and cortical regions. By analysing a wide range of genetically defined excitatory and inhibitory cell types, the authors were able to identify layer 5 excitatory neurons as exhibiting the strongest responses to unexpected vs. expected stimuli and being the most affected by antipsychotic drugs. Hence, this quasi-systematic approach provides valuable insights into the functional effects of antipsychotic drugs on the brain, and can guide future investigations towards the mechanisms by which these medications affect cortical neurons.

      • Bridging theory with experiments

      Another strength of this study is its theoretical framework, which is grounded in the predictive coding theory. The authors use this theory as a guiding principle to motivate their experimental approach connecting visual responses in different layers with psychosis and antipsychotic drugs. This integration of theory and experimentation is a powerful approach to tie together the various findings the authors present and to contribute to the development of a coherent model of how the brain processes visual information both in health and in disease.

      Weaknesses:

      • Unclear relevance for psychosis research

      From the study, it remains unclear whether the findings might indeed be able to normalise altered predictive coding in psychosis. Psychosis is characterised by a blunted distinction between predicted and unpredicted stimuli. The results of this study indicate that antipsychotic drugs further blunt the distinction between predicted and unpredicted stimuli, which would suggest that antipsychotic drugs would deteriorate rather than ameliorate the predictive coding deficit found in psychosis. However, these findings were based on observations in wild-type mice at baseline. Given that antipsychotics are thought to have little effects in health but potent antipsychotic effects in psychosis, it seems possible that the presented results might be different in a condition modelling a psychotic state, for example after a dopamine-agonistic or a NMDA-antagonistic challenge. Therefore, future work in models of psychotic states is needed to further investigate the translational relevance of these findings.

      We fully agree that it is unclear how the effects of antipsychotics in mice relate to the drug effects that would be observed in schizophrenic patients. It is also correct that the reduction of the difference between closed and open loop locomotion onset response in L5 IT neurons (Figure 4) is not what we would have expected to find under the assumption that psychosis is characterized by a blunted distinction between predicted and unpredicted stimuli. We are not sure how to interpret this finding. However, it is probably important to note that the difference is only reduced when using a normalized comparison. Looking just at the subtraction of the two curves, the difference between closed and open loop locomotion onset responses remains unchanged before and after antipsychotic drug injection. The finding of a decorrelation of layer 5 activity, however, is easier to interpret under the assumption that layer 5 functions as an internal representation. If speech hallucinations, for example, are the consequence of a spurious activation of internal representations in speech processing areas of cortex, then antipsychotics might reduce the probability of these spurious activation events by reducing the lateral influence between layer 5 neurons in different cortical areas.

      We do indeed plan to address the question of how antipsychotics influence cortical processing in mouse models of schizophrenia in the future.

      • Incomplete testing of predictive coding interpretation

      While the investigation of neuronal responses to different visual flow stimuli Is interesting, it remains open whether these responses indeed reflect internal representations in the framework of predictive coding. While the responses are consistent with internal representation as defined by the researchers, i.e., unsigned prediction error signals, an alternative interpretation might be that responses simply reflect sensory bottom-up signals that are more related to some low-level stimulus characteristics than to prediction errors.

      This is correct – we will expand on the discussion of this point in the manuscript.

      Moreover, This interpretational uncertainty is compounded by the fact that the used experimental paradigms were not suited to test whether behaviour is impacted as a function of the visual stimulation which makes it difficult to assess what the internal representation of the animal actual was. For these reasons, the observed effects might reflect simple bottom-up sensory processing alterations and not necessarily have any functional consequences. While this potential alternative explanation does not detract from the value of the study, future work would be needed to explain the effect of antipsychotic drugs on responses to visual flow. For example, experimental designs that systematically vary the predictive strength of coupled events or that include a behavioural readout might be more suited to draw from conclusions about whether antipsychotic drugs indeed alter internal representations.

      We agree that much additional work will be necessary to identify internal representation neurons. However, it is difficult to envision how behavioral output could be used to make inferences about internal representations in sensory areas of cortex. In humans, for example, there is evidence that internal representations in visual cortex and behavioral output are not always directly related: binocular rivalry activates representations of both stimuli shown in visual cortex, while the conscious experience that drives behavioral output is only of one of the two stimuli. Hence, we would assume that the internal representation in visual cortex does not necessarily relate to behavioral output.

      • Methodological constraints of experimental design

      While the study findings provide valuable insights into the potential effects of antipsychotic drugs, it is important to acknowledge that there may be some methodological constraints that could impact the interpretation of the results. More specifically, the experimental design does not include a negative control condition or different doses. These conditions would help to ensure that the observed effects are not due to unspecific effects related to injection-induced stress or time, and not confined to a narrow dose range that might or might not reflect therapeutic doses used in humans. Hence, future work is needed to confirm that the observed effects indeed represent specific drug effects that are relevant to antipsychotic action.

      We agree that both dosages and a broader spectrum of non-antipsychotic compounds will need to be investigated. We are in the process of building a screening pipeline to perform exactly these types of experiments. We would however argue that the paper already includes a control condition in the form of the amphetamine data (Figure 7). While it is possible that amphetamine might have an effect that exactly cancels out potential i.p. injection- or stress-induced changes, we would argue it is more probable that these changes had no measurable effect on Tlx3 positive L5 IT neuron calcium activity per se. We will provide additional evidence that time or injection stress alone do not result in the observed effects.

      Conclusion:

      Overall, the results support the idea that antipsychotic drugs affect neural responses to predicted and unpredicted stimuli in deep layers of cortex. Although some future work is required to establish whether this observation can indeed be explained by a drug-specific effect on predictive coding, the study provides important insights into the neural underpinnings of visual processing and antipsychotic drugs, which is expected to guide future investigations on the predictive coding hypothesis of psychosis. This will be of broad interest to neuroscientists working on predictive coding in health and in disease.

      Reviewer #3 (Public Review):

      The study examines how different cell types in various regions of the mouse dorsal cortex respond to visuomotor integration and how antipsychotic drugs impacts these responses. Specifically, in contrast to most cell types, the authors found that activity in Layer 5 intratelencephalic neurons (Tlx3+) and Layer 6 neurons (Ntsr1+) differentiated between open loop and closed loop visuomotor conditions. Focussing on Layer 5 neurons, they found that the activity of these neurons also differentiated between negative and positive prediction errors during visuomotor integration. The authors further demonstrated that the antipsychotic drugs reduced the correlation of Layer 5 neuronal activity across regions of the cortex, and impaired the propagation of visuomotor mismatch responses (specifically, negative prediction errors) across Layer 5 neurons of the cortex, suggesting a decoupling of long-range cortical interactions.

      The data when taken as a whole demonstrate that visuomotor integration in deeper cortical layers is different than in superficial layers and is more susceptible to disruption by antipsychotics. Whilst it is already known that deep layers integrate information differently from superficial layers, this study provides more specific insight into these differences. Moreover, this study provides a first step into understanding the potential mechanism by which antipsychotics may exert their effect.

      Whilst the paper has several strengths, the robustness of its conclusions is limited by its questionable statistical analyses. A summary of the paper's strengths and weaknesses follow.

      Strengths:

      The authors perform an extensive investigation of how different cortical cell types (including Layer 2/3, 4 , 5, and 6 excitatory neurons, as well as PV, VIP, and SST inhibitory interneurons) in different cortical areas (including primary and secondary visual areas as well as motor and premotor areas), respond to visuomotor integration. This investigation provides strong support to the idea that deep layer neurons are indeed unique in their computational properties. This large data set will be of considerable interest to neuroscientists interested in cortical processing.

      The authors also provide several lines of evidence that visuomotor information is differentially integrated in deep vs. superficial layers. They show that this is true across experimental paradigms of visuomotor processing (open loop, closed loop, mismatch, drifting grating conditions) and experimental manipulations, with the demonstration that Layer 5 visuomotor integration is more sensitive to disruption by the antipsychotic drug clozapine, compared with cortex as a whole.

      The study further uses multiple drugs (clozapine, aripiprazole and haloperidol) to bolster its conclusion that antipsychotic drugs disrupt correlated cortical activity in Layer 5 neurons, and further demonstrates that this disruption is specific to antipsychotics, as the psychostimulant amphetamine shows no such effect.

      In widefield calcium imaging experiments, the authors effectively control for the impact of hemodynamic occlusions in their results, and try to minimize this impact using a crystal skull preparation, which performs better than traditional glass windows. Moreover, they examine key findings in widefield calcium imaging experiments with two-photon imaging.

      Weaknesses:

      A critical weakness of the paper is its statistical analysis. The study does not use mice as its independent unit for statistical comparisons but rather relies on other definitions, without appropriate justification, which results in an inflation of sample sizes.

      We will expand on both analyses and justifications throughout.

      For example, in Figure 1, independent samples are defined as locomotion onsets, leading to sample sizes of approx. 400-2000 despite only using 6 mice for the experiment. This is only justified if the data from locomotion onsets within a mouse is actually statistically independent, which the authors do not test for, and which seems unlikely. With such inflated sample sizes, it becomes more likely to find spurious differences between groups as significant. It also remains unclear how many locomotion onsets come from each mouse; the results could be dominated by a small subset of mice with the most locomotion onsets. The more disciplined approach to statistical analysis of the dataset is to average the data associated with locomotion onsets within a mouse, and then use the mouse as an independent unit for statistical comparison. A second example, for instance, is in Figure 2L, where the independent statistical unit is defined as cortical regions instead of mice, with the left and right hemispheres counting as independent samples; again this is not justified. Is the activity of cortical regions within a mouse and across cortical hemispheres really statistically independent? The problem is apparent throughout the manuscript and for each data set collected.

      This may partially be a misunderstanding. Figures 1F-1K indeed use locomotion onsets as a unit, but there were no statistical comparisons. In these Figures we were addressing the question of whether locomotion onsets in closed loop differ from those in open loop. Thus, we quantify variability as a unit of locomotion onsets. The question of mouse-to-mouse variability of this analysis is a slightly different one. We did include the same analysis (for visual cortex) with the variability calculated across mice as Figure S2. We will expand this supplementary figure with the equivalent data of Figure 3 to further address this concern.

      For Figure 1L (we assume the reviewer means Figure 1L, not Figure 2L), the unit we used for analysis was cortical area. We will update and improve the analysis. This was indeed not optimal, and we will replace the statistical testing with hierarchical bootstrap (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7906290/) to account for nested data.

      An additional statistical issue is that it is unclear if the authors are correcting for the use of multiple statistical tests (as in for example Figure 1L and Figure 2B,D). In general, the use of statistics by the authors is not justified in the text.

      We will update and improve the analysis shown in Figure 1L.

      In Figures 2B and 2D, we think adding family-wise error correction would be slightly misleading. We could add a correction – our conclusions would remain unchanged almost independent of the choice of correction (most of the significant p values are infinitesimally small, see Table S1). However, our interpretation is not focusing on one particular comparison (of many possible comparisons) that is significant - all comparisons between closed and open loop data points were significant in the L5 IT recordings and none of them were significant in the recordings in C57BL/6 mice that expressed GCaMP brain-wide.

      Finally, it is important to note that whilst the study demonstrates that antipsychotics may selectively impact visuomotor integration in L5 neurons, it does not show that this effect is necessary or sufficient for the action of antipsychotics; though this is likely beyond the scope of the study it is something for readers to keep in mind.

      We fully agree, it is still unclear how the effects we observe in our work relate to the treatment relevant effects in patients. We will expand on this point in the discussion.

    1. The idea here focuses a lot on memory retention of past events which I think can be approached in two ways:

      1. At the level of the EDR itself: just like Dinil already suggested, instead of simply checking a threshold of maliciousness, we can monitor the gradient as well to effectively raise signals with an increasing maliciousness of event sequences. The idea of aggregated events across boots or even distributed analysis across similar clients also fit into this scenario

      2. At the ML level using such architectures as LSTMs, attention layers, memory based graph networks, etc.

      We may need to use both approaches here

    1. single pathways are exclusionary.

      I agree with this statement. There are so many factors to consider when creating something that is to be used by various kinds of people. It may feel as though it is fair in the moment, but we need to stop and think of whether we missed anything or disregarded a disability because we ourselves do not have it.

    1. I think we’re about to enter a stage of sharing the web with lots of non-human agents that are very different to our current bots – they have a lot more data on how behave like realistic humans and are rapidly going to get more and more capable.Soon we won’t be able to tell the difference between generative agents and real humans on the web.Sharing the web with agents isn’t inherently bad and could have good use cases such as automated moderators and search assistants, but it’s going to get complicated.

      Having the internet swarmed by generative agents is unlike current bots and scripts. It will be harder to see diff between humans and machines online. This may be problematic for those of us who treat the web as a space for human interaction.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their comments and constructive suggestions to improve the manuscript. We are encouraged to see that both reviewers acknowledge how the results from our manuscript uses state-of-art technologies to advance molecular underpinnings of centriole length, integrity and function regulation. Both reviewers also highlighted that the manuscript is well laid out and presents clear, rigorous, and convincing data. Reviewer#1 described our manuscript of highest experimental quality and broad interest to the field of centrosome and cell biology form a basic research and genetics/clinical point of view. Here, we explain the revisions, additional experimentations and analyses planned to address the points raised by the referees. We will perform most of the experimentations and corrections requested by the reviewers. We have already made several revisions and are currently working on additional experiments.

      Our responses to each reviewer comment in bold are listed below. References mentioned here are listed in the references section included at the of this document.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: __In this manuscript, Arslanhan and colleagues use proximity proteomics to identify CCDC15 as a new centriolar protein that co-localizes and interacts with known inner scaffold proteins in cell culture-based systems. Functional characterization using state-of-the-art expansion microscopy techniques reveals defects in centriole length and integrity. The authors further reveal intriguing aberrations in the recruitment of other centriole inner scaffold proteins, such as POC1B and the SFI1/centrin complex, in CCDC15-deficient cells, and observe defects in primary cilia. __

      We thank the reviewer for the accurate summary of the major conclusions of our manuscript.

      Major points:

      1) The authors present a high-quality manuscript that identifies a novel centriolar protein by elegantly revealing and comparing the proximity proteomes of two known centriolar proteins, which represents an important component for the maintenance of centrioles.

      We thank the reviewer for highlighting that our manuscript is of high quality and presents important advances for the field.

      __2) Data are often presented from two independent experiments (n = 2), which is nice, but also the minimum for experiments in biology. It is strongly recommended to perform at least three independent experiments. __

      We agree with the reviewer that analysis of data form three experimental replicates is ideal for statistical analysis. We performed three replicates for the majority of experiments in the manuscript. However, as the reviewer pointed out, we included analysis from two experiments for the following figures:

      • Fig. 4H: quantification of CCDC15 total cellular levels throughout the cell cycle by western blotting
      • Fig. 5A: CCDC15-positive centrioles in control and CCDC15 siRNA-transfected cells
      • Fig. 6B: % centriolar coverage of POC5, FAM161A, POC1B and Centrin-2 in control and CCDC15 siRNA-transfected cells
      • Fig. 6C, 6E: Centrin-2 or SFI1-positive centrioles in control and CCDC15 siRNA-transfected cells
      • Fig. 6J, K: normalized tubulin length and percentage of defective centrioles in cells depleted for CCDC15 or co-depleted for CCDC15 and POC1B
      • Fig. 7F, H: SMO-positive cilia and basal body IFT88 levels in control and CCDC15 siRNA-transfected cells
      • Fig. S3H: centriole amplification in HU-treated control and CCDC15 siRNA-transfected cells (no)
      • Fig. S3A: centrosomal levels upon CCDC15 depletion There are two reasons for why we performed two experimental replicates for these experiments: 1) results from the two experimental replicates were similar, 2) quantification of data by U-ExM is laborious. To address the reviewer’s comments, we will perform the third experimental replicate for the sets of data that led to major conclusions of our manuscript, which are Figures 4H, 6C, 6E, 6J, 6K, 7F, 7H and S3A.

      3) The protein interaction studies presented in Fig. 3 could be of higher quality. While it is great that the authors compared interactions to the centriolar protein SAS6, which is not expected to interact with CCDC15, the presented data raise many questions.

      __a) In most cases, co-expression of tagged CCDC15 stabilizes the tested interaction partners, such that the overall abundance seems to be higher. The increase in protein abundance is substantial for Flag-FAM161A (Fig. 3D) and GFP-Centrin-2 (Fig. 3E) and is even higher for the non-interactor SAS6 (Fig. 3G), while it cannot be assessed for GFP-POC1B (Fig. 3F). Hence, the higher expression levels under these conditions make it more likely that these proteins are "pulled down" and therefore do not represent appropriate controls. __

      We agree with the reviewer that the differences in protein abundance of the prey proteins upon expression of CCDC15 relative to control might impact the interpretation of the interaction data. To address this concern, we will perform the following experiments:

      • To account of the potential stabilizing effects of CCDC15 expression, we will change the relative ratio of plasmids expressing proteins of interest and assess the expression of bait and prey protein levels. We will then repeat the co-immunoprecipitation experiments in conditions where prey expression levels are similar.
      • To avoid the potential stabilizing effects of CCDC15 overexpression, we will perform immunoprecipitation experiments in cells expressing GFP or V5-tagged inner scaffold proteins and assess their potential physical or proximity interaction by blotting for endogenous CCDC15. __b) All Co-IP experiments are lacking negative controls in the form of proteins that are not pulled down under the presented conditions. __

      For the co-IP experiments, we only included a specificity control for the interaction of the bait protein with the tag of the prey protein (i.e. GBP pulldown of GFP or GFP-CCDC15-expressing cells). As the reviewer suggested, we will also include a specificity control for the interaction of bait with the tag of the prey protein for co-immunoprecipitation experiments (i.e. GFP pulldown of cells expressing GFP-CCDC15 with V5-BirA* or V5-BirA*-FAM161A).

      __c) The amounts of co-precipitation of the tested proteins appears very different. Could this reflect strong or weak interactors, or does it reflect the abundance of the respective proteins in centrioles? __

      We agree with the reviewer that the quantity of the co-precipitated prey proteins might be a proxy for the interaction strength if the abundance of the bait proteins is similar. However, the expression levels of bait and prey proteins in co-transfected cells are different and thus, cannot be used to derive a conclusion on the interaction strength. For the revised manuscript, we will repeat the IP experiments and comment on this in the discussion section.

      __4) The observation that IFT88 is supposedly decreased at the base of cilia in CCDC15-depleted cells requires additional experiments/evidence. Fig. 7G shows the results of n = 2 and more importantly, a similar reduction of gamma-tubulin in siCCDC15. Could the observed reduction in IFT88 be explained by a decrease in accessibility to immunofluorescence microscopy? Would the reduction in IFT88 at the base also be apparent when the signals were normalized to gamma-tubulin signals? __

      To address the reviewer’s concern, we quantified the basal body gamma-tubulin and IFT88 levels in control and CCDC15-depleted cells and plotted the basal body IFT88 levels normalized to gamma-tubulin levels in Fig. 7H. Similar to the reduction in IFT88 levels, gamma-tubulin-normalized IFT88 levels was significantly less relative to control cells. Moreover, the gamma-tubulin basal body levels were similar between control and CCDC15 cells. We revised the gamma-tubulin micrographs in Fig. 7G to represent this. These results indicate that the reduction in basal body IFT88 levels upon CCDC15 depletion in specific.

      __5) The observed Hedgehog signaling defects are described as follows: "CCDC15 depletion significantly decreased the percentage of SMO-positive cells". It is similarly described in the figure legend. If this was true, the simplest explanation would be that it reflects the reduction in ciliation rate (which is in a similar range). If SMO-positive cilia (instead of "cells") were determined, the text needs to be changed accordingly. __

      As the reviewer pointed out, we quantified SMO-positive cilia, but not cells. We are sorry for this typo. We corrected SMO-positive cells as SMO-positive cilia in the manuscript text, Fig. 7 and figure legends.

      __6) OPTIONAL: While expansion microscopy is slowly becoming one of the standard super-resolution microscopy methods, which is particularly well validated for studying centrioles, the authors should consider confirming part of their findings (as a proof of principle, surely not in all instances) by more established techniques. This could serve to convince critical reviewers that may argue that the expansion process may induce architectural defects of destabilized centrioles, as observed after disruptions of components, such as in Fig. 6. Alternatively, the authors could cite additional work that make strong cases about the suitability of expansion microscopy for their studies, ideally with comparisons to other methods. __

      • SIM imaging was previously successfully applied for nanoscale mapping of other centriole proteins including CEP44, MDM1 and PPP1R35 (Atorino et al., 2020; Sydor et al., 2018; Van de Mark et al., 2015). To complement the U-ExM analysis, we have started imaging cells stained for CCDC15 and different centriole markers (i.e. distal appendage, proximal linker, centriole wall) using a recently purchased 3D-SIM superresolution microscope. We already included the SIM imaging data for CCDC15 localization in centrosome fractions purified from HEK293T cells in Fig. S5B. In the revised manuscript, we will replace confocal imaging data in Fig. 3A and 3B with SIM imaging data.
      • As the reviewer noted, expansion microscopy has been successfully used for the analysis of a wide range of cellular structures and scientific questions including nanoscale mapping of cellular structures across different organisms. In particular, U-ExM of previously characterized centrosome proteins various centriole proteins have significantly advanced our understanding of centriole ultrastructure. In our manuscript, we used the U-ExM protocol that was validated for centrioles by comparative analysis of U-ExM and cryo-ET imaging by our co-authors (Gambarotto et al., 2019; Hamel et al., 2017). To clarify these points, we included the following sentence along with the relevant references in the introduction: “Application of the U-ExM method to investigate known centrosome proteins has started to define the composition of the inner scaffold as well as other centriolar sub-compartments (Chen et al., 2015; Gambarotto et al., 2021; Gambarotto et al., 2019; Kong and Loncarek, 2021; Laporte et al., 2022; Mahen, 2022; Mercey et al., 2022; Odabasi et al., 2023; Sahabandu et al., 2019; Schweizer et al., 2021; Steib et al., 2022; Tiryaki et al., 2022; Tsekitsidou et al., 2023).”

      Minor points:

      1) Text, figures, and referencing are clear and accurate, apart from minor exceptions.

      We clarified and corrected the points regarding text, figures and references as suggested by the two reviewers.

      __ 2) The title suggests a regulator role for CCDC15 in centriole integrity and ciliogenesis, which has formally not been shown. __

      We revised the title as “CCDC15 localizes to the centriole inner scaffold and functions in centriole length control and integrity”.

      __3) As the authors observe changes in centriole lengths in the absence of CCDC15, it would be very insightful to compare these phenotypes to other components that affect centriolar length, such as C2CD3, human Augmin complex components (as HAUS6 is identified in Fig. 1) or others. These could be interesting aspects for discussion, additional experiments are OPTIONAL. __

      We agree with the reviewer that comparative analysis of centriole length phenotypes for CCDC15 and other components that regulate centriole length will provide insight into how these components work together at the centriole inner core. To this end, we phenotypically compared CCDC15 loss-of-function phenotypes to that of other components of the inner scaffold (POC5, POC1B, FAM161A) that interact with CCDC15. In agreement with their previously reported functions in U2OS or RPE1 cells, we found that POC5 depletion resulted in a 4% slight but significant increase in centriole length and POC1B depletion resulted in a 15% significant decrease. In contrast, FAM161A depletion did not alter centriole length (siControl: 447.8±59.7 nm, siFAM161A 436.3±64 nm). Together, our analysis of their centriolar localization dependency and regulatory roles during centriole length suggest that CCDC15 and POC1B might form a functional complex as positive regulators of centriole length. In contrast, POC5 functions as a negative regulator and might be part of a different pathway for centriole length regulation. We integrated the following sub-paragraph in the results section and also included discussion of this data in the discussion section:

      “Moreover, we quantified centriole length in control cells and cells depleted for POC5 or POC1B. While POC5 depletion resulted in longer centrioles, POC1B resulted in shorter centrioles (POC5: siControl: 414.1 nm±38.3, siPOC5: 432.7±44.8 nm, POC1B: siControl: 400.6±36.1 nm, siPOC1B: 341.5±44.39 nm,). FAMA161A depletion did not alter centriole length (siControl: 447.8±59.7 nm, siFAM161A 436.3±64 nm). Together, these results suggest that CCDC15 might cooperate with POC1B and compete with POC5 to establish and maintain proper centriole length.”

      __ 4) While the reduced ciliation rate in the absence of CCDC15 is convincing, the authors did not investigate "ciliogenesis", i.e. the formation of cilia, and hence should re-phrase. The sentence in the discussion that "CCDC15 functions during assembly" should be removed. __

      To clarify that we only investigated the role of CCDC15 in the ability of cells to form cilia, we replaced sentences that indicates “CCDC15 functions in cilium assembly” with “CCDC15 is required for the efficiency of cilia formation”.

      __5) The existence of stably associated CCDC15 pools with centrosomes (Fig. 2) requires further evidence. The recovery of fluorescence after photobleaching in FRAP experiments is strongly dependent on experimental setups and is only semi-quantitative. A full recovery is unrealistic, hence, it is ideally compared to a known static or known mobile component. I personally think this experiment -as it is presented now- is of little value to the overall fantastic study. The authors may consider omitting this piece of data. __

      We agree with the reviewer that FRAP data by itself does not prove the existence of stably associated CCDC15 pool. As controls in these experiments, we use FRAP analysis of GFP-CCDC66, which has a 100% immobile pool at the cilia and 50% immobile pool at the centrosomes as assessed by FRAP (Conkar et al., 2019). To address these points, we toned down the conclusions derived from this experiment by revising the sentence as follows:

      Additionally, we note that the following data provides support for the stable association of CCDC15 at the centrioles:

      • About 49.6% (± 3.96) of the centrioles still had CCDC15 fluorescence signal at one of the centrioles upon CCDC15 siRNA treatment (Fig. 5A, 5B). The inefficient depletion of the mature centriole pool of CCDC15 is analogous to what was observed upon depletion of other centriole lumen and inner scaffold proteins including WDR90 and HAUS6 (Schweizer et al., 2021; Steib et al., 2020). __6) The data that CCDC15 is a cell cycle-regulated protein is not very convincing (see Fig. 3H), as the signals area weak and the experiment has been performed only once (n= 1). This piece of data does not appear to be very critical for the main conclusions of the manuscript and may be omitted. Otherwise, this experiment should be repeated to allow for proper statistical analysis. __

      We will perform these experiments two more times, quantify cellular abundance of CCDC15 in synchronized populations from three experimental replicates and plot it with proper statistical analysis.

      __7) Experimental details on how "defective centrioles" are determined are missing. __

      We included the following experimental details to the methods section:

      “Centrioles were considered as defective when the roundness of the centriole was lost or the microtubule walls were broken or incomplete. In the longitudinal views of centrioles, defective centrioles were visualized as heterogenous acetylated signal along the centriole wall or irregularities in the cylindrical organization of the centriole wall (Fig. 5F). We clarified these points in the methods section.

      __ 8) For figures, in which the focus should be on growing centrioles (see Fig. 4), it could be helpful to guide the reader and indicate the respective areas of the micrographs by arrows. __

      We added arrows to point to the respective areas of the micrographs in Fig. 4F.

      __ 9) Page18: "centriole length shortening" could be changed to "centriole shortening". __

      We corrected this description as suggested.

      __10) It is unclear how the authors determine distal from proximal ends of centrioles in presented micrographs (see Fig. 5D). __

      We determined the proximal and distal ends of the centrioles by taking the centriole pairs as a proxy. Even though we only represent a micrograph containing a single centriole in some of the U-ExM figures including Fig. 5D, the uncropped micrographs contain two centrioles, which are oriented orthogonally and tethered to each other at their proximal ends in interphase cells. We added the following sentence to the methods section to clarify this point:

      *“Since centrioles are oriented orthogonally and tethered to each other at their proximal ends in interphase cells, we also used the orientation of the centriole pairs as a proxy to determine the proximal and distal ends of the centrioles.” *

      __11) Fig. 7A is missing scale bars and Fig.7 overall is lacking rectangle indicators of the areas that are shown at higher magnification in the insets. __

      We added scale bar to Fig. 7A and rectangle indicators for zoomed in regions in Fig. A, E, G.

      12) Fig. 7C displays cilia that appear very short, especially when comparing to the micrographs and bar graphs presented. The authors may want to explain this discrepancy.

      We thank the reviewer for the comment. The zoomed in representative cilia is 4.1 µM in control cells and 1.4 µM in CCDC15-depleted cells. Therefore, the representative cilia is in agreement with the quantification of cilia in Fig. 7C.

      Reviewer #1 (Significance (Required)):From a technical point of view the authors use two state-of-the-art technologies, namely proximity labeling combined with proteomics and ultrastructure expansion microscopy, that are both challenging and very well suited to address the main questions of this study. ____ • General assessment: The presented study is of highest experimental quality. Despite being very challenging, the expansion microscopy and proximity proteomics experiments have been designed and performed very well to allow solid interpretation. The results of the central data are consistent and allow strong first conclusions about the putative function of the newly identified centriolar protein CCDC15. The study presents a solid foundation for future hypothesis-driven, mechanistic analysis of CCDC15 and inner scaffold proteins in centriole length control and maintaining centriole integrity. The only limitation of the study is that the technically simpler experiments should be repeated to allow proper statistical assessment, which can be addressed easily. • Advance: This is the first study that identifies CCDC15 as a centriolar protein and localizes it to the inner scaffold. It further describes a function for CCDC15 in centriole length control and shows its importance in maintaining centriole integrity with consequences for stable cilia formation in tissue culture. The study provides further functional insights into the interdependence of inner scaffold proteins and the role of CCDC15 in the recruitment of the SFI1/centrin distal complex. • Audience: The manuscript will be of broad interest to the fields of centrosome and cell biology, both from a basic research and genetics/clinical point of view due to the association with human disorders. The state-of-the-art technologies applied will be of interest to a broader cell and molecular biology readership that studies subcellular compartments and microtubules. • Reviewer's field of expertise: Genetics, imaging, and protein-protein interaction studies with a focus on centrosomes and cilia.

      We thank the reviewer for recognizing the importance of our work and for supportive and insightful comments that will further strengthen the conclusions of our manuscript. Our planned revisions will address the only major technical limitation raised by the reviewer that requires adding one more experimental replicate for analysis of the data detailed in major point#1. Notably, we also thank the reviewer to specifying the experiments that are not essential or will be out of the scope of our manuscript as “optional”.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary:

      __In this study, Arslanhan et al. propose CCDC15 as a novel component of the centriole inner scaffold structure with potential roles in centriole length control, stability and the primary cilium formation in cultured epithelial cells. Using proximity labelling they explore the common interactors of Poc5 and Centrin-2, two resident molecules of the centriole inner scaffold, to hunt for novel regulators of this structure. The authors leverage expansion microscopy-based localization and siRNA-dependent loss-of-function experiments to follow up on one such protein they identify, CCDC15, with the aforementioned roles in centriole and cilia biology.

      This study is designed and laid out nicely; however, to be able to support some of the important claims regarding their proximity labelling results and exploration on the roles of CCDC15, there are several major technical and reproducibility concerns that deem major revision. Similarly, the introduction (perhaps inadvertently) omits much of the recent studies on centriole size control that have highlighted the complexity of this biological problem. As such, addressing the following major points will be essential in further considering this work for publication. __

      __We thank the reviewer for recognizing the importance of our work and appreciate the positive reflections on our manuscript and the feedback comments that were well thought-out and articulated and will further strengthen the conclusions of our manuscript. Our planned revisions focus on addressing the reviewer’s comments especially in further supporting our conclusions for proximity-labeling, phenotypic characterization and immunoprecipitation experiments, examining CCDC15 centriole localization in an additional cell line and investigating how CCDC15 works together during centriole length control with known components of the inner scaffold. __

      Major points:

      __1a) The authors use Poc5 and Centrin-2 molecules as joint baits to reveal the interactome of the centriole inner scaffold, however the work lacks appropriate experimental and analytical controls to argue that this is a proximity mapping "at the centriole inner scaffold". In its current state, it is simply an interactome of total Poc5 and Centrin-2, and it might be misleading to call it an interactome at the centriole inner scaffold (the statistical identification of shared interactors cannot do full justice to their biology at the centrosome). Appropriate expression data needed to delineate how large the centrosomal vs. cytoplasmic (or nucleoplasmic) fraction is for either of these molecules, both without and upon the addition of biotin (to see whether the bulk of interaction data stem from the cytoplasm/nucleoplasm or the centrioles themselves). The authors can test this by selectively blotting a lysate fraction containing the centrosomes after centrifugation, and compare them with the simultaneous blot of the supernatant (which were readily used for the blots presented in Fig. 1B). This experiment also becomes very relevant for the case of Centrin-2, as it also heavily localizes to the nucleoplasm as the authors found out (see Fig. 1A and Fig. S1A). __

      __ Additionally, an orthogonal approach should be taken to perform bio-image analysis on their biotin/streptavidin imaging data to demonstrate the exact ratios between the centrosomal vs. cytoplasmic/nucleoplasmic biotin activation with appropriate signal normalization between the biotin/streptavidin images. This is particularly important, as although the authors claim that these cells stably express the V5BirA*, it seems that there is partial clonality to the expression. Some cells in both the Poc5 and Centrin-2 fusion constructs appear to lack the V5/Streptavidin signals upon Biotin addition (such as the two cells in the centre right in Poc5, and again a cell in the centre right for Centrin-2 images). In its current form, Fig. 1A lacks signal quantification and does not report any information about the replicates and distributions of the data. I worry that this may raise concerns on the reproducibility if published in its current form. __a) We agree with the reviewer that the proximity maps of POC5 and

      a) Centrin-2 are not specific to the centriole inner scaffold and thus, do not represent the inner scaffold interactome. The proximity maps identified interactions across different pools of POC5 and Centrin-2 in nucleus, cytoplasm and centrosomes (Fig. 1, S1). To highlight these important points, we already included extensive analysis of the different cellular compartments and biological processes identified by the POC5 and Centrin-2 proximity maps in the results section (pg. 9-10).

      We think that there are two reasons that caused the misinterpretation of the use of these proximity maps as the “inner scaffold interactome”: 1) the way we introduced the motivation for proximity mapping studies, 2) proposing the use of the resulting interactomes as resources for identification of the full repertoire of the inner scaffold proteins. To clarify these points, we revised the manuscript in all relevant parts that might have led to misinterpretation. Following are the specific revisions:

      • To clarify that the proximity maps are not specific to the inner scaffold pools of POC5 and Centrin-2, we revised the title of the results section for Fig. 1 and 2 as follows: “Proximity mapping of POC5 and Centrin-2 identifies new centriolar proteins”.

      • To indicate that POC5 and Centrin-2 localizes to the cytoplasm and/or nucleus in addition to the centrosome, we added the following sentence to the result section: In addition to centrosomes, both fusion proteins also localized to and induced biotinylation diffusely in the cytoplasm and/or nucleus (Fig. 1A).”

      • In the introduction, we revised the following sentence “Here, we used the known inner scaffold proteins as probes to identify the molecular makeup of the inner scaffold in an unbiased way.” as follows: *“Here, we used the known inner scaffold proteins as probes to identify new components of the inner scaffold”. *

      • To highlight the different cellular pools of POC5 and Centrin-2 and identification of their interactors in these pools, we included the following sentence in the results section: “As shown in Fig. S1, Centrin-2 and POC5 proximity interactomes were enriched for GO categories that are relevant for their published functions during centrosomal, cytoplasmic and/or nuclear biological processes and related cellular compartments (Azimzadeh et al., 2009; Dantas et al., 2013; Heydeck et al., 2020; Khouj et al., 2019; Resendes et al., 2008; Salisbury et al., 2002; Steib et al., 2020; Yang et al., 2010; Ying et al., 2019).”

      • We replaced the “interactome” statement with “proximity interaction maps” or “proximity interactors” throughout the manuscript to prevent the conclusion that the proximity maps represent the inner scaffold interactome. b) As the reviewer noted, most centrosome proteins have multiple different cellular pools including the centrosome. For most proteins like gamma-tubulin and centrin, their cytoplasmic/nucleoplasmic pools are more abundant than their centrosomal pools (Moudjou et al., 1996; Paoletti et al., 1996). For the Firat-Karalar et al. Current Biology 2015 paper, I compared the biotinylation levels of centrosomal fractions versus cytoplasmic fractions and confirmed that this is also true in cells expressing myc-BirA* fusions of CDK5RAP2, CEP192, CEP152 and CEP63 (unpublished) (Firat-Karalar et al., 2014). For the revised manuscript, we will compare the biotinylation level of centrosomal, nuclear and cytoplasmic pools of V5Bir*-POC5 and V5BirA*-Centrin-2 using the stable lines. To this end, we will use published centrosome purification protocols. We will include this data in Fig. S1 to highlight that the proximity interactomes represent the different pools of the bait proteins and to show the relative levels of the baits across their different pools.

      c) BioID approach has been successfully used to probe centrosome interactions by my lab and other labs in the field. In fact, proximity interaction maps of over 50 centrosome proteins were published as resource papers by Pelletier&Gingras labs (Gheiratmand et al., 2019; Gupta et al., 2015). Analogous to our strategy in this manuscript, these studies generated proximity maps of centrosome proteins by creating cell lines that stably express BioID-fusions of centrosome proteins followed by streptavidin pulldowns from whole cell extracts and mass spectrometry analysis. Since majority of centrosome proteins also have pools in multiple cellular locations, the published BioID proximity maps for centrosome proteins are not specific to centrosomes. However, the proximity maps included all known centrosome proteins and identified new proteins, which shows that centrosome interactions are represented in pulldowns form whole cell lysates. Moreover, maps form whole cell lysates are also advantageous as they are are unbiased and can be used in future studies as resources for studying the functions and interactions of the bait proteins in different contexts.

      In the Firat-Karalar et al. Current Biology 2015 paper, I combined centrosome purifications with BioID pulldowns to enrich for the centrosomal interactions in the proximity maps of centriole duplication proteins(Firat-Karalar et al., 2014). However, I started the purification with cells transiently transfected with the BioID-fusion constructs, which resulted in high ectopic expression of the fusions in the cytoplasm and/or nucleus. Therefore, centrosome enrichments were useful as an additional step before mass spectrometry. Comparative analysis of the data for proximity maps of 4 centrosome proteins generated from stable lines or centrosome fractions of transiently transfected cells substantially overlap as compared in the Gupta et al. Cell 2015 study and were more comprehensive (Table S2) (Gupta et al., 2015). Therefore, we are confident that the proximity interactomes we generated for POC5 and Centrin-2 include their centrosomal interactions.

      __1b) Similarly, it is not clear whether the expression of Poc5 and Centrin-2 fusion molecules somehow interfere with their endogenous interactions or function. At least some loss-of-function (e.g., RNAi) experiments should be performed where the depletion of endogenous proteins should be attempted to rescue by the fusion constructs. This will help evaluate whether the fusion proteins can rescue the depletion of their endogenous counterparts and behave as expected from a wild-type scenario. __

      The reviewer raises an important concern regarding the physiological relevance of the POC5 and Centrin-2 proximity maps. In the manuscript, we showed and discussed the validation of their proximity interactomes by two lines of evidence, which are: 1) the interactomes identified the previously described cellular compartments, biological processes or interactors of POC5 and Centrin-2, 2) the interactomes led to the identification of CCDC15 as a new inner scaffold protein.

      As the reviewer indicated, stable expression of POC5 and Centrin-2 in the presence of their endogenous pools might affect cellular physiology and thereby the landscape of the interactomes. We plan to address this using the following experiments:

      a) We will perform a set of functional assays to assess whether stable V5BirA*-Centrin-2 and V5BirA*-POC5 cells behaves like control cells in terms of their centrosome number, cell cycle profiles and mitotic progression. We will specifically quantify:

      • centrosome number (immunofluorescence analysis for gamma-tubulin and centrin)
      • their mitotic index (immunofluorescence analysis by DAPI)
      • spindle polarity and percentage of multinucleation (immunofluoerescence analysis for microtubules, gamma-tubulin and DAPI)
      • cell cycle profiles (flow cytometry and immunofluorescence)
      • apoptosis (immunoblotting for caspase 3) Together, results from these experiments indicate that the V5BirA*-POC5 or Centrin-2-expressing stable lines do not exhibit defects associated with their stable expression.

      b) We will perform expansion microscopy in V5BirA*-Centrin-2 and V5BirA*-POC5 cells to assess whether the fusion protein specifically localizes to the centriole inner scaffold, which will provide support for the presence of inner scaffold proteins in their proximity maps. Specifically, we plan to stain the fusion proteins by V5 or BirA antibodies and include the data for the antibody that works for expansion microscopy. This experiment will address whether their stable expression results in specific localization of these proteins at the centriole inner scaffold.

      1c) Overall, as the entire claim around the proximity mapping revolve around its assumption about the centriole inner scaffold, these controls seem imperative to substantiate the ground truth of the biology presented in the manuscript.

      In the revised manuscript, we toned down and made it clear that Centrin-2 and POC5 proximity maps are not specific to the inner scaffold and do not represent the inner scaffold interactome. Since the maps were generated from the whole cell extract, they will provide a resource for future studies aimed at studying functions and mechanisms of POC5 and Centrin-2 across their different cellular pools including the centrosome.

      We would like to also highlight that the proximity maps of POC5 and Centrin-2 are not the major advances of our manuscript. The major advance of our manuscript is the identification of CCDC15 as a new inner scaffold protein that is required for regulation of centriole size and architectural integrity and thereby, for maintaining the ability of centrioles to template the assembly of functional cilia. Importantly, our results identified CCDC15 as the first dual regulator of centriolar recruitment of inner scaffold protein POC1B and the distal end SFI1/Centrin complex and provided important insight into how inner scaffold proteins work together during centriole integrity and size regulation. The new set of experiments we will perform for the revisions of the paper will strengthen these conclusions.

      __2) I am curious about the choices of the cell lines in this work. The proximity mapping to reveal CCDC15 as a candidate protein for centriole inner scaffold was performed in HEK293T cells (human embryonic kidney), however its immunostaining was performed using RPE1 and U2OS cells (human retinal and osteosarcoma epithelial cells respectively). This raises questions regarding the generality of CCDC15 as a centriole inner scaffold protein. Could CCDC15 be simply unique to the centriole inner scaffold of epithelial cells such as RPE1 and U2OS cells? Or could the authors demonstrate any information/data on whether it's similarly localized to the inner scaffold in embryonic kidney cells or other cell types? If not, the claims should be moderated to reflect this fine detail. __

      To test whether CCDC15 localizes to the inner scaffold in other cell types, we performed U-ExM analysis of CCDC15 localization relative to the centriolar microtubules in differentiating multiciliated epithelial cultures (MTEC). As shown in Fig. S3A, CCDC15 localized to the inner scaffold in the centrioles in MTEC ALI+4 cells. Given that the inner scaffold proteins including CCDC15 and previously characterized ones have not been studied in multiciliated epithelia, this result is important and provides support for potential role of the inner scaffold in ensuring centriole integrity during ciliary beating. Additionally, we examined CCDC15 localization by 3D-SIM in centrosomes purified from HEK293T cells, which showed that CCDC15 localizes between the distal centriole markers CEP164 and Centrin-3 and proximal centriole markers gamma-tubulin and rootletin (Fig. S3B).

      3) Discussions and data on the localization of CCDC15 to centriolar satellites appear anecdotal and not fully convincing (Fig. S2D). Given that the authors test the relevance of PCM1 for CCDC15's centriolar localization, it is key to have quantitative data supporting their claim that centriolar satellites can help recruit CCDC15 to the centriole. Could the authors quantify what proportion of CCDC15 localize to the centriolar satellites? One way to do this could be to quantify the colocalization coefficience of CCDC15 and PCM1 signals.

      We only observed co-localization of CCDC15 with the centriolar satellite marker PCM1 in cells transiently transfected with mNG-CCDC15. In Fig. S2E, we included the quantification of the percentage of U2OS and RPE1 cells that exhibit co-localization of PCM1 (100% of U2OS cells, about 80% of RPE1 cells). Like CCDC15, ectopic expression of WDR90 revealed its centriolar satellite localization, suggesting a potential link between centriolar satellites and inner scaffold proteins that can be investigated in future studies (Steib et al., 2020). We now included these results in the discussion section as follows:

      As assessed by co-localization with the centriolar satellite marker PCM1, mNG-CCDC15 localized to centriolar satellites in all U2OS cells and in about 80% of RPE1 cells (Fig. S2C-E). Association of CCDC15 with centriolar satellites is further supported by its identification in the centriolar satellite proteomes(Gheiratmand et al., 2019; Quarantotti et al., 2019).”

      Even though endogenous staining for CCDC15 did not reveal its localization to centriolar satellites, following lines of data support the presence of a dynamic and low abundance pool of CCDC15 at the centriolar satellites: 1) CCDC15 was identified in the centriolar satellite proteome and interactome (Gheiratmand et al., 2019; Quarantotti et al., 2019). 2) CCDC15 centrosomal targeting is in part regulated by PCM1 (Fig. S2F, S2G). For majority of the proteins identified in the centriolar satellite proteome, their satellite pool can only be observed upon ectopic expression. This might be because their centriolar satellite pool is of low abundance and transient as satellite interactions are extensively identified only in proximity mapping studies, but not in traditional pulldowns

      __4) Similar to above (#3), there is no quantitative information on the co-localization or partial co-localization of the signal foci in Fig. 3A and 3B. The authors readily study CCDC15's localization in wonderful detail in their expansion microscopy data, so they could actually consider taking out Fig. 3A and 3B, as the data seem redundant without any quantification. __

      To address the reviewer’s concern, we included plot intensity profile analysis of CCDC15 and different centriole markers along a line drawn at the centrioles in Fig. 3A and 3B, which shows the extent of their overlap. As part of our revision plan, we will replace the confocal imaging data in Fig. 3A and 3B with 3D-SIM imaging data of CCDC15 relative to different centriole markers together with plot profile analysis. We already included 3D-SIM imaging of centrosomes purified form HEK293T cells in Fig. S3B. 3D-SIM imaging data will complement the localization data revealed by U-ExM.

      __5) Do the authors also feel that CCDC15 localize to the core lumen in a somehow helical manner (Fig. 1A, Fig. 1F top and bottom panels, Fig. 5A etc.)? Le Guennec et al. 2020's helical lattice proposal for the inner scaffold further reaffirms that CCDC15 is indeed a likely major component of the inner scaffold. In my view, authors should state this physical similarity explicitly to further support their findings on CCDC15. __

      As the reviewer indicated, cryo–electron tomography and subtomogram averaging of centrioles from four evolutionarily distant species showed that centriolar microtubules are bound together by a helical inner scaffold covering ~70% of the centriole length (Le Guennec et al., 2020). Although U-ExM data do not have enough resolution to show that CCDC15 localizes in a helical manner, we agree with the reviewer that the discussion of this possibility is important and thus we included the following sentence in the results:

      “Longitudinal views suggest potential helical organization of CCDC15 at the inner scaffold, which is consistent with its reported periodic, helical structure (Le Guennec et al., 2020).”

      __6a) The data on the link between the CCDC15 recruitment and the centriole growth (Fig. 4F) or the G2 phase of the cell cycle (Fig. 4H) are not fully convincing without quantitative data. For Fig. 4F, the authors should consider plotting the daughter centriole length vs the daughter CCDC15 intensities against each another, to see whether more elongated daughters truly tend to have more CCDC15. __

      To address the reviewer’s concern, we will plot the daughter centriole length versus CCDC15 intensity at different stages of centriole duplication. In asynchronous cultures that we analyzed with U-ExM, we were not able to find enough cells to perform such quantification. To overcome this limitation, we will perform U-ExM analysis of cells fixed at different points after mitotic shake-off and stained for CCDC15 and tubulin. We will include minimum 10 different representative U-ExM data for different stages of centriole duplication in the revised manuscript along with quantification of length versus signal.

      As detailed in the results section, the goal of these experiments was to determine when CCDC15 is recruited to the procentrioles during centriole duplication, but not to suggest a role for CCDC15 in centriole growth. We clarified this by including the following sentence:

      “To investigate the timing of CCDC15 centriolar recruitment during centriole biogenesis, we examined CCDC15 localization relative to the length of procentrioles that represent cells at different stages of centriole duplication (Fig. 4F).”

      __6b) For Fig. 4H, the argument regarding the cell cycle regulation requires quantification of the bands from several WB repeats, normalized to the expression of GAPDH within each blot (this is particularly relevant, as the bands of CCDC15 do not look dramatically different enough to draw conclusions by eye). __

      We will perform these experiments two more times, quantify cellular abundance of CCDC15 in synchronized populations from three experimental replicates and plot it with proper statistical analysis.

      __7a) The authors find herein that CCDC15 depletion lead to centrioles that are ~10% shorter than the controls. With the depletion of Poc5 and Wdr90 (other proposed components of the inner scaffold), the centrioles end up larger however (Steib et al., 2020). If the role of inner scaffold in promoting centriole elongation is structural, why are these two results the opposite of each other? I realize there is a brief discussion about this at the end of the paper, however, this requires a detailed discussion and speculation on the relevance of these findings. It would be key to clarify whether the inner scaffold as a structure inhibits or promotes centriole growth - or somehow both? If so, how? __

      We agree with the reviewer that comparative analysis of centriole length phenotypes for CCDC15 and other components that regulate centriole length will provide insight into how these components work together at the centriole inner core. To this end, we phenotypically compared CCDC15 loss-of-function phenotypes to that of other components of the inner scaffold (POC5, POC1B, FAM161A) that interact with CCDC15. In agreement with their previously reported functions in U2OS or RPE1 cells, we found that POC5 depletion resulted in a 4% slight but significant increase in centriole length and POC1B depletion resulted in a 15% significant decrease. In contrast, FAM161A depletion did not alter centriole length (siControl: 447.8±59.7 nm, siFAM161A 436.3±64 nm). Together, our analysis of their centriolar localization dependency and regulatory roles during centriole length suggest that CCDC15 and POC1B might form a functional complex as positive regulators of centriole length. In contrast, POC5 functions as a negative regulator and might be part of a different pathway for centriole length regulation. We integrated the following sub-paragraph in the results section in pg. 19 and also included discussion of this data in the discussion section in pg. 23:

      “Moreover, we quantified centriole length in control cells and cells depleted for POC5 or POC1B. While POC5 depletion resulted in longer centrioles, POC1B resulted in shorter centrioles (POC5: siControl: 414.1 nm±38.3, siPOC5: 432.7±44.8 nm, POC1B: siControl: 400.6±36.1 nm, siPOC1B: 341.5±44.39 nm,). FAMA161A depletion did not alter centriole length (siControl: 447.8±59.7 nm, siFAM161A 436.3±64 nm). Together, these results suggest that CCDC15 might cooperate with POC1B and compete with POC5 to establish and maintain proper centriole length.”

      __7b) There might be some intriguing opposing regulatory action of Poc5 and CCDC15 as demonstrated here, where CCDC15 depletion leads to slightly over-recruitment of Poc5, and vice versa. Does this suggest that a tug-of-war going on between different molecules that localize to the inner scaffold? Does this provide some dynamicity to this structure, which might in turn regulate centriole length both positively and negatively? This may be analogous to how opposing forces of dyneins and kinesins provide robust length control for mitotic spindles. I am speculating here, but hopefully these may provide some useful grounds for further discussion in the paper. If the authors deem it interesting experimentally, they can test whether the two molecules indeed regulate centriole length by opposing each other's action, by a double siRNA of CCDC15 and Poc5 to see if this retains the centriole length at its control siRNA size (like how they do a similar test for Poc1's potential co-operativity with CCDC15 in Fig. 6J). __

      We thank the reviewer for proposing excellent ideas on how inner scaffold proteins work together to regulate centriole length. As proposed by the reviewer, different proteins oppose each other analogous to how dynein and kinesin regulate mitotic spindle length. Loss-of-function and localization dependency data support that CCDC15 cooperates with POC1B, which was supported by phenotypic characterization of co-depleted cells (Fig. 6I-K).

      The increase in POC5 levels and coverage at the centrioles upon CCDC15 depletion and vice versa (Fig. 7B, 7G) suggest that CCDC15 and POC5 compete with each other in centriole length regulation. As suggested by the reviewer, we attempted to test this by comparing centriole length in cells co-depleted for CCDC15 and POC5 relative to their individual depletions. Although we tried different depletion workflows, we were not able to co-deplete CCDC15 and POC5. Specifically, we tried transfecting cells with CCDC15 and POC5 siRNAs at the same time or sequentially for 48 h or 96 h. The centrioles in cells that survived co-depletion were positive for both CCDC15 and POC5. This might be because co-depletion of both proteins is toxic to cells. Since CCDC15 and POC5 are likely part of two different pathway in regulation of centrioles and also have other cellular functions, this might have caused cell death. We included the following statement in the discussion to address the excellent model proposed by the reviewer:

      “Taken together, our results suggest that CCDC15 cooperates with POC1B and competes with POC5 during centriole length regulation. Moreover, they also raise the exciting possibility that centriole length can be regulated by opposing activities of inner scaffold proteins. Future studies that explore the relationship among centriole core proteins are required to uncover the precise mechanisms by which they regulate centriole integrity and size.”

      __8) In their introduction section, the authors discuss how relatively little is known about the size control of centrioles, however they fail to mention a series of recent primary literature that uncover striking, new mechanisms and novel molecular players that highlight the complexity of centriole size control. This complexity appears to arise from the existence of multitude of length control mechanisms that influence the cartwheel or the microtubule length individually, or simultaneously via yet-to-be further explored crosstalk mechanisms. a. As such, when the authors talk about the procentriole size control in the introduction, they should discuss and refer to the following studies, in terms of: • How theoretical and experimental work demonstrate that procentriole length may vary dependent on the levels of its building block Sas-6 in animals (Dias Louro et al., 2021 PMID: 33970906; Grzonka and Bazzi, 2022 bioRxiv). • How a homeostatic Polo-like kinase 4 clock regulates centriole size during the cell cycle (Aydogan et al., 2018 JCB PMID: 29500190), and how biochemistry and genetics coupled with mathematical modelling unravel a conserved negative feedback loop between Cep152 and Plk4 that constitutes the oscillations of this clock in flies (Boese et al., 2018 PMID: 30256714; Aydogan et al., 2020 PMID: 32531200) and human cells (Takao et al., 2019 PMID: 31533936). __

      __b. Similarly, when the authors refer to centriole size control induced by microtubule-related proteins, they should highlight the further complexity of this process by referring to: • How a molecule located at the microtubule wall, Cep295/Ana1, can regulate centriole length in flies (Saurya et al., 2016 PMID:27206860) and human cells (Chang et al., 2016 PMID:27185865) - like all the other centriolar MT molecules that the authors discuss in the manuscript. • How a crosstalk between Cep97 and Cep152 influences centriole growth in fly spermatids (Galletta et al., 2016 PMID:27185836). • How a crosstalk between CP110-Cep97 and Plk4 influences centriole growth in flies (Aydogan et al., 2022 PMID:35707992), and this molecular crosstalk is conserved, at least biochemically, in human cells (Lee et al., 2017 PMID:28562169). __

      We thank the reviewer for highlighting the papers that uncovered new mechanisms and players of centriole size and integrity control as well as for the detailed explanation of how different studies led to these discoveries in different organisms. We should have discussed these proteins, functional complexes and mechanisms in our manuscript and cited the relevant literature. We inadvertently focused on literature that uncovered centriole length regulation by MAPs and the inner scaffold. In the introduction section of the revised manuscript where we introduced centriole size regulation in pg. 5, we summarized the major findings on the role of different MAPs, cartwheel and PLK4 homeostatic clock in ensuring formation of centrioles at the correct size in different organisms.

      __Minor points: __

      __1) Introduction section: Literature reference missing for the sentence starting with "Importantly, the stable nature of centrioles enables them to withstand...". __

      We cited research articles that show the importance of centriole motility during ciliary motility and cell division.

      “Importantly, the stable nature of centrioles enables them to withstand mechanical forces during cell division and upon ciliary and flagellar motility (Abal et al., 2005; Bayless et al., 2012; Meehl et al., 2016; Pearson et al., 2009).

      __2) Fig. S1 legend: A typo as follows: CRAPome banalysis should read CRAPome analysis. __

      We corrected this typo.

      __3) Fig. S2: Info on the scale bar in the legend is missing in Fig. S2A. Scale bars for different panels are missing in general in Fig. S2A. __

      We added scale bar information for Fig. S2A and to all other supplementary figure legends that lack scale bar information.

      __4) Fig. 3A and 3B: When displaying the data, coloured cartoon diagrams would be beneficial to guide the reader who are not fully familiar with the spatial orientation of these proteins. __

      As suggested by the reviewer, we will remove the confocal imaging data for CCDC15 localization from Fig. 3A and 3B. For the revised version, we will include 3D-SIM imaging data along with a diagram that represents the spatial orientation of CCDC15 relative to the chosen centriole markers.

      __5) Fig. 3H: No information about the sample number (number of cells or technical repeats examined) reported. __

      We included information on the number of experimental replicates and cells analyzed.

      __6) Fig. S3B legend: A typo as follows: CCD15-depelted RPE1 cells should read CCDC15-depleted RPE1 cells. __

      We corrected this typo.

      __7) Fig. S3B legend: A typo as follows: cellswere fixed with should read cells were fixed with. __

      We corrected this typo.

      __8) There are many spelling mistakes and typos throughout the paper. I have listed a few examples above, but please carefully read through the manuscript to correct all the errors. __

      Thank you for indicating the spelling mistakes we missed to correct for initial submission. In the revised manuscript, we carefully read through the manuscript to correct the mistakes.

      __9) Fig. S3E: The orange columns depicting % of cells with Sas-6 dots look awkward. Why the columns look larger than the mean line? Please correct as appropriate. __

      The total percentage of cells in the two categories (orange and purple) we counted is 100%, which corresponds to the column value at the y-axis. Therefore, the value for each experimental replicate for the orange category is less than 100% and is marked below the 100% line.

      __10) Although authors provide microscopy information for the U-ExM and FRAP experiments, there is no information about the microscopy on regular confocal imaging experiments which should be detailed in Materials and Methods. Also, there is no information about the lenses, laser lines and the filter sets that were used in the imaging experiments. These should be provided as well. __

      In the methods section, we now included detailed information for the microscopes we used and imaging setup (lenses, laser lines, filter sets, detectors, z-stack size, resolution).

      11)

      • __ Fig. 2A: lacks a scale bar. __
      • __ Fig. 2C legend: lacks info on the scale bar length. __
      • __ Fig. 5A legend: lacks info on the scale bar length. __
      • __ Fig. 7A: lacks a scale bar. __
      • __ Fig. 7G legend: lacks info on the scale bar length. __
      • __ Fig. S2C-E: lack scale bars. __
      • __ Fig. S3D, F and H: lack scale bars. (Fig. S4 in the revised manuscript)__
      • __ Fig. S3J legend: lacks info on the scale bar length. (Fig. S4 in the revised manuscript)__
      • __ Fig. S4A, B, D and E: lack scale bars. (Fig. S5 in the revised manuscript)__
      • __ Fig. S4C legend: lacks info on the scale bar length. (Fig. S5 in the revised manuscript)__
      • __ Fig. S4G legend: lacks info on the scale bar length. (Fig. S5 in the revised manuscript)__ We added the scale bars and the size information to the figures and figure legends for the above figures.

      Reviewer #2 (Significance (Required)): __The findings of this study join among the relatively new literature (e.g., Steib et al., 2020 and Le Guennec et al. 2020) on the nature of centriole inner scaffold and its potential roles in centriole formation, integrity and its propensity to form the primary cilium. Therefore, it will be of interest to a group of scientists studying these topics in the field of centrosomes/cilia.

      My expertise is on the biochemistry and genetics of centriole formation in animals.__

      We thank the reviewer for his/her comments and constructive feedback to improve our manuscript. We are encouraged to see that the reviewer acknowledges how the results from our manuscript advances our understanding of centriole length, integrity and function regulation.

      References

      Abal, M., G. Keryer, and M. Bornens. 2005. Centrioles resist forces applied on centrosomes during G2/M transition. Biol Cell. 97:425-434.

      Atorino, E.S., S. Hata, C. Funaya, A. Neuner, and E. Schiebel. 2020. CEP44 ensures the formation of bona fide centriole wall, a requirement for the centriole-to-centrosome conversion. Nat Commun. 11:903.

      Azimzadeh, J., P. Hergert, A. Delouvee, U. Euteneuer, E. Formstecher, A. Khodjakov, and M. Bornens. 2009. hPOC5 is a centrin-binding protein required for assembly of full-length centrioles. J Cell Biol. 185:101-114.

      Bayless, B.A., T.H. Giddings, Jr., M. Winey, and C.G. Pearson. 2012. Bld10/Cep135 stabilizes basal bodies to resist cilia-generated forces. Mol Biol Cell. 23:4820-4832.

      Chen, F., P.W. Tillberg, and E.S. Boyden. 2015. Optical imaging. Expansion microscopy. Science. 347:543-548.

      Conkar, D., H. Bayraktar, and E.N. Firat-Karalar. 2019. Centrosomal and ciliary targeting of CCDC66 requires cooperative action of centriolar satellites, microtubules and molecular motors. Sci Rep. 9:14250.

      Dantas, T.J., O.M. Daly, P.C. Conroy, M. Tomas, Y. Wang, P. Lalor, P. Dockery, E. Ferrando-May, and C.G. Morrison. 2013. Calcium-binding capacity of centrin2 is required for linear POC5 assembly but not for nucleotide excision repair. PLoS One. 8:e68487.

      Firat-Karalar, E.N., N. Rauniyar, J.R. Yates, 3rd, and T. Stearns. 2014. Proximity interactions among centrosome components identify regulators of centriole duplication. Curr Biol. 24:664-670.

      Gambarotto, D., V. Hamel, and P. Guichard. 2021. Ultrastructure expansion microscopy (U-ExM). Methods Cell Biol. 161:57-81.

      Gambarotto, D., F.U. Zwettler, M. Le Guennec, M. Schmidt-Cernohorska, D. Fortun, S. Borgers, J. Heine, J.G. Schloetel, M. Reuss, M. Unser, E.S. Boyden, M. Sauer, V. Hamel, and P. Guichard. 2019. Imaging cellular ultrastructures using expansion microscopy (U-ExM). Nat Methods. 16:71-74.

      Gheiratmand, L., E. Coyaud, G.D. Gupta, E.M. Laurent, M. Hasegan, S.L. Prosser, J. Goncalves, B. Raught, and L. Pelletier. 2019. Spatial and proteomic profiling reveals centrosome-independent features of centriolar satellites. EMBO J.

      Gupta, G.D., E. Coyaud, J. Goncalves, B.A. Mojarad, Y. Liu, Q. Wu, L. Gheiratmand, D. Comartin, J.M. Tkach, S.W. Cheung, M. Bashkurov, M. Hasegan, J.D. Knight, Z.Y. Lin, M. Schueler, F. Hildebrandt, J. Moffat, A.C. Gingras, B. Raught, and L. Pelletier. 2015. A Dynamic Protein Interaction Landscape of the Human Centrosome-Cilium Interface. Cell. 163:1484-1499.

      Hamel, V., E. Steib, R. Hamelin, F. Armand, S. Borgers, I. Fluckiger, C. Busso, N. Olieric, C.O.S. Sorzano, M.O. Steinmetz, P. Guichard, and P. Gonczy. 2017. Identification of Chlamydomonas Central Core Centriolar Proteins Reveals a Role for Human WDR90 in Ciliogenesis. Curr Biol. 27:2486-2498 e2486.

      Heydeck, W., B.A. Bayless, A.J. Stemm-Wolf, E.T. O'Toole, A.S. Fabritius, C. Ozzello, M. Nguyen, and M. Winey. 2020. Tetrahymena Poc5 is a transient basal body component that is important for basal body maturation. J Cell Sci. 133.

      Khouj, E.M., S.L. Prosser, H. Tada, W.M. Chong, J.C. Liao, K. Sugasawa, and C.G. Morrison. 2019. Differential requirements for the EF-hand domains of human centrin 2 in primary ciliogenesis and nucleotide excision repair. J Cell Sci. 132.

      Kong, D., and J. Loncarek. 2021. Analyzing Centrioles and Cilia by Expansion Microscopy. Methods Mol Biol. 2329:249-263.

      Laporte, M.H., I.B. Bouhlel, E. Bertiaux, C.G. Morrison, A. Giroud, S. Borgers, J. Azimzadeh, M. Bornens, P. Guichard, A. Paoletti, and V. Hamel. 2022. Human SFI1 and Centrin form a complex critical for centriole architecture and ciliogenesis. EMBO J. 41:e112107.

      Le Guennec, M., N. Klena, D. Gambarotto, M.H. Laporte, A.M. Tassin, H. van den Hoek, P.S. Erdmann, M. Schaffer, L. Kovacik, S. Borgers, K.N. Goldie, H. Stahlberg, M. Bornens, J. Azimzadeh, B.D. Engel, V. Hamel, and P. Guichard. 2020. A helical inner scaffold provides a structural basis for centriole cohesion. Sci Adv. 6:eaaz4137.

      Mahen, R. 2022. cNap1 bridges centriole contact sites to maintain centrosome cohesion. PLoS Biol. 20:e3001854.

      Meehl, J.B., B.A. Bayless, T.H. Giddings, Jr., C.G. Pearson, and M. Winey. 2016. Tetrahymena Poc1 ensures proper intertriplet microtubule linkages to maintain basal body integrity. Mol Biol Cell. 27:2394-2403.

      Mercey, O., C. Kostic, E. Bertiaux, A. Giroud, Y. Sadian, D.C.A. Gaboriau, C.G. Morrison, N. Chang, Y. Arsenijevic, P. Guichard, and V. Hamel. 2022. The connecting cilium inner scaffold provides a structural foundation that protects against retinal degeneration. PLoS Biol. 20:e3001649.

      Moudjou, M., N. Bordes, M. Paintrand, and M. Bornens. 1996. gamma-Tubulin in mammalian cells: the centrosomal and the cytosolic forms. J Cell Sci. 109 ( Pt 4):875-887.

      Odabasi, E., D. Conkar, J. Deretic, U. Batman, K.M. Frikstad, S. Patzke, and E.N. Firat-Karalar. 2023. CCDC66 regulates primary cilium length and signaling via interactions with transition zone and axonemal proteins. J Cell Sci. 136.

      Paoletti, A., M. Moudjou, M. Paintrand, J.L. Salisbury, and M. Bornens. 1996. Most of centrin in animal cells is not centrosome-associated and centrosomal centrin is confined to the distal lumen of centrioles. J Cell Sci. 109 ( Pt 13):3089-3102.

      Pearson, C.G., D.P. Osborn, T.H. Giddings, Jr., P.L. Beales, and M. Winey. 2009. Basal body stability and ciliogenesis requires the conserved component Poc1. J Cell Biol. 187:905-920.

      Quarantotti, V., J.X. Chen, J. Tischer, C. Gonzalez Tejedo, E.K. Papachristou, C.S. D'Santos, J.V. Kilmartin, M.L. Miller, and F. Gergely. 2019. Centriolar satellites are acentriolar assemblies of centrosomal proteins. EMBO J.

      Resendes, K.K., B.A. Rasala, and D.J. Forbes. 2008. Centrin 2 localizes to the vertebrate nuclear pore and plays a role in mRNA and protein export. Mol Cell Biol. 28:1755-1769.

      Sahabandu, N., D. Kong, V. Magidson, R. Nanjundappa, C. Sullenberger, M.R. Mahjoub, and J. Loncarek. 2019. Expansion microscopy for the analysis of centrioles and cilia. J Microsc. 276:145-159.

      Salisbury, J.L., K.M. Suino, R. Busby, and M. Springett. 2002. Centrin-2 is required for centriole duplication in mammalian cells. Curr Biol. 12:1287-1292.

      Schweizer, N., L. Haren, I. Dutto, R. Viais, C. Lacasa, A. Merdes, and J. Luders. 2021. Sub-centrosomal mapping identifies augmin-gammaTuRC as part of a centriole-stabilizing scaffold. Nat Commun. 12:6042.

      Steib, E., M.H. Laporte, D. Gambarotto, N. O’lieric, C. Zheng, S. Borgers, V. Olieric, M.L. Guennec, F. Koll, A.M. Tassin, M.O. Steinnmetz, P. Guichard, and V. Hamel. 2020. WDR90 is a centriolar microtubule wall protein important for centriole architecture integrity. eLife.

      Steib, E., R. Tetley, R.F. Laine, D.P. Norris, Y. Mao, and J. Vermot. 2022. TissUExM enables quantitative ultrastructural analysis in whole vertebrate embryos by expansion microscopy. Cell Rep Methods. 2:100311.

      Sydor, A.M., E. Coyaud, C. Rovelli, E. Laurent, H. Liu, B. Raught, and V. Mennella. 2018. PPP1R35 is a novel centrosomal protein that regulates centriole length in concert with the microcephaly protein RTTN. Elife. 7.

      Tiryaki, F., J. Deretic, and E.N. Firat-Karalar. 2022. ENKD1 is a centrosomal and ciliary microtubule-associated protein important for primary cilium content regulation. FEBS J. 289:3789-3812.

      Tsekitsidou, E., C.J. Wong, I. Ulengin-Talkish, A.I.M. Barth, T. Stearns, A.C. Gingras, J.T. Wang, and M.S. Cyert. 2023. Calcineurin associates with centrosomes and regulates cilia length maintenance. J Cell Sci. 136.

      Van de Mark, D., D. Kong, J. Loncarek, and T. Stearns. 2015. MDM1 is a microtubule-binding protein that negatively regulates centriole duplication. Mol Biol Cell. 26:3788-3802.

      Yang, C.H., C. Kasbek, S. Majumder, A.M. Yusof, and H.A. Fisk. 2010. Mps1 phosphorylation sites regulate the function of centrin 2 in centriole assembly. Mol Biol Cell. 21:4361-4372.

      Ying, G., J.M. Frederick, and W. Baehr. 2019. Deletion of both centrin 2 (CETN2) and CETN3 destabilizes the distal connecting cilium of mouse photoreceptors. J Biol Chem. 294:3957-3973.

    1. Some recommendation algorithms can be simple such as reverse chronological order, meaning it shows users the latest posts (like how blogs work, or Twitter’s “See latest tweets” option). They can also be very complicated taking into account many factors, such as: Time since posting (e.g., show newer posts, or remind me of posts that were made 5 years ago today) Whether the post was made or liked by my friends or people I’m following How much this post has been liked, interacted with, or hovered over Which other posts I’ve been liking, interacting with, or hovering over What people connected to me or similar to me have been liking, interacting with, or hovering over What people near you have been liking, interacting with, or hovering over (they can find your approximate location, like your city, from your internet IP address, and they may know even more precisely) This perhaps explains why sometimes when you talk about something out loud it gets recommended to you (because someone around you then searched for it). Or maybe they are actually recording what you are saying and recommending based on that. Phone numbers or email addresses (sometimes collected deceptively) can be used to suggest friends or contacts. And probably many more factors as well!

      I think recommendation algorithms are very interesting and complex because different social media platforms use different algorithms to showcase content for users. For example, when I use Instagram my recommended posts to view can be so different from my friends because we have different interests and interactions with the app. However, I do think these algorithms are become so accurate and complex it can be really creepy.

    1. Author Response

      Reviewer #1 (Public Review):

      This is an interesting manuscript that proposes a new approach to for accounting for viral diversity within hosts in phylogenetic analyses of pathogens. Concretely, the authors consider sites for which a minor allele exist as an additional base in the substitution model. For example, if at a particular site 60% of reads have an C and 40% have a G, then this site is assigned Cg, as opposed to an C which is typical of analysing consensus sequences. Because we typically model sequence evolution as a Markovian process, as is the case here, the data become naturally more informative, given that there are more states in the Markov chain when adding these bases. As a result, phylogenetic trees estimated using these data are better resolved than those from consensus sequences. The branches of the trees are probably also longer, which is why temporal signal becomes more apparent.

      I commend the authors on their rigorous simulation study and careful empirical data analyses. However, I strongly suggest they consider whether treating minor alleles as an additional base is biologically realistic and whether this may have implication for other analyses, particularly when there is very high within-host diversity and the number of states in becomes very large.

      We thank the reviewer for the helpful and thorough review. We have included a paragraph in the Discussion regarding the biological interpretation of the 16-state model (Line 344-351), as well as the consequences when there’s high within-host diversity (Line 398).

      Reviewer #2 (Public Review):

      I agree that minor genetic variation could potentially be used to more accurately infer who-infected- whom in an outbreak scenario. Indeed, the use of minor genetic variation has proven very useful in reconstructing transmission chains for chronic infections such as HIV (e.g., see applications using Phyloscanner). To me, it seems that considering the full spectrum of viral genetic diversity within infected hosts would necessarily do the same if not better than considering only consensus-level viral sequence data. This is because there is a necessarily a loss of data and potentially a loss of information when going from considering the genetic composition of viral populations within a host to only considering the consensus sequences of those viral populations. As such, Ortiz et al.'s hypothesis stated on lines 66-70 is a reasonable one, and I was looking forward to seeing this hypothesis evaluated in detail in this manuscript.

      R2.1 There are several parts of this manuscript I really like. In particular, encoding within-sample diversity as character states and using that alternative representation of sequence data for phylogenetic inference (as shown in Figure 3) is a very interesting idea, I think. There are some limitations that are not explicitly mentioned, however. For example, when using this 16-character state representation for phylogenetic inference, they assume independence between nucleotide sites. This is a major assumption that can be violated when considering longitudinal intrahost data and transmission dynamics in an outbreak setting, given genetic linkage between sites.

      We have generated another set of simulations where the starting tree was a coalescent tree rather than a random phylogeny. This is described in the Results section, Line 228, and Figure 4—figure supplement 2. By using a coalescent tree, we increase the genetic linkage between sites. For all metrics used, the 16-state model performed better than the consensus sequence model. It is also important to note, as the reviewer points out, that longitudinal isolates should be removed from transmission inference, as we do in Figure 7 and Figure 7—figure supplement 2.. This point is now reflected in the Results (Line 286) and Methods (Line 534).

      I have several major concerns about the work as it stands, particularly in the context of the SARS-CoV-2 application.

      Concerns not related to the SARS-CoV-2 application:

      R2.2 Figure 4 shows that a model using within-sample diversity can more accurately reconstruct evolutionary histories than a model that uses only consensus-level genetic data. This is really interesting. The Materials and Methods section (particularly lines 351-354) indicates that the sequence data were generated using certain specified substitution rates. The rates specified seem to be chosen in such a way to facilitate finding an improvement when using within-sample diversity. I don't know whether the relative rates of these 'substitutions' at all mirror "real-life". It would be very useful to have a broader set of analyses here to examine the effect of these 'substitution' rates on the utility of incorporating within-sample diversity into phylogenetic inference. (Also, 1, 100, 200 (line 353) inconsistent with 1, 20, 200 in Supp Table 3)

      We have now corrected Supp Table 3 to reflect the rates described in the Methods section.

      We defined our model with three rates: rate of minor variant acquisition, rate of minor-major variant switch, and rate of minor variant loss. We chose the rates for the simulations (1, 100, 200) to reflect a low rate of minor variant acquisition (1) and high rates of minor-major variant switch (200) and minor variant loss (100). These rates will result in pure bases (A,C,G and T) 100 times more likely to be present than low frequency variants, as seen in the base frequencies in Supp Table 1 and 3, which would in turn minimize the effect of including minor variations. We chose these rates to reflect the high turnover of minor variation often observed in real data and the frequencies of minor alleles in the SARS-CoV-2 dataset, but we agree with the reviewer that this may not always be the case. We also agree with the reviewer that changing the parameters in the simulations also affects the effect of including low frequency variation in the model. As such, we have now included simulations using different sets of rates (Figure 4—figure supplement ):

      1) With a high rate of variant switch and loss compared to acquisition (1, 10, 100), reducing the frequency of minor variation.

      2) With a lower rate of switch and loss (1, 10, 10), promoting a stable landscape of low frequency variation.

      3) With no low frequency variation (Jukes-cantor model)

      R2.3 Figure 5 is very interesting, particularly the results at bottleneck sizes of 1-10. What are the 'substitution' rates that are inferred here from using this simulated dataset? The Material and Methods section also does not mention the within-host viral generation time anywhere, as far as I can see (~line 384 states the mutation rate per base per generation cycle but not the length of the generation cycle anywhere).

      Fastsimcoal2 is a coalescent simulator of population histories over several generations, given a population size and a mutation rate. For our purposes, transmissions are simulated as bottlenecks of constant size, and a generation is represented by each time step in the outbreak simulation, which corresponds to 1 day. This is further clarified in the Methods section (Line 475).

      Concerns related to the SARS-CoV-2 application:

      R2.4 I am very concerned about the testing of this hypothesis on the SARS-CoV-2 data presented. First, 1% is a very low variant calling threshold. Second, analysis of the 17 samples that were resequenced (out of 454) indicated that on average, 39% of iSNVS (intrahost single nucleotide variants) called between duplicate runs were only observed in one of the two runs (line 117). Their analysis in Figure 1 indicates that these discrepant (and seemingly spurious) variants occur at higher levels in high Ct samples (which makes sense; Figure 1b). They therefore decide to limit their analyses to samples with Ct values <= 30. This results in 249 samples. However, if we look at Figure 1b, only ~10% of iSNVs called across duplicate runs with Ct = 30 are shared! That means that 90% of iSNVs in the set appear to be spurious. If we assume that each duplicate run of a sample has approximately the same number of spurious iSNVs, then approximately 82% of iSNVs called in a sample with a Ct of 30 would be spurious. This fraction decreases with samples that have lower Ct values, but even at a Ct of 27, only ~60% of iSNVs called across duplicate runs are shared. All the downstream SARS-CoV-2 analyses based on within-host sample diversity therefore are based on samples where the large majority of considered sample diversity is not real. This leads to me necessarily discounting all of those downstream SARS-CoV-2 results.

      We agree with the reviewer that, as the results show, datasets that incorporate within-sample low frequency variation are expected to have considerably more noise than using exclusively consensus sequences, and perhaps this wasn’t properly discussed in the manuscript. We have incorporated some notes about this in the Discussion section (Line 408-413).

      The 1% variant frequency threshold was used to generate the analysis of Fig. 1 and Supp. Fig. 1-4. Looking at these results, we decided to establish the Ct cutt-off of 30 as mentioned by the reviewer, as well as a variant frequency threshold of 2% (as shown in the x-axis of Fig. 2). We overlooked this second variant frequency threshold in the manuscript, which has been added. As shown in Supp. Fig 4, this variant frequency threshold will increase the concordance between technical replicates, although some level of noise persists.

      R2.5 Lines 153-167: I can't figure out how to square the quantitative results given in this paragraph with what is shown in Figure 2. To me, Figure 2 shows only that Technical Replicates have higher probabilities of sharing a variant than with 'No' relationship. What would also be helpful here so that the reader can get a better feel for the data would be to see the iSNV frequencies plotted over time for the longitudinal replicate samples in the supplement and, for the 'epidemiological' samples to show 'TV plots' in the supplement (as in Fig 3c in McCrone et al. eLife)

      Figure 2 shows that technical replicates, longitudinal replicates, epidemiological samples and, in some instances, from the same department have a higher probability of sharing low frequency variants than those with no relationship (also shown in Supp Figure 5). However, also shown in Figure 2 is that the 95% CI is very wide, and therefore in many instances low frequency variants won’t be shared between epidemiological samples or samples from the same department.

      We have also added Figure 2—figure supplement showing the low frequency variants plotted over time for longitudinal replicates. Unlike McCrone et al, we don’t have proven transmission between pairs of samples, although we believe our analysis also shows a pattern of shared low frequency variants among potential epidemiological links.

      R2.6 Figure 6 and associated text: (a) root-to-tip distance: what units is this distance in? (b) That the authors find a temporal signal in these transmission clusters (where all consensus sequences within a cluster are the same) is interesting but also a bit baffling to me. Given the inference of very small transmission bottlenecks in previous studies (e.g., Martin & Koelle - reanalysis of Popa et al.; Lythgoe et al.; Braun et al.), I don't understand where the temporal signal comes in. Do the samples become more genetically diverse over the outbreak (this seems to be indicated in lines 260-262 but never shown and unlikely given bottleneck sizes)? Additional analyses to help the reader understand WHY within-sample diversity allows for the identification of temporal signal is important. This could involve plotting genetic diversity of the samples by collection date or some other, similar analyses.

      a) The units of the y-axis (root-to-tip distance) are measured in substitutions per genome. This is now reflected in the legend of the figure.

      b) As shown in Figure 5, even at small bottleneck sizes we are able to pick some of the diversity that evolves during the course of an outbreak. As hinted by the reviewer, the smaller the bottleneck the less diversity we can leverage for phylogenetic inference, and in fact for some epidemiological samples all the diversity will be lost during transmission, which is why many of the within-sample variants are not shared between the epidemiologically related samples. Figure 6 is indeed showing that the genetic distance (measured as number of substitutions per genome) increases per collection date. We have also added a Figure 6—figure supplement showing the increase in low frequency variants within outbreaks as the outbreaks progress in time (explained in Line 261 of the Results section), which explain in part the increasing temporal signal in clusters.

      R2.7 Paragraph consisting of lines 229-238 and Figure 7: This analysis stops abruptly. What are the conclusions here? Figure 7a (right) seems inconsistent to me with Figure 7b and 7C results. Also, the main hypothesis put forward in this paper is that within-sample sequence data can better resolve who-infected-whom in an outbreak setting. Figure 7b and 7c however are never compared against analogous panels that use just consensus sequences. (Even though the consensus sequences are the same, according to Figure 7a, the inferences shown in Figures 7b and 7c could use additional data such as collection times, etc. that would provide information even when using exclusively consensus-level data). Also, do the analyses in Figures 7b and 7c use the 16-character state model at all? I think Supp Figure 9 is relevant here but not sure how?)

      We have extended this section of the results to make it more coherent and clear (Line 284-293) and in the Discussion (Line 385-395). As added into the Discussion, we agree with the reviewer that even with equal sequences some inferences about transmission can be made with epidemiological data, specially collection dates. However, such data can’t be used to infer the genetic structure of the cluster, which complicates any analysis that can use a phylogenetic as input.

      Additional concerns:

      R2.8 Some of the stated conclusions, particularly in the Discussion section and in the Abstract, do not seem to be supported by the presented results. For example, line 27: 'within-sample diversity is stable among repeated serial samples from the same host': Figure 2 does not show this conclusively. Line 28: 'within-sample diversity... is transmitted between those cases with known epidemiological links': Figure 2 also does not show this conclusively. Line 29: 'within-sample diversity... improves phylogenetic inference and our understanding of who infected whom': Figure 7b/c results using within-sample diversity is never compared against results that use only consensus, so improvement not demonstrated. Line 272-273: 'samples with shorter distance in the consensus phylogeny were more likely to share low frequency variants'. Line 287: 'We demonstrated that phylogenies... were heavily biased'.

      Line 27 and Line 28: We agree with the reviewer that the genomic analysis of SARS-CoV-2 sequences show only partial congruence within technical replicates and epidemiological links. We have appropriately addressed this in the Abstract.

      Line 29 and Fig 7: Transmission inference using the consensus sequence in Figure 7b/c couldn’t be performed because the lack of any genetic difference between the consensus sequence meant that all sequences had the same transmission likelihood. This is now better explained in the Discussion section, lines 385-395.

      Line 272-273: We have removed this section as we did not perform this analysis, as pointed out by the reviewer.

      Line 287: The conclusion expressed in line 287 (now line 340) has been changed.

      R2.9 The manuscript at times does not cite previous work that is highly relevant and thus overstates the novelty of the current work. For example: lines 21-23: '..conventional whole-genome sequencing phylogenetic approaches to reconstruct outbreaks exclusively use consensus sequences...' Phyloscanner uses within-sample diversity, for example, as does SCOTTI. These are finally cited in the discussion section (~line 310), but because this previous work is not acknowledged earlier in the manuscript, the novelty of the work presented here is somewhat overstated.

      We have included background information in the introduction regarding the use of within-sample diversity for transmission inference (Line 69-73), as well as emphasizing that the novelty of our work lies more in the use of within-sample diversity in phylogenetic inference rather than exclusively transmission inference (Line 74, and other instance along the manuscript).

      In sum, I think that the 16 character-state model is a very interesting model. More analyses on simulated data would be helpful to expand on when below-the-consensus level genetic data would truly be informative of phylogenetic relationships and who-infected-whom in outbreak settings. The SARS-CoV-2 analyses are very worrisome to me, given the inclusion of samples where the majority of considered within-sample genetic diversity is very likely not real. Some of the stated conclusions appear to either be at odds with the results presented or not directly evaluated.

    1. Author Response

      Reviewer #1 (Public Review):

      In this interesting manuscript, Nasser et al explore long-term patterns of behavior and individuality in C. elegans following early-life nutritional stress. Using a rigorous, highly quantitative, high-throughput approach, they track patterns of motor behavior in many individual nematodes from L1 to young adulthood. Interestingly, they find that early-life food deprivation leads to decreased activity in young larvae and adults, but that activity between these times, during L2-L4, is largely unaffected. Further, they show that this "buffering" of stress requires dopamine signaling, as L2-L4 activity is significantly reduced by early-life starvation in cat-2 mutants. The paper also provides evidence that serotonin signaling has a role in modulating sensitivity to stress in L1 larvae and adults, but the size of these effects is modest. To evaluate patterns of individuality, the authors use principal components analysis to find that three temporal patterns of activity account for much of the variation in the data. While the paper refers to these as "individuality types," it may be more reasonable to think of these as "dimensions of individuality." Further, they provide evidence that stress may alter the strength and/or features of these dimensions. Though the circuit mechanisms underlying individuality and stress-induced changes in behavior remain unknown, this paper lays an important foundation for evaluating these questions. As the authors note, the behaviors studied here represent only a small fraction of the behavioral repertoire of this system. As such, the findings here are an interesting and very promising entry point for a deeper understanding of behavioral individuality, particularly because of the cellular/synaptic-level analysis that is possible in this system. This paper should be of interest to those studying C. elegans behavior and also more generally to those interested in behavioral plasticity and individuality.

      We thank the reviewer for finding our results interesting.

      Reviewer #2 (Public Review):

      This paper set out to understand the impact of early life stress on the behavior and individuality of animals, and how that impact might be amplified or masked by neuromodulation. To do so, the authors built on a previously established assay (Stern et al 2017) to measure the roaming fraction and speed of individuals. This technique allowed the authors to assess the effects of early life starvation on behavior across the entire developmental trajectory of the individual. By combining this with strains with mutant neuromodulatory systems, this enabled the authors to produce a rich dataset ripe for analysis to analyze the complicated interactions between behavior, starvation intensity, developmental time, individuality, and neuromodulatory systems.

      The richness of this dataset - 2 behavioral measures continuous across 5 developmental stages, 3 different neuromodulatory conditions (with the dopamine system subject to decomposition by receptor types) and 4 different levels of starvation, with ~50-500 individuals in each condition-underlies the strength of this paper. This dataset enabled the authors to convincingly demonstrate that starvation triggers a behavioral effect in L1 and adult animals that is largely masked in intermediate stages, and that this effect becomes larger with increased severity of starvation. Furthermore, they convincingly show that the masking of the effect of starvation in L2-L4 animals depends on dopaminergic systems. The richness of the dataset also allowed a careful analysis of individuality, though only neuromodulatory mutants convincingly manipulated individuality, recapitulating earlier research. Nonetheless, a few caveats exist on some of their findings and conclusions:

      We thank the reviewer for the constructive comments. In the revised manuscript we include additional analyses and textual changes as detailed below, to address the points raised.

      1) Lack of quantitative analysis for effects within developmental stages. In making the argument for buffered effects of starvation on behavior during periods of larval development, the authors make claims regarding the temporal structure of behavior within specific stages. However, no formal analysis is performed and and the traces are provided without confidence intervals, making it difficult to judge the significance of potential deviations between starvation conditions.

      In the revised manuscript, we include additional analyses of roaming fraction effects across shorter developmental-windows, showing within-stage differences in behavioral patterns following starvation (Figure 1 - figure supplement 1E; Figure 3 - figure supplement 1C). In addition, we further temper and rewrite our conclusions to clearly describe these effects (now- “…while 1 day of early starvation modified within-stage temporal behavioral structures by shifting roaming activity peaks to later time-windows during the L2 and L3 stages…” in p. 4 and “Interestingly, during the L2 intermediate stage the effects on roaming activity patterns were more pronounced during earlier time-windows of the stage…” in p. 8).

      2) Incorrect inferences from differences in significance demonstrating significant differences. The authors claim that there is an increase in PC1 inter-individual variation in tph-1 individuals, however the difference in significance is not evidence of a significant difference between conditions (see Nieuwenhuis et al. 2011). This undermines claims about an interaction of starvation, neuromodulators, and individuality.

      In the revised manuscript we provide now a direct comparison of PCs inter-individual variances between starved and unstarved populations, demonstrating significant differences in inter-individual variation in specific PC individuality dimensions following stress (Figure 6 and Figure 6 - figure supplement 1). These results include the increase in PC1 inter-individual variation in tph-1 mutants following 3 and 4 days of starvation (Figure 6A,E).

      3) Sensitivity of analysis to baseline effects and assumptions of additive/proportional effects. The neuromodulatory and stress conditions in this paper have a mixture of effects on baseline activity and differences from baseline. The authors normalize to the roaming fraction without starvation, making the reasonable assumption that the effect due to starvation is proportional to baseline, rather than an additive effect. This confound is most visible in the adult subpanel of figure 5d, where an ~2-3 fold difference in relative roaming due to starvation is clearly noted, however, this is from a baseline roaming fraction in tph-1 animals that are ~2 fold higher, suggesting that the effect could plausibly be comparable in absolute terms.

      Unavoidably, any such assumptions on the expected interaction between multiple effects will be a gross simplification in complicated nonlinear systems, and the data are largely shown with sufficient clarity to allow the reader to make their own conclusions. However, some of the interpretations in the paper lean heavily on an assumption that the data support a direct interpretation (e.g. "neuronal mechanisms actively buffer behavioral alterations at specific development times") rather than an indirect interpretation (e.g. that serotonin reduces baseline roaming fraction which makes a fixed sized effect more noticeable). Parsing the differences requires either more detailed mechanistic study or careful characterization of the effect of different baselines on the sensitivity of behavior to perturbation-barring that it's worth noting that many of these interactions may be due to differences in biological and experimental sensitivity to change under different conditions, rather than a direct interaction of stress and neuromodulatory processes or evidence of differing neuromodulatory activity at different stages of development.

      In the revised manuscript we added a discussion of the potential complicated interactions between neuromodulation and stress, altering baseline levels and deviations from baseline. We also discuss the interpretation of the results in the context of non-linear systems in which sensitivity of the behavioral response to underlying variations may be modified by specific neuromodulatory and environmental perturbations, without assuming direct differences in neuromodulatory states over development or across individuals (p. 16).

      Reviewer #3 (Public Review):

      In this study, Nasser et al. aim to understand how early-life experience affects 1) developmental behavior trajectory and 2) individuality. They use early life starvation and longitudinal recording of C. elegans locomotion across development as a model to address these questions. They focus on one specific behavioral response (roaming vs. dwelling) and demonstrate that early life (right after embryo hatching) starvation reduces roaming in the first larval (L1) and adult stages. However, roaming/dwelling behavior during mid-larval stages (L2 through L4) is buffered from early life starvation. Using dopamine and serotonin biosynthesis null mutant animals, they demonstrated that dopamine is important for the buffering/protection of behavioral responses to starvation in mid-larval stages, while in contrast, serotonin contributes to early-life starvation's effects on reduced roaming in the L1 and adult stages. While the technique and analysis approaches used are mostly solid and support many of the conclusions made in the manuscript for part 1), there are some technical limitations (e.g., whether the method has sufficient resolution to analyze the behaviors of younger animals) and confounding factors (e.g., size of the animal) that the authors do not yet sufficient address, and can affect interpretation of the results. Additionally, much of the study is descriptive and lacks deep mechanistic insight. Furthermore, the focus on a single behavioral parameter (dwelling vs. roaming) limits the broad applicability of the study's conclusions. Lastly, the manuscript does not provide clear presentation or analysis to address part 2), the question of how early life experience affect individuality.

      We thank the reviewer for these important comments. As described below, in the revised manuscript we include new analyses (following extraction of size data), showing behavioral modifications across different conditions/genotypes also in size-matched individuals (within the same size range) (Figure 1 - figure supplement 1F; Figure 3 - figure supplement 1D,E; Figure 5 - figure supplement 1B,D). We also made edits to the text to describe these results (Methods p. 21 and Results section). In addition, while we can detect behavioral changes using our imaging method even in young L1 worms across conditions and genotypes (described in Stern et al. 2017 and this manuscript), as the reviewer correctly pointed out, we may miss some milder behavioral effects due to lower spatial imaging resolution in younger worms. We are now referring to this spatial resolution limitation in the revised manuscript (discussion part). Lastly, in the revised manuscript we added clearer and more direct analyses of changes in inter-individual variation in multiple PC dimensions following early stress, by directly comparing variation between starved and unstarved individuals within the mutant and wild-type populations (Figure 6; Figure 6 - figure supplement 1). These analyses show significant changes in inter-individual variation within specific PC individuality dimensions following early stress. Also, we made textual changes along the manuscript to increase the clarity of presentation of these results.

    1. Author Response

      Reviewer 1 (Public Review):

      In this paper, Reato, Steinfeld et al. investigate a question that has long puzzled neuroscientists: what features of ongoing brain activity predict trial-to-trial variability in responding to the same sensory stimuli? They record spiking activity in the auditory cortex of head-fixed mice as the animals performed a tone frequency discrimination task. They then measure both overall activity and the synchronization between neurons, and link this ’baseline state’ (after removing slow drifts) of cortex to decision accuracy. They find that cortical state fluctuations only affect subsequent evoked responses and choice behavior after errors. This indicates that it’s important to take into account the behavioral context when examining the effects of neural state on behavior.

      Strengths of this work are the clear and beautiful presentation of the figures, and the careful consideration of the temporal properties of behavioral and neural signals. Indeed, slowly drifting signals are tricky as many authors have recently addressed (e.g. Ashwood, Gupta, Harris). The authors are well aware of the difficulties in correlating different signals with temporal and cross-correlation (such as in their ’epoch hypothesis’). To disentangle such slow trends from more short-lived state fluctuations, they remove the impact of the past 10 trials and continue their analyses with so-called ’innovations’ (a term that is unusual, and may more simply be replaced with ’residuals’).

      The terms ‘innovations’ and ‘residuals’ are sometimes used interchangeably. We used innovations because that’s how they were introduced in the signal processing literature (i.e., Kailath, T (1968). ”An innovations approach to least-squares estimation–Part I: Linear filtering in additive white noise.” IEEE transactions on automatic control). We try to be explicit in the text about the formal definition of this quantity, to avoid problems with terminology.

      I do wonder if this throws out the baby with the bathwater. If the concern is statistical confound, the ’session permutation’ method (Harris) may be better suited. If the concern is that short-term state fluctuations are more behaviorally relevant (and obscured by slow drifts), then why are the results with raw signals in the supplement (Suppfig 8) so similar?

      The concern was statistical confound, although this concern is ameliorated when using a mixed model approach and focusing on fixed effects. However, our approach allowed us to assess the relative importance of slow versus single-trial timescales in the predictive relationship between cortical state (and arousal) and behavior, revealing that, in the conditions of our experiment, only the fast timescales are relevant. Because of this, we think that the baby wasn’t thrown out with the bathwater as, qualitatively, no new phenomenology was revealed when the slow components of the signals were included. In hindsight, it is true that the results we obtained suggest that maybe the effort we made to isolate the fast component of the signals was unjustified. However, this can only be known after both options have been tried, as we did. Moreover, we started using innovations based on the results in Figure 2 where, as we show, the use of innovations does make a difference, even at the level of fixed effects in a mixed model. We agree that we could have used the ‘session permutation’ method, but given the depth at which we have explored this issue in the manuscript already, and the clarity of the results, we think that adding a third method would only make reading the manuscript more difficult without adding any substantially new content.

      While the authors are correct that go-nogo tasks have drawbacks in dissociating sensitivity from response bias, they only cursorily review the literature on 2AFC tasks and cortical state. In particular, it would be good to discuss how the specific method - spikes, EEG (Waschke), widefield (Jacobs) and algorithm for quantifying synchronization may affect outcomes. How do these population-based measures of cortical state relate to those described extensively with slightly different signals, notably LFP or EEG in humans (e.g. work by Saskia Haegens, Niko Busch, reviewed in https://doi.org/10.1016/j.tics.2020.05.004)? This review also points out the importance of moving beyond simple measures of accuracy and using SDT, which would be an interesting improvement for this paper too.

      We thank the reviewer for pointing us towards the oscillation-based brain-state literature in humans. We have expanded the paragraph in the discussion where we compare our results with previous work in order to (i) elaborate on the literature on 2AFC tasks, (ii) specifically address the literature linking alpha power in the pre-stimulus baseline and psychophysical performance, and (iii) mention different methods for assessing desynchronization. Our view is that absence of lowfrequency power is a robust measure which can be assessed using different types of signals (spikes, imaging, LFP, EEG). That said, the relationship between desynchronization and behavior appears subtle and variable, specially within discrimination paradigms. These issues are discussed in the paragraph starting in line 527 in the text.

      Regarding the use of SDT, we had already established that our main finding could be expressed as a significant interaction between FR/Synch and the stimulus-strength regressor, when predicting choice after errors (Supplementary Fig. 4A in original manuscript), which is equivalent to a cortical state-dependent increase in d′ after the mice made a mistake. In order to consider a possible effect of cortical state on the ‘criterion’ (i.e., an effect on the bias of the mice towards either response spout), we re-run this GLMM but adding the cortical state regressors as main effects. The results show that the FR-Synch predictor is only significantly greater than zero as an interaction after errors (p = 0.0025). As a main effect, it’s not significantly different from zero neither after errors (p = 0.28), nor after correct trials (p = 0.97). We have included this analysis as Figure 3-figure supplement 1B (replacing the previous Supplementary Fig. 4A) and commented on them in the text (lines 222-225).

      Reviewer 2 (Public Review):

      The relationship between measures of brain state, behavioral state, and performance has long been speculated to be relatively simple - with arousal and engagement reflecting EEG desynchronization and improved performance associated with increases in engagement and attention. The present study demonstrates that the outcome of the previous trial, specifically a miss, allows these associations to be seen - while a correct response appears less likely to do so. This is an interesting advance in our understanding of the relationship between brain state, behavioral state, and performance.

      This is probably just a typo, but we would like to clarify that the relevant outcome in the previous trial is not a miss, but an incorrect choice in an otherwise valid trial (i.e., a trial with a response within the allowed response window).

      While the study is well done, the results are likely to be specific to their trial structure and states exhibited by the mice. To examine the full range of arousal states, it needs to be demonstrated that animals are varying between near-sleep (e.g. drowsiness) and high-alertness such as in rapid running. The fact that the trials occurred rapidly means that the physiological and neural variables associated with each trial will overlap with upcoming trials - it takes a mouse more than a few seconds to relax from a previous miss or hit, for example. Spreading the rapidity of the trials out would allow for a broader range of states to be examined, and perhaps less cross-talk between adjacent trials. The interpretation of the results, therefore, must be taken in light of the trial structure and the states exhibited by the mice.

      We thank the reviewer for the positive assessment of our work and also for raising this point in particular. This motivated us to look more carefully at this issue, with results that, we believe, strengthen our study.

    1. Author Response

      Reviewer #1 (Public Review):

      In this work, Roche et al. study a 13-year long time series of microbiome samples from wild baboons from Kenya. The data used in this work challenge a previous finding from the same authors that temporal dynamics in microbiome changes are largely individualized. Using a multinomial logistic-normal modeling approach, the authors detect that co-variance in temporal dynamics in microbial pair-wise associations among individuals occurs more frequently between relatives. Furthermore, the authors identify that microbial phylogenetic proximity is associated with consistent co-abundance changes over time and that their metric of universal microbial relationships is robust across hosts and is detected even in human longitudinal data. The authors conduct a thorough statistical revision of publicly available results, highlighting this time (e.g. compared to Björk et al, doi: 10.1038/s41559-022-01773-4) the consistently shared microbial properties between individuals, rather that the individual microbial signatures highlighted in their previous work.

      Thank you for this summary. We would like to briefly clarify that we do not see the current work as inconsistent with our prior finding in Björk et al. that microbiome taxonomic compositions are idiosyncratic and asynchronized. However, this new analysis, which focuses on abundance correlations between pairs of taxa, indicates that the personalized compositions and dynamics we observed in Björk et al. are probably not attributable to personalized microbiome ecologies. In other words, Björk et al. showed that microbial taxa found in the guts of different baboons can be quite distinct (and remain so over time, giving rise to semi-stable individual signatures). The current study shows that, despite this taxonomic individuality, the correlations between pairs of microbes in the baboon gut are often quite consistent. To give a basic example, hot weather and ice cream, when observed, are often observed together (positively correlated), but while some places have a lot of both, some have little of either. This idea is discussed in more detail below (see response R6) and in the revised Discussion section (lines 572 to 586).

      Strengths:

      This work is foundational in its compelling effort to generate a rigorous method to evaluate coabundance dynamics in longitudinal microbiome data. The approach taken will likely inspire developments that will sharpen the capacity to extract co-varying microbial features, taking into account seasonality, diet, age, relatedness, and more. To the best of my understanding, their hierarchical model integrated into the Gaussian process to analyze microbial dynamics is reasonably robust and they clearly explain the implementation. Furthermore, this work introduces and defines the concept of a universality score for microbial taxon pairs. Overall, the work presented is clear and convincing and provides tools for the community to benefit from both methods and results. Furthermore, conceptually, this work stresses the value of consistent and shared microbial dynamics in groups, which enriches our understanding of host-associated microbial ecology, otherwise understood to be largely dependent on external fluctuations.

      Weakness:

      It is not entirely clear the extent to which the presented results revise, refute, or support the previously published analysis performed by the authors on the same dataset (doi: 10.1038/s41559-022-01773-4), which was more focused on individuality.

      We agree the relationship between Björk et al. and the current manuscript was unclear in our original submission. We now elucidate the relationship between these papers in the Discussion (lines 572 to 586). Briefly, Björk et al. found that microbiome taxonomic compositions are idiosyncratic and asynchronized. The current analysis finds that pairwise bacterial abundance correlations are predominantly shared and not highly personalized. We think the most likely explanation is that, as mentioned by Reviewer 2 below, the current analyses do not account for the role that environmental gradients play in the gut. If these environments differ asynchronously across hosts, it could lead to shared abundance correlations, but individualized microbiome compositions and individualized single-taxon dynamics. We discuss this possibility and other potential explanations in the revised Discussion (lines 572 to 586).

      Reviewer #2 (Public Review):

      The authors of this paper identify a knowledge gap in our understanding of the generalizability of ecological associations of gut bacteria across hosts. Theoretically, it is possible that ecological associations between bacteria are consistent within a host organism but differ between hosts, or that they are universal across hosts and their environmental gradients. The authors utilize longitudinal data with a unique temporal resolution, on Amboseli baboons, 56 individuals who were sampled for gut microbiome hundreds of times over a decade. This data allows disentangling ecological dynamics within and across individuals in a way that as far as I know has never been done before. The authors show that ecological relationships among baboon gut bacteria, measure through a correlation based on covariation, are largely universal (similar within and across host individuals) and that the most universally covarying taxa are almost always positively associated with each other. They also compare these results with two sets of human data, finding similar patterns in one human data set but not in the other.

      The main aim of this paper is to establish whether gut microbial ecologies are universal across hosts, and this the authors generally show to be true in a thorough and convincing way. However, some re-assessment or re-assurance on the solidity of their chosen method of estimating co-variation would be needed to fully assess the robustness of subsequent results. Specifically, the authors measure the correlation between microbial taxa from data on their abundance co-variation across samples. While necessary steps have been taken to validate the estimates across spurious correlations due to the compositional nature and autocorrelation structures present in the data, I worry that the sparsity of the data might influence the estimation of positive and negative correlations in a slightly different manner. There exist more microbial taxa than samples in the data and some taxa are present in as few as 20% of the samples, meaning that the covariation data will have a large amount of 0-0 pairs. I worry that the abundance of 0-0 pairs in the data might inflate the measures of positive co-variation, making taxa seem highly positively correlated in abundance when they in fact are missing from many samples. Of course, mutual absence is also a form of biologically meaningful covariation but taking the larger number of taxa than samples and the inability of sequencing technology to detect all low-abundance taxa in a sample, I am currently not convinced that all of the 0-0 pairs are modeled as a realistic and balanced way as a continuum of the other non-zero co-variation between taxa in the data. This may become problematic when positive and negative relationships are compared: The authors state that even though most associations between taxa were negative, the most universally correlated taxa pairs (taxa pairs with strongest correlations in abundance both within and between hosts) were enriched in positive associations. It may be possible that this is influenced by the fact that zero inflation in the data lends more weight to positive links than negative links. Whether these universal positive correlations are driven by positive non-zero abundance covariation or just 0-0 links in the data is currently unclear.

      Thank you for pointing out this weakness in our original analyses. As described in response R1 above, your hunch was correct: zero inflation biased our correlation patterns such that taxa pairs with a high frequency of joint zero observations (i.e., where both members of the pair had very low or zero abundances) tended to be positively correlated (Fig. R1). Consequently, as you suggested, zero inflation in the data lent more weight to positive links than negative links in our data set. To address this problem in the revised manuscript, we now restrict our analyses to taxon pairs whose joint zero-abundance observations were less than 5% of all samples across hosts (pairs to the left of the dashed vertical line in Fig. R1 above). We also restricted our analyses to taxa observed in at least 50% of all samples. The first of these criteria was the most restrictive. As described above, our new filtering procedure retained 1,878 of the original 7,750 ASV-ASV pairs; 57 of the original 66 phylum-phylum pairs; and 473 of the original 666 class/order/family-level pairs.

      Another additional result that would benefit from a more clear context is the result that taxa correlation patterns were more similar between phylogenetically close taxa and between genetically close host individuals. The former notion is to be expected if taxa abundances are driven by environmental (or host physiology-related) selective forces that favor bacteria with similar phenotypes. This yields more support to the idea that covariation is environmentally driven rather than driven by the ecological network of the bacteria themselves, and this could be more clearly emphasized. The latter notion of covariation being more similar in genetically related hosts is currently impossible to disentangle from the notion that covariation patterns were more similar with individuals harboring a more similar baseline microbiome composition since microbiome composition and genetic relatedness were apparently correlated. To understand if something about relatedness was actually influential over correlation pattern similarity, one would need to model that effect on top of the baseline similarity effect. Currently, it is not clear if this was done or not.

      We agree that shared responses to environmental gradients within hosts—especially immune profiles and pH—could explain both of these findings. These ideas are now described in the Discussion in lines 559 to 562.

      We also now report partial Mantel tests to control for baseline similarity in microbiome composition when testing for shared microbial correlation patterns among genetic relatives. Controlling for baseline similarity had little effect on the results, and we now report the statistics for this partial Mantel (Fig. 5B; Table S7; r2=0.009; partial Mantel p-value=0.002). See lines 391-392.

      The authors also slightly overemphasize the generalizability of their results to humans, taking that only one of the human data sets they compare their results to, shows similar patterns. While they mention that the other human data set (that was not similar in patterns to theirs) was different in some key aspects (sampling frequency was much higher), the other human data set was also dissimilar to the other two (it only contained infants, not adults). Furthermore, to back up the statement that higher sampling frequency would be the reason this data set had dissimilar covariation between taxa, one would need to show that the temporal variation in this data set was different from the baboon one and show that these covariation patterns were sensitive to timescale by subsampling either data to create mock data sets with different sampling frequency and see how this would change the inference of ecological associations.

      We have revised the text to tone down the generalizability of our results to humans. For instance, the abstract (line 58) now states that “universality in baboons was similar to that in human infants, and stronger than one data set from human adults” but does not state that our results are generalizable to humans.

      We also considered sub-sampling the data set from Johnson et al., from daily to monthly scales, but unfortunately that data set is only 17 days long, so doing so is impossible. This is now stated in the Discussion in line 619, which states, “However, without the ability to subsample Johnson et al. [7] to monthly scales (this data set is only 17 days long), it is impossible to test this prediction.”

      To the extent that the results are robust, particularly regarding to the main result of the universality of gut microbial ecological associations, the impact of this paper is not small. This question has never been so thoroughly and convincingly addressed, and the results as they stand have the power to strongly influence the expectations of gut microbial ecology across many different systems. Moreover, as the authors point out, evidence for universal gut microbial ecology is important for the future development of probiotics. An important point here, underemphasized by the authors, is that universal gut microbe ecologies will allow specific interventions that use gut microbe ecology to manipulate emergent community properties of microbiomes to be more beneficial for the host, rather than just designing compositional cocktails that should fit all. In addition to the main finding of this study, the unique data set and the methods developed as part of this study (e.g. the universality score, the enrichment measures, the model of log-ratio dynamics, the assessment of covariation from time-ordered abundance trajectories) will doubtlessly be translatable to many other studies in the future.

      Thank you for these suggestions. We now mention these implications in the introduction (line 82-84) and in the discussion in lines 537-539 and line 630.

      Reviewer #3 (Public Review):

      This is a well-executed study, offering thorough analysis and insightful interpretations. It is wellwritten, and I find the conclusions interesting, important, and well-supported.

      Thank you for your supportive comments.

      References

      1. Silverman JD, Roche K, Holmes ZC, David LA, Mukherjee S. Bayesian Multinomial Logistic Normal Models through Marginally Latent Matrix-T Processes. Journal of Machine Learning Research. 2022;23:1-42.
      2. Quinn TP, Richarrson MF, Lovell D, Crowley TM. propr: An R-package for Identifying Proportionally Abundant Features Using Compositional Data Analysis Scientific Reports. 2017;7:16252.
      3. Cao Y, Lin W, Li H. Large covariance estimation for compositional data via compositionadjusted thresholding. . J Am Stat Assoc. 2019:759-72.
      4. Friedman J, Alm EJ. Inferring correlation networks from genomic survey data. PLoS Comput Biol. 2012;8(9):e1002687. doi: 10.1371/journal.pcbi.1002687. PubMed PMID: 23028285; PubMed Central PMCID: PMCPMC3447976.
      5. Risely A, Schmid DW, Muller-Klein N, Wilhelm K, Clutton-Brock TH, Manser MB, et al. Gut microbiota individuality is contingent on temporal scale and age in wild meerkats. Proc Biol Sci. 2022;289(1981):20220609. Epub 20220817. doi: 10.1098/rspb.2022.0609. PubMed PMID: 35975437; PubMed Central PMCID: PMCPMC9382201.
      6. Wilmanski T, Diener C, Rappaport N, Patwardhan S, Wiedrick J, Lapidus J, et al. Gut microbiome pattern reflects healthy ageing and predicts survival in humans. Nat Metab. 2021;3(2):274-86. Epub 20210218. doi: 10.1038/s42255-021-00348-0. PubMed PMID: 33619379; PubMed Central PMCID: PMCPMC8169080.
      7. Johnson AJ, Vangay P, Al-Ghalith GA, Hillmann BM, Ward TL, Shields-Cutler RR, et al. Daily Sampling Reveals Personalized Diet-Microbiome Associations in Humans. Cell Host & Microbe. 2019;25(6):789-802. Epub 2019/06/14. doi: 10.1016/j.chom.2019.05.005. PubMed PMID: 31194939.
      8. Franzosa EA, Huang K, Meadow JF, Gevers D, Lemon KP, Bohannan BJM, et al. Identifying personal microbiomes using metagenomic codes. Proceedings of the National Academy of Sciences. 2015;112(22):E2930-E8. doi: 10.1073/pnas.1423854112. PubMed PMID: WOS:000355832200014.
      9. Faith JJ, Guruge JL, Charbonneau M, Subramanian S, Seedorf H, Goodman AL, et al. The long-term stability of the human gut microbiota. Science. 2013;341(6141):1237439. Epub 2013/07/06. doi: 10.1126/science.1237439. PubMed PMID: 23828941; PubMed Central PMCID: PMC3791589.
      10. Bik EM, Costello EK, Switzer AD, Callahan BJ, Holmes SP, Wells RS, et al. Marine mammals harbor unique microbiotas shaped by and yet distinct from the sea. Nat Commun. 2016;7:10516. Epub 20160203. doi: 10.1038/ncomms10516. PubMed PMID: 26839246; PubMed Central PMCID: PMCPMC4742810.
      11. Caporaso JG, Lauber CL, Costello EK, Berg-Lyons D, Gonzalez A, Stombaugh J, et al. Moving pictures of the human microbiome. Genome Biology. 2011;12(5):R50. doi: Artn R50 Doi 10.1186/Gb-2011-12-5-R50. PubMed PMID: ISI:000295732700014.
      12. Costello EK, Lauber CL, Hamady M, Fierer N, Gordon JI, Knight R. Bacterial community variation in human body habitats across space and time. Science. 2009;326(5960):1694-7. doi: Doi 10.1126/Science.1177486. PubMed PMID: ISI:000272839000053.
      13. Dolinsek J, Goldschmidt F, Johnson DR. Synthetic microbial ecology and the dynamic interplay between microbial genotypes. Fems Microbiology Reviews. 2016;40(6):961-79. doi: 10.1093/femsre/fuw024. PubMed PMID: WOS:000387995000010.
      14. Louca S, Polz MF, Mazel F, Albright MBN, Huber JA, O'Connor MI, et al. Function and functional redundancy in microbial systems. Nat Ecol Evol. 2018;2(6):936-43. Epub 2018/04/18. doi: 10.1038/s41559-018-0519-1. PubMed PMID: 29662222.
      15. Rainey PB, Quistad SD. Toward a dynamical understanding of microbial communities. Philos Trans R Soc Lond B Biol Sci. 2020;375(1798):20190248. Epub 2020/03/24. doi: 10.1098/rstb.2019.0248. PubMed PMID: 32200735; PubMed Central PMCID: PMCPMC7133524. 16. Martiny JB, Jones SE, Lennon JT, Martiny AC. Microbiomes in light of traits: A phylogenetic perspective. Science. 2015;350(6261):aac9323. doi: 10.1126/science.aac9323. PubMed PMID: 26542581.
      16. Debray R, Herbert RA, Jaffe AL, Crits-Christoph A, Power ME, Koskella B. Priority effects in microbiome assembly. Nat Rev Microbiol. 2022;20(2):109-21. Epub 20210827. doi: 10.1038/s41579-021-00604-w. PubMed PMID: 34453137.
      17. Gloor GB, Reid G. Compositional analysis: a valid approach to analyze microbiome highthroughput sequencing data. Can J Microbiol. 2016;62(8):692-703. Epub 2016/06/18. doi: 10.1139/cjm-2015-0821. PubMed PMID: 27314511.
      18. Joseph TA, Pasarkar AP, Pe'er I. Efficient and Accurate Inference of Mixed Microbial Population Trajectories from Longitudinal Count Data. Cell Syst. 2020;10(6):463-9 e6. Epub 20200624. doi: 10.1016/j.cels.2020.05.006. PubMed PMID: 32684275.
      19. Aijo T, Muller CL, Bonneau R. Temporal probabilistic modeling of bacterial compositions derived from 16S rRNA sequencing. Bioinformatics. 2018;34(3):372-80. doi: 10.1093/bioinformatics/btx549. PubMed PMID: 28968799; PubMed Central PMCID: PMCPMC5860357.
      20. Coyte KZ, Rao C, Rakoff-Nahoum S, Foster KR. Ecological rules for the assembly of microbiome communities. PLoS Biol. 2021;19(2):e3001116. Epub 20210219. doi: 10.1371/journal.pbio.3001116. PubMed PMID: 33606675; PubMed Central PMCID: PMCPMC7946185.
      21. Coyte KZ, Schluter J, Foster KR. The ecology of the microbiome: Networks, competition, and stability. Science. 2015;350(6261):663-6. doi: 10.1126/science.aad2602. PubMed PMID:
      22. Palmer JD, Foster KR. Bacterial species rarely work together. Science. 2022;376(6593):581-2. Epub 20220505. doi: 10.1126/science.abn5093. PubMed PMID:
      23. Reese AT, Pereira FC, Schintlmeister A, Berry D, Wagner M, Hale LP, et al. Microbial nitrogen limitation in the mammalian large intestine. Nat Microbiol. 2018. Epub 2018/10/31. doi: 10.1038/s41564-018-0267-7. PubMed PMID: 30374168.
      24. Firrman J, Liu L, Mahalak K, Tanes C, Bittinger K, Tu V, et al. The impact of environmental pH on the gut microbiota community structure and short chain fatty acid production. FEMS Microbiol Ecol. 2022;98(5). doi: 10.1093/femsec/fiac038. PubMed PMID: 35383853.
      25. de Vos WM, Tilg H, Van Hul M, Cani PD. Gut microbiome and health: mechanistic insights. Gut. 2022;71(5):1020-32. Epub 20220201. doi: 10.1136/gutjnl-2021-326789. PubMed PMID: 35105664; PubMed Central PMCID: PMCPMC8995832.
      26. Tamames J, Sanchez PD, Nikel PI, Pedros-Alio C. Quantifying the Relative Importance of Phylogeny and Environmental Preferences As Drivers of Gene Content in Prokaryotic Microorganisms. Front Microbiol. 2016;7:433. Epub 20160331. doi: 10.3389/fmicb.2016.00433. PubMed PMID: 27065987; PubMed Central PMCID: PMCPMC4814473.
      27. Gloor GB, Macklaim JM, Pawlowsky-Glahn V, Egozcue JJ. Microbiome datasets are compositional: and this is not optional. Front Microbiol. 2017;8:2224. Epub 2017/12/01. doi: 10.3389/fmicb.2017.02224. PubMed PMID: 29187837; PubMed Central PMCID: PMCPMC5695134.
      28. Caporaso JG, Lauber CL, Walters WA, Berg-Lyons D, Lozupone CA, Turnbaugh PJ, et al. Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proceedings of the National Academy of Sciences. 2011;108:4516-22. doi: Doi 10.1073/Pnas.1000080107. PubMed PMID: ISI:000288451300002.
    1. Motivation

      Reviewer 3: Wyeth Wasserman

      SYNOPSIS The manuscript describes an updated release of the ReMM regulatory variant mutation scoring system. The paper presents the performance of an updated version of the system and describes how it was applied to the most current release of the reference human genome.

      OVERALL PERSPECTIVE This is a valuable resource for the community of researchers and clinicians working on the interpretation of genetic variants in the human genome. The work appears to be thoughtfully done and appropriate assessments have been provided. The use of the random forest models to weigh the contributions of features was particularly noted for the insights it provided into how features contribute to prediction. My biggest concerns are stylistic, which falls outside the scientific quality of the work. I provide these comments for the authors to consider and do not expect that my stylistic preferences will be uniformly accepted. A fair amount of justification of the manuscript focuses on the value of having a release for version 38 of the human genome, pointing to the field as not having done so broadly. I think this is misguided, as by the time people are reading the manuscript such points will have lost relevance. I suggest a focus on the science be given, as there is no need to justify things based on where other resources have progressed in releasing their version 38 updates. Points below include language/text clarifications that can be assessed by the authors. Writing styles differ, so stylistic comments should be optional.

      MAJOR POINTS None. Well done and clearly presented.

      MINOR POINTS 1. The word "various" is vague and often shows up when people are too busy to provide an accurate statement. Starting the manuscript with it makes a bad impression on this reader. You do not have to change it, but I thought you might appreciate knowing this impression. You could delete it with no harm to the sentence. (Not to get carried away, but the next sentence starting with "some" heightens the impression of 'hand waving'.) 2. I think I understand ", we apply cytogenic band-aware cross-validation using ten folds" but I encourage the authors to provide clearer wording for this point. 3. I would allow the reader to make their own judgement of performance. So please remove "excellent" from "we achieve an excellent performance" 4. "Rather than using ReMM scores for ranking, some users need to specify score thresholds" is confusing. I would change 'need to' to 'choose to' 5. "with lots of false positives" is a bit informal. I suggest "with a high false positive rate" 6. I am confused by "from three genomic regions (genic content and not overlapping with assembly gap changes) " as the brackets include two items, not three. 7. "maybe due to better mapping" - "maybe" should be "may be" 8. I think the language like "seems to be the only tool directly trained on training data and features derived from GRCh38." Is not particularly valuable long term. This is a useful contribution, but many tools are being updated to 38 and by the time this appears and is read, such statements decline in relevance. I would focus on providing this valuable resource, and not try to justify it based on a transient perception of where the field stands in updating versions. 9. "It is worth noting that in the context of extremely unbalanced data…" - you do note it. So I would change the wording to "In the context of extremely unbalanced data…"

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Detailed Answer to the Reviewers

      Reviewer #1

      __Summary __

      The authors used a novel imaging technique to monitor glutamate release and correlated these measurements with gold standard electrophysiological measurements. The genetically encoded glutamate reporter, iGluSnFR, was expressed in mouse spiral ganglion neurons using the approach described in Ozcete and Moser (2021, EMBO J). The iGluSnFR signals and the postsynaptic currents were measured at the endbulb of Held synapse. A small effect of the expression of iGluSnFR on the mEPSC kinetics was found (but see comment 1). Furthermore, deconvolution of the iGluSnFR signals was performed enabling the comparison of some presynaptic properties assessed with either iGluSnFR or electrophysiology.

      We thank the reviewer for her/his appreciation of our work and for the comments that have helped/will help us further improve our manuscript.

      __Major comments __

      1. The central finding of the study is the prolonged decay time constant of the mEPSC. The difference is small but astonishingly significant (0.172 {plus minus} 0.002 and 0.158 {plus minus} 0.001, P=0.003). The SEM is about 100 times smaller than the measured time constant. This is biologically not plausible. Therefore, I am skeptical about the statistical significance of the results.

      We appreciate the feedback of the reviewer. We agree that our presentation of the data was easy to misunderstand and we changed it (see below). We modeled the statistical relationship of kinetic parameters with a mixed effects model (as described in methods). Since the presentation of regression parameters for this kind of data is not very usual in synaptic neuroscience (nor very informative in this study), we instead opted to report SEM and a p-value derived from the fit of the linear mixed effects model. For the SEM, there is no clear way to take into account the clustered nature of the data, so we calculated the SEM over all observations. Since the SEM is proportional to 1/sqrt(n) and the number of recorded mEPSCs is very large, this does indeed yield a very small SEM. We agree that reporting the SEM over all observations is unusual and leads to misunderstandings in this case. Now, we instead report the re-calculated the mean / SEM for all parameters over the median values per cell. We changed the presentation of the data also for the other values presented in the MS in all tables and the relevant parts of the main texts.. We note that the summary statistics do not directly influence the further statistical modeling.

      1. Analysis of the size of RRP with electrophysiology and iGluSnFR is potentially interesting but iGluSnFR recordings could not resolve the spontaneous fusion of single vesicles. Therefore, it is not possible to estimate RRP with these iGluSnFR recordings. This limitation of the approach should be emphasised more clearly.

      Yes, we think the inability to resolve single vesicles is one of the major limitations of the study and we note this in the introduction and in the relevant section of the discussion. We agree that it should be clear in the relevant section that we are not able to measure RRP size without resolving single vesicle release and modified the wording of the relevant results section to reflect this better (line 267, 497). We still believe that the cumulative release analysis is potentially interesting to researchers in the field, as RRP size is not the only parameter that can be estimated in this way. In particular, an estimation of the release probability in resting conditions is possible by dividing the amplitude of the first response (i.e. response to a single stimulus) by the RRP estimate even without knowing the exact number of vesicles that comprise either.

      1. The control conditions (no surgery/no virus injection) are not the correct conditions for comparison with the experimental conditions (surgery/virus injection and sensor expression). The control group should be operated and injected with saline or ideally with a virus expressing GFP at the extracellular membrane. The authors addressed this issue by citing their previous work (Özcete and Moser, 2021). However, I am not convinced that surgery does not induce subtle changes that could explain the small differences in mEPSCs.

      This is an excellent point that should be addressed in further research. A slowed decay would be consistent with the idea that iGluSnFR affects glutamatergic transmission by buffering glutamate, but we cannot rule out subtle changes due to the postnatal surgery or AAV-mediated transgene expression. In response to the reviewer’s comment, we modified the text to reflect the possibility of surgery and / or other parts of the expression system being responsible for the changes. We also discuss further control experiments (line 408). Finally, we believe that our comparison is still relevant for researchers using iGluSnFR in the system, as they will be asking if introducing a measurement system affects the underlying quantity.

      __Minor comments __

      The supplementary figures are not listed in the order in which they appear in the main text.

      We now list the supplementary figures in the order in which they appear in the main text.

      Figure 2B and 3 are not referenced in the main text.

      We now reference the figures in the text.

      The PPR in Figure 3 shows a PPR that cannot be evaluated because of the unusual plot with lines that are too thick.

      We updated Fig.3 and chose a more straight forward way to display the PPRs.

      Line 105: "...while simultaneously monitoring currents in postsynaptic cells". This sentence is not correct given that the EPSCs have not been shown yet at this point of the manuscript.

      We removed this part of the sentence.

      Line 110: "SV and are not cause are cause by spontaneous action potentials...". The sentence does not make sense.

      We corrected the sentence.

      Line 168-9: "...we did not find significant differences in amplitude and kinetics...". According to Table 2 (2mM Ca2+ condition), both Imax and Q appear to be almost twice as high in iGluSnFR as in control (2.05 {plus minus} 0.06 and 1.34 {plus minus} 0.03, respectively; P=0.241). Is this not a significant difference?

      The difference was not significant. The misunderstanding likely stems from the same problem in the presentation of the values as for the mEPSCs. We replaced the SEMs with the SEMs of the cell median to avert this.

      Table 4. 2mM Ca2+ condition. The Rrefill parameter is about an order of magnitude smaller in the iGluSnFR-expressing group. Is this correct or just a typo?

      Thank you for spotting this: it was an error with regards to unit conversion. The value for the control group was off by a factor of 10. We corrected this mistake.

      Referees cross-commenting

      I also agree with the comments of the other reviewers.

      Significance

      General assessment

      This topic is currently of interest because iGluSnFR techniques are widely used. However, the study is preliminary. The scientific progress in terms of quantity and quality is limited. For example, Figs. 1 and 5 show only images and traces with little scientific significance.

      Advance

      The main advance of the study is the implementation of the deconvolution of the iGluSnFR signal and the comparison of the back extrapolation with the first stimulus (Fig. 6). This comparison was similar between electrophysiology and iGluSnFR when deconvolution of the iGluSnFR data was performed. These data therefore argue against saturation of iGluSnFR, as expected from a large number of previous analyses of iGluSnFR.

      There is little methodological improvements compared with the group's previous study (Ozcete and Moser, 2021 EMBO J). In this earlier study, a different synapse was analyzed but the same iGluSnFR was injected into the scala tympani of the right ear through the round window in the same way as in this study. Surprisingly, the authors do not refer to Ozcete and Moser (2021) in the relevant methods section.

      Thank you for spotting this omission. We now cite Özçete and Moser (2021) in the appropriate place in the methods section as well.

      Reviewer #2

      Summary

      In the manuscript 'Optical measurement of glutamate release robustly reports short-term plasticity at a fast central synapse' the authors present a careful analysis of whether direct measurements of transmitter release using the genetically-encoded indicator iGluSnFR, are suitable for assessing changes in transmitter release at the spiral ganglion neuron end bulbs of Held in the mouse cochlear nucleus. What sets this study apart from other studies, which have demonstrated the utility of iGluSnFR measurements, is the use of a camera-based fluorescence readout as opposed to confocal or 2P microscopy methods and that it is performed in the cochlear nucleus.

      The primary methodology is the comparison of electrophysiological measurements of excitatory postsynaptic currents from bushy cells with fluorescence changes in the end bulbs of iGluSnFR expressing auditory nerve fibers with and without stimulation of the auditory nerve fibers. The experiments are technically demanding and introducing genetically encoded indicators in neurons of the cochlea is no small accomplishment. An important observation is that mEPSCs are slightly modified (prolonged) due to expression of iGluSnFR in the presynaptic end bulbs. This is perhaps not surprising as iGluSnFR binds glutamate and may act as a buffer to reduce the peak and slightly prolong the increase in cleft glutamate concentration after release from synaptic vesicles. To my knowledge, others have not reported iGluSnFR effects on mEPSCs. Perhaps earlier studies have not checked as carefully, alternatively previous studies had a too-low fraction of presynaptic terminals expressing iGluSnFR (or less expression of iGluSnFR) to detect a change in EPSC parameters, or this is a synapse-specific phenomenon. However, the authors demonstrate that EPSCs evoked by electrical stimulation of the auditory nerve fibers are unaffected by expression of iGluSnFR in the presynaptic neurons. Further findings are that the determined decay time constant is substantially longer than at other synapses (~16 ms at hippocampal synapses, Dürst et al., 2018). Synaptic depression was robustly reported by iGluSnFR at this synapse, but determination of single quantal events and thus quantal analysis was not really possible at this synapse using iGluSnFR in conjunction with the imaging and analysis techniques presented. The manuscript is carefully written and presented.

      We thank the reviewer for her/his appreciation of our work and for the comments that have helped/will help us further improve our manuscript.

      Major points

      1) The ROIs are selected to be 'outer bounds' of the glutamate spread from the synapses being studied. My concern is that these generously-sized ROIs include signal from many iGluSnFR molecules which are distal to the release sites and thus will be reached only slowly by low concentrations of glutamate or be contributing only noise and no changes in fluorescence. I suggest the temporal resolution could be improved by restricting the analysis of fluorescence changes to fewer pixels within the ROIs with the fastest rising/highest amplitude responses.

      Thanks for this helpful comment: The data in our data set should be well-suited to perform this analysis in addition to the presented analysis and so we added this new analysis to the Revision Plan.

      2) The observation that despite a 2 fold increase in eEPSCs when changing from 2 mM to 4 mM extracellular calcium there is no change in iGluSnFR peak is curious as pointed out by the authors but not really discussed.

      We don’t currently have an obvious explanation but consider saturation of the iGluSnFR peak response likely to contribute. In response to the comment of the reviewer, we have added the analysis of integrated iGluSnFR data, which we previously found to be more robust toward saturation than the peak, to the revision plan. We plan to add the relevant discussion along with the new analysis.

      Are the traces presented in Figure 5 examples from the same recording?

      Traces in fig. 5 are grand averages (wording modified for clarity). Unfortunately, it was not possible to routinely measure iGluSnFR responses from the same cell in 2mM Ca2+ and 4 mM Ca2+ as the time needed for the protocols was rather long which influenced cell stability and imaging conditions would deteriorate during the exchange of the bath solution. We think it is not quite possible to directly compare the absolute iGluSnFR responses at different extracellular Ca2+ levels.

      Assuming the examples are from one cell I first assumed the lack of change of peak was saturation of iGluSnFR but the larger fluorescence change with 100 Hz stimulation suggests otherwise. How many endbulbs are contacting one BC? Do you capture iGluSnFR responses from only one or several? In the previous point I suggested that restricting analysis to the soonest reacting pixels might improve temporal resolution but in the case of detecting the peaks with higher and normal calcium, these fastest reacting signals are probably also more likely to be saturated with glutamate.

      The eEPSCs elicited by this stimulation paradigm are monosynaptic (see methods / electrophysiology section), but there might be other iGluSnFR expressing endbulbs on the same bushy cell. Since we reduce the current just enough such that any further reduction leads to a complete failure to elicit an EPSC, we believe these additional endbulbs are not releasing glutamate. We cannot, however, exclude the possibility that iGluSnFR on neighboring structures captures any potential spillover glutamate.

      Minor points

      • mEPSCs are usually recorded in tetrodotoxin, I didn't find any mention in methods/results

      In this system, sEPSCs are not affected by TTX (Oleskevich and Walmsley, 2002) and thus usually recorded without adding TTX. We discuss this more explicitly and added a clarification to reflect this assumption (on line 111).

      • the large numbers of abbreviations make it difficult in places to follow the manuscript please at least define them again in the figure legends (e.g. BC, AVCN in figure 1, Q, FWHM in figure 2 etc.)

      We went over the manuscript again and removed some abbreviations or redefined them in captions.

      • it is a bit unusual to report results of a Wilcoxon test and at the same time mean and SEM instead of medians, if different tests were used then it is important to indicate this where the p values are given or make the sentence in the methods more definitive

      We agree that the initial presentation of the data was ambiguous. We changed the presentation to reflect this (see also answer to reviewer 1).

      • the liquid junction potential is reported as 12 mV, pretty sure it should be -12 mV (unless the QX-314 or some other of the more exotic ingredients in the extracellular solution is having a dramatic effect on the LJP).

      We follow the usual conventions of P. H. Barry, Methods in Enzymology, Vol. 171, p. 678, as described in E. Neher, Methods in Enzymology, Vol. 207, p. 123, in which the LJP is defined as the potential of the bath solution with respect to the pipette solution. We subtracted this positive potential (+12mV) in the end to obtain the membrane potential which therefore was more hyperpolarized than the nominal potential.

      I wonder if one of the faster/lower affinity iGluSnFR variants would be better suited for studying this synapse.

      We agree with the reviewer that future studies should explore the potential of faster/lower affinity iGluSnFR variants for studying the endbulb synapse. The reasons why we employed the original version include: i) sharing the same mice for studies of cochlear ribbon synapses (Özcete & Moser, 2021) and cochlear nucleus synapses (this MS) for the sake of reducing animal experiments, ii) good signal to background facilitating our first study establishing the recording in brainstem slice, iii) less signal to background and shorter signals with the new variants (as found in preliminary recordings from cochlear ribbon synapses) that would make the endbulb recordings more challenging. We have added the following statement to discussion. “Future imaging studies of glutamate release at calyceal synapses should explore the potential of new iGluSnFR variants with lower affinity that provide more rapid signal decay. This will ideally go along with imaging at higher framerate and might require stronger intensities of the excitation light to boost the fluorescence signal.” on line 430.

      The paper would benefit from a careful reading to shorten the text and to check for clarity. For instance page 15 line 436 I don't understand how 'the results can reduce the likelihood of biologically relevant changes'. I think the authors meant something different

      Thank you for spotting this. We reworded the sentence (now on line 399): "The data on hand suggests that this is not the case. Firstly, even if a larger sample size may uncover more subtle effects neurotransmission of evoked events, our measurements suggest a small effect size. Secondly, even as we did find changes in mEPSC, it is probable that the biological significance is limited"

      • page 5 'width' is misspelled

      Fixed.

      • page 18 'strychnine' is misspelled

      Fixed.

      • on many of the figures is text that it much too small

      We went through the manuscript and increased the text size in the figures, where appropriate.

      __Referees cross-commenting __

      I agree with all the comments of the other reviewers - both raise the point that there should be a 'control' AAV injected for comparison of the mEPSCs which I missed but is of course quite important. See https://pubmed.ncbi.nlm.nih.gov/24872574/ for a study of AAV serotype-dependent effects on presynaptic release.

      We now added a section on other possible factors influencing the results, citing the study above.

      Significance

      The main audience for this paper will be fairly specialized. Researchers interested in properties of presynaptic release and some specialists in synaptic transmission in the auditory system will be the main readers/citers of this work.

      The work is an important technical/methodological report. It highlights an important effect of expressing iGluSnFR and also demonstrates that the effect is overall not very problematic. Additional problems using iGluSnFR are also indicated.

      I am an electrophysiologist, studying synaptic transmission and plasticity with experience using a wide range of optogenetic tools

      Reviewer #3

      __Summary __

      In the present manuscript, the authors explore the information that can be obtained using optical measurement of glutamate release with iGluSnFR on synaptic dynamics in the endbulb of Held.

      They virally express iGluSnFR in presynaptic terminals, patch the postsynaptic cells and combine high-frame-rate optical recordings with electrophysiological measurements. Their first finding is that mEPCSs are prolonged when presynaptic cells express the glutamate indicator, which they interpret as buffering of extracellular glutamate by the indicator. Next, they repeated the experiment, this time with stimulating evoked EPSCs. In contrast to the previously observed effects, iGluSnFR did not affect the time course or the amplitude of the evoked EPSCs. The authors then asked whether iGluSnFR signals can be used to study synaptic dynamics, specifically, synaptic depression. In these experiments, the authors observed a change in the paired-pulse ratio with ISI of 10ms, but not longer intervals. They analyzed presynaptic release and did not find statistically significant differences.

      Can iGluSnFR signals be used for the analysis of synaptic release? When stimulated at a low frequency of 10Hz (allowing the fluorescence to return close to baseline levels in between pulses), iGluSnFR dynamics were somewhat comparable to postsynaptic signals. At higher frequencies, the slow time course of the indicator prevented the identification of individual responses and the resulting fluorescence had a very different shape. To resolve this problem, the authors used deconvolution analysis (fig 6). This analysis revealed a linear relationship between the optical readout and the patch-clamp data.

      I find the manuscript to be clearly written, the findings are well presented and discussed and are novel and of substantial interest to neuroscientists in the field. I do have a number of questions about experiments and analysis that may have an effect on the conclusions of this work.

      We thank the reviewer for her/his appreciation of our work and for the comments that have helped/will help us further improve our manuscript.

      1. In experiments comparing the effects of iGluSnFR expression on release dynamics (figure 1-4), the authors compare infected presynaptic cells to control (uninfected). The assumption is that synaptic buffering by iGluSnFR may affect glutamate diffusion in the synaptic cleft. However, it is possible that viral infection itself changes presynaptic properties. The authors should compare release from cells infected with GFP or a comparable indicator.

      We agree that this is an important control experiment to be done in the future and that causal attribution is not in the scope of this study. A slowed decay would be consistent with the idea that iGluSnFR affects glutamatergic transmission by buffering glutamate, but we cannot rule out subtle changes due to the postnatal surgery or AAV-mediated transgene expression. In response to the reviewer’s comment, we modified the text to reflect the possibility of surgery and / or other parts of the expression system being responsible for the changes. We now also discuss further control experiments (line 408). Finally, we believe that our comparison is still relevant for researchers using iGluSnFR in the system, as they will be asking if introducing a measurement system affects the underlying quantity.

      1. Analysis of pool parameters presented in table 4 indicates almost doubling of RRP size with iGLuSnFR with 2 mM Ca++. While not significant, this result may indicate a real effect that may have been missed due to low power (N=3 and 7 for these experiments). I do not believe the authors did a power analysis in this study. How was the number of experiments determined? I would suggest increasing the number of experiments to avoid type II errors.

      We thank the reviewer for this critical comment. Indeed, we would also have liked to have a greater statistical power for these experiments, but had to face the situation that the establishing the method required more animals than expected and the animal license did not offer further animals for the analysis. Moreover, we note that the obtained RRP size estimates were generally lower compared to previous estimates of our lab for the endbulb synapse (e.g. Butola et al., 2021: ~20 nA for 2 mM Ca2+ in Fig. 5). This can partially be attributed to the use of cyclothiazide in previous studies, which we avoided given reports of presynaptic effects of cyclothiazide. As the series resistances of the included recordings were below 8 MOhm (mean series resistances: 2mM Ca, injected: 5.58 MOhm; 2mM Ca, control: 5.93 MOhm; 4mM Ca, injected: 5.75 MOhm; 4mM Ca, control: 6.0 MOhm) and series resistance compensation was set to 80% we do not expect clamp-quality to contribute to the smaller estimates in the present data set.

      We have now added a statement noting the preliminary nature of these results and indicated that further experiments will be required to more certainly conclude on potential effects of iGluSnFR or the manipulation on endbulb transmission: “Our preliminary train stimulation analysis of vesicle pool dynamics in the presence and absence of AAV-mediated iGluSnFR expression in SGNs has not revealed significant differences between the two conditions. Further experiments, potentially involving faster versions of iGluSnFR and employing trains of different stimulation rates for model based analysis of vesicle pool dynamics (Neher and Taschenberger, 2021) will help to assess the value and impact of iGluSnFR in the analysis of transmission at calyceal synapses.” on line 381.

      1. The deconvolution analysis assumes an instantaneous rise time. Yet previous work (Armbruster et al., 2020) that took into account diffusion, suggested potentially slower rise time dynamics. More importantly, the deconvolved waveforms do not match the shapes of the EPSCs (figures 5 and supp 6-2).What is the aim of the deconvolution? It was not clear from the text, but I assume it shows iGluSnFR binding to glutamate - in which case the slow waveforms are indicative of extrasynaptic iGluSnFR activation.

      The deconvolution analysis was mainly used to recover the average responses to stimuli in the train without contamination by previous responses (see also Taschenberger et al. 2016, their figure 6).

      We did also try to use the average singular response instead of the exponential fit as a kernel for the (Wiener) deconvolution analysis, which more closely resembled the observed (fast) rise. Unfortunately, this led to markedly worse results, likely because of the noise levels in the measurements. We believe that it would be beneficial to model the rise of the signal more precisely if glutamate imaging data is acquired at higher framerates.

      The broad wave forms may be due to extrasynaptic binding of glutamate, but we also note that each frame corresponds to ~10ms and there is only ~10 data points between stimuli, so the responses are unlikely to be as sharp as eEPSCs.

      However, I suppose that the more interesting question is whether iGluSnFR could be deconvolved to reveal the underlying release events, similar to how calcium signals can be used to inform about single action potentials.

      We agree that it would be particularly interesting to use a "mini iGluSnFR" signal to deconvolve the resulting traces. Unfortunately, we failed to detect iGluSnFR signals reporting individual release events at this time, preventing this kind of analysis.

      1. I suggest referencing and discussing (Aggarwal et al., 2022; Srivastava et al., 2022) . These highly relevant papers analyzed iGluSnFR to probe synaptic release.

      References:

      Aggarwal, A., Liu, R., Chen, Y., Ralowicz, A. J., Bergerson, S. J., Tomaska, F., Hanson, T. L., Hasseman, J. P., Reep, D., Tsegaye, G., Yao, P., Ji, X., Kloos, M., Walpita, D., Patel, R., Mohr, M. A., Tilberg, P. W., Mohar, B., Team, T. G. P., . . . Podgorski, K. (2022). Glutamate indicators with improved activation kinetics and localization for imaging synaptic transmission. bioRxiv, 2022.2002.2013.480251. https://doi.org/10.1101/2022.02.13.480251

      Armbruster, M., Dulla, C. G., & Diamond, J. S. (2020). Effects of fluorescent glutamate indicators on neurotransmitter diffusion and uptake. Elife, 9. https://doi.org/10.7554/eLife.54441

      Srivastava, P., de Rosenroll, G., Matsumoto, A., Michaels, T., Turple, Z., Jain, V., Sethuramanujam, S., Murphy-Baum, B. L., Yonehara, K., & Awatramani, G. B. (2022). Spatiotemporal properties of glutamate input support direction selectivity in the dendrites of retinal starburst amacrine cells. Elife, 11. https://doi.org/10.7554/eLife.81533

      We thank the reviewer for the suggestions. Some of these studies were not available when we first drafted the manuscript. We now added a section discussing these studies starting on line 466:

      Optimizing the imaging technique may reduce noise level, while the development of improved GEGIs could improve the signal to a level, at which spontaneous release events can be identified reliably in the cochlear nucleus. In retinal slices, where quantal events have been reliably observed with two-photon imaging, temporal deconvolution was successfully employed to estimate release rates from iGluSnFR signal (Srivastava et al., 2022; James et al., 2019). Subcellular targeting of iGluSnFR variants to the postsynaptic membrane may reduce measurement errors introduced by contributing extrasynaptic iGluSnFR signal and improve spatial resolution of glutamate imaging data(Hao et al., 2023; Aggarwal et al., 2022).

      Referees cross-commenting

      I also agree with the comments made by other reviewers!

      Significance

      Overall, this study addresses an important problem in basic neuroscience research. With the developing reliance on optical measurement of neuronal function, it is important to understand the impact of the indicators on physiological function and the limitations of the technique. The study is well-executed and will be informative to neuroscientists performing optical glutamate recording to study single-cell and circuit function in and beyond the auditory system.

    1. Thank you so much for your paper! Metabolism of amino acids is extremely important to study but also very complex. It's also a really vast field so I really appreciate it when scientists decide to take a deep dive and uncover the existing metabolic pathways. Kudos for that! As excited as I am about L-amino acids, I'm even more excited to understand the metabolism of D-amino acids. I was wondering if you have considered applying your experimental approach to understand the metabolic pathway for D-arginine? May be also other D-amino acids? I think we know little about the metabolism of D-amino acids in B. subtilis and about the regulation of the metabolic enzymes. Thank you for your time!

    1. Author Response

      Reviewer #2 (Public Review):

      Granell et al. investigated genetic factors underlying wheezing from birth to young adulthood using a robust data-driven approach with the aim of understanding the genetic architecture of different wheezing phenotypes. The association of 8.1 million single nucleotide polymorphisms (SNPs) with wheeze phenotypes derived from birth to 18 years of age was evaluated in 9,568 subjects from five independent cohorts from the United Kingdom. This meta-genome-wide association study (GWAS) revealed the suggestive association of 134 independent SNPs with at least one wheezing subtype. Among these, 85 genetic variants were found to be potentially causative. Indeed, some of these were located nearby well-known asthma loci (e.g., the 17q21 chromosome band), although ANXA1 was revealed for the first time to play an important role in early-onset persistent wheezing. This was strongly supported by functional evidence. One of the top ANXA1 SNPs associated with wheezing was found to be potentially involved in the regulation of the transcription of this gene due to its location at the promoter region. This polymorphism (rs75260654) had been previously evidenced to regulate the ANXA1 expression in immune cells, as well as in pulmonary cells through its association as an eQTL. Protein-protein network analyses revealed the interaction of ANXA1 with proteins involved in asthma pathophysiology and regulation of the inflammatory response. Additionally, the authors conducted a murine model, finding increased anxa1 levels after a challenge with house dust mite allergens. Mice deficient in anxa1 showed decreased lung function, increased eosinophilia, and Th2 cell levels after allergen stimulation. These results suggest the dysregulation of the immune response in the lungs, eosinophilia, and Th2-driven exacerbations in response to allergens as a result of decreased levels of anxa1. This coincides with evidence of lower plasmatic ANXA1 levels in patients with uncontrolled asthma, suggesting this locus is a very promising candidate as a target of novel therapeutic strategies.

      Limitations of this piece of work that need to be acknowledged:

      (1) the manual and visual inspection of Locus Zoom plots for the refinement of association signals and identification of functional elements does not seem to be objective enough;

      This is an important observation and we have now added the following text in the Discussion which can be found on lines 400-2 Revised Main Manuscript:

      “Finally, the manual and visual inspection of Locus Zoom plots for the refinement of association signals and identification of functional elements was an objective approach which might have undermined the findings.“

      (2) the sample size is limited, although the statistical power was improved by the assessment of very accurate disease sub-phenotype;

      This point was already mentioned as a limitation and it can now be found in lines 349-365 Revised Main Manuscript:

      “By GWAS standards, our study is comparatively small and may be considered to be underpowered. The sample size may be an issue when using an aggregated definition (such as “doctor-diagnosed asthma”) but is less likely to be an issue when primary outcome is determined by deep phenotyping. This is indirectly confirmed in our analyses. Our primary outcome was derived through careful phenotyping over a period of more than two decades in five independent birth cohorts, and although comparatively smaller than some asthma GWASs, our study proved to be powered enough to detect previously identified key associations (e.g. chr17q21 locus). Precise phenotyping has the potential to identify new risk loci. For example, a comparatively small GWAS (1,173 cases and 2,522 controls) which used a specific subtype of early-onset childhood asthma with recurrent severe exacerbations as an outcome, identified a functional variant in a novel susceptibility gene CDHR3 (SNP rs6967330) as an associate of this disease subtype, but not of doctor-diagnosed asthma(51). This important discovery was made with a considerably smaller sample size but using a more precise asthma subtype. In contrast, the largest asthma GWAS to date had a ~40-fold higher sample size(7), but reported no significant association between CDHR3 and aggregated asthma diagnosis. Therefore, with careful phenotyping, smaller sample sizes may be adequately powered to identify larger effect sizes than those in large GWASs with broader outcome definitions(52).”

      (3) association signals with moderate significance levels but with strong functional evidence were found;

      We do not think of this as a limitation but as a strength. We were able to support our genetic results with evidence from experimental mouse models.

      (4) no direct replication of the findings in independent populations including diverse ancestry groups was described.

      This point was already mentioned as a limitation and it can now be found in lines 375-391 and 392-399 Revised Main Manuscript.

      “We are cognisant that there may be a perception of the lack of replication of our GWAS findings. We would argue that direct replication is almost certainly not possible in other cohorts, as phenotypes for replication studies should be homogenous(56). However, there is a considerable heterogeneity in LCA-derived wheeze phenotypes between studies, and although phenotypes in different studies are usually designated with the same names, they differ between studies in temporal trajectories, distributions within a population, and associated risk factors(57). This heterogeneity is in part consequent on the number and the non-uniformity of the timepoints used, and is likely one of the factors responsible for the lack of consistent associations of discovered phenotypes with risk factors reported in previous studies(58). This will also adversely impact the ability to identify phenotype-specific genetic associates. For example, we have previously shown that less distinct wheeze phenotypes in PIAMA were identified compared to those derived in ALSPAC(59). Thus, phenotypes that are homogeneous to those in our study almost certainly cannot readily be derived in available populations. This is exemplified in our attempted replication of ANXA1 findings in PIAMA cohort (see OLS, Table E12). In this analysis, the number of individuals assigned to persistent wheezing in PIAMA was small (40), associates of this phenotype differed to those in STELAR cohorts, and the SNPs’ imputation scores were low (<0.60), which meant the conditions for replication were not met.”

      “Our study population is of European descent, and we cannot generalize the results to different ethnicities or environments. It is important to highlight the under-representation of ethnically diverse populations in most GWASs(9). To mitigate against this, large consortia have been formed, which combine the results of multiple ethnically diverse GWASs to increase the overall power to identify asthma-susceptibility loci. Examples include the GABRIEL(6), EVE(60) and TAGC(7) consortia, and the value of diverse, multi-ethnic participants in large-scale genomic studies has recently been shown(61). However, such consortia do not have the depth of longitudinal data to allow the type of analyses which we carried out to derive a multivariable primary outcome.”

      Nonetheless, the robustness and consistency of the findings supported by different analytical and experimental layers is the major strength of this study.

      The authors successfully achieved the aims of the study, strongly supported by the results presented. This study not only provides an exciting novel locus for wheezing with potential implications in the development of alternative therapeutic strategies but also opens the path for better-powered research of asthma genetics, focused on accurate disease phenotypes derived by innovative data-driven approaches that might speed up the process to disentangle the missing heritability of asthma, making use of still useful GWAS approaches.

    1. Author Response

      Reviewer #2 (Public Review):

      The manuscript by Mohebi et al. examines a critical open question regarding the interaction of cholinergic interneurons of the striatum and transmitter release from dopaminergic axons in behaving animals. Activation of cholinergic interneurons in the striatum can evoke dopamine release in brain slices and in vivo as measured with voltammetry. However, it remains an open question in what context and to what extent this acetylcholine-mediated dopamine occurs in behaving animals. Here, the authors argue that CIN activity triggers dopamine release in the nucleus accumbens which encodes the motivation to obtain a reward through increasing "ramps" of dopamine release. Their data suggest that the ramps are not reflected in the firing of dopaminergic neurons. Rather, they provide compelling evidence that the ramps of dopamine release correlate with ramps in cholinergic interneuron activity as measured with GCaMP6. What's more, the authors show that ACh-mediated dopamine release has no paired-pulse depression, a striking result that differs from all prior ex vivo brain slice data. The manuscript is extremely well written and the data are of very high quality. Overall, this study represents an important step forward in our understanding of how ACh-mediated dopamine release regulates behavior, and more broadly how axons can generate behaviors independently from somatic activity.

      Major comments

      1) The complete absence of any short-term plasticity in CIN-mediated dopamine release is a striking result that is important for the field. The authors should strengthen this result with additional quantitative analysis demonstrating the lack of STP. They have analyzed paired-pulse ratios, but they should analyze this for stimuli at the higher frequencies (4 Hz, etc) that are more physiologically relevant. For example, Fig 1e shows a CIN-evoked DA release at many optically-stimulated frequencies. The authors should quantify short-term plasticity by generating fits of the single stimulus signal and comparing the mathematical sum predicted from 4 stim DA signals at different frequencies to the recorded data. A similar analysis has been done with Ca signals (Koester and Sakmann, 2000).

      Thank you for this very helpful suggestion. We have performed this analysis as recommended, and now confirm the lack of STP even at the higher frequencies (see new Supplementary Figure 1).

      2) The authors show that optical activation of CINs results in DA release as measured by dLight. To clearly establish that these signals are generated by DA release driven by nicotinic receptors (and not a partial effect of some unknown artifact), it would be useful to show that the optical CIN-evoked dLight signals shown in Fig. 1 are inhibited by nicotinic receptor antagonists such as DHbE. This control experiment would significantly strengthen the result shown here.

      We agree that combining drug manipulations with photometry would be useful, but as noted above this is not a methodology in our current technical repertoire.

      3) Similarly, the authors show clear correlations between CIN activity and DA release during behavior. The authors should consider determining whether CINs play a causal role in triggering DA release during behavior. For example, does infusion of DHbE in the NAc prevent the light-mediated DA release during behavior? As an alternative hypothesis, some groups have been suggesting that CIN activity has almost no direct influence over DA. Therefore, testing whether a causal relationship exists between CINs and DA release would be an important experiment in addressing these two opposing viewpoints.

      As noted above we are not currently able to combine drug manipulations with photometry in behaving animals.

      4) The ramps that are described in this manuscript are an order of magnitude faster (increasing over 100s of milliseconds) than ramps described in other studies that occur over seconds. In fact, the two signals may be completely different functionally. Discussion of this topic would be helpful.

      Dopamine ramps have indeed been reported over multiple different time scales, and as discussed in Berke 2018, this seems to reflect the duration of the approach behavior. We think further discussion of this topic is better saved for another paper, especially as we are now actively studying ramping over longer time scales (Krausz et al. 2023).

      Reviewer #3 (Public Review):

      This report by Mohebi et al. provides new answers to old questions by showing that the activity of striatal cholinergic interneurons (CINs) escalates progressively during specific reward-related behaviors and that this correlates with previously observed ramps in dopamine (DA) release in the nucleus accumbens core. The report is strong and provides evidence for the authors' hypothesis that DA ramps are independent of DA neuron activity, but are instead the result of CIN activity and corresponding acetylcholine (ACh) release. The authors further demonstrate that the fidelity of CIN activation and consequent driving of DA release is even more robust in vivo than observed ex vivo slice preparations, which is fundamental for understanding the role of ACh-DA interactions in behavior. The findings complement the authors' previous evidence ventral tegmental area (VTA) DA neuron firing patterns do not show a ramping pattern; the previously reported VTA data are appropriately included here (in Fig. 3) to illustrate the absence of VTA firing during the time-locked increases in CIN activity and DA release. The present studies stop short of showing a direct link between CIN activity and DA release, however, which would require examining DA release during behavior in the presence of an antagonist of nicotinic ACh receptors. The authors also extend the understanding of the regulation of DA release by acetylcholine (ACh) by showing that optical activation of CINs in vivo promotes DA release responses that do not attenuate with repetitive stimulation. This contrasts with previous results in ex vivo striatal slices in which ACh-evoked DA release has been found to decline progressively from rundown and/or receptor desensitization. The authors propose that in vivo, AChE may be more effective in curtailing local ACh levels than in slices because of the slightly lower temperature typically used for slice studies, as well as the use of superfusion that might facilitate some AChE washout (AChE inhibitors are still effective in slices, of course). Overall, the report not only provides evidence for the cellular substrate for DA ramps but also shows the robustness of ACh-driven DA release in vivo. A few points to strengthen the report are listed below.

      1) The authors give a few details about how CINs were activated at the beginning of the results, but say only that DA dynamics were monitored using fiber photometry. Given that the methods are at the end, a brief summary should be given here to indicate whether this means direct monitoring of DA or indirect via GCaMP, for example. It would be helpful to note the sensor used in the abstract, as well. In this light, as it were, RdLight1 should be described upon the first mention.

      We have now clarified in both abstract and text that we are using the direct DA sensor RdLight1.

      2) The authors show that infusion of DHbE in the NAc likelihood of decisions to approach the center port, as did antagonism of DA receptors. This supports the authors' argument that ramping of CIN activity and consequent ACh release underlies observed ramps in DA release. However, to show a causal interaction requires testing whether the observed DA ramps are absent after DHbE infusion in the NAc, under the same conditions that attenuated behavior.

      As noted above we are not currently able to combine drug manipulations with photometry in behaving animals.

      3) In Fig. 3, the y-axis title for the upper panels should specify VTA, not simply "rate". This is stated in the legend, but should also be specified in the figure panel.

      We have updated the y-axis titles in this figure.

      4) A recent preprint in BioRxiv by AC Krok, NX Tritsch et al. shows a related correlation between ACh and DA release in vivo in a reward task, as well as differences in other conditions. This report shows also that cortical input to CINs indeed plays a role, as suggested in the concluding sections of the present report. Consideration of the data in the preprint in the context of the present results could be valuable for the field.

      We have also noted those pre-prints with interest, even though they investigated different brain regions using different approaches. There are established differences between CIN-DA interactions in dorsal vs. ventral striatum that we suspect are relevant here. But given the rapid pace of developments in this subfield, we prefer not to speculate too much at this point and instead review the overall body of work once it is published.

    1. Author Response

      Reviewer #1 (Public Review):

      This is a simulation study comparing the performance of two major approaches for dealing with “population structure” when carrying out Genome-wide Association Studies - Principal Component Analysis and Linear Mixed-effects Models - a subject of considerable practical importance. The author correctly notes that previous comparisons have been quite limited. In particular, any study not concluding that LMM was superior has relied on very simple models of structure.

      The paper is clearly written and beautifully reviews the theoretical underpinnings (albeit in a manner that will be difficult to penetrate without deep knowledge of several fields). The simulations are well-designed and far better than previous studies. From a theoretical point of view, the work is somewhat limited by being strongly anchored in a very classical quantitative genetics framework that is focused on allele frequencies and inbreeding coefficients, and totally ignores coalescent theory, but this is a minor quibble. The simulations are limited by utilizing ridiculously small sample sizes by the standards of modern human GWAS. And of course, they do not include all the complexities of real data.

      The quantitative genetics framework we used was ideal for motivating and interpreting LMMs in particular, since they model relatedness with a kinship matrix which consists of IBD probabilities, all of which arose from quantitative genetics.

      We also added the following text to the discussion: “However, our conclusions are not expected to change with larger sample sizes, as cryptic family relatedness will continue to be abundant in such data, if not increase in abundance, and thus give LMMs an advantage over PCA (Henn et al., 2012; Shchur et al., 2018; Loh et al., 2018).”

      The main conclusion of the study is that LMM really are generally superior - as expected on theoretical grounds. However, the authors do address whether switching to LMM really is practicable given the sample size and lack of data sharing that characterize human genetics. Nor is it clear whether the difference in performance matters in real life given that the entire framework used is an idealized one - the fact that real human data suffers from environmental confounders that are correlated with “ancestry” is not addressed, to take the most obvious example. That said, it is surely important to note that the approach routinely used by the majority of users (PCA with 10 PCs) is most used for historical reasons and has little theoretical or empirical justification.

      We added simulations with environment effects correlated with ancestry, which we hope will make our study even more relevant as it does make our evaluations even more realistic than before. In the presence of environment effects, LMM without PCs remains among the best approaches, although occasionally LMM with PCs or PCA will perform slightly better. However, modeling environment directly (with the true variables) improves performance much more than by using PCs to model environment indirectly, so we believe that is not a strong reason for continuing to use PCs (in LMMs or otherwise) unless there is no choice.

      We also added the following text to the discussion: “However, recent approaches not tested in this work have made LMMs more scalable and applicable to biobank-scale data (Loh et al., 2015; Zhou et al., 2018; Mbatchou et al., 2021), so one clear next step is carefully evaluating these approaches in simulations with larger sample sizes.” As stated earlier, we believe that the difference in performance between LMM and PCA will remain in larger sample sizes because cryptic relatedness is more prevalent in that setting.

      We excluded the “lack of data sharing” point from our discussion because it does not align well with the goals of our manuscript. The current solution to the lack of data sharing is meta-analysis, but its use does not give PCA or LMM an inherent advantage, since it can be applied to the summary statistics of either (or even a combination of models, in theory). There is interesting recent work on “federated” PCA and LMM association (both versions exist), that allow a single model to be fit jointly to separate datasets (residing in different buildings across the world) as if they were combined into a single dataset. Thus, these issues do not explain or motivate why PCA or LMM should be used.

      Reviewer #2 (Public Review):

      Yao and Ochoa present a very nice paper examining the age-old question of whether LMM or PCA is a better way to adjust for structure (population, family, admixture). The authors provide a very nice and detailed overview of the previous research addressing this question, summarizing it in a table. They find that LMMs are generally better at accounting for population structure. However, I feel there are a couple of important factors that are missing. One is the consideration of environmental structure. Another is that the relationship between PCA and LMM is usually a bit more complicated in practice than depicted here, where the devil really lies in the details. Also, I think there are a couple of key reasons why LMMs haven’t been adapted as quickly as one might have expected, including case-control imbalance and cohort meta-analyses, which I feel the authors could point out. In fact, I believe LMMs have become sort of popular in recent years (e.g. Japan Biobank GWAS results).

      We added environment simulations, which we agree was an important shortcoming of the previous version of our work.

      We now discuss how the PCA and LMM connection can be more complicated in practice, but as the main difference is in how LD is handled, once that is correctly adjusted, PCs and random effects are still mostly modeling the same relatedness signals. Ultimately, our main conclusion is unchanged, namely that only LMMs can model family relatedness, which is their key advantage.

      We briefly commented on case-control imbalance in our discussion (now made more clear), but since this involves binary traits, which we did not explicitly test in this work, it is out of scope.

      Cohort meta-analysis does not influence whether to use PCA or LMM, since it can be performed with summary statistics from either model (and in theory even a combination of different models per cohort). The broad use of meta-analysis does not in itself prevent users from using PCA or LMM within individual cohorts. The use of meta-analysis is very interesting in its own right, but it is outside the scope of this work.

      Reviewer #3 (Public Review):

      This paper examines the relative performance of linear mixed models (LMMs), principal components (PCA), and their combination (PCA-LMM) for genetic association studies in human populations. The authors claim that previous papers examining this question are inadequate and that: (i) there remains confusion on which method is best and in which context, (ii) that the metrics used in previous evaluations were insufficient, and (iii) that the simulation settings used in previous papers were not comprehensive. To fix these problems the authors perform an extensive set of simulations within several frameworks and suggest two new metrics for evaluating performance.

      Strengths:

      The simulation framework used in this paper and the extensive number of simulations provide an opportunity to examine the relative properties of the three approaches (LMM, PCA, PCA-LMM) in a variety of contexts.

      The parameters of the simulation framework are based on highly diverged populations, which is an increasingly common analysis choice that has not been examined in detail via simulation previously.

      The evaluation metrics used in this paper are AUC and a test of the uniformity of the p-value distribution under the null. This is an improvement over some previous analyses which did not examine power and relied on less sensitive tests of type I error.

      Weaknesses:

      This paper has a limited set of population frameworks just like all papers before it. The breakdown of which method is best (LMM, PCA, PCA-LMM) will be a function of the simulation framework chosen.

      Ameliorating this issue, we added additional simulations with low heritability and with environment effects. We are pleased to report that all of our conclusions hold at low heritability (h2 = 0.3), and for the most part under environment effects (which occasionally give LMM with PCs and PCA a small advantage, but often LMM with no PCs remains best, and we show PCs are no replacement for directly modeling these environment effects).

      The frameworks chosen for this paper are certainly not comprehensive in contemporary human genetic studies. In fact, the authors make a number of unusual choices. For example, the populations in the simulated study have extremely large Fsts. While this is also a strength, the lack of more standard study designs is a weakness. More importantly, there is no simulation of family effects, which is the basis of many of the PCA-LMM papers reported in Table 1.

      We now better motivate in the introduction our focus on association studies of multiethnic and admixed individuals, which are nowadays very common and which have greater FST values than earlier studies. In reference to higher simulated FSTs, we also now cite our recent work, which has found that many previous FST estimates are downwardly biased (Ochoa and Storey, 2021, 2019). We simulated data that was fit to each of our three real datasets using our unbiased methods, so those values that (understandably) appear high are actually more correct (for multiethnic populations such as those in 1000 Genomes, HGDP, etc) than previous estimates in the literature. In our previous work we also determined that only previous pairwise FST estimators are unbiased (under some conditions), and using a previous pairwise FST estimator (from Bhatia et al., 2013) we obtained equally high values between the most diverged human populations (values from a revised version of Ochoa and Storey, 2019 that isn’t on bioRxiv yet): In HGDP, the largest pairwise FST is 0.479, between Pima and PapuanSepik; In Human Origins, the largest estimate is 0.396, between Cabecar and Baining_Malasait; Lastly, in 1000 Genomes, the largest estimate is 0.135, between YRI and JPT. (1000 Genomes was generally less structured than HGDP and Human Origins, because the latter include more diverse populations.) Several previous estimates from the literature, all between one hunter-gatherer Sub-Saharan African subpopulation and one non-African subpopulation resulted in values of about 0.25 (Bowcock et al., 1991, Henn et al., 2011, Bergstrom et al., 2020). FST estimates are also greater from whole-genome sequencing versus array data (revised version of Ochoa and Storey, 2019).

      Family (household) effects is a case where PCA is not expected to outperform LMM, though standard LMMs do not model this effect explicitly either and may not do much better. As this is a feature of family studies that ought to be absent in population studies (as usually only siblings are in the same household, and not more distant relatives), it is also not entirely relevant to the majority of our simulations. In these ways, including such a feature in our simulations does not align with the goals of this present work, but we agree this is an important framework that deserves more attention in future evaluations.

      The discussion (and simulations) of LMM vs PCA, particularly LMMs with PCs as fixed effects misses the critical distinction of whether PCs are in-sample (in which case including PCs as fixed effects effectively serves as a preconditioner for the kinship matrix, speeding up iterative methods such as BOLT), or projections of individuals onto out-of-sample principal axes. There is also no discussion of LOO methods to address “proximal contamination”, also quite relevant in evaluating power as a function of the number of PCs.

      We added the following to our discussion concerning out-of-sample PC projections: “We do not consider the case where samples are projected onto PCs estimated from an external sample (Prive et al., 2020), which is uncommon in association studies, and whose primary effect is shrinkage, so if all samples are projected then they are all equally affected and larger regression coefficients compensate for the shrinkage, although this will no longer be the case if only a portion of the sample is projected onto the PCs of the rest of the sample.”

      We also added the following to the discussion concerning the LOCO approach: “Similarly, the leave-onechromosome-out (LOCO) approach for estimating kinship matrices for LMMs prevents the test locus and loci in LD with it from being modeled by the random effect as well, which is called”proximal contamination” (Lippert et al., 2011, Yang et al., 2014). While LOCO kinship estimates vary for each chromosome, they continue to model family relatedness, thus maintaining their key advantage over PCA.”

      The same new discussion paragraph closes with the following thoughts concerning LOCO and related approaches: “LD effects must be adjusted for, if present, so in unfiltered data we advise the previous methods be applied. However, in this work, simulated genotypes do not have LD, and the real datasets were filtered to remove LD, so here there is no proximal contamination and LD confounding is minimized if present at all, so these evaluations may be considered the ideal situation where LD effects have been adjusted successfully, and in this setting LMM outperforms PCA. Overall, these alternative PCs or kinship matrices differ from their basic counterparts by either the extent to which LD influences the estimates (which may be a confounder in a small portion of the genome, by definition) or by sampling noise, neither of which are expected to change our key conclusion.”

      Lastly, we added the following to a different discussion paragraph: “A different benefit for including PCs were recently reported for BOLT-LMM, which does not result in greater power but rather in reduced runtime, a property that may be specific to its use of scalable algorithms such as conjugate gradient and variational Bayes (Loh et al., 2018).”

      There is no discussion/simulation of spatial/environmental effects or rare vs common PCs as raised in Zaidi et al 2020. There are some open questions here regarding relative performance the authors could have looked at. Same for LMMs with multiple GRMs corresponding to maf/ld bins and thresholded GRMs. For example, it would be helpful to know if multiple-GRM LMMs mitigate some of the problems raised in the Zaidi paper.

      We added simulations with environment effects, which are based on a two-level hierarchy of population labels so they are spatial to the extent that these labels capture spatial relationships between populations. However, our small sample size data are not well suited to study rare variants and their structure, so its out of scope. (The sample size limitation is also covered in a new discussion paragraph.) We hope to tackle this very interesting question in future work.

      We added the following paragraph to our discussion: “Another limitation of this work is ignoring rare variants, a necessity given our smaller sample sizes, where rare variant association is miscalibrated and underpowered. Using simulations mimicking the UK Biobank, recent work has found that rare variants can have a more pronounced structure than common variants, and that modeling this rare variant structure (with either PCA and LMM) may better model environment confounding, improve inflation in association studies, and ameliorate stratification in polygenic risk scores (Zaidi and Mathieson, 2020). Better modeling rare variants and their structure is a key next step in association studies.”

  4. Apr 2023
    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2022-01586

      Corresponding author(s): Hammond, Gerald

      1. General Statements

      Our manuscript details a novel homeostatic feedback loop for the master plasma membrane regulatory molecule, PI(4,5)P2. In this loop, the PIP4K family of PI(4,5)P2-synthesizing enzymes act in a novel, non-enzymatic capacity: they sense PI(4,5)P2 levels and directly inhibit the lipid’s synthesis by inhibiting the major enzyme involved in the terminal step of synthesis, PIP5K. The three reviewers seem largely convinced of our data, and provided detailed, insightful and plausible suggestions for revision, which we have now comprehensively provided. This includes substantial new experimental work, including the generation of genomically tagged cell lines to localize all endogenous PIP4K isoforms.

      However, all three reviewers questioned the paper’s novelty and significance based on recent studies in the literature demonstrating PIP5K inhibition by PIP4Ks [refs 25 & 53 in the manuscript]. We feel that this is an inaccurate and somewhat unfair assessment of our findings, since it does not consider our central (and completely unprecedented) finding that PIP4Ks directly sense PI(4,5)P2 levels through low-affinity binding. As well as being a novel finding, this places the previously observed inhibition of PIP5K by PIP4Ks into a completely new paradigm consisting of a complete, enclosed homeostatic feedback loop. This was not demonstrated previously in the literature.

      Of course, the reviewers’ convergent opinions almost certainly reflect a deficit in our articulation of the novel findings in the original manuscript. We have therefore revised the current version to more clearly emphasize our novel findings.

      2. Point-by-point description of the revisions

      Reviewer #1

      __Summary: __In this manuscript, authors address how PIP4K regulates tonic plasma membrane (PM) PI(4,5)P2 levels which are generated by major PI(4,5)P2 synthesis enzyme, PIP5K by using PIP4K and PIP5K overexpressing cells or acutely manipulating PM PI(4,5)P2 levels by the chemically induced dimerization (CID) system. Additionally, authors assessed effect of direct interaction between PIP4K and PIP5K by using supported lipid bilayers (SLBs) and purified PIP4K and 5K. Authors also were successful in monitoring dynamics of endogenous PIP4K by using a split fluorescent protein approach. Through this study, authors propose a model of PI(4,5)P2 homeostatic mechanism that PIP4Ks sense elevated PM PI(4,5)P2 by PIP5Ks, are recruited to the PM, and bind to PIP5Ks to inhibit PIP5Ks activity.

      # 1.1: Although authors mention methods of statistical analysis in materials and methods, they did not present the results of statistical analysis in the figures. The quantitative data should be presented with statistical analysis data, which is important for showing where convincing differences between treatment groups are found.

      We agree that statistics are important to fully interpret the data; we have now included the results of statistical tests (non-parametric statistics were used, as the data are not normally distributed) with correction for multiple comparisons. Significant changes are denoted using asterisk notation in figs. 1A-C, 2B, 5B & 7A. The full results are now reported as tables:

      Fig 1A = table 1; Fig 1B = table 2; Fig. 1C = table 3; Fig. 2B = table 4; Fig. 5B = table 5; Fig 7A = tables 6 & 7.

      __#1.2a: __Fig. 1D. Fig. 1D and Fig. 3A should be presented together because these are exactly same set of cells and information of each PIP4K and PIP5K membrane localization could be important for understanding mechanisms of inhibitory effect of PIP4Ks.

      We struggled when writing the manuscript to reconcile these data into a single figure. The manuscript flows from showing inhibition of PIP5Ks by PIP4Ks in living cells (figs. 1 & 2), then showing low affinity PI(4,5)P2 binding by endogenous PIP4Ks (figs. 3-6) and finally to a direct interaction between PIP4K and PIP5K (fig. 7). We therefore felt that reconciling the data showing attenuated PI(4,5)P2 synthesis with the interaction between PIP4Ks and PIP5Ks, despite being demonstrated in the same experiments, would disrupt the flow of the paper. We therefore request to leave the data in Figs. 2B and 7A, whilst remaining explicit that the data derive from a single experiment.

      #1.2b: Authors claimed that over-expression of all three PIP4K isoforms were able to attenuate the elevated PM PI(4,5)P2 levels caused by PIP5K over-expression. However, in Fig. 3A, PIP4K2A was recruited to PM by both PIP5K1A and PIP5K1C but looks only attenuated PIP5K1A, but not PIP5K1C, overexpression mediated PM PI(4,5)P2 elevation (Fig. 1D). PIP4K2C was less recruited to the PM than PIP4K2A and 2B in PIP5K1A overexpressing cell (Fig. 3A) but PIP4K2A, B and C isoforms equally attenuated increase of PM PI(4,5)P2 in PIP5K1A overexpressing cell (Fig. 1D). It is likely that efficiency of inhibitory effect of each PIP4K isoform is different by co-overexpressed PIP5K isoform. These images should be more carefully documented with Fig. 1D and Fig. 3A together.

      As the reviewer suggests, we have now expanded our description of these data in both results and discussion; firstly, for the attenuating effects on PI(4,5)P2 synthesis, we write on the 3rd paragraph of p4: “We also reasoned that co-expression of PIP4K paralogs with PIP5K might attenuate the elevated PI(4,5)P2 levels induced by the latter. Broadly speaking, this was true, but with some curious paralog selectivity (fig. 2B, statistics reported in table 4): PIP4K2A and PIP4K2B both attenuated PI(4,5)P2 elevated by PIP5K1A and B, but not (or much less so) PIP5K1C; PIP4K2C, on the other hand, attenuated PIP5K1A and was the only paralog to significantly attenuate PIP5K1C’s effect, yet it did not attenuate PIP5K1B at all.”

      On the relative ability of PIP5Ks to localize PIP4Ks we focus on the key result, writing on the 2nd paragraph of p7: “When co-expressing EGFP-tagged PIP5Ks and TagBFP2-tagged PIP4K2s, we found that PIP5K paralogs’ PM binding is largely unaffected by PIP4K over-expression (fig. 7A, upper panel and table 6), whereas all three paralogs of PIP4K are strongly recruited to the PM by co-expression of any PIP5K (fig. 7A, lower panel and table 7)…”

      And finally, we describe a more nuanced discussion of the possible implications for differential inhibition of PIP5K isoforms by PIP4Ks in the discussion, starting in the first paragraph on p. 11: “Despite minor differences in the ability of over-expressed PIP5K paralogs to recruit over-expressed PIP4K enzymes (fig. 7A), we observed major differences in the ability of PIP4K paralogs to inhibit PI(4,5)P2 synthesis when over-expressed alone (fig. 1C) or in combination with PIP5K (fig. 2B). It is unclear what drives the partially overlapping inhibitory activity, where each PIP5K paralog can be attenuated by 2 or 3 PIP4Ks. This is however reminiscent of the biology of the PIPKs, where there is a high degree of redundancy among them, with few unique physiological functions assigned to specific paralogs [49]. There may be hints of paralog-specific functions in our data; for example, enhanced PI(4,5)P2 induced by over-expressed PIP5K1C is only really attenuated by PIP4K2C (fig. 2B). This could imply a requirement for PIP4K2C in regulating PI(4,5)P2 levels during PLC-mediated signaling, given the unique requirements for PIP5K1C in this process [50,51]. Regardless, a full understanding of paralog selectivity will need to be driven by a detailed structural analysis of the interaction between PIP4Ks and PIP5Ks - which is not immediately apparent from their known crystal structures, especially since PIP4Ks and PIP5Ks employ separate and distinct dimerization interfaces [49].

      #1.3: Fig. 1F. It seems that PIP4K2A accelerated PIP5K, but not Mss4, dependent PI(4,5)P2 generation before PI(4,5)P2 reaches 28,000 lipids/um2. Is this significant? If so, why did this happen?

      We have answered this question with a sentence added to the 1st paragraph on p 8: *“The ability of PIP4K to bind to PIP5K on a PI(4,5)P2-containing bilayer also potentially explains the slightly accelerated initial rate of PI(4,5)P2synthesis exhibited by PIP5K1A that we reported in fig. 2C, since PIP4K may initially introduce some avidity to the membrane interaction by PIP5K, before PI(4,5)P2 reaches a sufficient concentration that PIP4K-mediated inhibition is effective.” *

      #1.4: Fig. 3B. In this figure, authors only presented images after Rapa treatment. Therefore, it is not clear what these results mean. Before Rapa treatment, where did bait proteins and NG2-PIP4K2C localize? If ePIP4K2C delta PM intensity (ER:PM/PM) increase, does that mean increase in ER:PM intensity or decrease in PM intensity? According to Figure legend, PI(4,5)P2 indicator TubbycR332H was co-transfected, but those images are not shown in the figure. Images of PI(4,5)P2 indicator also should be presented to show whether after Rapa treatment PI(4,5)P2 increased at ER-PM contact sites, because that could be critical for the conclusion that "The use of Mss4 ruled out an effect of enhanced PI(4,5)P2 generation at contact sites, since this enzyme increases PI(4,5)P2 as potently as PIP5K1A (Fig. 1A), yet does not cause recruitment of PIP4K2C". Is this conclusion consistent with Fig. 2F and G?

      These data now appear in Fig. 7B. We have added images showing the pre-rapamycin state to the revised figure. The reference to tubby­cR332H co-expression was an error. In fact, the cells expressed the ER:PM contact site marker MAPPER, which allowed us to quantify ER:PM contact site localization before and after rapamycin induced capture of the baits at these sites. The revised figure appears as follows:

      The failure of Mss4 to recruit endogenous PIP4K2C is entirely consistent with the old Fig. 2F and G (now 5A and C), since these show PIP4K interaction with PI(4,5)P2 containing lipid bilayers (in Fig. 5C, the PI(4,5)P2 was synthesized by Mss4). We demonstrated that Mss4 is unable to interact with PIP4K2A in Fig. 7D.

      #1.5: Fig. 3C and D. Based on results of Fig. 3C and D, authors concluded that "PIP4K2C binding to PI(4,5)P2-containing SLBs was greatly enhanced by addition of PIP5K to the membranes, but not Mss4". I don't think Fig. 3C and D are comparable because experimental conditions are different. While lipid composition of SLB used in Fig. 3C was 2% PI(4,5)P2, 98% DOPC, in Fig. 3D, it was 4% PI(4,5)P2, 96% DOPC. And also, in Fig. 3C, PIP5K1A was added to SLB at the time about 50 sec, whereas in Fig. 3D, Mss4 was added at 600 sec. It seems that in Fig. 3D, PIP4K2A was already saturated on SLB before adding Mss4. These two experiments must be performed under the same conditions.

      We have repeated these experiments (which now appear in Fig. 7C & D) under identical conditions, with the same result.


      #1.6: Overall results discussed in the text are very compressed referring readers to the 4 multi-panel complex figures with elaborate figure legends. While it is possible to figure out what the authors' studies and results are, it is quite a laborious process.

      We have revised the manuscript to be less compressed and easier to read, with the data now organized as eight figures and the results section split into four sub-sections.

      Minor comments:

      #1.7: Fig. 2D. The purified 5-phosphatase used in Fig. 2D is INPP5E but described in figure legend and materials and methods ass OCRL. Which one is correct?

      Purified OCRL was indeed used in the supported lipid bilayer experiments. The figure (now Fig. 4A) and legend have been corrected – thank you for spotting the error.

      #1.8: Fig. 3B. Indicate which trace represents PIP5K1A, Lyn11 or Mss4.

      The data now appears in Fig. 7B, with the traces separated into separate graphs for greater clarity (see response to #1.4).

      #1.9: Fig. 4C. X-axis label. Is "Time (min)" correct? Or should it be "Time (sec)".

      Thank you for spotting this typo. It should have indeed been seconds, and this is corrected in the new fig. 8C.

      • *

      Reviewer #1 (Significance (Required)): The finding that PIP4K itself is a low-affinity PI(4,5)P2 binding protein and sense increases of PM PI(4,5)P2 generated by PIP5K to control tonic PI(4,5)P2 levels by inhibiting PIP5K activity is a novel concept. However, inhibition of PIP5K by PIP4K and importance of the inhibitory effect of PIP4K in PI3K signaling pathway have previously been reported (ref 24). This reduces the novelty of the current work somewhat however, the authors do provide evidence for dual interactions of PIP4K (PIP2, PIP5K), which the previous report did not.

      We appreciate the reviewer’s insightful comments and overall appreciation of our work. We agree that previous studies did not detect the dual interaction of PIP4Ks with PIP5Ks and PI(4,5)P2; as we argue strongly in the general comments, we think this actually fits as a complete, enclosed homeostatic feedback loop – which is a significant and novel finding.

      • *

      Reviewer #2

      Summary: This paper proposes that the enzyme PIP4K2C is a negative regulator of the synthesis of PI(4,5)P2 and that it does so by dampening the activity of PIP5K which is the enzymatic activity responsible for producing the major pool of PI(4,5)P2 in cells.

      • *

      Reviewer #2 (Significance (Required)): Although the findings of the paper are presented as a major new advance, the observation that PIP4K might acts as a negative regulator of PIP2 synthesis has been previously presented in two previous publications. The significance of this paper is that it also shows the same point in another model system.

      PIP4Ks Suppress Insulin Signaling through a Catalytic-Independent Mechanism

      Diana G Wang 1, Marcia N Paddock 2, Mark R Lundquist 3, Janet Y Sun 3, Oksana Mashadova 3, Solomon Amadiume 3, Timothy W Bumpus 4, Cindy Hodakoski 3, Benjamin D Hopkins 3, Matthew Fine 3, Amanda Hill 3, T Jonathan Yang 5, Jeremy M Baskin 4, Lukas E Dow 6, Lewis C Cantley 7

      PMID: 31091439; PMCID: PMC6619495;DOI: 10.1016/j.celrep.2019.04.070

      and

      Phosphatidylinositol 5 Phosphate 4-Kinase Regulates Plasma-Membrane PIP3 Turnover and Insulin Signaling.

      Sharma S, Mathre S, Ramya V, Shinde D, Raghu P.Cell Rep. 2019 May 14;27(7):1979-1990.e7. doi: 10.1016/j.celrep.2019.04.084.PMID: 31091438

      Both of these studies show that in cells lacking PIP4K, during signalling the levels of PIP2 rise much greater than in wild type cells. Indeed the Cantley lab paper (Wang et.al) have shown that this is likely due to an increase in PIP5K activity, using an in vitro assay. They have further disrupted the interaction between PIP4K and PIP5K and demonstrated the importance of this interaction in the enhanced levels of PIP2.

      Respectfully, we disagree with this assessment, because we believe it doesn’t consider the novel, central findings we report: that PIP4Ks sense PI(4,5)P2 levels through direct interaction with the lipid, and that this is what facilitates PIP5K inhibition. These findings were not reported in the prior studies. Nonetheless, the studies are foundational for ours and were cited in our original manuscript (and are still, as refs 25 and 53).

      • *

      #2.1: Likewise although the authors have claimed that no mechanisms have claimed that there are no mechanisms reported to sense and downregulate PIP2 resynthesis. It is suggested that they read and consider the following recent paper which studies Pip2 resynthesis during GPCR triggered PLC signalling.

      Kumari A, Ghosh A, Kolay S and Raghu P*. Septins tune lipid kinase activity and PI(4,5)P 2 turnover during G-protein-coupled PLC signalling in vivo. Life Sci Alliance. 2022 Mar 11;5(6):e202101293. doi: 10.26508/lsa.202101293. Print 2022 Jun.

      We have now included a full discussion of this paper in the discussion starting on the last paragraph of p 9: “Since this paper was initially submitted for publication, another study has reported a similar homeostatic feedback loop in Drosophila photoreceptors, utilizing the fly homologue of septin 7 as the receptor and control center [38]. This conclusion is based on the observation that cells with reduced septin 7 levels have enhanced PIP5K activity in lysates, and exhibit more rapid PI(4,5)P2 resynthesis after PLC activation. However, changes in septin 7 membrane localization in response to acute alterations in PI(4,5)P2 levels, as well as direct interactions between PIP5K and septin 7, have yet to be demonstrated. Nevertheless, septin 7 has distinct properties as a potential homeostatic mediator; as a foundational member of the septin family, it is essential for generating all major types of septin filament [39]. Therefore, a null allele for this subunit is expected to reduce the prevalence of the septin cytoskeleton by half. Given that septin subunits are found in mammalian cells at high copy number, around ~106 each [29], and the fact that septins bind PI4P and PI(4,5)P2 [40,41], it is likely that septin filaments sequester a significant fraction of the PM PI4P and PI(4,5)P2 through high-avidity interactions. In addition, membrane-bound septins appear to be effective diffusion barriers to PI(4,5)P2 and other lipids [42]. We therefore speculate that septins may play a unique role in systems such as the fly photoreceptor with extremely high levels of PLC-mediated PI(4,5)P2 turnover: The septin cytoskeleton can act as a significant buffer for PI4P and PI(4,5)P2 in such systems, as well as corralling pools of the lipids for use at the rhabdomeres were the high rate of turnover occurs. This is in contrast to the role played by the PIP4Ks, where PI(4,5)P2 levels are held in a narrow range under conditions of more limited turnover, as found in most cells.”

      __#2.2: __Likewise there are other earlier papers in the literature which have studied possible PIP2 binding proteins as sensors for this lipid.

      We are only aware of a single, specific example of a similar negative feedback, which is discussed in the 3rdparagraph of p 10:Curiously, although phosphatidylinositol phosphate kinases are found throughout eukarya, PIP4Ks are limited to holozoa (animals and closely related unicellular organisms) [47]. Indeed, we found the PIP5K from the fission yeast, Saccharomyces cerevisiae, does not interact with human PIP4Ks (fig. 7) and cannot modulate PI(4,5)P2 levels in human cells without its catalytic activity (fig. 1). This begs the question: how do S. cerevisiae regulate their own PI(4,5)P2 levels? Intriguingly, they seem to have a paralogous homeostatic mechanism: the dual PH domain containing protein Opy1 serves as receptor and control center [48], in an analogous role to PIP4K. Since there is no mammalian homolog of Opy1, this homeostatic mechanism appears to have appeared at least twice through convergent evolution. Combined with hints of a role for septins in maintaining PI(4,5)P2 levels [38], the possibility arises that there may yet be more feedback controls of PI(4,5)P2 levels to be discovered.”

      • *

      Technical standards: The work is done to a high technical standard.

      #2.3: Does catalytically dead isoform of PIP4K2B and 2C also yield the same result as a catalytically dead version of PIP4K2A in Fig 1B?

      In a word: yes. We have added these experiments, which are now presented in Fig. 2A:

      The results are described in the results in the 2nd paragraph of p. 4: “To directly test for negative regulation of PIP5K activity by PIP4K in cells, we wanted to assay PI(4,5)P2 levels after acute membrane recruitment of normally cytosolic PIP4K paralogs. To this end, we triggered rapid PM recruitment of cytosolic, FKBP-tagged PIP4K by chemically induced dimerization (CID) with a membrane targeted FRB domain, using rapamycin [27]. As shown in fig. 2A, all three paralogs of PIP4K induce a steady decline in PM PI(4,5)P2 levels within minutes of PM recruitment. Catalytically inactive mutants of all three paralogs produce identical responses (fig. 2A).”

      #2.4: The labelling on y-axis for PI(4,5)P2 biosensor intensity ratio is PM/cell at some places, PM/Cyt or PM/Cyto in some places. It is recommended to make it uniform across all the panels.

      PM/Cyto was a typo, now corrected to PM/Cyt. PM/Cell and PM/Cyt are two subtly different metrics used to normalize PM fluorescence intensity across varying transient expression levels. This is clarified in the methods in the 3rdparagraph on p.22: For confocal images, the ratio of fluorescence intensity between specific compartments was analyzed as described previously [59]. In brief, a custom macro was used to generate a compartment of interest specific binary mask through à trous wavelet decomposition[68]. This mask was applied to measure the fluorescence intensity within the given compartment while normalizing to the mean pixel intensity in the ROI. ROI corresponded to the whole cell (denoted PM/Cell ratio) or a region of cytosol (PM/Cyt), as indicated on the y axis of individual figures.”

      #2.5: The claim that PI(4,5)P2 production is sufficient to recruit PIP4K2C to the PM can be ascertained further if one is able to do an experiment where PI(4,5)P2 is ectopically expressed in some compartment of the cell which is non-native to PI(4,5)P2 and as a consequence of this PIP4K2C is recruited to this non-native compartment.

      We have now removed the assertion that PI(4,5)P2 is sufficient to localize PIP4Ks to the membrane, since our conclusion is that the coincident presence of PI(4,5)P2 and PIP5Ks in the PM is what ultimately localizes the PIP4Ks. We did not detect recruitment of endogenous PIP4Ks to lysosomes when ectopic PI(4,5)P2 synthesis was induced, although fluorescence levels are so low as to be inconclusive, and therefore not appropriate for inclusion in the manuscript.

      #2.6: In the entire figure 2, to establish that PI(4,5)P2 is necessary and sufficient for PM localisation of PIP4K, PIP4K2C is used as the PIP4K isoform on the basis that it is highly abundant in HEK293 cells. But PIP4K2A is localised mainly at the plasma membrane and here we are discussing about PI(4,5)P2 regulation at the PM . Can experiments be done with isoforms 2A and 2B as well? Can acute depletion of PI(4,5)P2 lead to the membrane dissociation of the isoform 2A as well? This will help us in understanding if there is an isoform specific difference in sensing PI(4,5)P2 levels which will help us in targeting specific isoform as therapeutic targets.

      We have now generated endogenously tagged PIP4K2A and PIP4K2B; these cell lines are characterized in the revised fig. 3:

      With the dependence on PI(4,5)P2 for PM binding for all isoforms shown in fig. 4:

      32 cells that were imaged across three independent experiments. (E) Depletion of PI(4,5)P2 causes NG2-PIP4K2C to dissociate from the membrane. As in C, NG2-PIP4K2C (blue) cells were transfected with FKBP-tagged proteins, TubbyC (orange) and Lyn11-FRB, scale bar is 2.5 µm; cells were stimulated with 1µM rapa, as indicated. TubbyC traces represent mean change in fluorescence intensity (Ft/Fpre) ± s.e. The NG2-PIP4K2C traces represent the mean change in puncta per µm2 ± s.e. of > 38 cells that were imaged across three independent experiments. " v:shapes="Text_x0020_Box_x0020_5">

      And increased binding by elevated PI(4,5)P2 levels shown in fig. 5B:

      The results are described in the accompanying results text “PIP4K are low affinity sensors of PM PI(4,5)P2”, pp.4-7. In short, endogenous PIP4K isoforms behave similarly with respect to PI(4,5)P2-dependent PM recruitment.

      • *

      #2.7: In Figure 1A, it is shown that overexpression of a catalytically dead PIP5K 1A/1B/1C is still able to increase PI(4,5)P2 levels. In the figure 2E, expression of homodimeric mutant of PIP5K domain which is a way to increase catalytic activity of PIP5K, increases PI(4,5)P2 levels which is consistent with the inferences from Fig, 1 , but what is surprising is a catalytically dead variant not being able to do so. Why is there a discrepancy between Fig. 1A and Fig. 2E? If the homodimeric mutant is the reason, then it is not clear in the explanation.

      We have added the following clarification to the results on the second paragraph of p.6:We next tested for rapid binding to acutely increasing PI(4,5)P2 levels in living cells, using CID of a homodimeric mutant PIP5K domain (PIP5K-HD), which can only dimerize with itself and not endogenous PIP5K paralogs [34]. This domain also lacks two basic residues that are crucial for membrane binding [35], and only elevates PM PI(4,5)P2 when it retains catalytic activity (fig. 5D), unlike the full-length protein (fig. 1A).” We currently do not fully understand why these well characterized residues of PIP5Ks are necessary for PM binding and inhibition by PIP4K. This is a focus of ongoing studies in the lab for the structural basis of PIP5K inhibition by PIP4K.

      • *

      #2.8: Show the loading control in Fig 2A western.

      We have added the loading control using alpha tubulin in the revised fig. 3B.


      #2.9: In the figure 2D, in the legend OCRL is written. So, the labelling in the panel should also be changed to OCRL from INPP5E. It is intermixed.

      Reviewer 1 also spotted this inconsistency (#1.7): Purified OCRL was indeed in the supported lipid bilayer experiments. The figure (now Fig. 4A) and legend have been corrected – thank you for spotting the error.

      • *

      #2.10: In the figure 2E, can the labelling be changed from HD to something more self-explanatory for homodimeric mutant of PIP5K domain?

      We prefer to keep the “HD” notation in the revised figure 5D for brevity’s sake, but now define the abbreviation in the text in the second paragraph of p.6:…a homodimeric mutant PIP5K domain (PIP5K-HD)…”.

      #2.11: In Fig. 2E, PIP5K expression is acute and in Fig. 2F Mss4 expression is chronic, both of which is able to recruit PIP4K2C to the plasma membrane. How can a likewise argument be drawn out of these two experiments when one is acute and the other one is a chronic expression? It is suggested to do an FRB-FKBP experiment for Mss4 as well.

      We agree with the reviewer that an FKBP-Mss4 would have been an excellent experiment. As can be seen from Fig. __1A, Mss4 is constitutively PM localized in mammalian cells. However, we were unable to identify a truncation of Mss4 that lost constitutive membrane binding whilst retaining catalytic activity. Therefore, we could only perform chronic overexpression as shown in __fig. 5B. The lack of an acute demonstration is why we went on to develop the PIP5K-HD constructs, results of which are reported in __fig. 5D. __

      #2.12: In the text, Fig. 2G and 2H is written for PIP4K2C, but in the corresponding panels and legends, it is an assay for purified PIP4K2A on SLBs. Kindly resolve the discrepancy.

      We thank the reviewer for spotting this discrepancy. PIP4K2A is the protein that was used in the SLB experiments now reported in fig. 5A & C and the accompanying results on pp.5-6. This is now corrected in the manuscript.

      #2.13: Kindly explain a bit in detail why the baits were now targeted to ER-PM contact sites. It is not self-explanatory.

      We have now added a more detailed description to the third paragraph of p. 7: “We therefore sought to distinguish between a direct PIP5K-PIP4K binding interaction versus PI(4,5)P2-induced co-enrichment on the PM. To this end, we devised an experiment whereby a bait protein (either PIP5K or control proteins) could be acutely localized to subdomains of the PM, with the same PI(4,5)P2 concentration. This was achieved using CID of baits with an endoplasmic reticulum (ER) tethered protein, causing restricted localization of the bait protein to ER-PM contact sites – a subdomain of the PM (fig. 7B).”

      • *

      #2.14: The conclusions for Fig. 3 most likely hints towards the possibility of PIP4K and PIP5K interaction being independent of PI(4,5)P2 levels. Well, Fig. 3C and 3D does suggest a direct interaction, but can other protein-protein interaction assays be used to establish the direct interaction of PIP4K with PIP5K such as FRET or Yeast two hybrid as assays scoring for interaction?

      We respectfully diverge from the reviewer’s assessment of the data, presented in the revised fig. 7. Figs. 7A & B__show PIP4K and PIP5K interacting in the context of a PI(4,5)P2 replete PM; __fig. 7C shows this in the context of a PI(4,5)P2 replete SLB. Therefore, we make no assertion that the PIP4K/PIP5K is independent of PI(4,5)P2 levels. We also contend that the latter experiment is a more direct demonstration than a Y2H assay, or even FRET (which can occur among non-interacting proteins localized to a membrane surface, see e.g. 10.1074/jbc.m007194200).

      #2.15: Conceptually a direct interaction can be explained to some extent from Fig. 3 but extending it to be an inhibitory interaction is not right without a direct experiment. Can an experiment be done with PI4P enriched SLB, wherein you put just PIP5K purified protein vs PIP5K+PIP4K combination and measure the % mol of PI(4,5)P2 produced using a probe. That will be suggestive of a negative interaction.

      This is a great experiment, the results of which are reported in fig. 2C, described in the third full paragraph of p. 4: “To more directly examine inhibition of PIP5K by PIP4K, we tested activity of purified PIP5K1A on PI4P-containing supported lipid bilayers (SLBs). Addition of PIP4K2A exhibited delayed inhibition of PIP5K1A activity (fig. 2C): Once PI(4,5)P2 reached approximately 28,000 lipids/µm2 (~2 mol %), PIP5K dependent lipid phosphorylation slowed down, which doubled the reaction completion time (fig. 2C, right). In contrast, we observed no PIP4K dependent inhibition of Mss4 (fig. 2C, inset). These data recapitulate the prior finding that PIP4K only inhibited purified PIP5K in the presence of bilayer-presented substrate [25]. We therefore hypothesized that inhibition of PIP5K by PIP4K requires recruitment of the latter enzyme to the PM by PI(4,5)P2 itself.”

      • *

      __#2.15: __ In Figure 3B, the FRB tagged constructs are magenta coded and PIP4K2C is cyan. Kindly change the labelling of the FRB constructs on the y axis to magenta so that it goes with what is written in the legend. It will also be appreciated to show a colocalization quantification between the magenta (FRB constructs) and cyan (PIP4K2C) post rapamycin addition and not just the intensity for ER-PM recruited PIP4K2C.

      These modifications and some additional points have been added in response to reviewer 1’s #1.4 to the revised fig. 7B. Note, we quantified the co-localization with an ER-PM contact site marker, MAPPER. Co-localization with the FRB-tagged construct would be misleading, because this construct is localized across the membrane at the start of the experiment and would thus have a high degree of co-localization. As can be seen from the inset graphs in the new analysis, however, all FRB-tagged constructs co-localize with MAPPER after rapamycin addition, but only FRB-PIP5K1A causes endogenous PIP4K2C to increase co-localization with this compartment.

      # 2.16: Again, in the text , the description is written for PIP4K2C but in the result panel and legend (Fig. 3C and Fig. 3D), PIP4K2A is mentioned. Kindly resolve the discrepancy

      We have corrected the results text on the last paragraph of p. 7: “Finally, we also demonstrate that PIP4K2A binding to PI(4,5)P2-containing supported lipid bilayers was greatly enhanced by addition of PIP5K to the membranes (fig. 7C), but not by Mss4 (fig. 7D).”

      • *

      # 2.17: In the Fig. 4B, it will be appreciated to show statistical significance in terms of R2 value for commenting on the linear response.

      “Linear response” was not the best description of what we were trying to articulate in the revised fig. 8B; we have now amended the results in the 2nd paragraph of p.8 to read: “Of these, Tubbyc showed the largest degree of change in PM localization across all changes in PI(4,5)P2 levels (fig. 8B).”

      • *

      #2.18: Discussion can be in general a bit more detailed which is suggestive of future experiments to do that can shed more light on the interaction such as which residues in PIP4K interacts with PIP5K to negatively regulate it.

      The revised manuscript contains a greatly expanded discussion, as described in detail in our responses to comments #1.2b, #2.1 and __#2.2. __

      #2.19: In the discussion, more light can be shed on the fact that Mss4 in spite of being a 5- kinase is not negatively regulated by PIP4K and the fact that PIP4K is present only in metazoans suggests that this fine tuning of PI(4,5)P2 levels is specific to metazoans. Another insight could be in the direction, that Fig 4. tells PI3K, but not calcium signaling is modulated by this fine tuning and interestingly class I PI3K is also an enzyme specific to metazoans. Hence, unlike yeast, metazoans rely on growth factor signalling processes, hence regulation of PI(4,5)P2 by PIP4K and hence Class I PI3K and PI(3,4,5)P3 could be a process relevant to metazoans.

      We have addressed the restriction of PIP4K to holozoa as described in our response to #2.2, wherein we describe a previously proposed paralogous mechanism in fungi. The reviewer’s point about the homeostatic process being related to class I PI3K signaling in growth control of multicellular organisms is interesting, but the presence of the PIP4Ks in some unicellular organisms complicates this view. We are of the view that a discussion of this important topic is a little nuanced for inclusion in the current manuscript.

      • *

      Reviewer #3

      __Summary: __Using state of the art imaging techniques the authors try to address how cells sense PI(4,5)P2 levels and regulate PIP5Ks to maintain an optimal level since any dysregulation of PI(4,5)P2 levels can have significant effects on the functioning of the cell and led to numerous disease states, such as cancers.

      The key conclusions are convincing and importantly validate previous disputed findings made by Wang et al. (Cell Reports 2019) using different and more rigorous methods, however unfortunately due to the Wang et al publication the overall novelty of this study is lacking. A suggestion to the authors is to state/explain with text more clearly how their findings are more precise and higher quality than the previous report and why their findings are necessary and significant to drive the field forward.

      We have revised the manuscript to more clearly state our novel finding that PIP4Ks are PI(4,5)P2 sensing proteins that inhibit PIP5Ks on the membrane in a PI(4,5)P2-dependent manner, which was not previously described in the literature.

      Further, experiments in the study were performed in vitro in cultured cells using overexpression methods making the physiological significance a bit unclear and the enthusiasm of the main discovery dampened. With that being said these findings are worthy of publication in order to advance the field and understanding of how the PIP kinase families are regulated and maintain PIP2 homeostasis which is important for life.

      We feel that this assessment is slightly unfair, since most of the key experiments have been validated using purified proteins in supported lipid bilayers, and endogenous proteins were studied using genomic tagging approaches, rather than over-expression.

      Minor and easily addressable experiments should be performed by the authors the following. Further, many of these experimental issues can easily go in supplemental materials

      #3.1: Include western blots for the constructs to compare expression levels.

      We agree that it is important to take into account differences in expression levels for the experiments presented in fig. 1. However, since these are single cell assays, Western blotting of whole populations of transiently transfected cells is not the best control. Instead, having acquired the images under consistent excitation and detection parameters, we compared the fluorescence intensity, expressed as relative expression in Fig. 1A and C, which is discussed in the results text in the first two paragraphs of the results on p. 3: “Notably, expression of the catalytically inactive mutants was usually somewhat less strong compared to the wild-type enzymes, yet effects on PI(4,5)P2 levels were similar (fig. 1A).” and “Again, differences in expression level between isoforms do not explain differences in activity, since all achieved comparable expression levels as assessed by fluorescence intensity (fig. 1C).”

      #3.2: For Figure 1A, what is the source of the observed increase in PI(4,5)P2, how do the authors take into account the role of endogenous PIP5Ks?

      We added a new experiment in the revised Fig. 1B showing that the increased PI(4,5)P2 occurs at the expense of PM PI4P:

      This is described in the first paragraph of the results on p.3: “PI(4,5)P2 levels are expected to increase at the expense of PM PI4P levels when over-expressing any of the three isoforms of human PIP5K (A-C) or the single paralog from the budding yeast, Saccharomyces cerevisiae (Mss4). Indeed, this was precisely what we observed (fig. 1A and B, statistics reported in tables 1 and 2).”

      The role for endogenous PIP5Ks is clarified on the sentence that spans pp. 3-4: “We therefore reasoned that saturation of endogenous, inhibitory PIP4K molecules by PIP5K over-expression, regardless of catalytic activity of the PIP5K, would free endogenous, active PIP5K enzyme from negative regulation (fig. 1D).”

      • *

      #3.3: For Figure 1B, could the authors comment on the intracellular distribution of PI(4,5)P2. How are they able to reliably distinguish their signal between plasma membrane and intracellular localizations and conclude that PIP2 on the plasma membrane is decreased?

      As detailed in the now expanded methods section covering image analysis on p. 22, our analysis specifically quantifies fluorescence in the plasma membrane.

      #3.4: Please include statistics for all image- based quantitation analysis.

      We have added details of statistical analysis and tabulated the results, as detailed in our response to __#1.1. __

      __#3.5: __ Could the authors comment on the ability of PIP4K to have affinity for its own product? How does PIP4K sense membrane PI(4,5)P2 since these kinases are mostly cytoplasmic?

      We have added a comment to the 1st paragraph of the Discussion on p.9: “PIP4K’s low affinity and highly co-operative binding to PI(4,5)P2 makes it an excellent sensor for tonic PI(4,5)P2 levels. It is poised to sense PI(4,5)P2generated in excess of the needs of the lipids’ legion effector proteins, ensuring these needs are met but not exceeded. Nevertheless, the relatively low PIP4K copy number of around 2.5 x 105 per cell [29] is a small fraction of the total PI(4,5)P2 pool, estimated to be ~107 [33], ensuring little impact on the capacity of the lipid to interact with its effectors.”

      __#3.6: __Do the authors have any other experiments to substantiate the binding of the two PIP kinases, similar to the Wang et al findings? Is the N-term motif required? Is it possible to disrupt that interaction and show the phenotype?

      We do not have additional, conclusive experiments to share at this time, and believe that characterization of the inhibitory interaction is beyond the scope of the current manuscript. We do however add a comment on this topic to the 1st paragraph of p. 11: “Regardless, a full understanding of paralog selectivity will need to be driven by a detailed structural analysis of the interaction between PIP4Ks and PIP5Ks - which is not immediately apparent from their known crystal structures, especially since PIP4Ks and PIP5Ks employ separate and distinct dimerization interfaces [50].”

      #3.7: With the overexpression studies in Figure 1, do the authors see any changes in signaling when they just overexpress PIP5Ks versus in combination with PIP4Ks to show that the changes in plasma membrane PI(4,5)P2 can affect downstream signaling?

      We agree with the reviewer that attenuating PIP5K-mediated PI(4,5)P2 increases with PIP4K should affect downstream signaling. However, we believe that these will not add additional insight compared to the already included experiments (fig. 8), whereby signaling output in response to graded changes in PI(4,5)P2 levels was investigated.

      • *

      Reviewer #3 (Significance (Required)): Overall, as mentioned above because of the 2019 Wang et al report the novelty is diminished, however using completely alternate methods and sophisticated microscopy this body of work indeed advances the field and provides further believable evidence of the PIP kinase families communicating in higher organisms which is required to maintain PIP2 levels shedding light on many of the findings that were previously unexplained surrounding the PIP4K studies. Further, the use of biosensors to describe these findings are new and will enable others in the field to begin to use such tools to investigate potential crosstalk between other lipid kinases.

      As we argued in the general comments, we do feel that this evaluation misses the key finding that PIP4Ks are PI(4,5)P2 sensors, and that this regulates PIP5K regulation as part of a feedback loop.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2022-01723

      Corresponding author(s): Daphne Avgousti, Srinivas Ramachandran

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary This study by Lewis et al. examines the role of heterochromatin in the nuclear egress of herpesvirus capsids. They show that heterochromatin markers macroH2A1 and H3K27me3 are enriched at specific genome regions during the infection. They also show that when macroH2A1 is removed or H3K27me3 is depleted (both of which reduce the amount of heterochromatin at the nuclear periphery), the capsids are not able to egress as effectively. This is interesting since it could be argued that heterochromatin acts as a hindrance to the transport of viral capsids to the nuclear envelope and that the loss of it would allow capsids to reach the nuclear envelope more easily. However, this paper seems to show that heterochromatin formation, on the contrary, is necessary for efficient egress. Overall, the study seems comprehensive. The methodology is solid, and the experiments are very well controlled. However, some issues need to be addressed before publication.

      Major comments

      1) In line 49, the authors state, "Like most DNA viruses, herpes simplex virus (HSV-1) takes advantage of host chromatin factors both by incorporating histones onto its genome to promote gene expression and by reorganizing host chromatin during infection". In addition, HSV1 expression can be hindered by the host's interferon response via histone modifications. Ref. Johnson KE, Bottero V, Flaherty S, Dutta S, Singh VV, Chandran B. IFI16 restricts HSV-1 replication by accumulating on the HSV-1 genome, repressing HSV-1 gene expression, and directly or indirectly modulating histone modifications. PLoS Pathog. 2014 Nov 6;10(11):e1004503. doi: 10.1371/journal.ppat.1004503. Erratum in: PLoS Pathog. 2018 Jun 6;14(6):e1007113. PMID: 25375629; PMCID: PMC4223080.

      We agree with the reviewer and have amended our text and added the reference. See line 57.

      2) Reference 5 is misquoted in the sentence, "This redistribution of host chromatin results in a global increase in heterochromatin". In that reference, the amount of heterochromatin is not analyzed in any way. However, that particular paper shows that the transport of capsid through chromatin is the rate-limiting step in nuclear egress, which is important considering this study. Further, the article by Aho et al. shows that when the infection proceeds capsids can more easily traverse from the replication compartment into the chromatin, which means that infection can modify chromatin for easier capsid transport. For that reason, the article is an important reference, but it needs to be cited correctly.

      We agree with the reviewer that this citation was misquoted and have corrected the citation. See lines 55 and 62-64.

      3) The term heterochromatin channel at lines 54, 102, and 303 is misleading since the channels seen in the original referred paper are less dense chromatin areas. Also, this term is not used in the original paper where the phenomenon was first described. These less dense interchromatin channels were found by soft-X-ray tomography imaging and analyses, not by staining.

      We thank the reviewer for pointing out this discrepancy and have amended the text to accurately describe the methods used in the appropriate citations. See lines 65, 115, and 383.

      4) It is difficult to visualize chromatin using TEM microscopy. The values of peripheral chromatin thickness given in Figure 1e (5-15 nm) do not seem realistic given that the thickness of just one strand of histone-wrapped DNA is 11 nm. Why are the two values for WT different? If you can get so different values for WT, it is a bit worrisome (switching the WT results between the top and bottom parts of Fig. 1e would for example result in very different conclusions on the effect of macroH2A1 KO for the thickness of the chromatin layer).

      *We agree with the reviewer that it is difficult to visualize chromatin by TEM. It is also important to note that comparisons can only be made between samples treated on the same day in the same way. Taking this into account, we chose to compare macroH2A1 KO cell stains to controls done at the same time, and the same for H3K27me3 depleted conditions compared to DMSO treated and prepare for EM at the same time. Visually, it is apparent that the staining in the macroH2A1 KO control cells is somewhat different than those of the H3K27me3 depleted control cells, which represents the inherent variability of this method. It is also true that one nucleosome is around 11nm, however, since the cells contain highly compacted chromatin with many other proteins present, this measurement is not appropriate to apply. Adding up the millions of nucleosomes that make up the chromosomes at 11nm each would result in a space much larger than the nucleus, therefore we focus on comparing between control and experimental conditions restricted to this assay as a relative qualitative comparison. Nevertheless, we agree with the reviewer that the notion of changing chromatin is difficult to quantify by EM and so we have taken an additional approach to test our hypothesis and confirm EM interpretations (discussed lines 391-393). We have utilized live capsid trafficking to visualize capsid movement in nuclei in the presence or absence of macroH2A1. The results from these new experiments are presented in new Figure 5 and EV5 and support our model. *

      5) In lines 134-137 it says that "The enrichment of macroH2A1 and H3K27me3 was observed as large domains that were gained upon viral infection (Fig 2a), suggesting that the host landscape is altered upon infection. These gains were reflected in an increase in total protein levels measured by western blot (Fig 2b)." However, the protein levels of H3K27me3 do not seem to increase during infection. In other presented data as well (Figs. 2a, 2b, 2c, S2a) it is difficult to justify the statement that H3K27me3 is enriched in infection. When this is the case, the conclusion that the amount of heterochromatin increases in the infection (the quotation above and the one in line 315) is not supported. The statement in line 315 is also not specific since it is unclear what "newly formed heterochromatin increases" means.

      We agree with the reviewer that our original description was misleading. We now have edited the text to clarify that there is redistribution of macroH2A1 and H3K27me3. In the revised manuscript, we have also included mass spectrometry data mined from Kulej et al. that show peptide counts that reflect increases in the heterochromatin markers described (see new Figure EV1a). Despite this quantitative measure, upon rigorous replicates of our western blots as requested by Reviewer 2, we concluded that the increases originally described are somewhat inconsistent by western blot. This discrepancy between mass spectrometry data and western blot is likely due to the non-linear nature of antibody binding and developing of western blots by the ECL enzymatic reaction. Therefore, our revised manuscript focuses on this redistribution as a reaction to infection and stress responses instead of a global increase as the original manuscript stated. See lines 174, 182, 196, 397 and Fig EV4d in main text and discussion sections.

      • *

      6) Quantitation of viral capsid location in H3K27me3-depleted cells seems somewhat arbitrary. It would have been more robust to calculate the number of capsids per unit length of the nuclear envelope with and without depletion.

      We agree with the reviewer that the quantification of capsids in the H3K27me3-depleted conditions was arbitrary. In our revised manuscript, we have now repeated this quantification to accurately measure the phenotype observed, that is the chains of capsids lined up at the inner nuclear membrane. To do this, we used two measures: 1) the distance from the INM as less than 200nm and 2) the distance from other capsids as less than 300nm. Taking into account these two measures, we quantified the frequency with which multiple capsids lined up at the INM in WT and H3K27me3-depleted conditions. This is represented in the new Figure 5d. In the WT setting, we observe most often 1 single capsid at the INM, with a small fraction of 2 capsids. However, in the H3K27me3-depleted condition, we observe much greater numbers of capsids at the INM more frequently, as many as 16 at a time, leading to an average of 2-3 capsids at any single location. The source data for this figure are also provided. See lines 589 and Fig5d.

      7) In lines 300-302 it says "Elegant electron microscopy work showed that HSV-1 infection induces host chromatin redistribution to the nuclear periphery2,8." However, the redistribution data in reference 8 is based on soft x-ray tomography and not on electron microscopy."

      We have amended the text to accurately describe the methods used in the citations. See line 384.

      8) The authors bundle together the effects of macroH2A1 removal and H3K27me3 depletion by saying that they both decrease the amount of heterochromatin at the nuclear periphery and therefore hinder capsid egress. This seems overly simplistic and macroH2A1 and H3K27me3 seem to act very differently, which is manifested in the drastic difference in nuclear capsid localization between the two cases. This difference needs to be discussed more.

      We agree with the reviewer that there is a nuanced difference in the effect on nuclear egress in the absence of the two heterochromatin marks. Specifically, that macroH2A1 loss results in greater numbers of capsids dispersed throughout the nucleus, whereas depletion of H3K27me3 results in capsids reaching the INM and not escaping. To examine these differences further, we have carried out live imaging of capsid trafficking in macroH2A1 KO cells compared to control and found that capsids move much more slowly, consistent with our model, see new Figure 5h-I and EV5h-i. Conversely, H3K27me3 depletion does not prevent the capsids from reaching the INM, raising the question of whether they are successfully able to dock at the nuclear egress complex (NEC). To investigate this further, we obtained an antibody against the NEC component UL34 and probed during infection in our heterochromatin disrupted conditions. We found that UL34 levels are unchanged upon loss of macroH2A1 or depletion of H3K27me3, suggesting the levels of UL34 do not account for the decrease in titers. These data are now presented in new Figure EV3g-h. Furthermore, we have amended our model to include the two different scenarios upon loss of different types of heterochromatin (see new Figure 6) and discussion of these differences. See line 428.

      Minor comments Line 45: Nuclear replicating viruses -> Nuclear-replicating viruses Line 56: is -> are Line 64: 25kDa -> 25 kDa Line 159: macroH2A1 cells -> macroH2A1 KO cells Line 289: The term gDNA is rarely used for viral DNA. Replace gDNA with viral DNA. Line 405: 8hpi -> 8 hpi Line 449: mm2 -> μm2 "Scale bar as indicated" words can be removed in the figure legends or at least should not be repeated many times within one figure legend.

      We have amended the text to address these comments. See lines 52, 68, 76, 179, 334, 513, and 585.

      Reviewer #1 (Significance (Required)):

      These findings would appeal to a broad audience in the field of virology. Specifically, the researcher in the fields of virus-cell and virus-nucleus interactions. This manuscript analyses herpesvirus-induced structural changes in the chromatin structure and organization in the nucleus that are also likely to affect the intranuclear transport of viral capsids.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The manuscript "HSV-1 exploits heterochromatin for egress" describes the effects of heterochromatin at the nuclear periphery, macroH2A1 or H3K27me3 on HSV-1 replication and egress. Knocking out macroH2A1 or depleting H3K27me3 with high concentrations of tazemetostat depleted heterochromatin at the nuclear periphery, may not have affected HSV-1 protein expression and modestly inhibited the production of cell-free infectivity and HSV-1 genomes. macroH2A1 deposition was affected by infection, creating new heterochromatin domains which did not correlate directly with the levels of expression of the genes in them. The authors conclude that heterochromatin at the nuclear periphery dependent on macroH2A1 and H3K27me3 are critical for nuclear egress of HSV-1 capsids.

      The experiments leading to the conclusion that HSV-1 capsids egress the nucleus through channels in the peripheral chromatin confirm previously published results (https://doi.org/10.1038/srep28844). The previously published EM micrographs show a much larger number of nuclear capsids, more consistent with the images in the classical literature, even in conditions when nuclear egress was not inhibited. Figures 1 and 4 show scarce nuclear capsids, even under the conditions when nuclear egress should be inhibited according to the model and analyses. The large enrichment in nuclear capsids in KO cells predicted by the model is not reflected in figure 4a, which shows only a modest increase in nuclear capsid density (the total number of nuclear capsids would be more informative). The number or density of nuclear capsids is not shown in H3K27 "depleted" cells. The robustness of the analyses of the number of capsids at the membrane in H3K27 "depleted" cells is unclear. For example, the analyses could be repeated with different cut offs, such as 2 or 4. If they are robust, then the conclusions will not change when the cutoff value is changed.

      We appreciate the reviewer’s observation that to number of capsids we show differs from those published in the publication by Myllys et al. (Sci Rep 2016 PMID 27349677). It is important to note there are several differences between our study and that of Myllys et al. that explain the difference. First, as reviewer 1 pointed out, the Myllys et al. study used three-dimensional soft X-ray tomography combined with cryogenic fluorescence and electron microscopy to observe capsids in 3D rendered nuclei. Since our method uses only single ultrathin 50nm slices of cells, we cannot visualize the total number of capsids per nucleus, rather only per slice, which is why we have averaged slices of many nuclei to generate a statistical comparison between macroH2A1 KO or H3K27me3-depleted and control cells treated at the same time (see response to reviewer 1). Furthermore, these other methods are specialized techniques for 3D imaging that are beyond the scope of our study. Second, the Myllys et al. paper used B cells which are much smaller than HFFs, lending themselves to better tomography studies but not commonly used to study HSV-1 biology. Third, the Myllys et al. paper also used a different MOI and time point than we have. Taken together, these differences account for the disparity in visualizing capsids which is why we quantified capsid number across many images.

      We agree with the reviewer that our quantification in the H3K27me3-depleted cells compared to control was somewhat arbitrary. As stated in the response to Reviewer 1 above, in our revised manuscript we have now repeated this quantification to accurately reflect the phenotype observed, that is the chains of capsids lined up at the inner nuclear membrane. To do this, we used two measures: 1) the distance from the INM as less than 200nm and 2) the distance from other capsids as less than 300nm. Taking into account these two measures, we quantified the frequency with which multiple capsids lined up at the INM in WT and H3K27me3-depleted conditions. This is represented in the new Figure 5d. In the WT setting, we observe most often 1 single capsid at the INM, with a small fraction of 2 capsids. However, in the H3K27me3-depleted condition, we observe much greater numbers of capsids at the INM more frequently, as many as 16 at a time, leading to an average of 2-3 capsids at any single location. The source data for this figure are also provided. See lines 589 and Fig 5d.

      Furthermore, we have now also carried out live-imaging analysis of single capsids during infection which show the appropriate number of capsids expected when the full nucleus is visible. These results are presented in the new Figure 5 and EV5.

      The quantitation of the western blots present no evidence of reproducibility and/or variability. The number of biologically independent experiments analyzed must be stated in each figure and the standard deviation must be presented. As presented, the results do not support the conclusions reached. The quality of western blots should also be improved. it is unclear why figure 2b shows viral gene expression in wild-type cells only, and not in KO or H3K27me3 depleted cells, which are only shown in the supplementary information. These blots presented in Figure S5a and S5b are difficult to evaluate as the signal is rather weak and the controls appear to indicate different loading levels. These blots do not appear to be consistent with the conclusions reached. Some blots (VP16, ICP0 in HFF) appear to indicate a delay in protein expression whereas others (VP16, ICP0 in RPE) appear to indicate earlier expression of higher levels. The claimed "depletion of H3K27me3 is not clear in in figure S5d, in which the levels appear to be highly variable in all cases, without a consistent pattern, with no evidence of reproducibility and/or variability, and using a mostly cytoplasmic protein as loading control. All western blots should be repeated to a publication level quality, the number of independent experiments must be clearly stated in each figure, and the reproducibility and/or variability must be indicated by the standard deviation.

      *As reviewer 1 also pointed out, we appreciate that there is some variability with respect to the stated ‘increase’ in these heterochromatin marks during infection. As stated in response to reviewer 1, in our revised manuscript we have included a deeper analysis of these marks from global mass spectrometry that indicates an increase in total levels. Please see response to reviewer 1. *

      • *

      In the revised manuscript, we have now included mass spectrometry data mined from Kulej et al. that show peptide counts that reflect increases in the heterochromatin markers described (see new Figure EV1a). Despite this quantitative measure, upon rigorous replicates of our western blots as requested by Reviewer 2, we concluded that the increases originally described are somewhat inconsistent by western blot. This discrepancy between mass spectrometry data and western blot is likely due to the non-linear nature of antibody binding and developing of western blots by the ECL enzymatic reaction. Nevertheless, our genome-wide chromatin profiling showed consistent, reproducible, and statistically significant redistribution of macroH2A1 and H3K27me3 upon HSV-1 infection. Therefore, our revised manuscript now focuses on this redistribution as a reaction to infection and stress responses instead of a global increase as the original manuscript stated. See lines 174, 182, 196, 397 and Fig EV4b-c.

      • *

      With respect to viral protein levels, although there is slight variation in the levels of VP16 or ICP0 in RPEs compared to HFFs, we do not feel that this difference is biologically significant as several other measures of viral infection progression are unchanged (viral RNA, viral genome accumulation within infected cells). Furthermore, the significant difference in titers we observe is not explained by slight differences in ICP0 or VP16. Nevertheless, to document this variability in western blot and assuage any concern of impact infection progression, we have repeated each western blot presented in the paper three separate times and used these blots to quantify each relevant protein. Graphs of western blot quantitation can be found in each figure accompanying a western blot as follows:

      Western blots:

      Figures 3b-c, 4ab, EV1b, EV5a

      Quantitation of western blots:

      Figures 3d, 4c, EV1c, EV5b-f

      • *

      An enhanced analyses of the RNA-seq data, analyzing all individual genes rather than pooling them together, would provide better support to these conclusions. Then, the western blots are useful to show that the changes in mRNA result in changes in the levels of selected proteins.

      • *

      *We appreciate the reviewer’s interest in the RNA-seq data, however, we feel that reviewer has not understood the analysis we presented in the initial submission. To clarify, we calculated fold changes for individual genes and did not pool RNA-seq data anywhere in the manuscript. We show boxplots of log2 fold changes of individual genes. Boxplots enable summarization of the salient features of a distribution while still representing individual gene analysis. Here, the distribution being plotted is the log2 fold change of individual genes that intersect with macroH2A1 domains that change due to infection. As such, clusters 1-3 of macroH2A1 domains feature a loss in macroH2A1 due to infection and the boxplots show that the majority of genes are upregulated. To highlight this point further, in our revised manuscript we have included volcano plots of genes intersecting with each cluster also showing the split between the number of genes significantly upregulated and downregulated in each cluster at each time point (see new Figure EV3c). As expected from the boxplots, clusters 1-3 feature a much higher fraction of genes are significantly upregulated, whereas cluster 5 features a higher fraction of genes downregulated with concomitant increase in macroH2A1 due to infection. Taken together with the gene ontology analysis (new Figure Sd), these results support our model in which macroH2A1 is deposited in active regions to block transcription and promote heterochromatin formation. To further support these conclusions, we have also carried out analysis of 4sU-RNA data generated upon salt stress or heat shock and found that the regions defined by gain of macroH2A1 (i.e. clusters 5 and 6) also exhibit significant decreases in new transcription at just 1-2 hours after treatment. These data, which are presented in new Figure EV3b-c, strongly support our model in which macroH2A1 is deposited in genes downregulated upon stress response to generate new heterochromatin. *

      Figure S1 raises some questions about the specificity of the macroH2A1 antibody used for CUT&Tag. As expected CUT&Tagging the cellular genome in the KO cells with the specific antibody results in lower signal than with the IgG control antibody. In contrast, viral DNA is CUT&Tagged as efficiently in the KO as in the WT cells, and in both cases significantly above the IgG controls. The simplest interpretation of these results is that the antibody cross-reacts with a protein that binds to HSV-1 genomes. The manuscript must experimentally address this possibility.

      We agree with the reviewer that there is a possibility that antibodies cross react. However, we are confident that this is not the case in this scenario for the following reasons:

      • *

      *1 – We have carried out immunofluorescence analysis of macroH2A1 or H3K27me3 during HSV-1 infection and observe no overlap with ICP8 staining. We have included these images together with a histogram documenting the lack of overlap in the new Figure EV2f-g. *

      • *

      2 – CUT&Tag relies on the Tn5 transposase to insert barcodes into accessible regions of the genome. An inherent limitation of this method during viral infection is that the replicating viral genome is very dynamic and accessible, leading to easier and less specific insertion by the transposase. This is evidenced by the pattern of signal across the viral genome that is completely overlapping in the macroH2A1, H3K27me3 and IgG conditions. Snapshots of the full viral genome are now included in the new Figure EV2c-d.

      • *

      *Furthermore, using CUT&Tag with macroH2A1 antibody, we expect the transposition rate to be identical between WT and macroH2A1 KO conditions for the Ecoli and viral genomes. This is because we assume that the transposition in these two genomes is non-specific since there is no macroH2A1 present. Then, we expect the spike-in normalized CUT&Tag enrichment on the viral genome to be the same between WT and macroH2A1 KO conditions. Since IgG should not be affected by macroH2A1 KO, we expect the IgG enrichment to be same between WT and macroH2A1 KO conditions. Thus, non-specific background would result in higher enrichment in an apparent signal on viral genome in the macroH2A1 KO condition. *

      • *

      Combined with this expectation for background transposition and the following: 1) the distribution of the CUT&Tag signal across the viral genome is virtually identical between IgG, macroH2A1, and H3K27me3 CUT&Tag signal in WT and macroH2A1 KO cells (see new Figure EV2c-d), 2) that there is no colocalization between macroH2A1 or H3K27me3 with viral genomes by immunofluorescence (see new Figure EV2f-g), and 3) the whole genome correlation of the signals across CUT&Tag samples on the viral genome, but not the host, are virtually identical as presented in a heat map (see new Figure EV1g vs EV2e), we conclude that the viral CUT&Tag signal is noise. Therefore, any analysis of the signal on the viral genomes would not be biologically meaningful.

      • *

      Also, Figure S1 shows that the viral genome is CUT&Tag'ed with H3K27me3 antibody as efficiently in macro H2A1 WT and KO cells, and in both cases above the background signal from IgG control antibody. The authors conclude that the signal with the specific antibody "mirrors" that of the control antibody, but "mirroring" is not defined and the actual data show that there is a large increase in signal with the specific antibody. Not surprisingly, the background signal also increases, as the number of genomes increase while infection progresses. The authors conclude that "these results indicated that there was a significant background signal from the viral genome that could not be accounted for", but no evidence supporting this conclusion is presented. The data show clear signal above the background from the viral genome and that this signal is not affected by the presence or absence of macroH2A1. This section of the manuscript has to be thoroughly re-analyzed as there is clear H3K27 signal.

      *We agree with the reviewer that as presented in the current manuscript it seems as though there is a real H3K27me3 signal. However, as stated in the above comment, the pattern of this signal matches that of all other conditions, including IgG, suggesting it is not a real signal, cross-reacted or otherwise, but rather an artifact of the methodology. See new Figure EV2. *

      The concentration of tazemetostat used is high. Normally, concentrations of around 1µM are used in cells, and 10µM is often cytotoxic (for examplehttps://doi.org/10.1038/s41419-020-03266-3; https://doi.org/10.1158/1535-7163.MCT-16-0840). The effects on H3K27me3 presented in figure S1b appear to be normalized to mock infected treated cells. If so, they do not allow to evaluate the effectivity of the treatment. Cell viability after the four days treatment must be evaluated, the claimed "depletion" of H3K27me3 must be clearly demonstrated (the blots in figure S5 are not sufficient as presented), and levels of different histone methylations must be tested to support the claimed specificity of tazemetostat for H3K27me3 at the high concentrations used.

      *While we agree with the reviewer that the cytotoxicity of any inhibitor is an important aspect to take into account, in this instance the reviewer is incorrect. The reviewer has cited papers that highlight the potential use of tazemetostat as a cancer-cell specific treatment for colorectal and B-cell cancers. In both of these cases, the primary conclusion is that tazemetostat’s cytotoxic property is largely corelated to mutation in EZH2. In fact, WT EZH2 treated cells had a more “cytostatic” response, which shows that tazemetostat is not toxic with WT EZH2 (Brach et al. Mol Cancer Ther. 2017, PMID 28835384) as is the case in our system. Furthermore, the Tan et al. study shows a non-transformed human fibroblast (CCD-18co) and embryonic colon epithelial (FHC) as “healthy controls” for their work in colorectal cancer cell lines in Figure 1D. These 2 cell lines, which are comparable to the WT HFF cells we used, show no reduction in viability at a log fold greater concentration than the 10 µM used in our paper. *

      • *

      *Nevertheless, we agree with the reviewer that cytotoxicity should be formally ruled out. In our original experiment, we recorded cell counts at the harvested mock, 4-, 8-, and 12 hpi and found no difference in the number of cells over the course of infection (see new Figure EV3e). We also used trypan blue staining as a measure of cell viability upon tazemetostat treatment and found no toxicity. These results are presented in new Figure EV3f. *

      Furthermore, we agree with the reviewer that total H3 levels by western blot should be included in any comparison of H3 modification. While these were included in some figures, they were unintentionally omitted in others. In our revised manuscript we have now included these blots together with quantification of triplicate biological samples of H3K27me3 levels normalized to total H3. See new Figures 3, 4, EV1, and EV5.

      • *

      Minor comments. Reference No.27 is misquoted in lines 250-251, which state that it shows that "HSV-1 titers, but not viral replication, where reduced upon EZH2 inhibition." The reference actually shows inhibition of HSV-1 infectivity, DNA levels and mRNA for ICP4, ICP22 and ICP27. This reference uses much shorter treatments (12 h and only after infection). It also shows that inhibition of EZH2/1 up regulates expression of antiviral genes.

      *We appreciate that the reviewer has pointed out a discrepancy between our results using an EZH2 inhibitor (tazemetostat) and those from reference 27 (Arbuckle et al., mBio, 2017 PMID 28811345) that requires clarification. The reviewer states that the treatments were 12 hours after infection, however, this is incorrect. In the Arbuckle et al. study, the authors used multiple different inhibitors at high doses for short treatments before infection and noted that this caused an upregulation in antiviral genes that blocked infection progression of multiple viruses including HCMV, Ad5 and ZIKA. Importantly, these genes include multiple immune signaling and interferon stimulated genes. In our study, we specifically use a much lower dose of EZH2 inhibitor, with respect to the IC50 value, and waited 3 days to ensure a steady state. In our system, any initial burst of immune response from the inhibitor would likely have subsided by the time we do our infection. Furthermore, supplemental figure EV1 from the Arbuckle et al. study states that EZH1/2 inhibitors do not affect nuclear accumulation of viral genomes and suppress HSV-1 IE expression in an MOI-independent manner (Arbuckle et al. Supplemental Figure 1). These results in fact support our conclusions that it is not any antiviral effect of inhibition of EZH2 that causes the decrease in titers that we observe. *

      • *

      To clarify, the IC50 value of the inhibitors used in the Arbuckle et al. study are 10 nmol/L (GSK126) and 4 nmol/L (GSK343). The IC50 is a measurement used to denote the amount of drug needed to inhibit a biological process by 50% and is commonly used in pharmacology to compare drug potency. In the Arbuckle et al. study, GSK126 was used at a concentration range of 15-30 µM, that is 1500-3000x more than the IC50 level as converted from nmol/L to µM, and GSK343 was used at a concentration range of 20-35 µM, that is 5000-8750x more than the IC50 level, to see changes in viral mRNA levels. The IC50 value for tazemetostat is 11 nmol/L which means that one would need to use a much higher molarity of tazemetostat, at least 28 µM which would be 2500x the IC50 value, to achieve the comparable biological changes as the inhibitors used in the Arbuckle et al. study. Thus, we are confident that the 10 µM concentration used in our study is an appropriate and non-toxic amount that would not impact antiviral responses at the dose and times that we used. As shown above and reported in multiple studies (for example: Knutson et al. Molecular Cancer Therapy 2014 PMID 24563539, Tan et al. Cell Death and Disease 2020 PMID 33311453 cited above, and Zhang et al. Neoplasia 2021 PMID 34246076, among others) the concentration of tazemetostat that we used is not toxic to the cells. Importantly, it was also reported that a global decrease in H3K27me3 by EZH2 inhibition using a 10 µM concentration of tazemetostat (here referred to by the identifier EPZ6438) did not impact HSV-1 RNA transcript accumulation measured by bulk sequencing (Gao et al. Antiviral Res 2020 PMID 32014498), consistent with our findings.

      • *

      In our revised manuscript, we have now included a discussion of these important points. See lines 409-428.

      HFF are primary human cells but they are fibroblasts whereas the primary target of HSV-1 replication is epithelial cells. The wording used "they represent a common site of infection in humans" must be edited

      We agree with the reviewer and have updated the text. See lines 109.

      Disruption of macroH2A (1 and 2) results in general defects in nuclear architecture, not just peripheral chromatin (https://doi.org/10.1242/jcs.199216;, see also figure 1c and 5a, presenting invaginated and lobulated nuclei). The manuscript would benefit from including a broader discussion of the effects of macroH2A defects on the general nuclear architecture.

      • *

      We agree with the reviewer and our revised manuscript now includes a more in-depth discussion of the impact of macroH2A and other heterochromatin marks on nuclear structure. See lines 373-374 and 394.

      The title should be edited, as "egress" in virology is commonly used to refer to the egress of virions from the cell, not to the nuclear egress of capsids. Adding the words nuclear and capsid should be sufficient to address this issue.

      *We agree with the reviewer and will update the title to read “HSV-1 exploits host heterochromatin for nuclear egress”. Given that we are measuring multiple aspects of infection, we feel that adding the word ‘capsid’ is not necessary. *

      It is unclear why preferential changes in expression of housekeeping genes would indicate "stress responses to infection". The rationale for this conclusion must be fully articulated and supported.

      We agree with the reviewer that it may not be immediately clear as to why changes in house-keeping gene expression represent a stress response. In a recent study that we cite in our manuscript, Hennig et al. (PLOS Path 2018 PMID 29579120) demonstrate that changes in chromatin accessibility and gene transcription during HSV-1 infection resemble those that occur upon heat shock or salt stress. These results strongly support the model that global transcription changes caused upon stress (heat, salt, infection etc.) result in dramatic alterations to chromatin structure. In support of this notion, in our revised manuscript we now include analysis of these datasets based on our macroH2A1-defined clusters. Importantly, we found that the regions defined by gain of macroH2A1 (i.e. clusters 5 and 6) also exhibit significant decreases in new transcription at just 1-2 hours of exposure to salt and heat stress. These data, which are presented in new Figure EV3b-c, strongly support our model in which macroH2A1 is deposited on active genes to generate heterochromatin as a response to the stress of infection. We also discuss these results further in the revised manuscript, see lines 210-220, 233-236, and 424-426.

      Statistical methods must be fully described in materials and methods and the number of biologically independent experiments must be stated in each figure.

      *We agree with the reviewer and have included these details in each figure legend. *

      Reviewer #2 (Significance (Required)):

      The major strengths of the manuscript lie on the comprehensive analyses of the effects of knocking histone macroH2A in the nuclear architecture and chromatin organization. These analyses indicate that peripheral heterochromatin is defective in the KO. Another strength lies on the analyses of the news heterochromatin domains in HSV-1 infected cells. The relationship between the lack of correlation between the changes in gene expression and global heterochromatin domains defined by macroH2A1 with the main conclusion is less clear.

      The major weakness is that the data presented do not strongly support the conclusions. Additional experiments are required to support the main conclusion that the effects in peripheral heterochromatin result in a biologically significant effect on capsid egress. The authors should also consider that the additional experimentation may not support the conclusion that macroH2A or H3K27me3 play critical roles in the nuclear egress of capsids.

      • *

      *To support our conclusions, we have carried out an entirely different set of experiments to track capsid movement. Bosse et al. PNAS 2015 PMID 26438852 and Aho et al. PLOS Path 2021 PMID 34910768 use live-imaging and single-particle tracking to characterize capsid motion relative to host chromatin. These approaches allowed the authors to discover that infection-induced chromatin modifications promote capsid translocation to the INM. They showed that 1) HSV-1 infection alters host heterochromatin such that open space is induced at heterochromatin boundaries, termed "corrals", in which viral capsids diffuse and 2) the movement of viral capsids through the host heterochromatin is the rate limiting step in HSV-1 nuclear egress. *

      • *

      To test our hypothesis that macroH2A1-dependent heterochromatin specifically is required, we collaborated with Dr. Jens Bosse to carry out these same experiments in our macroH2A1 KO and paired control cells. We tracked RFP-VP26 using spinning-disk confocal live imaging to track individual capsid movement within the nucleus. We found that capsids in cells lacking macroH2A1 traveled much shorter distances on average. This is represented graphically by the mean-square displacement (MSD) of capsid movement in macroH2A1 KO cells plateauing at ~0.4 µm2 vs 0.6 µm2 in WT cells, which represents the size of the “corral”, or space through which capsids diffuse. The average corral size in macroH2A1 KO cells is ~300 nm less than the average corral size in WT cells (two-thirds the size). These results are consistent with the finding that macroH2A1 limits chromatin plasticity both in vitro (Muthurajan et al. J Biol Chem 2011 PMID 21532035) and in cells (Kozlowski et al. EMBO Rep 2018 PMID 30177554). These data strongly support our hypothesis that macroH2A1-dependent heterochromatin is critical for the translocation of HSV-1 capsids through the host chromatin to reach the INM. Furthermore, these data support the model in which macroH2A1 allows for the increase of open space induced during infection. Loss of this open space restricts the movement of capsids in the nucleus, as quantified by our live-imaging experiments. These data are now included in the new Figure 5 and EV5 and described in lines 348-372 and 1011-1037.

      • *

      NOTE: These experiments were done in a separate lab using the same cells and MOI we used for our TEM studies. It is important to note that because this was done by live imaging where the full nucleus and cell are visible, the appropriate number of capsids is apparent.

      Another major weakness is that the results of CUT&Tag of the viral genome are dismissed without proper justification. The authors conclude that the results invalidate the assays, but the results are consistent with cross-reactivity of the macroH2A1 antibody with another protein that interacts with the viral genomes and with H3K27me3 being associated with the viral genomes irrespectively of macroH2A1.

      *We agree with the reviewer that as presented the viral genome reads were dismissed without thorough justification. As stated above, we are confident that the patterns we detected do not represent a biologically relevant signal but rather an artifact of the experimental set up. Furthermore, it is well known in the field that normalizing replicating viral genomes during lytic infection in any kind of chromatin profiling technique is fraught with inconsistencies as each cell may have a different copy number of viral genomes at any given time point. Therefore, we feel strongly that any analysis of the viral genome chromatin profile during a lytic replication at this point in time would require single cell sequencing which is beyond the scope of this study. We appreciate that this was not clearly presented in the original manuscript and in our revised submission we have included a full supplemental figure documenting the negative data that support our conclusions (see new Figure EV2). *

      If the authors had additional data supporting the claim that these results do not reflect cross-reactivity or association with the viral genomes, these data must be presented. Without that additional data, the conclusions are not supported and these discussions must be removed from the manuscript. The authors may still opt to not analyze any association with the viral genomes, but they should not dismiss them as artifactual without actual evidence to support this claim. Previously published literature is also misquoted.

      This study makes an incremental contribution to the previously published evidence showing that HSV-1 capsids egress the nucleus through channels in between the peripheral chromatin. It shows that disruption of the heterochromatin at the nuclear periphery, and the nuclear architecture in general, may have a modest effect on capsid egress. This information may be of interest mostly to a specialized audience focused on the egress of nuclear capsids.

      While we agree with the reviewer on many points as stated above, we respectfully disagree that our study is merely an incremental contribution of interest only to a specialized audience focused on nuclear egress. As reviewer 2 states earlier, the strength of our study lies in the “comprehensive analyses of the effects of knocking histone macroH2A in the nuclear architecture and chromatin organization”, which would be of interest to a broad chromatin audience as well as virologists. Together with the new data presented here and a revised manuscript, we feel that our study would be of interest to a broad audience in the chromatin and virology fields as reviewers 1 and 3 also pointed out. Chromatin is generally analyzed in the context of how it might affect gene expression and the impact of chromatin on biological processes such as viral infections, and its structural role in the nucleus is not commonly considered. Here, we demonstrate an important example of the glaring effects of chromatin structure on the biological nuclear process of infection.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Lewis et al. reveal an unexpected role for heterochromatin formation in remodeling the nucleus to facilitate egress of the nuclear-replicating virus HSV1. By performing TEM in HSV1-infected primary human fibroblasts, the authors show that capsids accumulate at the inner nuclear membrane in regions of less densely stained heterochromatin, in agreement with studies in established cell lines. The authors go on to reveal that heterochromatin in the nuclear periphery of HSV1-infected primary fibroblasts was dependent on the histone variant macroH2A1 and is enriched with H3K27me3.CUT & Tag was used to profile macroH2A1 over time during lytic HSV1 infection and showed that both macroH2A1 and H3K27me3 were enriched over newly formed heterochromatic regions 10s-100s of Kb in length in active compartments. Remarkably, loss of macroH2A1 or H3K27me3 reduced released, cell free infection virus progeny and increased intranuclear capsid accumulation without detectably impacting the proportion of mature genome containing capsids, virus genome or protein accumulation. Their finding that newly remodeled heterochromatin forms in HSV infected cells and is a critical determinant for the association of capsids with the inner nuclear membrane is consistent with a critical role in egress.

      I have only relatively minor editorial suggestions listed below to improve the manuscript:

      Line 92: This subtitle should be revised to more precisely state the findings shown in the Fig 1 data. While the first part of the statement "HSV1 capsids associate with regions of less dense chromatin" is consistent with what is shown, the final phrase "...to escape the nucleus" is an interpretation of the data inferred from the static image.

      We agree with the reviewer and have amended our text to more accurately describe the figure. See lines 138-139.

      Line 96: I am not sure the statement that fibroblasts represent a "common" site of infection is supported by ref 15. FIbroblasts do, as indicated in ref 15, express the appropriate receptor(s) for virus entry and in culture support robust virus productive growth. However, in human tissue, infection of dermal fibroblasts appears rare, suggesting it may not be a "common" site of infection (PMCID: PMC8865408). Maybe simply revise wording to indicate fibroblasts represent "a site of infection or can be infected in tissue?".

      We agree with the reviewer, as was also pointed out by reviewer 2, and have amended the text. See lines 109.

      Line 126-127: As written it states that "....regions of the host genome that increase during infection", implying these genome regions are amplified (increase). I think the authors mean that infection increases binding of mH2A1 and H3K27me3 to broad regions of the host genome. Please clarify.

      We agree with the reviewer that this was written ambiguously. As was pointed out by reviewers 1 and 2, the increase in these marks depends on the type of measurement. Therefore, we have modified the text in a revised manuscript to focus instead on the redistribution of these marks during infection. See line 138-139.

      FIgS1, a,b,c,d: please indicate that 4,8,12 indicate hpi, correct? And indicate that in the legend M indicates Mock.

      This is correct and we have updated this in the figure legend. See lines 625-627.

      Line 197: "active compartments". Do the authors mean transcriptionally active compartments? Please clarify

      This is correct and have clarified this in the text. See line 248.

      Line 232: please replace "productive" with "infectious"

      We agree with the reviewer and have amended our text. See line 295.

      Line 233 - The authors conclude mH2A1 is important for egress, ruling out assembly before even bringing it up. As I read on, it is clear the authors addressed this important issue later on in the manuscript. That said, it was a bit jarring to conclude egress is important without addressing the assembly possibility at this juncture in the manuscript. One way to remedy this would be to move the Fig S6 assembly/capsid type data (lines 286-297, Fig S6) and surrounding text earlier to support the conclusion that mH2A1 did not detectably influence assembly, but is important for egress.

      *We agree with the reviewer that the order of presentation makes it difficult to follow. Our revised manuscript now includes these important data within the same figure. See new Figure 5. *

      Line 244: "progeny production" - it would be helpful to specify "cell free or released infectious virus progeny"

      Line 248: change "produced" to released"

      Line 273 replace "productive" with "infectious virus progeny released from infected cells"

      Fig S5c: Was the plaque assay performed on cell free supernatants? This should be indicated.

      We agree with the reviewer and have made all these changes in the text. See lines 285-287.

      Reviewer #3 (Significance (Required)):

      The experiments are well executed, the data are solid with appropriate statistical analysis and their analysis sufficiently rigorous, and the manuscript is clearly written. Moreover, the finding that HSV manipulates host heterochromatin marks to facilitate nuclear egress is significant and exciting. The work reveals an unexpected role for newly assembled heterochromatin in egress of nuclear replicating viruses like HSV1.

    1. Author Response:

      We thank the editors and reviewers for their assessment of our manuscript, and their agreement that we present compelling evidence for post-transcriptional regulation of AURKA through the 3’UTR.

      In response to Reviewer 1, we acknowledge that much of our study is performed exclusively in U2OS cells, and that study of alternative polyadenylation in additional cell lines would serve to further generalize our findings. However, as U2OS are a well-known model cell line for cell cycle studies we believe our demonstration of cell cycle regulation of AURKA through its 3’UTR offers a depth of understanding that is perhaps of greater interest than confirming the existence of alternative AURKA 3’UTRs in additional cell lines, using our methods. We note that the recent rapid growth in RNA seq data resources allows easy confirmation of the broad existence of alternative polyadenylation events on a genome-wide scale. For example, AURKA-specific data extracted from a recent benchmark study of Nanopore long read RNA sequencing (Chen et al., 2021) clearly shows the existence of two distinct AURKA 3’UTRs differentially expressed between a number of different cancer cell lines. In addition, a recent study investigating the landscape of APA at single-cell resolution detected AURKA APA isoforms in HeLa and MDA-MB-468 cell lines (Wang et al., 2022). Their study further identifies AURKA among genes showing negative correlation between generalized distal polyA site usage index (gDPAU) and expression levels, meaning preference to use the proximal polyA site when expression levels increase, and include AURKA in the gene cluster showing slight increase in usage of the distal polyA site from G1 to M phase (Wang et al., 2022). Both studies are in support of the evidence presented in our manuscript.

      We agree with Reviewer 2 that better information on translation rates would improve our understanding of the impact of translation regulation on AURKA levels. Some insight on the translation rate of AURKA in the cell cycle can be derived from inspection of the ribosome profiling dataset published by Tanenbaum et al., 2015. From their analysis, translation efficiency of AURKA mRNA in G2 is 1.59 times that in G1 and in G1 it is 0.69 times that in M phase, whilst in G2 it is 1.10 times higher than in M. Such data reveal a reversible increase in translation of AURKA mRNA, alongside other mitotic regulators, in preparation for M phase (Tanenbaum et al., 2015). These results are in accordance with our findings that translation rates contribute modestly to cell cycle changes in AURKA levels in normal cells, and we concur with Reviewer 3’s comment that the contribution of increased translation rate to AURKA levels at mitosis is less than the change in mRNA levels at this point in the cell cycle.

      We think the significance of the regulatory mechanism we describe lies rather in the large effect it has on AURKA levels in interphase (when AURKA expression is normally repressed at both mRNA and translation rate). We hypothesise that it is interphase regulation that may be relevant to roles of AURKA in cancer (and to the association of APA with cancer) (Bertolin and Tramier, 2020; Naso et al., 2021). It is indeed the case that (i) AURKA regulation by miRNA, (ii) cooperation between APA and translation and (iii) cell-cycle dependent control of AURKA at the translation level, are already known. We believe the novelty of our study lies in drawing together these elements to provide new insight into AURKA regulation, using tools that allow similar investigation of other APA events, and contributing new ideas for future therapeutic interventions for disease proteins regulated via APA.

    1. These findings suggest that ToM-like ability (thus far considered to beuniquely human) may have spontaneously emerged as a byproduct of language models’improving language skills.

      How can we be sure that ToM is uniquely human?

      What kind of tests have been administered on chimps, dolphins etc? We shouldn't equate that they're unable of ToM ability just because they can't tell us what they think some other being is thinking. (language barrier).

      Moreover, ToM ability probably breaks down for humans if they have to infer what a member of another species is thinking (e.g Try to get a human to tell you what a chimp, dolphin or bat is thinking)

    1. Propuso el Memex en As we may Think.

      ¿Podría considerarse Memex como el precedente de Google?, pues si bien no estuvo materializado se parte de la idea primaria de víncular muchos textos al tiempo

    1. Can we devise solutions that aren’t reactive and ad hoc, and aren’t bogged down by accusations of partisan bias? One idea is to treat fake news as a distribution problem, treating it more like spam. Spam is something the platforms already understand and deal with.

      I think this is an extremely interesting aspect of this article. Another way of understanding fake news is to realize that it is a form of spam. Spam emails, calls and texts are almost always labeled spam and wind up in our spam inboxes/folders. If social media companies created better filter systems for weeding out fake news and identifying them as spam like, the circulation of these fake stories may decrease.

    1. In universal design, the goal is to make environments and buildings have options so that there is a way for everyone to use it22. For example, a building with stairs might also have ramps and elevators, so people with different mobility needs (e.g., people with wheelchairs, baby strollers, or luggage) can access each area. In the elevators the buttons might be at a height that both short and tall people can reach. The elevator buttons might have labels both drawn (for people who can see them) and in braille (for people who cannot), and the ground floor button may be marked with a star, so that even those who cannot read can at least choose the ground floor.

      I think this approach is the most widely applicable solution for the disability community. Strategies that contrast with assistive devices that are expensive or try to make them "normal" are changing the group itself. If changes are made from the designer's point of view, this transpersonal strategy can protect the disability community to the broadest extent. Because we need to respect them as they are, not force them to change in order to fit in.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2022-01784

      Corresponding author(s): Felipe, Court

      1. General Statements [optional]

      We submit a revision plan for our manuscript “Senescent Schwann cells induced by aging and chronic denervation impair axonal regeneration after peripheral nerve injury” by Fuentes-Flores et al. from the groups of Felipe Court, Judith Campisi, Jose Gomez, and Ahmet Hoke.

      One of the greatest challenges in the field of peripheral nerve regeneration is the decrease in the nerve regenerative capacity in aged patients or after delayed repair, a condition also known as chronic denervation. For the last two decades, several research groups have focused on understanding this phenomenon, but the main drivers of unsuccessful regeneration and poor functional recovery have been elusive, remaining an important clinical problem.

      In the work described in this manuscript we found an unexpected property of Schwann cells in the denervated nerves. Aged and chronically denervated Schwann cells are not just passive participants in the impaired regeneration process, but they actively inhibit the regeneration of peripheral axons. Using a combination of morphological, behavioral and molecular techniques in a collaborative multi-lab approach we demonstrate for the first time that senescent Schwann cells accumulate in aged or chronically denervated peripheral nerves modifying the nerve environment, increasing proinflammatory and regeneration-inhibitory factors. Elimination of senescent Schwann cells using a systemic intervention with senolytic or genetically targeting p16-positive senescent cells, greatly improve axonal regeneration in both chronic denervation and aging conditions. Importantly, the enhanced axonal regeneration observed after senescent cell elimination is accompanied by improved functional recovery after chronic denervation. Chronic denervation and aging are the main clinical problems associated to peripheral nerve injuries. Our approach, using FDA approved drugs currently in clinicals trials for its application as senotherapeutics, effectively broadens the spectrum of its clinical use and effectiveness.

      We foresee this work will be of interest to a wide audience, including experts in nerve regeneration, senescent cells, aging and those studying the effect of chronic insults in regenerative medicine.

      We have now received the comments from two reviewers and we are prepared to experimentally approach the issues raised. We thank their criticism and suggestions, as well as their very enthusiastic comments. We are extremely pleased as both reviewers recognized the important implication of this work, from reviewer 1:

      “The findings reported in this manuscript are very interesting and will move the field of nerve repair forward. This paper will be of interest for basic science audience in the fields of aging and neurobiology and has also potential interest to the broader clinical and translational fields. Indeed, this paper provides data that Schwann cells entering a senescent stage not only fail to support axon regeneration in aged animals, but actively inhibit axon regeneration…. Furthermore, the use of an FDA approved drug, currently in clinicals trials for its application as senotherapeutics, to increase axon regeneration in aged and chronic denervation conditions will provide new avenues for clinical applications”.

      Which is backed up by reviewer 2:

      “Overall, this is an interesting study that undertake fundamental question in the field of nerve physiopathology and also could open a good opportunity in developing therapeutic strategies for translational research”.

      We understand the reviewers have raised issues associated with the manuscript format and we are prepared to profoundly edit the manuscript as suggested. In addition, after discussion the experimental issues raised by the reviewers, we are prepared to perform all the experiments and controls suggested (some of them are currently underway), including new animal experiments and in vitro work. This information is detailed in the point-by point revision plan below.

      Thank you in advance for the consideration and we look forward to hearing from you in due course. Please do not hesitate to contact me if you want to discuss anything associated to the manuscript and the revision plan.

      2. Description of the planned revisions

      Point-by point Reponses and revision plan in blue

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary This manuscript by Fuentes-Flores et al reports that elimination of senescent Schwann cells by systemic senolytic drug treatment or genetic targeting improves nerve regeneration and functional recovery in aging and chronic denervation. This improved regeneration is associated with an upregulation of c-Jun expression. Mechanistically the authors provide data to show that senescent Schwann cells secrete factors that are inhibitory to axon regeneration. These findings are very interesting and move the field forward beyond the notion that Schwann cells fail to support axon regeneration in aged animals and identify potential targets to enhance nerve repair. The use of a senolytic drug to increase regeneration in aged and chronic denervation conditions provides new avenues for clinical interventions. However, some of the claims are overstated since this is not the first characterization of senescent Schwann cells and the manipulations used in the study are not entirely specific for Schwann cells. The manuscript is also poorly written and difficult to follow, given the complex set of surgeries and terminology, and lack of explanation of the rationale for the surgery model used. Figures are poorly labeled and difficult to follow without figure legends, and figure legends do not match the figures.

      We thank the reviewer for the positive comments, we also acknowledge the problems detected in the manuscript format, including the lack of a clear explanation of the complex procedures used. We are prepared to work carefully on the format, including clear explanations of the procedures and new schemes to complement the text. As detailed below we are also prepared to perform all the experimental work proposed by the reviewer, which will strengthen the conclusion of this manuscript.

      Below are suggestions for improvements:

      Major comments:

      • In the axon regeneration assays, how is the reconnection site defined in longitudinal images stained for SCG10? Of particular concern is that Figure 1B "adult chronic dmg" nerve section image appears to be identical to the image in Figure 5A "Vehicle adult (47dpi)". However, the reconnection site is located at different sites along the nerve. Also, the scale bar appears identical but the legend states different sizes. In Figure 1 chronic damage is 42dpi, and figure 5 is 47dpi, yet with what appears as the same image.

      Revisions incorporated in the transferred manuscript, see section 3, below.

      • The authors need to provide a rationale for the choice of this complex injury model and what are the advantages over other models. Please clearly describe the timepoints for each experiment and why the time points were chosen for analysis. Provide the scheme of injury in the main Figures to ease comprehension. A scheme is provided in what appears to be Figure S2, but the legend of Figure S2 does not match. Please compare same time points between aged and adults. Days post injury is sometimes referred to 7 or 42, and it is difficult to follow if it is days post initial transection of the tibial nerve or days post reconnection of transected tibial to peroneal. Revise all Figure legends and supplementary Figure legends to match figures.

      We thank the reviewer for this comment. In a revised manuscript we will provide a clear explanation for the injury model used, as well as references, including one from the group of Tessa Gordon that describe for the first time this model in rats (PMID: 30215557), and the one applying this model to the mouse from the groups of Rhona Mirsky and Kristjan Jessen (PMID: 33475496). Briefly, this model has two advantages: first, it allows to generate chronic denervation for months and then be able to connect the distal denervated stump with a proximal one without the need of a nerve bridge; And second, neurons in the different groups that have different denervation times (1 week versus 6 weeks) are all damaged at the same time, eliminating variability associated to chronic axonal damage. We will include this information in the results section of a revised manuscript along with the above references.

      We will include schemes of the injuries performed in each experiment in each figure, also adding a timeline. This is an excellent suggestion to clearly understand the different procedures performed. We will also check all figures and legends, including correcting the problem detected by the reviewer (legends of Figures 2 and 5 were swapped). We understand the problem of referring to days post-injury, then we will introduce a new form of referring to the initial transection and the experiment, which include reconnection. Adding schemes per figure will also help to understand the different timelines used in different experiments, including the ones using the senolytics.

      As detailed above, we will perform a very careful revision of the text and legends for consistency.

      • The authors' main conclusion is that Senescent Schwann cells inhibit axon regeneration. The authors need to tune down this statement and acknowledge that their manipulations are not entirely Schwann cells specific. While the data nicely shows a contribution of senescent Schwann cells, it does not sufficiently acknowledge the possibility that other senescent cells in the nerve contribute to this effect. First, the authors refer in the discussion that 60% of the senescent cells are SOX10 negative, and thus represent other cells beyond Schwann cells. This quantification needs to be shown in Figure 1. Second, the genetic and pharmacology manipulation eliminate all senescent cells, including Schwann cells. Third, while the culture experiment may be Schwann cells specific, the authors need to provide detailed information on how they purify these cells, how they induce repair Schwann cells (rSC) as claimed in Figure 3, and demonstrate whether these are pure Schwann cells. This is important because other cells in the nerve contribute to nerve repair, including mesenchymal cells. Finally, the claim that using Mpz-cre will lead to c-jun overexpression only in SC also needs to be demonstrated, since Mpz is also expressed in satellite glial cells in the DRG.

      We thank the reviewer for these comments and suggestions. We will tone down the statement that senescence Schwann cells are the only cell candidates for modulating regeneration. We discussed this in the original manuscript, but we agree we need to review this statement, including new data detailed below.

      We will include the data requested by the reviewer (60% SOX-10 negative senescent cells) in a new graph in Figure 1. Also, we are currently performing new experiments and quantifications using specific markers for macrophages, epithelial cells, and fibroblasts, to identify the cell identity of the 40% SOX-10 negative senescent cells in aging and chronic denervation.

      Regarding in vitro experiments, we will provide detailed methods for Schwann cell purification, and induction of rSC phenotype. Related to the purity of these cultures, in past experiments we have obtained numbers ranging from 95-98% of Schwann cell purity; we will repeat these experiments and quantification for this manuscript and include this data in the method section of a revised text.

      Regarding c-jun overexpression in the Mpz-cre, we agree with the reviewer that there is probably overexpression in satellite glial cells in the DRG. Satellite glial cells (SGCs) are a subset of cells in the Schwann cell (SC) lineage that express several early myelination markers, such as Mbp, Mag, and Plp, and the transcription factor Sox10. SGCs express early SC markers, such as CDH19, and are transcriptionally and morphologically similar to SCs, even in the absence of axonal contact. Regarding the possibility that SGCs are contributing to the enhanced regeneration presented by mice with c-jun overexpression, this issue was somehow approached previously by Wagstaff et al. (PMID: 33475496), as they showed that in this mouse strain, increased axonal regeneration was equally observed in sensory neurons, in contact with SGCs, and motor neurons, which are not associated to SGCs. This observation suggests that the effect is associated with c-jun overexpressing Schwann cells in the distal stump. In addition, in our work, the changes in senescent cells observed in the c-jun overexpressed mouse, were associated to the distal nerve stump, which was mechanically separated from the proximal region. We agree it is important to include this discussion, and we will do so in the revised manuscript.

      • In Figure 5, c-jun is shown after denervation (42 dpi). The results describe 28 days of denervation, 5 days of GCV and 7 days post reconnection, which makes 40 days. If that is not the case, results need to better explain timeline of this procedure. Also, what is the basal c-jun expression in p16-3MR mice? In addition to the number of c-jun positive cells shown in Figure 5G, the authors need to quantify the percent of c-jun puncta that co-localize with Sox10. The size of the c-jun puncta appears different in size in vehicle and GCV, is that an expected phenotype?

      As expressed above, we will include schemes and timelines for all the surgical experiments in a revised manuscript, including detailed information for the experiments using the p16-3MR mice.

      Regarding c-jun expression in the p16-3MR mice, we are currently performing the suggested control experiment which is important to draw conclusions of this research. We will use immunofluorescence, but also include western blots as an extra analytical method in this and other experiments. All this information will be included in a revised manuscript.

      The observation of the apparent difference c-jun puncta is intriguing. Is not an expected phenotype, but it will important to check if there is a quantifiable change in the pattern of expression. We will quantify this in the different groups and include the results in a revised Figure 5.

      Minor comments

      • Improve labeling of Figures or at the very least describe in the Figure legend. For example: Figure 5B-C, which of the graphs is from adult mice and which is from aged mice?

      We are sorry about the lack of clear labeling in the figures; we will carefully review all figures in the manuscript and their corresponding legends, adding better labeling. Labelling of Figure 5B-C has been corrected.

      • The authors need to carefully describe where the high magnification images were taken in the injured nerve and keep the comparison at same site between groups. Please check the scale bar for each image. For example, the images in Figure 1D/F/I/K used same scale, but the cell size and cell morphology are different. The images for split individual channels need to match the merge channel images. For example, the individual channel and merge images are not properly aligned in Figure 4C, ABY-263 group.

      As suggested by the reviewer, we will show the regions in which high magnification were taken. All quantifications were performed in comparable sites, but we will include information in a revised manuscript to clearly describe the methodology used. We will check all scale bars in a revised manuscript.

      For the comment on alignment problems we have incorporated this in the transferred manuscript, see section 3 below.

      • The analysis method used to quantify axon regeneration should be consistent throughout. For example, in Figure 1C, number of axons/nerve width(um) was used for regeneration assay, but axon density (width corrected) was used in Figure 5B-C in regeneration assay.

      Revisions incorporated in the transferred manuscript, see section 3, below.

      **Referees cross-commenting**

      I agree with all Referee #2's comments. Both sets of comments are important, complementary and point to the same major concerns that need to be addressed. Agree as well that both reviewer think this is an interesting and relevant study for the field of nerve repair, if revised appropriately.

      Reviewer #1 (Significance (Required)):

      __Significance____ __The findings reported in this manuscript are very interesting and will move the field of nerve repair forward. This paper will be of interest for basic science audience in the fields of aging and neurobiology and has also potential interest to the broader clinical and translational fields. Indeed, this paper provides data that Schwann cells entering a senescent stage not only fail to support axon regeneration in aged animals, but actively inhibit axon regeneration. While this reviewer raises questions on whether only senescent Schwann cells or other senescent cells in the nerve contribute to this effect, the identification of potential targets to enhance nerve repair is highly significant. Furthermore, the use of an FDA approved drug, currently in clinicals trials for its application as senotherapeutics, to increase axon regeneration in aged and chronic dennervation conditions will provide new avenues for clinical applications.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary The reported study by Fuentes-Flores et al. shows that Schwann cells (SCs) in peripheral nerves undergo senescence with aging or in chronic denervation. This senescent SC phenotype correlates with downregulation of c-Jun expression and axon regeneration capacity and consequently, affecting functional recovery. The study has been undertaken by using in vivo mice model of chronically denervated sciatic nerve and in vitro of rat-primary cell cultures (Schwann cell, DRG-explants, and their coculture). Schwann cell exosome manipulation was also included to exploit their released factor as media for cell cultures.

      Major comments:

      1. __ __ Chronic denervated nerve model and Schwann cell phenotype

      *Tibial nerve transection and chronic denervated nerve: As the referee have no information about this model (no reference is cited), a detailed description should be provided highlighting the interest of such model compared to standard sciatic nerve lesion model.

      We thank the reviewer for this comment. We will provide a clear explanation for the injury model used, as well as references, including one from the group of Tessa Gordon that describe for the first time this model in rats (PMID: 30215557), and the one applying this model to the mouse from the groups of Rhona Mirsky and Kristjan Jessen (PMID: 33475496). Briefly, this model has two advantages: first, it allows to generate a chronic denervation for months and then be able to connect the distal denervated stump with a proximal one without the need of a nerve bridge; second, neurons in the different groups that have different denervation times (1 week versus 6 weeks) are all damaged at the same time, eliminating variability associated to chronic axonal damage. We will include this information in the result section of a revised manuscript along with the above references.

      *Should be shown, histological analysis of denervated tibial branch prior reconnection with freshly cut proximal peroneal branch with specific immunostainings of rSCs v.s. sSC associated with dapi nuclear staining (as used along this study). Specific staining for other cells should be also provided (i.e., macrophage and endothelial cells). In simple words, "how chronically denervated nerve looks like and what is his cellular content? This is necessary to responds to the following main referee question: does the increase/decrease of rSCs or sSC under specific condition all through the study concerns the SCs that have migrated from freshly cut peroneal branch into denervated tibial distal branch, or resident SCs that have survived in chronically denervated tibial distal branch. In other words, whether rSCs that migrate (and accompanying regenerating axons) into chronically denervated branch nerve undergo phenotype change into sSC because of the environment of chronically denervated nerve. This is not clearly described or discussed, and remain confusing for reader.

      We are sorry about the lack of clarity in the text and figures. The immunostaining analysis of denervated tibial branch prior reconnection is included in the original manuscript, specifically in Figure 1D-K, and Figure 4. In a revised manuscript we will include schemes in the Figures to shown the regions analyzed in each case.

      We thank for the suggestion of including staining for other cell types. As suggested by the reviewer we are currently performing these experiments and analysis for macrophages, endothelial cells and fibroblasts, together with staining for cell senescence (p16), in both aged and chronically denervated conditions. We will include this data in a revised manuscript

      About the specific question of the reviewer: “does the increase/decrease of rSCs or sSC under specific condition all through the study concerns the SCs that have migrated from freshly cut peroneal branch into denervated tibial distal branch, or resident SCs that have survived in chronically denervated tibial distal branch”. Our data demonstrate that senescence Schwann cells appear in the distal nerve stump in aged mice and after chronic denervation. The distal stump is physically disconnected from the proximal part of the nerve. Therefore, after reconnection the regenerating axons encounters a tissue which is already populated with senescent cells. To clearly explain this, we will add extra text in the results and discussion section to clarify these findings.

      Support of up- and down-regulation of gene expression illustrated in fig.4

      The conclusions and statements on up- or down-regulation of c-jun, yH2AX, beta-gal, P16, arise from quantitative and qualitative analysis from immunostaining of these specific markers by determining the number of positive cells rSCs vs. sSC. For these strong statements appropriate methods for quantification of protein levels such as western blots are required. For example, the statement of down regulation of c-jun expression, the quantitative graph shows strong increase in c-jun cell number under ABT263 treatment but the histological photo does not illustrate such decrease in number of the cell. It shows rather an increase in brightness of c-jun staining. Thus, only appropriate method for protein level quantification could be conclusive. This would also remove the doubt that some photos are under- or over- exposed as it appears in the figure. For example, in aged animal under vehicle condition, there is no variability in staining intensity. Accordingly, one question to the authors: for quantification of cell number, are weakly stained cells considered as positive cell?

      We agree with the reviewer that a more quantitative method is required to complement the immunofluorescence data. As the reviewer correctly states, quantification of the immunofluorescence data corresponds to cell positive for the specific marker, expressed as % of cell positive for that maker. Therefore, we will perform western blot for c-jun, yH2AX, and p16 for the different models, including treatments with senolytics.

      Regarding the method for quantification, we performed all these quantifications using Imaris software, in which we set up the same threshold for all conditions for a specific antibody marker. Then, in addition to the quantitative western blot analysis, we will include a graph representing the distribution of the labelling (intensity histogram) for all cell number quantification from immunofluorescence data, comparing control with the experimental condition. Finally, the methods used for quantification will be expanded in a revised manuscript.

      The method section should be revised in general

      Methods could be described in brief only when are supported by provided refs in which the reader could find details. Several refs are missing, i.e., 4.4 for thermal allodynia, 4.5 for ABT263 gavage administration; senescence induction, ...

      Quantification methods should be more detailed, several information are missing and not found in result section or legends (i.e., number of nerve section per animal, neurite length, ...).

      We completely agree with the reviewer that the method section was not developed adequately in the original manuscript. As described in previous responses, we will detail the methods used, especially those associated with quantifications performed. We will also include references for the different methods used, including those detailed by the reviewer.

      The use of rat for in vitro DRG and SC culture while in vivo study is undertaken on mice The switch of species from in vivo to in vitro (mice vs. rat), is not justified as mice DRG and SC culture are also commonly used. In addition, the use of transgenic mice (used here only for in vivo) could also be exploited to address specific and reinforce the data.

      We agree with the reviewer that using mouse SC in in vitro experiments will be a better approximation to support our findings. We have been using rat SC in this and other publications as they were the standard model used in in vitro experiments. Nevertheless, as the author states, now there are suitable methods for culturing mouse SC, that we have incorporated in our lab. Therefore, we will perform key in vitro experiments using mouse SC together with mouse DRGs.

      Regarding the use of the transgenic mice (3MR), we thank the reviewer for this suggestion; we will perform new experiments using SC derived from 3MR mice in order to demonstrate induction of senescence (by expression of the red fluorescent protein in this transgenic line) and senolysis in vitro.

      Use of conditioned exosome/media

      Should be explained why the use of exosomes directly in cell culture was not tested. This would be close to physiological condition, regarding the concentration of released factors.

      This is an important point that was not explored. As we have plenty of experience using SC-derived exosomes, we will perform the suggested experiments comparing the effect of exosomes from conditioned media from senescent-induced SC and include the results from these experiments in a revised manuscript.

      The statement on the effect of rSC vs. sSC cell on growth cone dynamic

      The provided data illustrated in fig. 3 are not in support that sSC affect growth cone dynamics. Only what would be "suggested" is that the decrease in neurite length could be associated to changes of growth cone morphology, on fixed tissue, that appeared to be affected. If such statement has to be maintained, time-laps is required. The image does not reflect a retracting neurite nor collapsed growth cone. In addition, other mechanisms could be at the basis of observed decrease in neurite length, which are not evaluated here. This is an important point to address as the authors state that sSC release inhibitory factors.

      We completely agree with the reviewer: we are not exploring growth cone dynamics. We will change the manner these results are presented as we are not demonstrating a dynamic process in our results. We prefer to modify the text associated with these experiments rather than perform a time-lapse analysis at this moment. This is part of a future exploration we want to achieve, that will take some time to develop, and we consider at this moment lies outside the scope of the present work. Included in section 4, below.

      other comments

      • The surgical description is complicated, also annotation to be added in supp fig #2A; provide

      We will work in the description of the model, including references to other papers using this nerve anastomosis model for assessing regenerative potential. As stated above we will also include schemes in all figures to help the reader with the surgical procedure and different timelines used.

      • M&M 4.2: lign #8, error referring to Fig 2A, correct by. Supp fig 2A

      Revisions incorporated in the transferred manuscript, see section 3, below.

      • Review refs list, ref#17, full info needed

      Revisions incorporated in the transferred manuscript, see section 3, below.

      • Fig 3H would be interesting only if contain a column of the 21 proteins exclusively expressed in senescent-induced SCs

      Revisions incorporated in the transferred manuscript, see section 3, below.

      • The immunostaining of Lamin B1 positive nuclear invaginations in fig S4 should better described in results for non-familiar reader

      We will describe better this staining pattern in the result section of the revised manuscript.

      • Title of Fig 3, the expression "neuronal growth" is not appropriate here (neurite outgrowth)

      Revisions incorporated in the transferred manuscript, see section 3 below.

      **Referees cross-commenting**

      I agree with referee#1's comments. He/she has raised complementary and important points that should be taken into account by the authors as well as we share the same major concerns. Furthermore, we both expressed the interest of such study if revised appropriately.

      Reviewer #2 (Significance (Required)):

      Significance

      Overall, this is an interesting study that undertake fundamental question in the field of nerve physiopathology and also could open a good opportunity in developing therapeutic strategies for translational research. However, additional investigations are needed to support the main conclusions.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Reviewer 1

      Major comments

      • In the axon regeneration assays, how is the reconnection site defined in longitudinal images stained for SCG10? Of particular concern is that Figure 1B "adult chronic dmg" nerve section image appears to be identical to the image in Figure 5A "Vehicle adult (47dpi)". However, the reconnection site is located at different sites along the nerve. Also, the scale bar appears identical but the legend states different sizes. In Figure 1 chronic damage is 42dpi, and figure 5 is 47dpi, yet with what appears as the same image.

      We thank the reviewer for detecting this issue. The problem arises as the data shown in Figure 1a corresponds to the controls (vehicle) of Figure 5a. The data in Figure 1 is a known phenomenon in the field of peripheral regeneration (i.e., decreased regeneration in aged animals as well as in chronically denervated nerves); nevertheless, we decided to add this data at the start of the manuscript to clearly shown the reader (thinking in a broader scientific audience) the baseline of the evident decrease in axonal regeneration in these two conditions. To make this very clear, we have included the same image Figure 1 and 5 for the controls and detailed this in the legends of both Figure 1 and 5 in the uploaded manuscript.

      We have checked the scales and modify the scale in Figure 5 as it was not correct. We have also corrected the nomenclature for days post injury in this image as well as in the corresponding legend.

      In addition, for transparency, we have uploaded in a public repository (EBI BioStudies database, https://www.ebi.ac.uk/biostudies/) all the microscopy images used in this work, which is detailed in the uploaded manuscript in a new section named data availability.

      Regarding the localization of the reconnection site, this is identified using the whole z-stack of the nerve and not a single section, the region in the z-stack can be recognized using two parameters: the difference in diameter between the proximal and distal stump and by identifying the filament used to suture both stumps. We have included a description of this procedure in the method section of the revised manuscript. As this is not always a perpendicular line, in the revised Figures we have now used an arrowhead to denote the reconnection site.

      We are sorry for the confusion generated by the labeling of some images. We will review the text and figures and fix errors. In a revised manuscript we will also add schemes for several figures in order to explain better the experimental procedure and timelines.

      Minor comments

      • Improve labeling of Figures or at the very least describe in the Figure legend. For example: Figure 5B-C, which of the graphs is from adult mice and which is from aged mice?

      We have included the suggested labelling for Figure 5B-C.

      • The images for split individual channels need to match the merge channel images. For example, the individual channel and merge images are not properly aligned in Figure 4C, ABY-263 group.

      We thank the reviewer for spotting the error in the split channels, we have now fixed this in Fig 4C, but also corrected other alignment problems detected in Fig 4A and 5F.

      • The analysis method used to quantify axon regeneration should be consistent throughout. For example, in Figure 1C, number of axons/nerve width(um) was used for regeneration assay, but axon density (width corrected) was used in Figure 5B-C in regeneration assay.

      We have included the procedure used to quantify axonal regeneration in the method section of the uploaded manuscript, which is the same throughout the manuscript. We are sorry for the different texts in the axes of graphs included in Figure 1 and Figure 5. In the first version of the figures, we were using the term “axon density (width corrected)”, but then we decided to change it to “number of axons/nerve width (mm)”, which was more precise. Unfortunately, the text of the graph axis in Figure 5 was not changed by mistake. We have now fix this in the revised version.

      Reviewer 2

      • M&M 4.2: lign #8, error referring to Fig 2A, correct by. Supp fig 2A

      We have corrected this error in the uploaded manuscript.

      • Review refs list, ref#17, full info needed

      We have fixed this reference in the uploaded manuscript.

      • Fig 3H would be interesting only if contain a column of the 21 proteins exclusively expressed in senescent-induced SCs

      We are sorry about this omission in Figure 3H. We included a list of all the identified proteins of repair and senescent-induced SCs in Supplementary Table 4 (Table S4) of the original manuscript, including the identity of the 21 proteins exclusively expressed in senescent-induced SCs. In the revised version, we have incorporated the information of the 21 proteins in Figure 3H as suggested by the reviewer.

      • Title of Fig 3, the expression "neuronal growth" is not appropriate here (neurite outgrowth)

      We thank the reviewer for detecting this error, we have changed the expression as suggested in the uploaded manuscript.

      Other changes included in the revised manuscript and Figures

      1. Numeric data for graphs We have included a new supplementary excel file: Supplementary Table 8, including the data for each replicate associated to the graphs of text and supplementary figures.

      Revision Figure 5G.

      During our review on all the individual replicates of the manuscript data to upload into BioStudies, the first author noticed that the data in graph of Figure 5G corresponded to a pilot experiment performed to set up the protocol. The final experiment was not included, then we uploaded the correct images and included the final quantification. Data is comparable, and statistical differences remains.

      Revision Figure 3A.

      We have modified the pseudocolors of the S100 antibody channel from magenta to green for ease of visualization. The image and quantification remain exactly the same.

      4. Description of analyses that authors prefer not to carry out

      Reviewer 2

      The statement on the effect of rSC vs. sSC cell on growth cone dynamic

      The provided data illustrated in fig. 3 are not in support that sSC affect growth cone dynamics. Only what would be "suggested" is that the decrease in neurite length could be associated to changes of growth cone morphology, on fixed tissue, that appeared to be affected. If such statement has to be maintained, time-laps is required. The image does not reflect a retracting neurite nor collapsed growth cone. In addition, other mechanisms could be at the basis of observed decrease in neurite length, which are not evaluated here. This is an important point to address as the authors state that sSC release inhibitory factors.

      We completely agree with the reviewer: we are not exploring growth cone dynamics. We will change the manner these results are presented as we are not demonstrating a dynamic process in our results. We prefer to modify the text associated with these experiments rather than perform a time-lapse analysis at this moment. This is part of a future exploration we want to achieve, that will take some time to develop, and we consider at this moment lies outside the scope of the present work.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      General Statements

      Thank you for providing an initial assessment of our manuscript. We went through all the raised comments and suggestions aiming to improve our manuscript. Our manuscript will benefit from addressing them.

      Our main impression is that the concerns regarding the novelty of our work by Reviewers #1 and #3 come from the fact that we apply a known flexible statistical framework (group factor analysis) to novel applications in single-cell data analysis, namely the estimation of multicellular programs and sample-level unsupervised analysis. The core methodology of our work is indeed based on the popular tool Multi-omics factor analysis (MOFA). We see the novelty of our study in the formulation of these relatively new applications within this framework, and the demonstration of the added value that this formulation provides building on MOFA’s strengths, in particular by expanding the possibilities of downstream analysis of single-cell data including the meta-analysis of distinct single-cell patient cohorts and its integration to complementary bulk and spatial data modalities.

      The simultaneous estimation of multicellular programs together with sample-level unsupervised analysis is only possible with a single available tool, scITD, which is limited by its modeling strategy, based on tensor decomposition: with tensor decomposition, multicellular programs can not be estimated from distinct feature sets across cell-types, making this method less flexible and sensitive to technical effects, such as background expression. We compared our proposed methodology with scITD and showed the benefits of using group factor analysis as implemented in MOFA for this task. Moreover, as of now, no other methodology is able to estimate multicellular programs and perform sample-level unsupervised analysis, simultaneously in multiple independent single-cell atlases. We also showed how multicellular programs are traceable in bulk transcriptomics data and show that they are better fit to classify heart failure patients compared to classic cell-type deconvolution approaches.

      Altogether, we believe that our current manuscript complements existing literature and puts forward an approach with distinct features to analyze single-cell atlases. We will edit the text to make more explicit the novelty and advantages of our proposed methodology, and we will emphasize that our work does not mean to propose a new method, but rather demonstrate how group factor analysis can be used for novel sample-level analysis of single-cell data. We plan to incorporate the suggestions by Reviewer #1 regarding the inclusion of additional datasets, model validations, and novel applications involving a direct modeling of cell-compositions and spatial organization of cells. Moreover, we plan to discuss perspectives on how cell communications can be incorporated in the analysis of multicellular programs as suggested by Reviewer #2. Additionally, we will correct all the figure and text typos identified by the reviewers. Finally, we will provide an R package (https://github.com/saezlab/MOFAcellulaR) and python implementations (https://liana-py.readthedocs.io/en/latest/notebooks/mofacellular.html) that facilitate the use of our approach.

      Please find below the point-by-point response to the reviewers in blue, numbered for convenience.

      __Reviewer #1 (Evidence, reproducibility and clarity (Required)): __

      Remark to authors

      Flores et al. present a pipeline in which they leverage MOFA framework, a matrix factorization algorithm to infer multi-cellular programs (MCPs). Learning and using MCP has already been proposed by others. Yet, authors pursue a similar goals by using MOFA, providing a cell*sample matrix for different cell types as different views (instead of multiple modalities/views) as the input. They later apply MOFA using this data format on a series of applications to analyze acute and chronic human heart failure single-cell datasets using MCPs. Authors further try to expand their analysis by incorporating other modalities.

      Major points:

      1.1 As briefly outlined in the remarks, the current manuscript needs novel findings and methodology to grant a research article which I can' see here. The underlying matrix factorization is the original MOFA (literally imported in the code) with no modification to further optimize the method toward the task. While I appreciate and acknowledge the author's efforts resulting in a detailed analysis of heart samples, I think all of these could have been part of MOFA's existing tutorials.

      Response 1.1 As the reviewer correctly states, we used the framework and code of MOFA. The novelty lies in its application for the unsupervised analysis of samples from cross-condition single-cell data and the inference of MCPs. MOFA is a statistical framework implementing a generalization of group factor analysis with fast inference and its current version fits the task of MCP inference and unsupervised analysis of samples across cell-types that provides a more flexible modeling alternative than current available methods (as presented in Table 1 of the manuscript). Current work on MCP inference is based on the premise of multi-view factorization with distinct statistical modeling alternatives. As mentioned in the discussion of our manuscript, three main points distinguish our discussed methodology from present alternatives and provide evidence about its relevance and uniqueness over available tools:

      Simultaneous unsupervised analysis of samples across cell-types and inference of MCPs, together with comprehensive interpretable descriptions of the reconstruction of the original multi-view dataset. This is only currently possible with scITD (Mitchel et al, 2022) and is compared in the manuscript. DIALOGUE (Jerby-Arnon & Regev, 2022) is limited to the generation of MCPs and Tensor-cell2cell (Armingol et al, 2021) is only focused in cell-communications with limited interpretability.

      Flexible non-overlapping feature set that handles better technical effects such as background expression, as discussed in section “__2.2 Multicellular factor analysis for an unsupervised analysis of samples in single-cell cohorts”. __Moreover, as mentioned by the reviewer in a later point (Reviewer comment 1.2), this enables joint modeling of distinct aspects of the tissue, such as cell compositions, cell communications (preliminary work: https://liana-py.readthedocs.io/en/latest/notebooks/mofatalk.htm) and spatial organization.

      Joint-modeling of independent atlases that enables meta-analysis at the sample level of cross-condition single-cell data. No currently available methodology is capable of performing similar modeling. For these reasons, we believe that our work is worth being discussed and presented to the community as a research article. We will modify the discussion to put more emphasis on the added value of group factor analysis as implemented in MOFA.

      Moreover, we now provide an R package (https://github.com/saezlab/MOFAcellulaR) and python implementations within our analysis framework LIANA (https://liana-py.readthedocs.io/en/latest/notebooks/mofacellular.html) that facilitates the usage of our proposed methodology. The R and python implementations are compatible with current Bioconductor and scverse pipelines, respectively.

      Application of our methodology to heart failure datasets also revealed novel knowledge about heart disease processes:

      In myocardial infarction, we found that our MCPs associated with cardiac remodeling capture cell-state-independent gene expression changes. This provides a novel understanding on the effect of disease contexts in the expression profiles of specialized cells. This finding was not reported in the original atlas publication.

      In chronic heart failure, we identified a conserved MCP of cardiac remodeling across patient cohorts and etiologies, suggesting a common chronic phase between distinct initial causes of heart failure.

      Moreover, we showed that deconvoluted chronic heart failure MCPs from bulk transcriptomics better classify patients in comparison to classic cell-type composition deconvolution of bulk data. To our knowledge, this finding was not presented in any of the manuscripts of other methodologies focused on MCPs.

      Altogether, our current work shows a novel application of group factor analysis for the simultaneous estimation of MCPs and the sample-level unsupervised analysis of cross-condition single cell data. We showed the unique features compared to current available tools. Distinct post-hoc analysis in combination with other data modalities shows the biological relevance of our proposed methodology to complement the tissue-centric knowledge of disease.

      1.2 How can you explain that the results in donor-level analyses are not due to technical artifacts (batch variation)? Can this be used to infer a new patient similarity map? For example, I would test this by leaving out a few patients from training, projecting them, and seeing where they would end up in the manifold or classifying disease conditions for new patients and explaining the classification by MCPs responsible for that condition.

      Response 1.2 When knowledge of the technical batches is available it is possible to test for association between these labels and the factors encoding MCPs as shown in Figure 2.

      In our current applications, we additionally showed the biological relevance of our estimated MCPs by mapping them to spatial and bulk data sets, which is a direct way of testing how generalizable were our findings:

      In the application of MOFA to human myocardial infarction data, we mapped the gene loadings conforming the MCP associated with cardiac remodeling to paired spatial transcriptomics datasets. We showed that in general, the cell-type specific expression of the MCP of cardiac remodeling encompassed larger areas in ischemic and fibrotic samples compared to myogenic (control) samples.

      In the application of MOFA to chronic human end-stage heart failure data, we mapped the gene loadings conforming the MCP associated with cardiac remodeling to 16 independent bulk transcriptomics datasets of heart failure. There we showed that the cell-type specific expression of the MCP of cardiac remodeling separates heart failure patients from control individuals. Regarding the generation of new patient similarity maps, it is possible to estimate the positions of new samples in the manifold formed by the factors representing the MCPs. As suggested by the reviewer we will show this by classifying heart failure single-cell samples using MCPs of two independent patient cohorts (presented in section 2.7).

      1.3 The bulk and spatial analysis are used posthoc after running MOFA, I think since MOFA can use non-overlapping features set, it would be interesting to see if deconvoluted bulk or ST data can be encoded as another view (one view from scRNAseq data for each cell-type and another view from bulk RNA-seq or ST, you can get normalized expression per spot (for ST) or per sample (for bulk) and use them as input.

      Response 1.3 Thanks for the suggestion. We agree that the possibility of using non-overlapping features opens options of complex models that include the cell-type compositional and organizational aspects of tissues. However these features must be quantified in the same sample, thus it is limited to samples profiled simultaneously at different scales.

      We will present the results of a sample-level joint model of multicellular programs together with cell-proportions and spatial dependencies using the myocardial infarction dataset presented in section 2.2. For this dataset based on our previous work we have the compositions of major cell-types and their spatial relationships based on spatially contextualized models (Kuppe et al, 2022). We will run a MOFA model and show how it can be used to find factors associated with structural and molecular features of tissues.

      __Minor: __

      1.4 Some figure references are not correct (e.g., "the single-cell data into a multi-view data representation by estimating pseudo bulk gene expression profiles for each cell-type across samples (Figure 1b)." should be figure 2b)

      Response 1.4 Thanks for pointing this out. We apologize for these mistakes and we will adjust all labels correctly.

      1.5 The paper is well written, but there could be some more clarifications about what authors consider as cell-type and cell-state, condition, MCPs which I think is critical to current analysis (see here https://linkinghub.elsevier.com/retrieve/pii/S0092867423001599) for the reader not familiar with those concepts.

      Response 1.5 We agree with the reviewer that it is important to introduce these concepts in more detail to avoid confusion. We will adapt the current manuscript to incorporate these definitions in the introduction.

      __Reviewer #1 (Significance (Required)): __

      1.6 While I find the concept of MCPs interesting, the current work seems like a series of vignettes and tutorials by simply applying MOFA on different datasets (The authors rightfully state this). However, It needs to be clarified what the novelty is since there is no algorithmic improvement to current MCP methods (because there is no new method) nor novel biological findings. Additionally, even in the current form, the applications are limited to the heart, and the generalization of this proposed analysis pipeline to other tissues and datasets is not explored. Overall, the paper lacks focus and novelty, which is required to grant a publication at this level.

      Response 1.6 As mentioned in response 1.1, we show that group factor analysis as implemented in MOFA has advantages given its flexibility of the feature space, the joint-modeling of independent datasets, and the interpretability of the model. We will make these advantages clearer in the discussion, and we will explicitly mention the disadvantages and lack of functionalities of available methods.

      The applications were mainly done in heart data for consistency although they represent four distinct single-cell datasets, one spatial transcriptomics dataset, and 16 independent bulk transcriptomics datasets. For completeness, as suggested by the reviewer, we will show the application of our methodology to peripheral blood mononuclear cell data of lupus samples (preliminary results: https://liana-py.readthedocs.io/en/latest/notebooks/mofacellular.html)

      __expertise: Computational biology, single-cell genomics, machine learning __

      __Reviewer #2 (Evidence, reproducibility and clarity (Required)): __

      Summary:

      The authors use MOFA, an unsupervised method to analyze multi-omics data, to create multicellular programs of cross-condition multi-sample studies. First, for each cell-type, a pseudobulk expression matrix per sample is created. The cell-type now functions as the separate view, typically reserved for the different omics layers in MOFA. This then results in a latent space with a certain number of factors across samples. The factors, representing coordinated gene expression changes across cell-types, can then be checked for associations with covariates of interest across the samples.

      MOFA is well-suited for this task, as it can handle missing data and it is a linear model facilitating the interpretation of the factors. Users should be aware that MOFA can estimate the number of factors, but the pseudobulk profiles require a rigorous selection of cell-type specific marker genes. The result will be most suited for downstream analysis if there is a clear association with one factor and a clinical covariate of interest. In a final step, a positive or negative gene signature can be created by setting a cut-off on the gene weights for that specific factor.

      The method is applied on 3 separate data sets of heart disease, each time demonstrating that at least one of the factors is associated with a disease covariate of interest. The authors also compare the method to a competitor tool, scITD, and explore to what extent a factor mainly captures variance associated with (i) a general condition covariate or rather (ii) specific cell states.

      The multicellular programs are also mapped to spatial data with spot resolution. Though this analysis does not bring any novel biological insight in the use case, it does support the claim that the programs are associated with the covariate of interest.

      The most interesting applications of MOFA are in my opinion the potential for meta-analysis of single-cell studies and validation of cell-type specific gene signatures with publicly available bulkRNAseq data sets.

      The authors provide various data sets and data types to support their claims and the paper is well written. The relevant code and data has been made available.

      We thank the reviewer for the positive comments to our work.

      __Major comments __

      2.1 What is the added value of the gene signatures obtained from MOFA compared to e.g. a naive univariate approach? In theory, a similar collection of genes or gene signature could be obtained by running a differential gene expression analysis across the samples for each cell-type (e.g. myogenic vs ischemic ) and applying a set of relevant cut-offs or filters on the results. In other words, does MOFA detect genes that would otherwise be missed?

      Response 2.1 Thank you for the relevant comment. The original motivation of our work is the unsupervised analysis of samples based on a manifold formed by a collection of multicellular molecular programs. We envisioned that this unsupervised analysis would be relevant in situations where a clear histological or clinical classification of samples is not possible with reliability. As mentioned by Reviewer #1 in comment 1.2, one advantage of these approaches is that they create patient similarity maps, which have been shown useful to stratify patients in a recent analogous work in multiple sclerosis (Macnair et al, 2022). The cell-type signatures obtained from relevant factors explaining the patient stratification avoid the likelihood of performing “double dipping” by avoiding the need of a direct differential expression analysis between newly formed groups.

      In our applications, the generation of cell-type signatures (here called multicellular programs) associated to a specific clinical covariate (eg. control vs perturbation) are post-hoc analyses of the generated manifold. And as the reviewer correctly points out, these signatures should be similar to performing direct differential expression analysis between those patient conditions. In the related work of scITD (Mitchel et al, 2022) the authors showed high concordance between the cell-type signatures and the results of differential expression analysis. For completion, we will similarly quantify the degree of overlap between genes of our generated signatures with the ones coming from differential expression analysis.

      It is relevant to mention that in complex experimental designs with multiple conditions, our approach facilitates patient ordering, which allows the understanding of one condition in the context of all the others, avoiding the need of multiple testing and the definition of multiple contrasts, as mentioned in the text.

      We will incorporate these points in the discussion section of the manuscript.

      2.2 Could scITD also be used for meta-analysis or could the obtained gene signatures of that method also be mapped to bulkRNAseq data? If so, it would be interesting to show the relative performance with MOFA. If not, this specific advantage should be highlighted.

      Response 2.2 Thank you for pointing this out. scITD does not provide a group-based model to perform meta-analysis, and this feature is one of the main advantages of group factor analysis as currently implemented in MOFA. We will highlight this feature in Table 1 and in the discussion.

      Although scITD signatures of a single study could be mapped to bulk transcriptomics data, the stringent tensor representation leads to the generation of signatures that may be influenced by technical effects as shown in the manuscript section 2.2. Thus we believe that the flexibility of the feature space in MOFA is an advantage for this task. We will add this observation to the discussion.

      2.3 Users need to specify gene set signatures based on the weights for a factor of interest. This might suggest a limitation to categorical covariates of interest. If the authors see potential for a continuous covariate of interest, this should at least be highlighted in the text and if possible demonstrated on a use case.

      Response 2.3 In our applications we limited ourselves to categorical variables, however, it is possible to associate factors to continuous variables. An implementation of the association with continuous variables is already available in our newly created R package “MOFAcellulaR”: https://github.com/saezlab/MOFAcellulaR/blob/main/R/get_associations.R.

      The datasets we analyzed have no continuous clinical covariates to showcase this functionality, but as suggested by the reviewer we will highlight this feature in the text.

      __Minor comments __

      2.4 In Figure 2c the association between factor 2 and the technical factor shows a very strong outlier. Please verify that the association is still significant after applying a more robust statistical test (e.g. non-parametric test as Wilcoxon).

      Response 2.4 Thanks for the observation, we will test these differences with a non-parametric test.

      2.5 For mapping the cell-type specific factor signatures to bulk transcriptomics, the exact performed comparison or model is unclear. There are seven cell-type signatures for each sample in every study. Was there a t-test run for each cell-type or was a summary measure taken across the cell-types? he thresholding is also rather lenient (adj. p-val 0.1).

      Response 2.5 We are sorry for not being clear about our procedure. After identifying the multicellular program associated with heart failure estimated from the two single cell studies meta-analyzed, we calculated the weighted mean expression of the seven cell-type signatures independently to every sample of the 16 bulk studies. In other words each sample within each bulk study will be represented by a vector of 7 values representing the relative expression of a cell-type specific signature (Figure 6D-left). For each bulk transcriptomics study, first, we centered the gene expression data before calculating the weighted mean.

      In supplementary figure 4-e we show the results of performing a t-test of the cell-type scores between heart failure and control samples within each study. Given the relative low sample size of most of the studies (affecting the power of the test), we chose a not so stringent adjusted p-value. For completion, we will show the results of a more classical threshold (adj. p-value

      2.6 typo in abstract: In sum, our framework serves as an exploratory tool for unsupervised analysis of cross-condition single-cell ***atlas*** and allows for the integration of the measurements of patient cohorts across distinct data modalities

      Response 2.6 Thanks for pointing out this typo. We will modify the text.

      2.7 In Figure 4a it is not clear to me why on the one hand we see marker enrichment vs loading enrichment with healthy and disease.

      Response 2.7 We apologize, this is a typo after editing the labels. Both should contain the marker enrichment label. We will fix this.

      2.8 IN Figure 4b it would help if the same color scheme would be maintained throughout the paper (here now black and white) and if for the cell states the boxplots would be connected per condition, emphasizing the (absence) of change across cell states within a condition.

      Response 2.8 We thank the reviewer for the suggestion. We will reorganize the panels showing the gene expression per condition and fix the color scheme.

      __Reviewer #2 (Significance (Required)): __

      __General assessment: __

      2.9 MOFA is well-suited for detecting multicellular programs because it can handle missing data and allows for easy interpretation of the factors as a linear method. It might have particular potential for meta-analysis across multiple studies and reevaluating bulkRNAseq data sets, but in the current manuscript it is unclear to what extent this is a specific advantage of MOFA or could also be done with competitors. The authors show how the obtained results and associations with clinical covariates can be validated across multiple data types. How the resulting multicellular programs can provide additional biological insight or form the starting point for additional downstream analysis (e.g. cell communication) is not covered in the paper.

      Response 2.9 We thank the reviewer for highlighting the methodological advantages of group factor analysis for the estimation of multicellular programs and the unsupervised analysis of samples from cross-condition single-cell atlas. As mentioned in response 1.1 and 2.2, the added value of our methodology is the flexibility of feature views (that goes beyond gene expression) and simultaneous modeling of independent single-cell datasets, a feature not present in any of the currently available methods that facilitates the meta-analysis of datasets across modalities.

      While we interpret the presented multicellular programs in the context of cellular functions and the division of labor of cell states, it is true that we did not attempt to provide mechanistic hypotheses, for example, via cell-cell communication, on how this coordination across cell-types emerges.

      Previous work of the related tool Tensor-cell2cell (Armingol et al, 2021) has presented the idea of the estimation of multicellular programs from cell-cell communications and group factor analysis can also be used for this task (preliminary work: https://liana-py.readthedocs.io/en/latest/notebooks/mofatalk.html). We will discuss in the text perspectives on how the estimation of multicellular programs can be linked to the inference of cell communications from single-cell data together with analysis alternatives previously proposed by scITD and Tensor-cell2cell. However, we believe that this question requires further work and it is out of scope of our current manuscript.

      __Audience: This paper will be mainly of interest to a specialized public interested in unsupervised methods for large scale multi-sample and multi-condition studies. __

      __Reviewer: main background in the analysis of scRNAseq data. __

      __Reviewer #3 (Evidence, reproducibility and clarity (Required)): __

      This manuscript by Saez-Rodriguez and colleagues proposes to repurpose Multi-Omics Factor Analysis for the use of single cell data. The initial open problem stated by the paper is the need for a framework to map multicellular programs (such as derived from factor analysis) to other modalities such as spatial or bulk data. The authors propose to repurpose MOFA for use in single cell data. Case studies involve human heart failure datasets (and focuses on spatial and bulk comparisons).

      There are particular issues with clarity regarding the key methodological contribution (and assessment of it), discussed under significance.

      __Reviewer #3 (Significance (Required)): __

      3.1 I am very puzzled by the repeated claims the manuscript makes that their central methodological contribution and innovation is to use MOFA for single cell data. One of their citations for MOFA is to MOFA+, which is precisely that (in a relatively popular manuscript published by the original authors of MOFA and not overlapping with the present authors). I am left to wonder what I missed.

      Response 3.1 We apologize for the misunderstanding, as mentioned in the response to review 1.1 and explained by reviewer 2’s summary, the main objective of our work is to use the statistical framework of group factor analysis for the inference of multicellular programs and the sample-level unsupervised analysis of cross-condition single-cell data, which is a distinct task to multimodal integration (Argelaguet et al, 2021).

      While it is true that MOFA+ introduced expansions to the model for the modeling of single-cell data, namely fast inference and group-based modeling, the main focus in their applications is the multimodal integration of data, where each cell is represented by a collection of distinct collection of features (e.g. chromatin accessibility and gene expression). Unlike multimodal integration, here we propose a different approach to analyze single-cell data at the sample level instead of the cell level, without modifying the underlying statistical model (see section 2.1 of the manuscript).

      In detail, what we assume is that samples of single-cell transcriptomics data (e.g. tissue from a patient) can be represented by a collection of independent vectors collecting the gene expression information of cell types composing the tissue analyzed. Decomposition of these multiple views with group factor analysis produces a manifold that captures multicellular programs (coordinated expression processes across cell-types), or shared variability across cell-types simultaneously. Altogether, this represents a novel usage of group factor analysis in an application for the inference of multicellular programs, where the main focus is not at the cell-level but at the patient level.

      As a side note, Britta Velten, one of main developers of MOFA and coauthor of both the MOFA and MOFA+ papers, is a contributor and coauthor of this manuscript, and Ricard Argelaguet, who also led both versions of MOFA, gave us helpful feedback and is acknowledged as such on this work.

      3.2 Multimodal integration methods are fairly numerous and even if they're not all exactly factor analyses, it's strange to argue that MOFA fills some unique conceptual gap. I agree it fills something of an interesting gap (except for MOFA+ already filling it), but it's not like the quite popular spatial to single-cell integration approaches aren't doing similar things. If this is a methods paper (as it is presented) then there would have to be very substantially more comparative evaluation to these other approaches.

      Response 3.2 As presented in the previous response (3.1) our current work is not focused on multimodal integration, but rather the inference of multicellular programs and the sample-level unsupervised analysis of single-cell data. Given this, in the current manuscript we compared our proposed methodology with the only three other available methods that address at least partially the inference of multicellular programs (see Table 1 in our manuscript). In response 1.1 and 3.2 we discussed the advantages of our proposed methodology compared to available methods. In the manuscript section 2.2 we compared group factor analysis with tensor decomposition and showed that the former better deals with technical artifacts and better identifies known patient groups.

      We will distinguish our work from multimodal integration explicitly in the introduction and the manuscript section 2.1 to avoid confusions.

      3.3 The biological use cases are comparatively interesting and dominate the manuscript (but are still presented principally as use cases rather than a compelling biological narrative of their own).

      Response 3.3 The focus of our manuscript was the reintroduction of group factor analysis for the novel applications of the inference of multicellular programs and the sample-level unsupervised analysis from single-cell data. Given the distinct possibilities of post-hoc analyses, we mainly used acute and chronic heart failure data to showcase the utility of MOFA to connect spatial and bulk modalities with single-cell data.

      That said, as discussed in response 1.1, our analyses allowed to generate novel hypotheses of these datasets:

      In myocardial infarction, we found that our estimated multicellular programs associated with cardiac remodeling capture cell-state-independent gene expression changes. This provides a novel understanding of the effect of disease contexts in the expression profiles of specialized cells. In other words, we found that cell-states, regardless of their specialized function, share a common response in the tissue context.

      In chronic heart failure, we identified a conserved multicellular program of cardiac remodeling across patient cohorts and etiologies, suggesting a common chronic phase between distinct initial causes of heart failure, which again may be linked to the dominating response to the tissue context that is shared across etiologies.

      These two results support the observation that deconvoluted chronic heart failure multicellular programs from bulk transcriptomics better classify patients in comparison to classic cell-type composition deconvolution of bulk data. To our knowledge, this finding was not presented in any of the manuscripts of other methodologies focused on MCPs. We summarize these results in the third paragraph of the discussion in the manuscript:

      “In an application to a collection of public single-cell atlases of acute and chronic heart failure, we found evidence of dominant cell-state independent transcriptional deregulation of cell-types upon myocardial infarction. This may suggest that while certain functional states within a cell-type are more favored in a disease context, most of the cells of a specific type have a shared transcriptional profile in disease tissues. If part of this shared transcriptional profile is interpreted as a signature of the tissue microenvironment that drives cells in tissues towards specific functions, this result may also indicate that a major source of variability across tissues, besides cellular composition, is the degree in which the homeostatic transcriptional balance of the tissue is disturbed. By combining the results of multicellular factor analysis with spatial transcriptomics datasets, we explored this hypothesis and identified larger areas of cell-type-specific transcriptional alterations in diseased tissues. Given these observations on global alterations upon myocardial infarction, we meta-analyzed single-cell samples from two additional studies of healthy and heart failure patients with multiple cardiomyopathies. Here, we found a conserved transcriptional response across cell-types in failing hearts, despite technical and clinical variability between patients. Further, we could find traces of these cell-type alterations in independent bulk data sets. These observations suggest that our approach can estimate cell-type-specific transcriptional changes from bulk data that, together with changes in cell-type compositions, describe tissue pathophysiology. Altogether, these results highlight how MOFA can be used to integrate the measurements of independent single-cell, spatial, and bulk datasets to measure cell-type alterations in disease.”

      To fully assess the relevance of these observations, they should be investigated in more datasets and analyses, where shared functional cell-states across distinct heart failure etiologies are identified and then compared at their compositional and molecular level. This, in our opinion, represents an independent study on its own.

      3.4 Altogether, I found the framing of this manuscript very puzzling. It is possible the result would be more clearly presented if the use case was the major focus rather than the more conceptual point about factor analysis.

      Response 3.4 Thanks for the suggestion. The major aim of this manuscript is to highlight the versatility of the generalization of group factor analysis as implemented in MOFA for novel applications in single-cell data analysis, beyond multimodal integration of single cells. The definition of multicellular programs from single-cell data and its sample-level unsupervised analysis are relatively new analyses in the field, and thus we believe that it is timely to show how a known statistical framework can be used for these applications.

      We believe that a detailed analysis of single-cell datasets of heart failure deserves its own focus and it is out of scope of our current objective with this manuscript. We apologize for the apparent misunderstanding of the objective of our methodology. We will add these distinctions in the introduction of the manuscript.

      References

      Argelaguet R, Cuomo ASE, Stegle O & Marioni JC (2021) Computational principles and challenges in single-cell data integration. Nat Biotechnol 39: 1202–1215

      Armingol E, Baghdassarian H, Martino C, Perez-Lopez A, Knight R & Lewis NE (2021) Context-aware deconvolution of cell-cell communication with Tensor-cell2cell. BioRxiv

      Jerby-Arnon L & Regev A (2022) DIALOGUE maps multicellular programs in tissue from single-cell or spatial transcriptomics data. Nat Biotechnol 40: 1467–1477

      Kuppe C, Ramirez Flores RO, Li Z, Hayat S, Levinson RT, Liao X, Hannani MT, Tanevski J, Wünnemann F, Nagai JS, et al (2022) Spatial multi-omic map of human myocardial infarction. Nature 608: 766–777

      Macnair W, Calini D, Agirre E, Bryois J, Jaekel S, Kukanja P, Stokar-Regenscheit N, Ott V, Foo LC, Collin L, et al (2022) Single nuclei RNAseq stratifies multiple sclerosis patients into three distinct white matter glia responses. BioRxiv

      Mitchel J, Gordon MG, Perez RK, Biederstedt E, Bueno R, Ye CJ & Kharchenko P (2022) Tensor decomposition reveals coordinated multicellular patterns of transcriptional variation that distinguish and stratify disease individuals. BioRxiv

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      The work presented here examined the combined contribution of intermediate gray matter spinal interneurons of the spinal lumbar enlargement (L2-L4) to locomotion in rats. By targeting this region with kainic acid, we were able to produce a specific locomotor signature that was not compensated for over time, indicating the need for cellular replacement therapies in the treatment of such spinal cord injuries leading to the loss of spinal enlargement intermediate gray matter. Further, the newly developed techniques of a combinatorial behavioral assessment using Random Forest classification and a machine learning intermediate gray matter neuronal loss assessment established in this work add an unbiased, in-depth approach that we are making available to others.

      The reviewers have critically evaluated our work and highlighted points of weakness either in the research itself or in connecting with our audience. Below is our detailed response to all the comments as well as our revision plan for submission. We believe we have been able to sufficiently address the concerns that were voiced to strengthen our manuscript and express our gratitude for the feedback.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)): __ In this paper, Kuehn and colleagues report on the analysis of functional impairments following intermediate gray matter lesion with kainic acid. The image convincingly show that mostly purely grey matter lesion can be achieved throughout the paper. The authors took care to do a battery of well-designed behavioral tests and sophisticated analysis in order to access functional impairment. They then correlate their behavioral assessment to lesion size, the number of NeuN positive cells in layers V-VII epicenters as well motoneuron numbers and the percentage of white matter. Overall, the manuscript is well written, nicely framed in the existing literature, very clear and the experiments are simple but well designed. The behavioral testing and evaluations including random forest ranking are well performed. The methodology is complete and would allow reproducing the experiments. Statistics are used appropriately. We have however some reserves and comments on some of the results and interpretations. Addressing these comments would not involve new experiments but new re-analysis of the existing datasets.

      Major comments:__

      __ While the claims that grey matter lesions trigger major behavioral impairments is convincing in particular with the refine behavioral experiments performed, the key claim that only interneuron loss in layer V-VII mediates those deficits is currently not supported by the presented data. In particular, we would suggest that the lesions performed, in contrast to the claims, are not purely and selectively impacting layer V-VII but might also impact layers VIII-IX. We think that presenting neuronal counts based on NeuN staining separately for layer I-IV, V-VII, VIII-IX and comparing control vs KA is necessary. Only with these data can conclusions be supported either in the direction suggested by the authors or otherwise.__

      • Although primarily targeting laminae V-VII, we realize this is not exclusively doing so with our lesion model. We understand the value of what you request and are retraining our computer models to be able to do the additional neuronal quantification in laminae I-IV, VIII, IX. We will then combine lamina VIII with laminae V-VII to make up the intermediate gray matter NeuN counts. Completion of all manually validated new analysis is ongoing and will be finished shortly. We plan on adding this additional analysis to the paper, which means much of Figure 6 and Supplementary Figure 3 will be altered and partially for Figure 7, but we won’t know exactly how until we finish the analysis. Tracked changes are shown in the updated manuscript PDF and highlighted text may change depending on results of this analysis.

      Another claim relative to the lack of involvement of motoneurons in the related behavioral deficits is also difficult to resolve with the current data. Motoneurons have been identified based on NeuN staining and size. While this is not the state of the art (ChAT staining would have been preferable), it remains acceptable. However, the data presented figures 7 and 8 show a very wide range in the motoneuron count (15 to 50) indicating either motoneuron loss or a count performed at different lumbar levels in the animals. This raises questions on the model (is it really involving only layers V-VII?) or on the interpretation of the data. Therefore we believe that motoneurons counts need to be presented separately (see above) in control vs KA groups and data need to be discussed in this perspective. Authors should also tone down the specificity of the model and involvement of motoneurons accordingly (page 20 for example).

      • Although we agree with the reviewers that ChAT staining would have been preferable, we had a limited amount of tissue available. Our unbiased, machine-learning-based analysis of neuronal loss by NeuN required much of the existing tissue. However, neuronal staining has been previously established to identify motoneurons based on size inclusion (Hadi et al., 2000; Wen et al., 2015), as we have used here. Additionally, we will be including total neuronal analysis from lamina IX as requested (please see answer to previous comment).

      • By including the Controls along with the KA rats, we postulate that the wide range of motoneuron numbers is due to natural individual variation as well as due to variation at each spinal level, and not due to the KA lesion, as the KA animals have a range of motoneuron counts, sometimes even greater than the controls (Figure 7 and 8). However, as requested, we have split L2, L3 and L4 (graphs below) and still do not see a correlation with behavioral performance (BBB and inclined beam). The variation due to spinal level may partially be explained by the fact that there are different numbers of motoneurons at each spinal level, dependent upon the number of muscles each spinal level is responsible for and the number of motor columns at a given level (Mohan et al., 2015; Nicolopoulos-Stournaras & Iles, 1983). These counts are taken from a given section and not the entirety of the spinal level, adding further possible variation. Moreover, we have removed the controls as suggested (graphed below) for motoneuron analysis and still do not see a correlation between the number of motoneurons and behavioral performance (BBB and inclined beam). We do not find this the correct way to graphically represent the data as it does not allow the reader to see the natural number of motoneurons that exist at each spinal level and variation within as well as knowing that this is not due to injury correlating with behavioral differences, and therefore we would like to keep these graphs with controls in the manuscript.

      We have toned down the specificity of the model and involvement of motoneurons as requested on pages 20-21.

      Most of the conclusions rely on correlations that include control animals (injected with saline hence with no lesions and no behavioral deficits; Fig 6 and 7). This artificially skews the correlations as those animals show no lesions and good performance in the behavioral tests. These correlations need to be performed only with KA injected animals to determine the respective involvements of interneurons and motoneurons.

      • To address your concern, we first did as you asked and removed the controls and performed the correlation analysis for Figure 6, shown below. There are no significant correlations between neurons at each spinal level and behavior. We would further argue that unlike a contusion injury where control animals only receive a laminectomy, our control animals have very minor neuronal loss due to the saline injection itself and therefore do have a minor lesion. An example of this is seen in Figure 6 for the control animal at spinal level L2 where the pipette track is visible. Therefore, to show that the observed behavioral deficits are from the kainic acid and not the injection itself, we would argue that it is important that the control animals remain in the correlation analysis.

      The long-term study (Fig 8) is performed with very few animals and hence, drawing conclusions from these animal numbers is difficult. All correlations are performed including control animals which is even more of a problem here as in Figure 6 and 7 due to the low number of animals. The authors should either add animals or remove the figure. When control animals (injected with saline) are removed (as they do not show any lesion and perform accurately in the behavior), one would actually see a correlation between the number of motoneurons and the behavioral performance (Fig. 8E,F) but not with the lesion size (Fig.8C,D).

      • The long-term study was planned with more animals, but due to exclusion criteria by lesion length, the numbers remain low. We had discussed extensively whether to include this data in the manuscript or not. We decided for several reasons to include it in the manuscript within the main figures. First, it demonstrates that once these interneurons are lost, there are no compensatory mechanisms that restore function, which is quite striking given that the ones that lose weight support by 2 weeks do not regain it over a 3-month observational period. Further indicating that loss of lumbar gray matter interneurons is essential to locomotor function of hindlimbs and should be targeted in SCI replacement therapeutics. However, we do not agree with removing controls to examine the motoneuron number as there is motoneuron number variation within the lesion area and the motoneuron number from the KA animals is within the Control motoneuron range, which can be seen with the graph including the Controls. We can provide the individual spinal lesion level correlations, but this does not provide the entire picture as one level alone has not been found to be essential to the behavioral deficits. We are currently processing these animals to also provide NeuN numbers from laminae I-IV, V-VIII and IX.

      Minor comments:

      __ Figure 1A: if lesions are bilateral, it would be nice to illustrate this on the schematic.__

      • This has been fixed. Figure 1B-D: scale bars are missing

      • This has been fixed. Figure 3H: What represents the y-axis? % of completion or number of completion?

      • This has been fixed. Figure 4 Table: Please specific what the acronym stands for: pLDA.

      • This has been clarified in the figure legend. Figure 6 A: scale bars are missing

      • This will be fixed when the data for the analysis is finished and the figure is redone. Figure 6B/C/D: please add the spinal level analyzed directly on the graphs. This will ease the comprehension.

      • This has been adjusted. Figure 7 and Figure 8: While it is quite convincing that the model is purely a grey matter injury (panel C and D), the data are very much spread out for the number of motoneurons per mice (see major comments above). We would suggest to plot those data to present the number of neurons (interneurons in layer I-IV, V-VII and motoneurons) control vs KA.

      • Thank you for the suggestion. We will plan on presenting the additional neuronal quantification data mentioned above by comparing Controls and KA animals.

      Dots are missing on those figures (probably superimposed on top of each other). This should be changed to see all data points

      • Thank you for the observation. They were superimposed but we have fixed this. Figure 8E,F: the number of motoneurons is very low also in controls. How is this explained?

      • Depending on where the section was taken at each spinal level, there is variation in the number of motoneuron columns innervating targeted muscles (Mohan et al., 2015), Figure 6). Therefore, it is not surprising to see a range of motoneurons. In addition, we would like to clarify that these motoneuron counts are taken from only three sections across the lesion (from the three lesion injection epicenters), not the whole lumbar section. Often the motoneuron number in the KA group was equal to or greater than the Control group, indicating more often variation than motoneuron loss. Regardless motoneuron numbers do not correlate with the observed behavioral deficits.

      __Reviewer #1 (Significance (Required)):

      This paper by Kuehn and colleagues reports on the functional impairments that follow intermediate gray matter lesions using kainic acid. This work is largely confirmatory of previous studies (Magnuson et al., 1999; Hadi et al., 2000) with modern behavioral evaluation. After revision, it would provide a description of the functional impairments following those specific lesions. The paper would be informative for a specific audience in particular scientists in the field of spinal cord injury and spinal interneuron. Our field of expertise is spinal cord injury, inflammation, behavior and axon outgrowth.__

      __Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      This manuscript reports on a pair of well-designed and well-carried out studies investigating a Kainic Acid (KA)-mediated gray matter lesion in the lumbar enlargement of adult female SD rats. The investigators demonstrate, using NeuN immunohistochemistry, that the KA lesion reduces NeuN positive cells along the length of the lumbar spinal cord from rostral to L2 to slightly caudal to L4 following 6 separate injections made on the right and left sides of the spinal cord at L2, L3 and L4. The investigators made significant efforts to avoid depleting neurons in the dorsal and ventral horns, and the evidence provided suggests they were successful. The methodology described is sound and sufficient details are provided to allow the reader to fully understand the studies. It is outstanding that the study was done while following all of the PREPARE and ARRIVE guidelines. A second major component of the work is the use of multiple outcome measures and efforts (using a Forest analysis) to develop a relatively quick, accurate and efficient system to screen or classify the injuries in individual animals within 2 weeks of the injury so that subsequent treatments could be done on animals which received injuries of sufficient severity (within a relatively narrow range) and with balanced experimental and groups. Again, with this effort the investigators were largely successful. The KA lesion results in persistent locomotor and sensorimotor deficits, that plateau early without substantial sensory dysfunction.__

      Major Comments:

      __ Introduction: Overall, the rationale presented and the review of the pertinent literature is solid, with the following exception: The authors state that their model should allow them to thoroughly investigate the behavioral readout of premotor IN loss. It is generally accepted that the designation of premotor interneurons refer to those directly connected to motor neurons, and while the chosen KA lesion certainly targets some premotor neurons, it also targets many other interneurons that do not directly contact motoneurons. Please revise how the lesions are referred to. In the very next paragraph the targets are defined somewhat differently as "INs and propriospinal INs in laminae V-VII in spinal levels L2-L4".__

      • We agree that our wording does cause confusion to the reader and to avoid this we have now made the change from premotor INs to SpINs (pages 3-5).

      • On a side note, we would like to state these are adult female Fischer rats and not adult female SD rats, also described in the methods.

      Spared white matter. In many (but not all) labs, spared white matter at the epicenter is an important measurement because it presumably represents all the spared axons, such that any/all rostrocaudal communication is represented. Thus, it is the single point (or section, in this case) that has the smallest number of axons represented as stained white matter. So, to indicate that you assessed "three epicenters per spinal cord" doesn't make sense in this context, Even if you are referring to three separate KA injection sites (L2, L3 and L4). Thus, averaging three sections also doesn't really make sense because the actual epicenter should be represented by the single cross section that has the smallest area of stained white matter. Also related to spared white matter, in many labs they calculate %SWM based on a section from a control animal, and this should reduce variability because some cords shrink (injured gray matter) more than others after the injury, whether it be a contusion or mild excitotoxic injury. Please either re-calculate your SWM or provide additional justification for your current method.

      • We agree with the reviewer that normally only the epicenter of the lesion needs to be examined for white matter damage as once the connection is severed it does not matter what is rostral or caudal to this site. However, in our case we do not find any significant differences in white matter between the Controls and the KA groups. To be certain we looked at all three lesion epicenters where the damage occurred. If you examine the graphs below, you will notice that in fact the KA animals have a higher % white matter of the CSA than the Controls. Given how this analysis is done we are looking at % white matter of the cross sectional area (CSA). In the KA animals the loss of gray matter causes a collapse that makes it appear as though the white matter covers more of the CSA area than it normally does. Even if we were to normalize to the Controls you would see the same as what you already observe in these graphs.

      For this reason, we have compared the average area of white matter at the three lesion epicenters between the Control and KA groups and did not find significant differences (new Figure 7C). We also evaluated average area of white matter at the individual spinal levels (L2-L4) and did not find significant differences between the two groups and therefore averaged them. This indicates that we are not seeing any white matter alterations with our lesion model.

      Results: Within the results (and elsewhere) there are a number of un-supported statements that should be removed, softened or supported. For example, on page 18 the authors talk about how the CatWalk "further investigates the role of propriospinal INs connecting the cervical and lumbar enlargements" and no reference is provided.

      • The requested references have now been added.

      It is important to note that two animals were not included overall because they were unable to perform the CatWalk assessment. Additional information about these animals might be helpful to further characterize the KA lesions, for example, when they are too large.

      • Yes, we have looked into this. Lesion size appears to play a role (Figure 2C) but does not appear to be the only determining factor as two animals (KA#6, KA#7) with and without weight support had the same lesion length (10,325um). We predict this is due to the amount of neuronal loss; KA#7 had greater neuronal loss in all three levels compared to KA#6.

      Figure 6 brings up a number of questions including how the three "epicenters" were determined and how some KA lesioned spinal cords appear to have more than 100% the number of neurons in the control spinal cords. Yes, there is variability in normal animals, but still this seems unlikely. Is it possible that the KA injection sites were not accurate in these animals? I know it is unlikely, however, the large number of neurons in some animals at L2 is bothersome. Did the investigators always inject L2, L3 and L4 in that order? Pipettes tend to wick up liquid thus diluting the drug/cells/whatever at the tip.

      • We understand your concern of being greater than 100%, therefore we have changed the normalization to the greatest control value vs the average of controls (except for lesion size which is done to largest lesion size overall, new Supplemental Figure 3 and Figures 6 and 7 will be altered once our new analysis is finished).

      • Animal KA #1 that you are referring to could have been a technical error due to injections but it is hard to say at this point as we have found nothing from our surgery records that indicate why this animal would be different from others. Yes, bilateral injections were always performed in the same order (L2, L3, L4). However, we think it is unlikely that this created a significant drug dilution problem as we see animals with more damage in L4 than L3 or L2 (KA #3, #6 and #7 in new Supplemental Figure 3). But clearly L2 in animal KA 1 is not significantly damaged.

      Also for Figure 6, I am not convinced that the color coding is really very useful here. I think what might be more useful would be some higher magnification images of the intermediate gray matter. This figure also appears to show pipette tracks in some sections suggesting that the KA was leaking up the track either during injection or when the pipette was withdrawn. This is not a serious issue, but might be worth mentioning as a confound.

      • First, we would like to clarify that Figure 6 is already a higher magnification image of only laminae V-VII not the entire gray matter (please see figure legend). Figure 1 is a lower magnification but here in Figure 6 we wanted to highlight the region of interest that was analyzed for neuronal loss. Pipette tracks were also observed in the Controls and not thought to be due to KA leakage, as we don’t see neuronal damage beyond the injection tracts in the dorsal horns. With the new figure we will see if the color coding will be added or not dependent on the space available.

      Finally, for Figure 6, the correlations shown are quite poor, and would be even worse of control animals were not included. Too much strength is given to these findings.

      These issues with Figure 6 become even more serious as we move to Figure 7. Here, looking at the correlation to loss of MNs is weak because this reviewer is not convinced that looking at the "three epicenters" is a valid approach. Were the epicenters identified by particular criteria? Also, I think images showing how MNs were identified and counted would be important, in particular since you did not use ChAT staining but relied on NeuN and size.

      • These epicenters were chosen after reviewing all coronal sections in a 1:7 series of the lumbar cord (T12-L5). The three epicenters were the three coronal slices with the greatest neuronal loss (methods, page 12). This is supported by the inflammatory response in these sections (not shown).

      • Please see the schematic below that explains the motoneuron analysis that is also performed in our work which is detailed in the methods. Briefly, the cell soma area of NeuN+ cells in lamina IX were measured in Image J. NeuN+ cells with an area greater than 916µm2 were used for the motoneuron analysis (Wen et al, 2015).

      • We agree that in Figure 6 the correlations for each spinal level although significant are moderate but this is due to the fact that one given spinal level was not found to be responsible for the behavioral deficits. This is supported by our work on correlation with lesion length, the lesion must span multiple levels to produce the behavioral deficit. Finally, the correlations may change when we add in lamina VIII, but we won’t know until the analysis is finished.

      • As for Figure 7, we agree that we do not see correlations and our argument is that motoneuron and white matter area are not responsible for the behavioral deficits we observe (new Figure 7). Therefore, you are reading those correctly, these are not significant correlations.

      Discussion Yes, interneurons in the intermediate gray matter throughout the lumbar enlargement "regulate lower motoneurons" but they also do other things, most notably communicating both intra and intersegmentally (short and long propriospinals). Please adjust this statement.

      • We appreciate this detailed feedback, we have adjusted this statement to the following:

      “Damage to this area, which includes regulation of lower motoneurons leads not only to gross motor deficits (BBB score), but rhythmic and skilled walking (even and uneven horizontal ladders), coordination (BBB subscore), balance (inclined beam) and gait deficits (CatWalk), as well.” (page 25)

      On page 25, you talk again about premotor SpINs. I understand that you are using this term/nomenclature to distinguish these INs from motoneurons, but this is problematic because many if not most of your readers will assume the premotor SpINs synapse directly onto MNs, which of course many of the INs that are eliminated by KA do not. Calling them simply SpINs would be sufficient and still distinguish them from MNs.

      • We have adjusted this to the term “SpIN and premotor circuitry” on pages 26 and 27.

      On page 27 you talk about the RI, and while there is a statistically significant drop in RI, it must be admitted that the RI remains above 90% (0.9) which means that 9 out of 10 steps use a normal sequence. Thus, I think it is misleading to indicate that this indicates a difference for the KA animals. In fact, I think it is more important to consider how these animals were able to maintain an RI in excess of 90% despite the loss of substantial numbers of INs.

      • Thank you for the comment, we have adjusted this in the discussion:

      “In addition to gait rhythm changes, we also saw significant differences in pattern generation. The regularity index (RI) measures correctly sequenced footsteps and is used to analyze recovery in mild to moderate injuries and coordination (Koopmans et al., 2005; Kuerzi et al., 2010; Shepard et al., 2021). While KA-animals have a significantly lower RI in comparison to the controls, the RI remains above 90% which is still relatively high given the amount of neuronal loss. However, we would argue that a single parameter is not the defining factor of gait/coordination, but a combination of parameters and tests provides a more comprehensive picture, as we have seen with our pLDA analysis and Random Forest classification approaches.” (Pages 28-29)

      The rationale for determining classification prior to histological analysis is somewhat weak, and I think it would be worthwhile strengthening this rationale at the beginning of this paragraph...it becomes more obvious later why this classification is important. Is the variability of the KA model greater than an NYU or IH contusion model? If so, why? The early functional plateau is key to this argument.

      • We postulate that less severe SCIs and our milder KA lesion tend to have more variability than more severe SCI models. In the contusion models this is due to the delayed natural compensatory functional recovery plateau that can last up to 5-6 weeks. However with the KA model, variability arises from titrating down KA and adding multiple injection sites increasing variable success rate per injection. In the KA model, the early functional plateau at two weeks allows for correctly excluding or classifying animals into equally lesioned groups prior to treatment with our Random Forest Eco model. We agree that we need to clarify this reasoning in the results and have now done so on page 22. “To test the efficacy of experimental SCI therapies, it is important to effectively evaluate recovery performance through the combination of behavioral tests. In addition to carefully classifying groups at the end of the study, there is a need to provide exclusion criteria and equal sorting of variability between groups prior to treatment (after deficits have stabilized at two weeks).” (page 22)

      Minor Comments:

      __ Heatmap Analysis: The term "lesion size" is insufficiently accurate to be used in this context. Do you mean lesion length?__

      • This term has now been adjusted to lesion length throughout the manuscript and figures.

      Kainic Acid injuries are known to be accompanied by cell division and neurogenesis in the brain, and if that kind of thing is happening in the presented model, it could be an interesting confound/addition to the alluded to cellular replacement __therapies.____

      __

      • KA has been shown to be accompanied by cell division and neurogenesis in the brain, however from our own work and previous work with KA in the spinal cord if this occurs it is not at a level that is relevant to functional recovery as evidenced in our long-term study. A previous study by Magnuson et al compared E14 cerebral rat precursor cell transplantation 40 minutes and 4 weeks post-KA injury and did not find significant differences in cell survival/division (Magnuson et al., 2001). Therefore, we do not believe this would hamper or confound our future work with cellular replacement therapies. In addition, cell transplantation would take place 2 weeks post-KA injury when KA would no longer be able to hamper the transplanted cells.

      __Reviewer #2 (Significance (Required)):

      __

      __ Overall, this is a well-designed and performed set of studies that takes the KA lesion model into new territory, well set-up to perform delayed (sub-acute or early chronic) neuron replacement studies. The work characterizes a multi-segment but mild KA injury model that demonstrates persistent dysfunction that plateaus early, and a rapid and efficient system to classify the injury with a high predictability of long-term dysfunction by 2 weeks post-injury.

      This model should be of interest because it focuses on gray-matter specific tissue loss and functional deficits that should be amenable to neuron replacement strategies without the complications of white-matter dependent functional losses.

      My expertise: I have been using a variety of spinal cord injury models, in rats, for many years including contusions, lacerations and excitotoxic (KA) lesions. I have a lot of experience with locomotor, motor and sensory outcome measures. However, I have very limited experience with the Random Forest analysis employed and am not an expert in statistics.__

      __References: __

      Hadi, B., Zhang, Y. P., Burke, D. A., Shields, C. B., & Magnuson, D. S. (2000). Lasting paraplegia caused by loss of lumbar spinal cord interneurons in rats: no direct correlation with motor neuron loss. J Neurosurg, 93(2 Suppl), 266-275. https://doi.org/10.3171/spi.2000.93.2.0266

      Koopmans, G. C., Deumens, R., Honig, W. M., Hamers, F. P., Steinbusch, H. W., & Joosten, E. A. (2005). The assessment of locomotor function in spinal cord injured rats: the importance of objective analysis of coordination. J Neurotrauma, 22(2), 214-225. https://doi.org/10.1089/neu.2005.22.214

      Kuerzi, J., Brown, E. H., Shum-Siu, A., Siu, A., Burke, D., Morehouse, J., Smith, R. R., & Magnuson, D. S. (2010). Task-specificity vs. ceiling effect: step-training in shallow water after spinal cord injury. Exp Neurol, 224(1), 178-187. https://doi.org/10.1016/j.expneurol.2010.03.008

      Mohan, R., Tosolini, A. P., & Morris, R. (2015). Segmental Distribution of the Motor Neuron Columns That Supply the Rat Hindlimb: A Muscle/Motor Neuron Tract-Tracing Analysis Targeting the Motor End Plates. Neuroscience, 307, 98-108. https://doi.org/10.1016/j.neuroscience.2015.08.030

      Nicolopoulos-Stournaras, S., & Iles, J. F. (1983). Motor neuron columns in the lumbar spinal cord of the rat. J Comp Neurol, 217(1), 75-85. https://doi.org/10.1002/cne.902170107

      Pitzer, C., Kurpiers, B., & Eltokhi, A. (2021). Gait performance of adolescent mice assessed by the CatWalk XT depends on age, strain and sex and correlates with speed and body weight. Sci Rep, 11(1), 21372. https://doi.org/10.1038/s41598-021-00625-8

      Shepard, C. T., Pocratsky, A. M., Brown, B. L., Van Rijswijck, M. A., Zalla, R. M., Burke, D. A., Morehouse, J. R., Riegler, A. S., Whittemore, S. R., & Magnuson, D. S. (2021). Silencing long ascending propriospinal neurons after spinal cord injury improves hindlimb stepping in the adult rat. Elife, 10. https://doi.org/10.7554/eLife.70058

      Wen, J., Sun, D., Tan, J., & Young, W. (2015). A consistent, quantifiable, and graded rat lumbosacral spinal cord injury model. J Neurotrauma, 32(12), 875-892. https://doi.org/10.1089/neu.2013.3321

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      *Randhawa and co-authors have studied various aspects of the regulation of lignocellulose degradation by the filamentous ascomycete fungus Penicillium funiculosum. Over-expression of the well-known transcription factor clr2 (which regulates cellulase gene expression in Neurospora and other ascomycetes) in a delta-mig1 strain did not result in an increase in cellulase activity. However, when combined with an increased Ca2+ concentration the cellulase activity in the medium did increase. Using RNA-Seq, the authors have identified a candidate regulator: Snf1. Indeed, a knockout confirms that this gene is involved in the posttranscriptional regulation of cellulase production, specifically by regulating the secretion of the cellulases. *

      Major comments:

      In general, the topic and results are interesting. There are a few issues that need to be addressed, however. The manuscript would benefit from some careful proofreading. For example, articles ('the', 'a') are frequently missing. Very informal language is sometimes used ('zilch effect'). Put a space between '1000bp', etc. It is 'kDa', not 'kD', etc.

      Response – Thank you very much for the encouraging remarks. We have thoroughly checked the manuscript and have added the articles at appropriate places. We have also improved the manuscript’s language and removed any informal language used.

      I am a bit puzzled by the choice of calcium source: CaCO3, up to 10 g/L. Calcium carbonate does not efficiently dissolve in water unless the pH is low. Fungi generally acidify their culture medium during growth. As such, calcium carbonate likely has a pH buffering effect. Therefore, the described effects may also be attributed to a more neutral pH of the medium, and not necessarily to an increase in calcium ions.

      Response – We completely agree with the reviewer and had the same thought that the pH buffering effect of CaCO3 could be the reason for increased cellulase production. We ruled out this by using 50 mg/l CaCl2 solely in rest of the experiments performed in Fig. 3 and afterwards. We have also mentioned the same in the manuscript (lines 175-178).

      The authors have performed RNA-Seq, but as far as I can tell the data has not been made publicly available. At least, the raw reads should be deposited in the Short Read Archive of NCBI (or a similar repository), and preferably also the expression values in GEO of NCBI (or a similar repository).

      Response – We will comply and deposit the raw reads in the short read archive of NCBI. We will also be providing the differential analysis of transcription factors expressed under glucose and Avicel in NCIM1228 and ∆Mig1 in the supplementary information.

      P21. Very little information is provided in the M&M regarding the gene expression analysis. Provide references to all the tools, as well as the version numbers. Were any non-default parameters used?

      Response – We have added the complete information on tools and procedures used for RNA-seq data analysis. For differential expression profiling, all FPKM values were normalized to the library size using the R package, Edge R. The expression value for the transcript was calculated using the reads aligned & normalised it on library size (Total sequencing reads generated) & transcript length giving us FPKM value (Fragments Per Kilobase of transcript per Million mapped reads), and TPM value (Transcript per million reads), which is regarded as normalized expression value for a particular transcript. We have taken the number of reads which got aligned to the conserved transcripts (Present in both the comparison group i.e Wild Type Glu & Cellulose samples (S1, S2, S7 & S8) Vs MIG1 glu & Cellulose sample (S3, S4, S5 & S6) and performed the differential gene expression between the two groups. The excel sheet having differential expression profiling of transcription factors is available as supplementary data.

      The authors claim that SSP1 CaMKK phosphorylates SNF1 AMPK (last title of the Results section). I don't see any evidence for a direct interaction between these two proteins. I will believe that they are in the same pathway, but if the authors want to claim a direct interaction then additional experiments will be required. E.g. Y2H.

      Response – Ssp1 is known to phosphorylate SNF1 during nutritional stress in S. pombe and they were found to interact directly by Co-IP studies. Based on the literature, we planned to over-express Ssp1 in P. funiculosum.

      Minor comments:

      • Please add line numbers to the manuscript, this facilitates the review process.*

      Response - Line numbers have been added.

      *P14 "in all yeasts and filamentous fungi". I doubt that all fungi have been tested. *

      Response - The phrase has been modified.

      P18. "in diverse yeasts and fungi". Yeasts are also fungi.

      Response - The phrase has been modified.

      P16. "solves dual purpose". I think this is meant: "serves a dual purpose"?

      Response - The phrase has been modified.

      *P17, first paragraph: this seems very speculative to me, so it should probably be labeled as such. *

      Response - The phrase has been modified.

      P21. What reference genome is used? Please cite the paper.

      Response - We have our own reference genome in lab which is yet to be published.

      Fig 1B. These are reported as volcano plots, but to me it looks like an empty graph (no data points), only a number of genes.

      Response - The pictures have been changed.

      Fig 1D. What do the colors on the right represent? The colors on the right represents k-means clustering of the genes of transcription factors.

      Response - The same has been added to the figure legend also.

      On various places in the manuscript the term "three times in triplicate" is used. What is meant here, three technical replicates of each of the three biological replicates?

      Response - Yes we mean the same and the phrase has been modified.

      P46. "We aimed to sought"

      Response - The phrase has been modified.

      Abstract: The sentence "Further, Ca2+-signaling" should be rewritten, because currently is seems to suggest that SSP1 downregulates the phospho-HOG1 levels.

      Response – As suggested by the Western blot in the Fig.4b, Snf1 gets phosphorylated only when dual signal of calcium and cellulose are present. Since we observed upregulated Ssp1 expression in Avicel (Fig. 4a), and increased Ssp1 expression could increase the phosphorylated Snf1 in the cell (Fig. 7i), our data suggests that Ssp1 phosphorylates Snf1 in a Ca2+-dependent manner. Further Hog1 was found in hyperphosphorylated state in ∆Snf1 (Fig 6e), thus we believe Snf1 AMPK downregulates phospho-Hog1 levels.

      Reviewer #1 (Significance (Required)):

      *In general, the topic and results are interesting. There are a few issues that need to be addressed, however. The manuscript would benefit from some careful proofreading. *

      Response – We highly appreciate the encouraging words of the reviewer. We have addressed all the issues raised by the reviewer. The major ones included the language and readability of the text, which has been improvised. We have replaced the volcano plot figures, and will be uploading the RNA-seq data to the SRA database of NCBI and excel sheet of differential expression analysis of transcription factor will be added as a supplementary file to the manuscript.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      • Randhawa et al. study the effect of loss of function of Snf1 kinase and calcium on the production of enzymes related to cellulose degradation in the fungus Penicillium funiculosum. *
      • The manuscript is well structured and the researchers have done an enormous amount of work in constructing a number of mutant strains in this fungus. *
      • Transcriptomics and proteomics support the conclusions reached with the strains generated.*

      Response - Thank you very much for showing confidence in our research work, we are highly obliged by positive remarks on the manuscript.

      The manuscript is long and suffers from an excess of results presented in figures. My main criticism focuses on the presentation of data on the cellular distribution of the ER and Golgi apparatus. The micrographs are inconclusive and it is not really clear what the authors are trying to show in these experiments. These results are not really necessary for the article and I suggest that they be removed from the article.

      Response – We agree with the reviewers comments on data on the cellular distribution of the ER and Golgi apparatus. We have removed the micrograph data on the cellular distribution of the ER and Golgi apparatus (earlier Figure 3j and Figure 4r).

      Reviewer #2 (Significance (Required)):

      The authors have done an excellent job in producing a large number of strains carrying null alleles. In addition, they have used two broad analysis techniques that allow them to establish coherent hypotheses and corroborate them with the results.

      Response – Thank you very much for the positive comments

      The manuscript is difficult to understand in some sections because of the excessive amount of data and panels in the figures. The names designating each strain and given in full length in the graphs do not help either.

      Response – Thank you very much for the valuable suggestion. We have reduced the number of graphs by including all enzymes assays in one concise graph in Figure 4. We have also shortened the names of strains and enzymes, in all the figures.

      This work is of interest to all researchers interested in the integrity of signaling and regulatory pathways on extracellular enzymes of biotechnological interest.

      *My interests focus on the cell biology of filamentous fungi, in particular on the molecular mechanisms and subcellular localization of elements involved in intracellular transport, signaling against environmental stresses and changes in transcriptional regulatory patterns. *

      Response – Thank you once again for the encouraging remarks.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The manuscript submitted by Randhawa et al focus on the mechanism of cellulases secretion, the very important and basal question in the filamentous fungi, particularly for cellulases biotechnology. As the author said the molecular basis of cellulases production previously study mainly focuses on regulation mechanism at transcription level, the study of molecular mechanism of cellulases translation and secretion are much rare. Therefore the submitted work is very impressive me on the progress of this area. What they presented shown the Ca2+ is critical for the regulation of cellulases secretion by SNF-1, SSP1 and HOG1. The regulation might be caused by affecting the protein trafficking in ER and Golgi, the manuscript found the development of ER and Golgi shown changes by staining by ER-tracker and Bodipy under different conditions and mutants. The manuscript constructed a model about regulatory mechanism of Ca2+ on cellulases translation and secretion level. The present study is close to make significant progress in the cellulases regulation area.

      Response – We appreciate the positive comments of the reviewer.

      Major comments: I am really impressive for the great work in the manuscript, however, I think the more work do need for give the conclusion of paper.

      1.In terms of dynamics development of ER and Golgi of strains, the very critical data for the conclusion of the paper, the current data is only by chemical staining. It is not robust, it will be needed by other methods, for example, GFP-labeling the marker of ER or Golgi.

      Response – The manuscript focuses on the signaling events governing cellulase production, and secretion. Since ER and Golgi are the sites of protein production and secretion, we hypothesized, if the Ca2+ signaling affects post-transcriptional events, it must have had some impact on the dynamics of these organelles; and microscopy experiments suggested us the same. In the next set of experiments, we proved our hypothesis with the proteomics and functional analysis of Snf1, Ssp1, and Hog1 MAPK. Hog1 MAPK pathway is known to regulate protein trafficking and secretion in yeast. We here showed that Ca2+- dependent regulation of Hog1 MAPK and its downregulation by Snf1 AMPK is crucial to cellulase secretion.

      2.Also the author try to suggest the cellulases were detained in the ER, not went into Golgi, therefore the secretome protein decreased. It is very much possibly but the evidence is not robust either, to trafficking the GFP-labelled CBH1 might be a good experiment to make it clear.

      Response – Thank you very much for raising the query. The manuscript majorly focuses on the role of calcium signaling on cellulase translation and secretion. Further, we have studied two signaling proteins, Snf1 AMPK and Hog1 MAPK which are downstream to calcium signaling, and we found their crosstalk vital to cellulase secretion. We have not talked about cellulases being detained in the ER or Golgi, rather we focused on the signaling events regulating cellulase production and transport.

      Since we had already ruled out the role of calcium in cellulase transcriptional activation, and ER and Golgi being major site of protein production in the cell; we performed microscopy experiments to see if the calcium signaling modifies ER and Golgi morphology during carbon stress. We found under-developed Golgi in the absence of calcium in wild type. This experiment helped us to build a hypothesis that calcium signaling might have role in downstream events like protein translation, and secretion. The hypothesis was proved by functional analysis of signaling proteins, Western blot and proteomics experiments. Further, microscopy experiments further strengthened our observation that Snf1 AMPK is downstream target of calcium signaling and has no role in the cellulase translation, but cellulase secretion.

      Considering that we are not focusing on the protein trafficking of cellulase, the confocal microscopy experiments are not decisive, rather build supporting evidence for our hypothesis, as suggested by the second reviewer. We have proved our hypothesis of Ca2+-dependent post-transcriptional regulation of cellulase by proteomics, and other biochemical experiments. Nevertheless, we plan to perform the confocal experiments again to achieve pictures with higher resolution.

      1.On page 9, please indicate the fold changes of the kinases genes talked about, snf1 and so on.

      Response – We have added the Fold change in the expression of Snf1 and Ssp1 (line number 221).

      2.The quality of microscopic figure is not good, should have one with higher resolution, even consider to present the electron microscope picture to give the er and Golgi dynamics changes the manuscript talked about(optional).

      Response: We agree with the reviewer’s suggestion to add high resolution confocal images of mycelia in Fig 3j and Fig. 4o. We are in the process of repeating the confocal microscopy experiment. We will update the manuscript with improved microscopic pictures.

      *3. The quality of Western plot need to be improved, particularly figure 4f,figure 7i, it is hard to give the conclusion based on the picture presented *

      Response – We have replaced the pictures of western blots (Fig 4f, and Fig 7i) with high resolution images.

      Reviewer #3 (Significance (Required)):

      The manuscript submitted by Randhawa et al focus on the mechanism of cellulases secretion, the very important and basal question in the filamentous fungi, particularly for cellulases biotechnology. As the author said the molecular basis of cellulases production previously study mainly focuses on regulation mechanism at transcription level, the study of molecular mechanism of cellulases translation and secretion are much rare. Therefore the submitted work is very impressive me on the progress of this area. What they presented shown the Ca2+ is critical for the regulation of cellulases secretion by SNF-1, SSP1 and HOG1. The regulation might caused by affecting the protein trafficking in ER and Golgi, the manuscript found the development of ER and Golgi shown changes by staining by ER-tracker and Bodipy under different conditions and mutants. The manuscript constructed a model about regulatory mechanism of Ca2+ on cellulases translation and secretion level. The present study is close to make significant progress in the cellulases regulation area.

      Response - Thank you for the positive comments on the manuscript.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Randhawa and co-authors have studied various aspects of the regulation of lignocellulose degradation by the filamentous ascomycete fungus Penicillium funiculosum. Over-expression of the well-known transcription factor clr2 (which regulates cellulase gene expression in Neurospora and other ascomycetes) in a delta-mig1 strain did not result in an increase in cellulase activity. However, when combined with an increased Ca2+ concentration the cellulase activity in the medium did increase. Using RNA-Seq, the authors have identified a candidate regulator: Snf1. Indeed, a knockout confirms that this gene is involved in the posttranscriptional regulation of cellulase production, specifically by regulating the secretion of the cellulases.

      Major comments:

      In general, the topic and results are interesting. There are a few issues that need to be addressed, however. The manuscript would benefit from some careful proofreading. For example, articles ('the', 'a') are frequently missing. Very informal language is sometimes used ('zilch effect'). Put a space between '1000bp', etc. It is 'kDa', not 'kD', etc.

      I am a bit puzzled by the choice of calcium source: CaCO3, up to 10 g/L. Calcium carbonate does not efficiently dissolve in water unless the pH is low. Fungi generally acidify their culture medium during growth. As such, calcium carbonate likely has a pH buffering effect. Therefore, the described effects may also be attributed to a more neutral pH of the medium, and not necessarily to an increase in calcium ions. The authors have performed RNA-Seq, but as far as I can tell the data has not been made publicly available. At least, the raw reads should be deposited in the Short Read Archive of NCBI (or a similar repository), and preferably also the expression values in GEO of NCBI (or a similar repository). P21. Very little information is provided in the M&M regarding the gene expression analysis. Provide references to all the tools, as well as the version numbers. Were any non-default parameters used? The authors claim that SSP1 CaMKK phosphorylates SNF1 AMPK (last title of the Results section). I don't see any evidence for a direct interaction between these two proteins. I will believe that they are in the same pathway, but if the authors want to claim a direct interaction then additional experiments will be required. Eg Y2H.

      Minor comments:

      Please add line numbers to the manuscript, this facilitates the review process.

      P14 "in all yeasts and filamentous fungi". I doubt that all fungi have been tested.

      P18. "in diverse yeasts and fungi". Yeasts are also fungi.

      P16. "solves dual purpose". I think this is meant: "serves a dual purpose"?

      P17, first paragraph: this seems very speculative to me, so it should probably be labeled as such.

      P21. What reference genome is used? Please cite the paper.

      Fig 1B. These are reported as volcano plots, but to me it looks like an empty graph (no data points), only a number of genes.

      Fig 1D. What do the colors on the right represent?

      On various places in the manuscript the term "three times in triplicate" is used. What is meant here, three technical replicates of each of the three biological replicates?

      P46. "We aimed to sought"

      Abstract: The sentence "Further, Ca2+-signaling" should be rewritten, because currently is seems to suggest that SSP1 downregulates the phosphor-HOG1 levels.

      Significance

      In general, the topic and results are interesting. There are a few issues that need to be addressed, however. The manuscript would benefit from some careful proofreading.

    1. I hope that they pass something because the whole nation deserves some protections, not just certain states. There are improvements that they should make on transparency. For example, if Facebook’s privacy policy says, “We use information about groups you follow to personalize ads,” as opposed to if they say, “We use the content of your communications to personalize ads,” I think most consumers would think of those as different. Some people may be more comfortable with one rather than the other.

      The desire to provide comprehensive protection for everyone in the country, not just certain states. Regarding transparency, the importance of using clear and precise language in privacy policies has been well accepted. Consumers should be able to understand how companies like Facebook use their data, and using clear, simple language in privacy policies can help achieve that goal. This example highlights the difference in perception, depending on how the information is presented. Ultimately, it is important that consumers have access to transparent and meaningful information about how their data is used so they can make informed decisions about whether to share it.

    2. I think most consumers would think of those as different. Some people may be more comfortable with one rather than the other. Right now, we don’t know which one Facebook does, because all they say is “We use your personal information.”

      The companies are very sneaky and cheat us in purpose. They write their privacy informations so long and hide it. We need to expand to read them all. Some of them do not have decline bottom. We have to agree the statement and accept to continue.

    3. I think most consumers would think of those as different. Some people may be more comfortable with one rather than the other. Right now, we don’t know which one Facebook does, because all they say is “We use your personal information.”

      It is very sneaky how companies write their privacy information and policies/contracts. The purposefully make it long and difficult to understand which is probably why it is such a struggle to make and maintain laws about privacy.

    1. we should not refer to persons in ways that may imply that they are essentially defined by something that they are, in fact, managing.

      I don't think that LGBT people are managing this, I think that they are using it to define themselves and make that almost their whole existence. It is not the same thing as having a disease.

    1. Author Response

      Reviewer #1 (Public review):

      1.0) This paper investigates the metabolic basis of a node, posterior cingulate cortex (PCC), in the default node network (DMN). They employed sophisticated MRI-PET methods to measure both BOLD and CMRglc changes (both magnitude and dynamics) during attention-demanding and working memory tasks. They found uncoupling of BOLD and CMRglc in PCC with these different tasks. The implications of these findings are poorly interpreted, with a conclusion that is purely based on other work independent of this study. Various suggestions could allow them to place some speculations in line with a stronger interpretation of their results.

      This is one of several papers in recent years investigating the metabolic underpinnings of activated (or task-positive) and deactivated (or task-negative) cortical areas in the human brain. In this study, they used BOLD fMRI and glucose PET scan to examine the metabolic distinction of the default node network (DMN), which is known to be deactivated during attention-demanding tasks, with different types of cognitively demanding tasks. Unlike the BOLD response in posteromedial DMN which is consistently negative, they found that CMRglc of the posteromedial DMN (a task-negative network) is dependent on the metabolic demands of adjacent task-positive networks like the dorsal attention network (DAN) and frontoparietal network (FPN). With attention-demanding tasks (like Tetris) the BOLD and CMRglc are both downregulated in DMN (specifically the posterior cingulate cortex, PCC, a task-negative node of DMN), but working memory induces CMRglc increase in PCC and which is decoupled from the negative BOLD response in PCC.

      We thank the reviewer for the constructive feedback and the possibility to improve our manuscript. We agree that the interpretation of the results should be strengthened to provide a stronger focus on our data. Regarding the uncoupling of BOLD and CMRGlu during working memory, we acknowledge the need to further elaborate on this topic in our discussion. These suggestions and comments have been incorporated into the revised manuscript as outlined below.

      1.1) These complicated results are the main findings, and to provide a biological basis to these data they rather surprisingly, but without their own experimental evidence, conclude that the negative BOLD and negative CMRglc in PCC during attention-demanding tasks is due to decreased glutamate signaling (which was not measured in this study) and the negative BOLD and positive CMRglc in PCC during working memory is due to increased GABAergic activity (which was not measured in this study). It is rather surprising that without measurement, a conclusion is made which would at best be considered a hypothesis to be tested. Thus, independent of these hypothesized mechanisms, they need to summarize their results based on their own measurements in this study (see 3 for a hint).

      Thank you for bringing up this point and for the insightful suggestion concerning point 3. We have now explicitly stated that the interpretation regarding glutamate and GABAergic signaling is of speculative nature as theses were not measured in the current work, moreover, we have substantially reduced this section. As such, we agree with the reviewer that this represents an interesting hypothesis to be tested in future work. For further details please see response to comments 1.3 and 1.4.

      Discussion, page 16, line 341:

      On the neurotransmitter level, one of the current hypotheses regarding BOLD deactivations proposes that CMRO2 and CBF are affected by the balance of the excitatory and inhibitory neurotransmitters, specifically GABA and glutamate (Buzsáki et al., 2007; Lauritzen et al., 2012; Sten et al., 2017). In the PCC, glutamate release prevents negative BOLD responses (Hu et al., 2013), whereas a lower glutamate/GABA ratio is associated with greater deactivation (Gu et al., 2019). As glutamate elicits proportional glucose consumption (Lundgaard et al., 2015; Zimmer et al., 2017), decreases in glutamate signaling in the pmDMN could indeed explain both, the decreased BOLD response and decreased CMRGlu during the Tetris® task. Conversely, increased GABA supports a negative BOLD response in the PCC (Hu et al., 2013), as do working memory tasks (Koush et al., 2021) and pharmacological stimulation with GABAergic benzodiazepines (Walter et al., 2016). In consequence, the observed dissociation between BOLD changes and CMRGlu during working memory could indeed result from metabolically expensive (Harris et al., 2012) GABAergic suppression of the BOLD signal (Stiernman et al., 2021). However, we need to emphasize that glutamate and GABAergic signaling was not measured in the current study, thus, the above interpretations are of speculative nature. Nonetheless, future work may test this promising hypothesis, e.g., using pharmacological alteration of GABAergic and glutamatergic signaling or optogenetic approaches modulating GABAergic interneuron activity.

      Furthermore, to maintain a more concise discussion that is closer aligned with the measured results, we have removed the following paragraph:

      Discussion, page 15, line 309:

      The associations of these metabolic demands between the DMN and task-positive networks is also reflected in their distance along a connectivity gradient, which is hierarchically organized from unimodal sensory/motor to complex associative functions and the DMN being at the end of the processing stream (Margulies et al., 2016; Smallwood et al., 2021). A corresponding decrease in pmDMN glucose metabolism was observed for tasks that activate unimodal networks and the DAN, but not for the FPN. The inverse influence of attention and control networks on the pmDMN may therefore suggest that connectivity gradients are supported by the underlying energy metabolism.

      1.2) It is mentioned that the FDG-PET scans allow quantitative CMRglc, both in terms of units of glucose use but also with high time resolution. Based on the method described, it isn't clear how this is possible. Important details of either prior work or their own work have been excluded that show how the time course of CMRglc (regardless of whether it's absolute or relative) can be compared with the BOLD time course. Furthermore, it is extremely difficult to conceive that quantitative CMRglc can be estimated without additional measurements (e.g., blood samples, etc). Significant methodological details have to be provided, which even should make their way to results given the importance of their BOLD-CMRglc coupling and decoupling in the same region.

      We thank the reviewer for this important comment and apologize for the lack of clarity. We would like to emphasize that in the current work only spatial patterns of CMRGlu and BOLD signal changes were compared, but not the time course of these signals. The manuscript was edited throughout to clarify this point.

      Introduction, page 5, line 110:

      Studies using simultaneous fPET/fMRI have shown a strong spatial correspondence between the BOLD signal changes and glucose metabolism in several task-positive networks and across various tasks requiring different levels of cognitive engagement (Hahn et al., 2020, 2016; Jamadar et al., 2019; Rischka et al., 2018; Stiernman et al., 2021; Villien et al., 2014).

      Introduction, page 5, line 123

      Specifically, it is unknown whether the observed dissociation between patterns of metabolism and BOLD changes in the DMN generalizes for complex cognitive tasks, and whether this in turn depends on the brain networks supporting the task performance and their interaction with the DMN.

      Results, page 7, line 143:

      From this dataset (DS1) we evaluated the spatial overlap of negative task responses in the cerebral metabolic rate of glucose (CMRGlu quantified with the Patlak plot) and the BOLD signal specifically in the pmDMN. […] After that, the distinct spatial activation patterns across different tasks were used to quantitatively characterize the CMRGlu response of the pmDMN in DS1.

      The method of functional PET (fPET) imaging indeed enables the evaluation of changes in glucose metabolism with a relatively high temporal resolution. That is, a conventional bolus application and subsequent quantification yield a single CMRGlu image per scan of about 60 min (typical frame length ~1-5 min) or a single SUV image from a static scan. In contrast, the constant infusion employed in fPET allows to assess baseline metabolism and changes induced by different tasks in a single scan by using a frame length currently down to 6-30 s (Rischka et al., 2018), where the latter was also used in the current study. A general description of the fPET approach is now also included in the manuscript.

      Introduction, page 5, line 99:

      In this context, functional PET (fPET) imaging represents a promising approach to investigate the dynamics of brain metabolism. fPET refers to the assessment of stimulation-induced changes in physiological processes such as glucose metabolism (Villien et al., 2014; Hahn et al., 2016) and neurotransmitter synthesis (Hahn et al., 2021) in a single scan. The temporal resolution of this approach of 6-30 s (Rischka et al., 2018) is considerably higher than that of a conventional bolus administration. This is achieved through the constant infusion of the radioligand, thereby providing free radioligand throughout the scan that is available to bind according to the actual task demands. Here, the term “functional” is used in analogy to fMRI, where paradigms are often presented in repeated blocks of stimulation, which can subsequently be assessed by the general linear model.

      Regarding the absolute quantification of CMRGlu, arterial blood samples were obtained from all subjects of DS1. These were used for absolution quantification of CMRGlu with the Patlak plot. Full details were already provided in the methods section and are now also mentioned in the results.

      Results, page 7, line 140:

      Simultaneous fPET/fMRI data and arterial blood samples were acquired from 50 healthy participants during the performance of the video game Tetris®, a challenging cognitive task requiring rapid visuo spatial processing and motor coordination (Hahn et al., 2020; Klug et al., 2022). From this dataset (DS1) we evaluated the spatial overlap of negative task responses in the cerebral metabolic rate of glucose (CMRGlu quantified with the Patlak plot) and the BOLD signal specifically in the pmDMN.

      Methods, page 19, line 399:

      For glucose metabolism, these changes are absolutely quantified in μmol/100g/min with the arterial input function and the Patlak plot.

      Methods, blood sampling, page 24, line 536:

      Before the PET/MRI scan blood glucose levels were assessed as triplicate (Gluplasma). During the PET/MRI acquisitions manual arterial blood samples were drawn at 3, 4, 5, 14, 25, 36 and 47 min after the start of the radiotracer administration (Rischka et al., 2018). From these samples whole-blood and plasma activity were measured in a gamma counter (Wizard2, Perkin Elmer). The arterial input function was obtained by linear interpolation of the manual samples to match PET frames and multiplication with the average plasma-to-whole-blood ratio.

      Methods, cerebral metabolic rate of glucose metabolism, page 25, line 561:

      Quantification was carried out with the Patlak plot (t* fixed to 15 min) and the influx constant Ki was converted to CMRGlu as CMRGlu = Ki * Gluplasma / LC * 100 with LC being the lumped constant = 0.89 (Graham et al. 2002, Wienhard 2002).

      1.3) It is surmised that the glutamatergic/GABAergic involvement of these metabolic differences in PCC is from another study, but what mechanism causes the BOLD signal to decrease in both stimuli? This is where the authors have to divulge the biophysical basis of the BOLD response. At the most basic level, the BOLD signal change (dS) can be positive or negative depending on the degree of coupling with changed blood flow (dCBF) and oxidative metabolism (dCMRO2) from resting condition. Unfortunately, neither CBF nor CMRO2 was measured in this study. In the absence of these additional measurements, the authors should at least discuss the basis of the BOLD response with regard to CBF and CMRO2. If we assume that both attention-demanding and working memory tasks decreased BOLD response in PCC in the same way, we have identical dCBF/dCMRO2 in PCC with both tasks, i.e., their results seem to suggest an alteration in aerobic glycolysis with different tasks. With attention-demanding tasks, CMRglc decreases similarly to CMRO2 decreases in PCC, whereas with working memory tasks, CMRglc increases differently from CMRO2 decreases. This suggests PCC may the oxygen to glucose index (OGI=CMRO2/CMRglc) would rise in PCC attention-demanding tasks, but fall in PCC with working memory tasks. This is obviously an implication rather than a conclusion as CBF or CMRO2 were not measured.

      1.4) Given the missing attention that gives rise to the BOLD contrast mechanism, it is almost necessary to discuss the biophysical basis of BOLD contrast and specifically how metabolic changes have been linked to both increases and decreases in neuronal activity in the past. Although this type of work has largely been conducted in animal models, it seems that this topic needs to be discussed as well.

      We would like to thank the reviewer for sharing these insightful ideas and for bringing up these aspects that indeed appear to be essential for the manuscript. Since the points 1.3. and 1.4 complement each other, we have combined them and created a shared response. To fully address the points, the following paragraphs were added to the manuscript.

      Discussion, page 15, line 310:

      Metabolic and neurophysiological considerations effects

      The distinct relationships between BOLD and CMRGlu signals that emerge during specific tasks highlight the different physiological processes contributing to neuronal activation of cognitive processing (Goyal and Snyder, 2021; Singh, 2012). While CMRGlu measured by fPET provides an absolute indicator for glucose consumption, the BOLD signal reflects deoxyhemoglobin concentration, which depends on various factors, such as cerebral blood flow (CBF), cerebral blood volume (CBV) and the cerebral metabolic rate of oxygen (CMRO2) (Goense et al., 2016). In simple terms, the BOLD signal relates to the ratio of ∆CBF/∆CMRO2. Assuming that the observed BOLD decreases during Tetris® and WM emerge from the same mechanisms, this would result in a comparable ∆CBF/∆CMRO2 in the pmDMN for both tasks. Given that these types of tasks (external attention and cognitive control) elicit a reduction in CBF in the pmDMN (Shulman 97, Zou 2011), CMRO2 also decreases albeit to a lesser extent (Raichle 2001). Therefore, the respective metabolic processes can be described by their oxygen-to-glucose index (OGI), the ratio of CMRO2/CMRGlu. Accordingly, our results suggest two distinct pathways underlying BOLD deactivations in the pmDMN that differ regarding their OGI. During Tetris® there is a BOLD deactivation with a high OGI, resulting from a larger decrease in CMRGlu than CMRO2. This metabolically inactive state is in line with electrophysiological recordings in humans (Fox et al., 2018) and in non-human primates showing a decrease of neuronal activity in the pmDMN that covaries with the degree of exteroceptive vigilance (Shmuel et al., 2006; Bentley et al., 2016; Hayden et al., 2009). Therefore, we suggest that the negative BOLD response during external tasks reflects a reduction of neuronal activity and their respective metabolic demands. On the other hand, the relatively increased CMRGlu without the corresponding surge in CMRO2 hints at another kind of BOLD deactivation with a low OGI in the pmDMN during working memory, indicating energy supply by aerobic glycolysis (Vaishnavi et al., 2010; Blazey et al., 2019). Previous work in non-human primates has indeed suggested a differential coupling of neuronal activity to hemodynamic oxygen supply in this region (Bentley et al., 2016). Furthermore, tonic suppression of PCC neuronal spiking during task performance was punctuated by positive phasic responses (Hayden et al., 2009), which could indicate differences between both tasks also at the level of electrophysiologically measured activity.

      Reviewer #2 (Public Review):

      2.0) This paper provides an important and insightful investigation into patterns of activations that emerge in external task states. The authors use state-of-the-art methods and novel analytic approaches to establish that deactivations in the default mode network during external tasks are driven by activity in brain regions that are important in the current tasks (such as the visual or dorsal attention networks). It will be important in the future to understand whether this is a symmetrical phenomenon by studying this behaviour in states that maximize activity within the default mode network and also drive reductions in networks that are not relevant to these situations.

      We thank the reviewer for the encouraging feedback and the constructive comments on our manuscript. We particularly appreciate the interest in the research and the insightful suggestions for future work.

      Reviewer #3 (Public Review):

      3.0) The authors report a study where, using multiple datasets with [18F]FDG PET bolus + continuous infusion ("functional PET") and BOLD fMRI data, they re-evaluate the metabolic and hemodynamic properties of the default mode network (DMN) in a task-evoked context, with a focus on posteromedial DMN due to its relevance for across-network integration. They show how posterior DMN is differently engaged depending on the chosen task: while visual and motor tasks lead to BOLD deactivations and glucose metabolic decrease, specifically in the dorsal posterior cingulate cortex (PCC) area, working memory tasks produce BOLD deactivations but metabolic increases, specifically in ventral PCC, as shown in their previous paper (Stiernman et al. 2021, https://doi.org/10.1073/pnas.2021913118). This aims to solve the controversies elicited by findings of both increased and decreased glucose consumption in the presence of BOLD deactivation in the DMN.

      Additionally, they show how task-evoked glucose metabolism in posterior DMN seems to be shaped by that of the corresponding task-positive networks, with a positive link with dorsal attention and a negative link with frontoparietal network metabolism. This is explored using a type of directional connectivity analysis called "metabolic connectivity mapping", drawn from their previous work (Riedl et al. 2016, https://doi.org/10.1073/pnas.1513752113; Hahn et al. 2020, https://doi.org/10.7554/eLife.52443). They go on to speculate that concomitant BOLD deactivation and reductions in glucose expense might relate to decreased glutamatergic signaling, while BOLD deactivations accompanied by increased glucose consumption might depend on increased GABAergic neuronal activity.

      This is a relevant topic because it not only shows how the DMN is flexibly engaged in different tasks but also allows us to better understand the complex relationships between BOLD fMRI and [18F]FDG PET signals, which are still not fully characterized to this day. Of course, while in resting state the situation is further complicated by the more uncertain physiological meaning of the resting BOLD signal, task-evoked states are expected to provide a more interpretable intermodal link between metabolism and hemodynamics, due to the known major changes in blood flow, blood volume, and glucose metabolism - which underlie BOLD and [18F]FDG signal changes - in response to neural activation. However, even in task states, there is not always a strong association between the two responses, as previously shown by the authors themselves (Rischka et al. 2018, https://doi.org/10.1016/j.neuroimage.2018.06.079). This is something I think the authors should stress out a little more, as they have previously done (Rischka et al. 2018, https://doi.org/10.1016/j.neuroimage.2018.06.079), both in the introduction and in reference to Figure 1, which shows clear differences between BOLD and [18F]FDG activations/deactivations (e.g., widespread negative responses in the cerebellum for [18F]FDG).

      Overall, the analyses reported in the manuscript are simple and seem mostly sound, drawing from well-established methods in PET and fMRI activation studies, with additional approaches previously developed by some of the authors themselves (e.g., "metabolic connectivity mapping", Riedl et al. 2016, https://doi.org/10.1073/pnas.1513752113). Moreover, a clear strength of the paper is the high number of subjects, at least from a PET perspective, i.e., n = 50 for the Tetris task, plus group averages of previously published data for working memory (Stiernman et al. 2021, https://doi.org/10.1073/pnas.2021913118) and motor tasks (Hahn et al. 2018, https://doi.org/10.1007/s00429-017-1558-0).

      The conclusions are in line with the results, and, though a little speculative, are potentially relevant for further exploration aimed at characterizing the neurotransmitter pathways underlying positive and negative BOLD and [18F]FDG responses. Moreover, the language is sufficiently clear to allow a proper understanding of the aims and the results, as well as the details of the analyses. As a side note, the title should probably be adjusted to "Task-evoked metabolic demands of the posteromedial default mode network are shaped by dorsal attention and frontoparietal control networks", to emphasize that the findings do not necessarily generalize to the resting state.

      In conclusion, I am overall quite positive about this manuscript, which seems to nicely position itself within the existing literature, making some additional contributions.

      We thank the reviewer for the thorough evaluation and the positive feedback on our manuscript, we appreciate the constructive and insightful suggestions. We agree that the differential spatial patterns of activation between the BOLD signal and CMRGlu response require further attention. To address this point in more detail, we have added the following information to the manuscript.

      Introduction, page 5, line 110:

      Studies using simultaneous fPET/fMRI have shown a strong spatial correspondence between the BOLD signal changes and glucose metabolism in several task-positive networks and across various tasks requiring different levels of cognitive engagement (Hahn et al., 2020, 2016; Jamadar et al., 2019; Rischka et al., 2018; Stiernman et al., 2021; Villien et al., 2014). […]. However, also regional differences in activation patterns have been observed previously between these modalities in these and previous studies (Wehrl et al., 2013). Moreover, a dissociation between BOLD changes (negative) and glucose metabolism (positive) has recently been observed even in the same region of the DMN during working memory (Stiernman et al., 2021), namely the posteromedial default mode network (pmDMN).

      Results, caption Figure 1, page 8, line 173

      White clusters represent the intersection of significant CMRGlu and BOLD signal changes, irrespective of direction. Note, that also relevant differences between both imaging parameters can be observed, such as decreased CMRGlu in the cerebellum (in both datasets), without changes in the BOLD signal.

      We appreciate the reviewer’s proposal for the title as it raises awareness that the activation patterns reflect task-specific inference.

      Title:

      Task-evoked metabolic demands of the posteromedial default mode network are shaped by dorsal attention and frontoparietal control networks

      We have limited the discussion of underlying neurotransmitter effects and explicitly mention that these are of speculative nature. For manuscript adaptation on this point, we would like to refer to points 1.1, 1.3, 1.4 that address this topic as well.

    1. Author Response

      Reviewer #1 (Public Review):

      The study tackles the topic of male harm (sexual selection favoring male reproductive strategies that incur a reduction of female fitness) from an interesting angle. The authors put emphasis on using wild-collected populations and studying them within their normal thermal range of reproductive conditions. Where previous studies have used temperature variation as a proxy for stressful environmental change, this approach should instead clarify what can be the role of male harm on female fitness in natural conditions. A minor caveat regarding this point is the fact the polygamy treatment also has a heavily male-biased sex ratio (3:1). The authors argue that this sex ratio is within the range of normal variation in that species, but it is likely that the average is still (1:1) in natural populations and using a male-biased sex ratio could magnify the intensity of male harm. This does not undermine the conclusions regarding the temperature sensitivity of sexual conflict but should be acknowledged.

      The authors find that varying temperature within a range found in natural conditions affects the reproductive interactions between males and females, particularly through male-harm mechanisms. Male harm, measured as a reduction in lifetime reproductive success (LRS) from monogamy to polygamy settings is present at 20C, stronger at 24, and absent or undetectable at 28C. Female senescence is always faster in the polygamy mating systems as compared to monogamy, but the effect appears strongest at 20C. Mating behaviors of males and females in these different settings are used to attempt to uncover underlying mechanisms of the sensitivity of male harm to temperature.

      A weakness of the manuscript in its current form is the lack of clarity about the experimental design, which makes understanding the results a long and involved procedure, even for someone who is familiar with the field. If the authors consider revising the manuscript, I suggest giving a better overview of the experimental design(s) earlier in the manuscript, perhaps supported by a diagram or flowchart. I also suggest structuring the results better to aid the reader (e.g., make clearer distinctions between results that come from the different experiments). Finally, some additional figures and statistical tests corrected for multiple testing would help get a better feel of some aspects of the dataset.

      I believe that the conclusions are generally justified and the results overall convincing. Overall, this is an impressive study with a lot of dimensions to it. Its complexity is a challenge and may require additional effort from the authors to make it easier to access. The core of the question is answered by LRS measures, but the authors have also provided a wealth of behavioral data as well as other fitness components. The manuscript could be greatly improved by putting more effort into linking the different metrics together to track down potential mechanisms for the observed variation in male-harm-induced reduction in female LRS. The discussion would also benefit from considering the female side of the sexual conflict coevolution arms race.

      We are thankful for the nice words and constructive appraisal of our work. As stated above, reviews like this are extraordinarily helpful. The reviewer mentions four main points that we have addressed:

      1. We now expand a bit on the justification to use a (3:1) male-biased sex ratio in the methods section (lines 150-155). We also acknowledge potential limitations of this design in the discussion (lines 563-571).
      2. To clarify the methods, we have placed this section before the results. This, in itself, has significantly improved the clarity of the manuscript. We have also substantially re-written the methods and results (including adding some tables) to streamline the text while providing all the necessary details, and have also included several diagrams to illustrate all our experiments (in the SM, see Figs. S1.1 to S1.5) along with a general schematic figure of the general design that we present early on in the main text (in the introduction, see Fig. 1).
      3. As suggested, we have re-run all analyses using the Benjamini-Hochberg procedure in order to correct for inflation of type I error rate due to multiple testing. We have also included in the SM a complementary set of models that also test for this via post hoc Tukey contrasts. Both these approached corroborate our initial findings, and thus contribute to strengthen our results.
      4. We now explicitly discuss the female side of things in the discussion (lines 636-647).

      Reviewer #2 (Public Review):

      Londoño-Nieto et al. investigated the influence of temperature on the form and intensity of sexual conflict in Drosophila melanogaster. They aimed to test the effect of naturally occurring temperature fluctuations on a wild population of Drosophila while disentangling pre- and postcopulatory episodes of sexual conflict. To this end, they exposed females to males under monogamy or polyandry, hence manipulating the degree of male harm experienced by females. The effect of temperature was explored by exposing these groups to 20, 24, or 28{degree sign}C. They found that female fitness suffered from male harm most at 24{degree sign}C and less at the other two temperatures. Interestingly, pre- and postcopulatory episodes of sexual conflict were affected differently by temperature. Overall, these data suggest that the relationship between sexual conflict and temperature can be strong and complex. Hence, these results can have important implications for the impact of sexual conflict on population viability, especially in light of the climate crisis.

      We want to thank the reviewer for the time invested in reading and reviewing our work. We are glad to read that the reviewer found our results interesting and considered our study to be of importance to the field.

      This paper tackles a highly relevant question using an established model organism for sexual conflict and contains a rich dataset obtained using a series of carefully planned experiments and analysed in an appropriate way. Importantly, the authors used biologically meaningful temperatures and mating treatments, which increases the relevance of the data. The main conclusions are well supported by the data. Nevertheless, the devil is in the detail, and given the way the authors frame their study (i.e. testing a natural population under naturally occurring temperature fluctuations) and their results (i.e. sexual conflict is buffered by temperature effects in the wild) there are some limitations to be considered:

      We appreciate the positive feedback! The reviewer identified potential limitations and made good suggestions that have only served to improve our manuscript considerably, for which we are very grateful. Details follow on how we have dealt with each specific comment.

      1) The authors frame their study as addressing the question of how sexual conflict reacts to naturally occurring temperature fluctuations in the wild. Nevertheless, the population used in this experiment had been kept for nearly 3 years in the laboratory prior to the experiment. Importantly, the authors ensured that the laboratory population maintained genetic diversity, by regularly crossing wild lines into it. Nevertheless, this population remained for some time in the laboratory under standardized conditions. The applied temperature fluctuations are in a biologically meaningful range (though only during the reproductive season), but it remains unclear if the applied fluctuations were in a standardized way (i.e. pre-programmed) or included random fluctuations (i.e. a more natural setting). This laboratory setup has certainly clear advantages, for example, it enables the exclusion of any effects other than the temperature on sexual conflict. Nevertheless, how these will then ultimately play out in the wild could be a different story.

      Agree. We clarify now that we meant pre-programmed fluctuations and acknowledge this limitation in the methods (lines 124-131).

      2) The authors highlight clearly that temperature fluctuations in the wild might play an important part in how sexual conflict plays out in natural populations. This very interesting and highly relevant point might lead the reader to assume that this is what was actually tested in the experiment. Nevertheless, in the experiments, different constant temperatures were applied to the flies, while only the stock population was kept at a fluctuating temperature regime. Hence, the influence of fluctuations during episodes of sexual conflict remains untested. While the present data show that sexual conflict can be modulated by temperature, the effect of naturally occurring fluctuations on the net cost of sexual conflict to a population remains unclear.

      Again, a fair point that we acknowledge in the current version (lines 571-575). “Second, our treatment temperatures were stable, designed to study how coarse-grain changes in temperature across the adult lifespan of flies may influence how sexual conflict unfolds in nature. Thus, future studies will need to encompass how fine-grained fluctuation (i.e., repeated variation of temperature across an adult’s lifespan) may affect male harm for a more comprehensive picture of temperature effects on sexual conflict in the wild”.

      3) The authors conclude that the effect of sexual conflict can be buffered by temperature in the wild. In general, I agree with this, although a more conservative way of framing this would be to say that temperature modulates or moderates sexual conflict instead of buffers it. If there really is a buffering effect of temperature in the wild remains to be tested, I believe. This will depend on how actual changes in temperature affect this dynamic (see point 2). In addition, I think another interesting open question is what the mechanism behind the observed differences might be. Are male and female interests really more aligned at different temperatures (i.e. males plastically reduce harm)? This would really buffer the harm of sexual conflict at those temperatures. Nevertheless, alternatively, males might not be perfectly adapted to manipulate the female optimally at lower or higher temperatures. This would mean that if the temperatures change, males might evolve to increase the manipulation of females, and hence the scope for sexual conflict might not change in the end under this scenario. Nevertheless, as the authors themselves state: 'An intriguing possibility is thus that SFPs are more effective at lowering female re-mating rates at warm temperatures, thereby buffering these costs.' Therefore, a temperature-dependent increase in the effectiveness of male manipulation might counterintuitively reduce sexual conflict in this species.

      We echo both points in the current version of the paper (see lines 633-655).

      4) In the end the authors argue that the climate crisis might have 'unexpected positive consequences via its effect on male harm'. Sexual conflict is indeed widespread, but it takes many different forms (as has been nicely described in the introduction of this paper). Because the studied system seems to be quite a specific example, it is questionable how far spread this phenomenon is in nature. In addition, it remains unclear how male harm will evolve in response to the climate crisis (see point 3). Finally, the relative fitness of females increased in the present experiment, as the tested range was within the reproductive optimum of the species. Nevertheless, the relative importance of the positive effect of sexual conflict on fitness outside of optimal temperatures seems questionable.

      Agree. Altogether, we have tried to tone down our conclusions regarding the implication of our results for a climate change scenario, and acknowledge all the points highlighted by the reviewer in the current version of the manuscript (see lines 563-575).

      Nonetheless, I believe these results to be of exceeding interest to the scientific community and of importance to the field. It opens up many potential research directions and adds further data to the fascinating field of sexual conflict, SFPs, and male harm in Drosophila.

      We are thrilled to read that the reviewer found our study of exceeding interest.

      Reviewer #3 (Public Review):

      In this paper, the authors explore the effects of the environment, specifically temperature, on male harm to females. Male harm is the phenomenon where males reduce female fitness in polyandrous systems, where a single female may mate with multiple males. The selection of males to increase their reproductive success in male-male competition can lead to genetic conflict that increases male fitness at the expense of female fitness. Typically, male harm has been studied in single environments under optimal conditions. However, there is an increasing focus on the effect of the environment on fitness costs of male harm to females, as a way to better understand the effect of male harm on population fitness in more realistic ecological contexts. In this paper, the authors add to these studies by exploring the effect of temperature on male harm and female fitness, using the fruit fly Drosophila melanogaster, as a model system. They find that temperature affects the impact of male harm on female fitness, with male harm having the greatest effect at 24˚C relative to 20˚C and 28˚C. The authors then go on to disentangle how temperature affects the various components of male harm that impact female fitness (e.g. harassment, ejaculate toxicity). The paper demonstrates that male harm depends on ecological context, which has implications for understanding its impact on population fitness under realistic ecological scenarios, particularly with respect to climate change.

      The strength of the paper is that it demonstrates that male harm (presented as differences in female life reproductive success between monogamous and polyandrous matings) changes with temperature. The authors dissect this general observation by showing that different aspects of precopulatory reproductive behavior, for example, male-male aggression, copulation rate, and female rejection rate, also change with temperature. Further, they demonstrate that correlates for male ejaculate quality also change with temperature, suggesting that temperature also affects postcopulatory mechanisms of male harm.

      The weakness of the paper is that the method and results section are difficult to follow, which negatively impacts the interpretation of the data. The experiments are complex and need to be for what the authors are studying. Nevertheless, the paper is written in a way that makes it challenging for the reader to fully understand how precisely the experiments were conducted. Further, the authors do not explain clearly how some of the experiments relate to the phenomenon ostensibly being assayed. For example, a more detailed explanation of why mating duration and remating latency are assays for ejaculate quality in the context of sperm competition would be very helpful in interpreting the data. Further, a clearer explanation of the statistical analyses conducted

      Thank you for the positive, detailed and constructive review. We agree with all the weaknesses laid out and we have strived to address all of them in the current version. This includes a mayor rearrangement, structuring and re-write of the methods and results section and extra statistical analyses. Please find the details below.

    1. he looming presence of climate change, as a kind of techno-social disaster that has already begun and which will inundate the next couple of centuries as somekind of overdetermining factor, no matter what we do

      I think this statement highlights the severity and urgency of the threat posed by climate change. While it may create a sense of despair or helplessness, it can also serve as a call to action for individuals and societies to take concrete steps towards reducing our impact on the planet and mitigating the impacts of climate change.

    1. Author Response

      Reviewer #1 (Public Review):

      The authors have used computational models and protein design to enhance antibody binding, which should have broad applications pending a few additional controls. The authors' new method could have a broad and immediate impact on a variety of diagnostic procedures that use antibodies as sensitivity is often an issue in these kinds of experiments and the sensitivity enhancement achieved in the two test cases is substantial. Affinity maturation is a viable approach, but it is laborious and expensive. If the catenation method is generalizable, it will open up opportunities for antibody optimization for cases where affinity maturation is either not feasible or otherwise impractical. Less clear is how this method might enhance therapeutic potency. Issues that arise when using therapeutic antibodies are often multifactorial and vary depending on the target and disease state. Many issues that occur with antibody-based therapies will not be rectified with affinity enhancement.

      We agree with the limitation.

      Reviewer #2 (Public Review):

      The paper presents an interesting design approach to having homodimeric IgGs with higher binding affinity to the antigens on a surface by fusing a weakly homodimerizing protein (a catenator) to the C-terminus of IgG. Considering the homodimeric IgGs with likely enhanced antigen binding ability and their stabilization with a reversible catenation when bound to the surface is an interesting idea. With agent-based modeling - the simulations based on Markov Chain Monte Carlo (MCMC) sampling - and proof of concept experiments, it has been possible to show the enhanced antigen binding ability of the homodimer Igs for many folds, where the weakly homodimerizing ability of the catenator is indicated to have a central role, enabling proximity effect driven catenation on the antigen bound surfaces. While the results render the enhanced binding affinity of the catenated homodimeric IgGs, the study would benefit from a more elaborated interpretation and discussions of the results.

      The following discussion is now stated in the revision (pages 19-20, in the revision); “While we demonstrated that dual catenator-fused heterodimeric IgGs can enhance binding avidity, the oligomer formation or potential intramolecular homodimerization of the catenator necessitates the development of a more robust catenator for application to conventional homodimeric IgGs. Specifically, the ideal catenator should geometrically disallow intramolecular homodimerization, exhibit fast association kinetics, and be able to withstand the standard low pH purification step. On the other hand, our demonstration indicates that this approach can be applied to bispecific antibodies employing a heterodimeric Fc.”

      One interesting base of the discussion may include how the fusion of the catenator may likely affect the binding behavior, the intrinsic binding behavior, and/or on the global structural changes, of IgGs (monomeric and homodimeric (catenated) per se beyond its proximity-driven contribution. Would it lead to a more restricted structure in the mobility in the unbound states so as to decrease the entropic cost for the binding and thus increase the binding avidity/affinity (in addition to external proximity-driven association). In other words, what would be the role of entropy in the free energy of binding, given that the enthalpic contributions remain the same? Possible effects of the length of the catenator should also in parts be related to the entropy. For example, if a longer and more flexible catenator is considered, what would the resulting observation experimentally and computationally be?

      The binding site occupancy depends on [catAb]/KD. Figure 4-figure supplement 2 shows the binding site occupancy and (KD)eff as a function of (KD)catenator. In this simulation, [catAb] was fixed (10-9 M) while KD was varied (from 10-8 to 10-6). In the figure legend and in the main text, we now explicitly state that KD was varied from 10-8 to 10-6 (page 30, in the revision). To address this comment, we set KD = 10 nM (as used for simulation in Figures 3 and 4), and varied [catAb] from 0.1 to 10 nM. The binding site occupancy and (KD)eff as a function of [catAb] are plotted for three different set values of (KD)catenator (1 μM, 10 μM and 100 μM). The new figures are now presented as Figure 4-figure supplement 3. This simulation shows that the enhancement of (KD)eff by increasing the concentration of catAb is much less dramatic than that by increasing the affinity for catenator homodimerization at [catAb] > 10 nM.

      On the other side, simple simulation approaches have a high value with a level of abstraction while still keeping the physical and biological relevance. In the simulations, i.e. in the sampling of various states, three main terms/rules to govern the behavior are implemented. One is a term favoring an increase in the ability to bind (preventing to unbinding) to the surface upon the catenation of IgGs. This may need to be substantiated for the simulations not imposing a preassumed ability to increase the binding (or decrease the unbinding) ability upon the catenation.

      We agree with the review in that the third rule favors the binding ability of catenated IgGs, because it assumes that catenated antibodies are not allowed to dissociate from the binding site. While this assumption is not exactly correct, we think that it is valid, considering the behavior of a multivalent ligand. When the IgG portion dissociates completely from the binding site, it is still anchored by the catenation arm, and thus it will rebind the same binding site immediately. This postulation agrees with the quantitative analysis showing that multivalent ligand exhibits orders of magnitude binding likelihood increase when the ligand size is comparable to the stretch length of a conjugating linker [Liese, S. & Netz, R. R., ACS Nano, 12, 4140 (2018)].

      The weakly homodimerizing state of the catenator appears as one of the important aspects of the proposed design strategy. Would it also be possible that the experimental observations may readily also imply the higher binding ability of the catenator fused IfgG without the homodimerization on the surface (due to the reduced entropic cost for the binding)? The presentation of the evidence of the homodimerization of the catenator and the catenated IgGs on the surface would strengthen the findings and discussions.

      To fully address this comment, we would need to consider the detailed molecular behavior of the IgG part, the catenator and the linker, probably using molecular dynamics simulation, which we think is outside the scope of the current work. We like to qualitatively describe what we think about the raised issues. Fused to the C-terminus of Fc, the catenator won’t affect the complementary determining region (CDR) of Fab which is located on the opposite side of the C-terminus of Fc. This notion is supported by the observation that the SDF-1α-fused antibodies exhibited association kinetics similar to those of the mother antibodies (Figure 5).

      Regarding the mobility of the structure, we presume that the fused catenator would not interact with the antibody portion and thus it would not affect the intrinsic structural mobility of the antibody.

      Since the catenator is fused to the C-terminus of Fc by a flexible linker, the homodimerization of catenator would decrease the entropy upon catenation. However, the enthalpic contribution would overcome the entropic loss, and result in negative free energy of the catenator homodimerization.

      Figure 2-figure supplement 1 (in the revision) shows the simulation for five different values of the reach length (R), which is the sum of the linker length and half of the catenator length. The simulation results show that the likelihood of catenation decreases as the linker length increases over the distance (d) between the two adjacent catAb-2Ag complexes, while it is maximum when the reach length equals d. Since the catenator length is fixed, increasing the linker length (such that R > d) will lower the catenation effect.

      Reviewer #3 (Public Review):

      The authors proposed an antibody catenation strategy by fusing a homodimeric protein (catenator) to the C-terminus of IgG heavy chain and hypothesized that the catenated IgGs would enhance their overall antigen-binding strength (avidity) compared to individual IgGs. The thermodynamic simulations supported the hypothesis and indicated that the fold enhancement in antibody-antigen binding depended on the density of the antigen. The authors tested a catenator candidate, stromal cell-derived factor 1α (SDF-1α), on two purposely weakened antibodies, Trastuzumab(N30A/H91A), a weakened variant of the clinically used anti-HER2 antibody Trastuzumab, and glCV30, the germline version of a neutralizing antibody CV30 against SARS-CoV-2. Measured by a binding assay, the catenator-fused antibodies enhanced the two weak antibody-antigen binding by hundreds and thousands of folds, largely through slowing down the dissociation of the antibody-antigen interaction. Thus, the experimental data supported the catenation strategy and provided proof-of-concept for the enhanced overall antibody-antigen binding strength. Depending on specific applications, an enhanced antibody-antigen binding strength may improve an antibody's diagnostic sensitivity or therapeutic efficacy, thus holding clinical potential.

      Thanks for the favorable comments.

    1. “Our lessons, units, and courses should be logically inferred from the results sought, not derived from the methods, books, and activities with which we are most comfortable. Curriculum should lay out the most effective ways of achieving specific results… in short, the best designs derive backward from the learnings sought.”

      I'm actually a little bit surprised that this was a revolutionary idea - or that it had to be intentionally staked out as a new school of thought in learning design, where the roles of the learning objective and assessment are so foundational. I suppose tradition and inertia play a role here - certain topics have always been taught using specific instructional activities, and those learning activities are treated as a given by instructors and designers, even if they do not always lend themselves to observable and measurable assessments. I think of the role of the essay in humanities courses - where essays are treated as the established learning activity because of academic traditions, when we may find there are more effective ways to teach and assess a topic if we worked backwards from the learning objectives to find the best way for learners to demonstrate mastery. (I suspect the essay still wins out in many cases, but it's still worth interrogating).

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank all reviewers for their comments and suggestions. The revised manuscript included new experiments they suggested and extensive text edits. Our point-by-point response is shown in bold.

      Point-by-point description of the revisions

      —----------------------------------------------------------------------------------------------------------------

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary

      In this manuscript, Blank et al. propose a link between cell-cycle dependent changes in metabolic flux and corresponding changes in TORC1 activity in yeast cells. Based on their findings, the authors propose that Bat1-dependent leucine synthesis from glucose increases as cells progress through G1 and that this activates TORC1 to drive cell cycle progression. Although the existence of cell-cycle dependent synthesis of leucine is a novel and exciting finding, several aspects of the proposed model are not sufficiently supported by experimental evidence, in particular the fact that the increase in Leu synthesis is causing the increase in TORC1 activity in late G1.

      Major comments:

      1. To show that the increase in Leu biosynthesis in S-phase is activating TOR, one would ideally want to blunt this increase in biosynthesis and assay TORC1 activity. Admittedly, this is difficult. So, instead, the authors study bat1- cells which have strongly impaired synthesis of BCAA including Leucine. The relevance of these bat1- cells to the proposed cell-cycle dependent model, however, is questionable for two reasons: 1) Although the authors state that "exogenous supplementation of BCAAs in all combinations suppressed the growth defect of bat1- cells, especially when valine was present", the spot assays in Figure 3 show visible rescues only when valine is present either alone or in combination, while supplementation of leucine or isoleucine does not seem to have any effect. Hence it appears that the bat1- phenotype is mainly due to limiting valine levels, not leucine levels. 2) The relevance of these results for understanding TORC1 regulation are questionable, since valine does not typically activate TORC1. Does addition of Leu to bat1- cells increase TORC1 activity ? RESPONSE: The reviewer’s comments were very valuable. We performed the suggested experiments (adding not only Leu but also Ile and Val) to bat1 cells and measuring phosphorylation of Rps6 (see new Figure 4D) and the DNA content of those cells (see new Figure 3C). We found that Leu weakly promotes cell cycle progression, compared to the addition of Val, which also leads to pronounced activation of TORC1 (>10-fold activation; see Figure 4D). We discuss these findings in the revised text.

      We also note, as published by others and now discussed in the text, that in WT cells, exogenous addition of Leu (or any other BCAA) does not lead to sustained activation of TORC1 (see new Figure 4D). This is not surprising. As reported by the Hall lab (see PMID: 25063813, which we now cite), the Gtr-dependent activation of TORC1 by BCAAs mentioned by the reviewer is very transient. Hence, our new data, showing sustained TORC1 activation and cell cycle effects upon Val addition in bat1 cells, is exciting. They argue that bat1 cells serve as a highly sensitized background of low TORC1 activity, enabling the display of effects that are difficult to measure in WT cells.

      TORC1 activity is known to depend on steady-state leucine concentrations in the cell rather than on leucine flux. Although the authors observe that the synthesis rate of leucine increases during G1 progression, this does not necessarily translate into increased leucine concentrations in the cell. To support the claim that the increase in TORC1 activity during G1 progression depends on leucine, the authors would need to show that, not only leucine synthesis, but also overall leucine levels in the cell increase during G1 progression.

      RESPONSE: We did this experiment and now report the data (see new Figure EV2), using the Edman degradation-based assay. We found that changes in the steady-state levels of BCAAs had a similar pattern, and those changes were most significant for valine (rising 30-40% from late G1 to G2/M). Nonetheless, we note also that the kinetics of amino acid synthesis measured by our isotope tracing experiment need not match the steady-state levels of amino acids. Steady-state levels are affected by a multitude of parameters, only one of which is the rate of synthesis, as we now discuss in detail in the manuscript.

      To test whether the increase in Leu biosynthesis in S-phase activates TORC1, a few different approaches could be tested: 1) Since leucine activates TORC1 through the Gtr proteins, the authors could test whether rendering TORC1 resistant to low leucine through expression of constitutively active Gtrs abolishes the cell-cycle dependence in TORC1 activity. 2) Leu could be added to the medium of wildtype cells in G1 to the amount necessary to cause an increase in intracellular Leu levels similar to those seen in S-phase to test whether this increases TORC1 activity.

      RESPONSE: We did the suggested experiments, which are now shown in the new Figure 5. Leucine and valine accelerated the rise in TORC1 activity in G1. However, there were no noticeable downstream consequences in the kinetics of cell cycle progression. As we discuss in the text:

      “A small acceleration of the rise in the levels of phosphorylated Rps6 was evident in both the leucine- and valine supplemented cells (Figure 5A,B). Nonetheless, there were no noticeable downstream consequences in the kinetics of cell cycle progression, in either the rate the cells increased in size or their critical size (Figure 5A; see values above the corresponding blots), consistent with the notion that TORC1 activity already is at a maximal level in these conditions…”

      In Fig 2B one sees that Leu biosynthesis peaks at 150min and then drops again. The p-RpS6 blot in Fig. 5D, however, only goes up to 140 min and shows that TORC1 activity increases up to 140 min, but it doesn't show timepoints beyond 150 min when Leu biosynthesis drops again, and hence one would expect TORC1 activity to drop. If TORC1 activity were to drop from 150min onwards, this would strengthen the correlation between Leu biosynthesis and TORC1 activity.

      RESPONSE: The reason for the drop in Figure 2 is trivial and does not affect the interpretation. As seen in Figure 1 (the experiment from which the data in Figure 2 are shown), by 180 min, the cells were entering a new cell cycle, evidenced by a reduction in cell size (Figure 1B) and in the fraction of budded cells (Figure 2B). At that point, there is a mix of mothers and daughters with very poor synchrony, making it impossible to conclude much about the drop in Leu synthesis (i.e., does it arise from the lack of new synthesis in mothers, daughters, or both?). In the experiment in Figure 5, the reviewer mentions (now those figures have moved to File S8 because we added more experiments in the figure) the experiment terminated when peak budding was reached, which was 140 min, within one cell cycle. Lastly, it is important to stress that every elutriation experiment is different. While the times are close, comparing various experiments on a time basis alone is inaccurate. Instead, the metric used in the field to compare different experiments is usually cell size, which we use in all other Figures except Figure 1 because, in that case, the experiment was a time-based, pulse-chase one.

      Minor concerns:

      1. In Figure EV4, the authors should highlight some of the metabolites that are significantly changed, in particular the BCAA. The figure is not very informative as currently presented. __RESPONSE: We have now labeled the BCAAs, and a few more metabolites as suggested (note the Figure is now EV5). __

      Fig 2 - are "expressed ratios" the best term for metabolite levels? Unlike genes, where such heat maps are often used, the metabolites are not 'expressed'. How about 'relative metabolite level' instead?

      RESPONSE: Good point. The axis now reads “relative abundance”.

      Page 8: "We also measured the MID values from the media of the same cultures used to prepare the cell extracts." Where are these data? We don't see them in File S2?

      RESPONSE: The data are in File S2 (there are many ‘sheets’ in the file). In sheets 3,4 are the MID values and the analysis from metabolites in the media.

      Fig 4B - the x-axis labeling is missing for the bat1- cells

      RESPONSE: Corrected. Note that new DNA content measurements are now shown in Figure 3C.

      Although the authors state repeatedly that they show "for the first time in any system" that TORC1 activity is dynamic in the cell cycle, similar observations have already been made before, for instance showing high mTORC1 activity in the G1/S transition in the Drosophila wing disc or low mTORC1 activity during mitosis in mammalian cells (see PMIDs 28829944, 28829945, and 31733992). The text should be amended accordingly.

      RESPONSE: Thank you. Corrected.

      There are two entries for valine in File S1/Sheet8. Why?

      RESPONSE: The reason is that they were detected in both analytical pipelines (primary metabolites and biogenic amines; primary metabolites were measured with GC-TOF MS, while biogenic amines with HILIC-QTOF MS/MS), which were combined in the Table. We did not describe it adequately in the previous version. We do now, in the Methods. We also note that the raw data from each method are shown in the corresponding supplemental files. We combined them in the Table used in the Figure for display purposes. We also note that the amino acids were also measured by another method (PTH-based HPLC). Hopefully, the new edits in the Methods clarify these points.

      Reviewer #1 (Significance (Required)):

      Significance

      Despite the well-known effects of pharmacological or genetic manipulations of TORC1/mTORC1 on cell cycle progression, whether and how mTORC1 activity itself is physiologically coupled to cell cycle progression is still an insufficiently studied aspect. Hence this study provides an interesting link between cell-cycle dependent regulation of amino acid biosynthesis and TORC1 regulation. Importantly, the results of this study rely on centrifugal elutriation to obtain cell cycle synchronization, thus ruling out potential metabolic artifacts due to pharmacological methods. The observed changes in metabolic flux are therefore likely genuine and represent the major strength of the study. The major limitation is the lack of strong evidence supporting the notion that the increase in Leu biosynthesis at late G1 or S-phase is causing the increase in TORC1 activity.

      The major advance is conceptual - that amino acid biosynthesis rates are cell-cycle dependent.

      These results will be of interest to a broad audience of people studying the cell cycle, cell growth, TORC1 activity, cell metabolism and cancer.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      This paper provides evidence that branched chain amino acid (BCAA) in the G1 phase of the cell cycle, fueled by pyruvate generated by glucose catabolism activates cell growth and allows cells to reach the critical size required for entry into S phase by activation of TORC1 signaling. Previous work had indicated that Leucine supplementation of a bat1 bat2 mutant, lacking both enzymes that catalyze BCAA from the alpha-keto acid precursors and starved on minimal medium, led to TORC1 activation. This work is significant in suggesting that BCAA synthesis from glucose is responsible for a cyclic activation of TORC1 necessary for a normal rate of cell growth in the G1 phase of the cell cycle.

      The study employs metabolic flux analysis of metabolites derived from glucose following a pulse-chase with different isotopes of glucose in synchronized early G1 cells (obtained by elutriation) throughout one cell cycle. They claim that the only compelling changes in metabolites observed as the cell cycle proceeds was a decline in pyruvate containing only one heavy 13C carbon atom and a corresponding increase in Leu (M6) with 6 heavy carbon atoms, which is interpreted to indicate Leu synthesis from pyruvate that begins in early G1 and peaks at mitosis. They show that a bat1 mutant exhibits a slow-growth phenotype that can be mitigated only by valine (although they infer similar effects for Leu and Ile that I find unconvincing) and they observed reductions in all three BCAAs in different experiments that measure steady amino acid levels in different ways (although the results are compelling only for Val). They go on to show evidence that the bat1 mutation reduces birth and mean cell size and leads to an increased proportion of G1 cells in asynchronous cultures, and they claim that bat1 cells take much longer than WT to achieve the same size found when a synchronized WT culture reaches 50% budding (although they don't show the data for this last point.) Interestingly, they find that deleting BAT1 suppresses sensitivity to the TORC1 inhibitor rapamycin (Rap), consistent with the idea that the bat1 mutation impairs TORC1 activity in the same manner as Rap and that BCAA are required to activate TORC1 in WT cells to the level that can be impaired by Rap, as summarized in the model in Fig. 5F. Consistent with this, they present evidence that the bat1 mutation reduces TORC1 signaling as judged by diminished Rps6 phosphorylation (although it was not shown that this effect could be reversed by Val addition). They also show that TORC1 signaling/Rps6-P increases as the cell cycle progresses using elutriated early G1 cells, suggesting that TORC1 activity is periodic in the cell cycle (although they don't establish this periodicity through a second cell cycle).

      General critique:

      The conclusion that BCAA synthesis from glucose is responsible for a cyclic activation of TORC1 necessary for a normal rate of cell growth in the G1 phase of the cell cycle is potentially of considerable significance. There are however a number of puzzling aspects of the data that seem to weaken this conclusion. As described in greater detail below, it is difficult to explain why only Leu is synthesized from glucose during the cell cycle, and why only Val shows a marked reduction in the bat1 mutant that appears to be responsible for the slow-growth phenotype. In addition, there are important controls lacking of showing that a Val supplement can suppress the G1 delay and reduction in TORC1 signaling in the bat1 mutant. In addition, the evidence that TORC1 activity is periodic in the cell cycle is lacking and it needs to be shown that Rps6-P levels are periodic through at least a second cell cycle.

      Major comments:

      -Why don't they observe synthesis of Ile and particularly Val in the metabolic flux experiment of Fig. 1, especially considering that only Val appears to be critically required for normal cell growth in the bat1 mutant based on the results in Fig. 3B?

      RESPONSE: We now show the actual plots and the errors of all the measurements in Figure 2 (instead of a heatmap we had shown before). Valine (M5) levels show a very similar trend to leucine (M6). The variance in the measurements was higher, though, and statistically, the valine changes were less significant. Hence, it was more appropriate to highlight the leucine changes. Lastly, the new DNA content data (Figure 3C) show an effect upon the addition of leucine, albeit less significant than that of valine addition.

      -The data in Fig. 3B do not show a convincing increase in growth of the bat1 mutant with addition of Leu and Ile; and the stimulation by Val alone seems identical to that seen with Val in combination with Leu and Ile. Thus, it appears that the slow-growth of the bat1 mutant results only from reduced Val levels, not all 3 BCAAs, which is at odds with their interpretation of the data.

      __RESPONSE. As mentioned above, the effect of valine is more pronounced than leucine's, but leucine does have consequences, best shown in the DNA content analysis (new Figure 3C). We also note that valine alone is insufficient to suppress the growth and cell cycle defects of bat1 cells. The latest data we have added (see Figures 3 and 5) are consistent with the interpretation that at least some de novo synthesis of BCAAs in the cell may be needed, explaining why exogenous BCAAs, including valine, are unable to correct the defects of bat1 cells fully. __

      -they claim to see reductions in all three BCAAs in the bat1 mutant; however, no significant reduction was found for Leu in Fig. EV3, and only Val was altered by the 1.5-fold cut-off imposed on the MS metabolomics data in Fig. EV4 (which could be appreciated only by an in-depth examination of the supplementary data in File S1-the Val, Leu, and Ile dots should be labeled in Fig. EV4). In addition, the reductions in Ala and Gly showin in Fig. EV3 were not found in the MS analysis of Fig. EV4. It needs to be acknowledged that the metabolomics data show a marked reduction in the bat1 mutant only for Valine with little or no change in Leucine levels. This result is difficult to explain with the simple models shown in Fig. 3A and 5F, which requires additional comment. The authors should acknowledge the much greater effect of the bat1 mutation on Val levels versus Leu and Ile, revealed both by measuring the levels of BCAAs in the mutant and comparing the BCAAs for rescuing the slow-growth of the mutant, and explain how this can be reconciled with the results in Fig. 2 where only Leu and not Val or Ile synthesis was detected.

      __RESPONSE. The perceived discrepancy in the steady-state measurements could easily arise from the different analytical methods used in each case. The differences are less substantial than the reviewer implies. For steady-state measurements in BAT1 vs. bat1 cells, we used the PTH-based method (which only detects amino acids) and two different MS-based pipelines (which detect various metabolites). From the MS-based analyses, the drop for all BCAAs was statistically significant. Although the magnitude of the drop was greater for valine (about 60% for valine vs. ~30% for isoleucine and leucine). Why is this a problem? __

      As for the valine changes in the isotope tracing experiments, as we mentioned above, the trend for valine (M5) was similar to that of leucine (M6) (now, hopefully, that data is shown better in Figure 2). Furthermore, as we commented above (see response to Reviewer 1) and now stated in the text, our isotope tracing experiments measure only the rate of synthesis, which need not match the steady-state abundances. The latter are affected by a multitude of variables, including the turnover of proteins and amino acids, not to mention their partition into distinct intracellular pools.

      __Lastly, please note that we have now added PTH-based measurements of amino acid levels in the cell cycle of wild type cells (new Figure EV2). As mentioned in our response to Reviewer 1, we found that changes in the steady-state levels of BCAAs had a similar pattern, and those changes were most significant for valine (rising 30-40% from late G1 to G2/M). __

      -They need to add the data indicating that the bat1 mutant requires longer than WT cells to reach the ~35 fL volume at which 50% of WT cells are budded.

      __RESPONSE: We added all that data (new Figure EV6) and discussed it better in the text. Note that our elutriation analyses allow accurate estimates of the G1 duration, which is at least 2x longer in bat1 vs. BAT1 cells. __

      -It seems important to show that Val supplementation can suppress the overabundance of G1 cells in bat1 mutant cells shown in Fig. 4C; and can restore sensitivity to Rap and Rps6-P accumulation in bat1 mutant cells (in Fig.s 5A & B).

      __RESPONSE: Excellent suggestions. We now present the requested experiments. The DNA content data are in Figure 3C, and the phospho-Rps6 data in the new Figure 4D are discussed in the text. Briefly, exogenous valine, and to a lesser extent leucine, suppressed the G1 accumulation, but not to wild type levels. Exogenous valine also substantially increased TORC1 activity (>10-fold). __

      -It seems important to show that Rps6-P will decline in M phase and increase during a second cell cycle to establish that TORC1 activity actually fluctuates in the cell cycle instead of just by reduced by the manipulations involved in collecting young G1 cells by elutriation.

      RESPONSE: The second cycle comment is not pertinent to our elutriation setup. The two-cycle approach should be used in arrest-and-release synchronizations to minimize arrest-related artifacts when cells continue to grow in size. This is why we used elutriation in the first place, as described in the text, to avoid such artifacts. In elutriations it is the first cycle, exclusively of daughter cells, that can be meaningfully scored. After that, the cells lose synchrony very fast because you have mothers (which grow in size very little) and daughters (which need to double in size until mitosis). Hence, the second cycle will be meaningless and impossible to interpret.

      Reviewer #2 (Significance (Required)):

      General Assessment:

      Strengths: Evidence for BCAA biosynthesis from glucose in the G1 phase of the cell cycle, and evidence obtained from analyzing the bat1 mutant that BCAA synthesis underlies activation of TORC1 early in the cell cycle in a manner required to achieve the critical cell size necessary for G1 to S transition.

      Weaknesses: Lack of evidence for Val biosynthesis in G1 despite evidence that Val limitation is more crucial than Leu limitation in the bat1 mutant; lack of confirmation that Val limitation underlies the delayed G1-S transition and reduced TORC1 signaling in the bat1 mutant; and lack of compelling evidence that TORC1 activity is periodic in WT cells.

      Advance: This would be the first evidence that TORC1 activity varies through the cell cycle in a manner controlled by synthesis of BCAAs

      Audience: This advance would be of great interest to a wide range of workers studying how the cell cycle is regulated and the role of TORC1 in controlling cell growth and division in normal cells and in human disease.

      My expertise: Mechanisms of metabolic regulation of gene expression at the transcriptional and translational levels in budding yeast

      **Referees cross-commenting**

      Ref. #1's major comment 1 echoes my request for clarification about whether Leu, and not just Val, is limiting growth in the bat1 mutant, and also the need to determine which BCAA supplement to bat1 cells will restore TORC1 activity (which was also requested by Ref. #3).

      I agree with this reviewer's request to provide evidence that Leu levels actually increase during G1 progression (comment #2). I also think the suggested experiments in Comment #3 are reasonable for their potential to provide stronger evidence that Leu production in the G1 phase of wild-type cells activates TORC1, as currently the argument is based on the finding of low TORC1 activation in bat1 cells (that seem to be limiting for Val vs. Leu). Comment #4 echoes similar requests made by both me and Ref. #3. Ref. #3's major comments 1 and 3 mirror two of my major comments. I wasn't convinced of the need to monitor Sch9 versus Rps6 phosphorylation as a read-out of TORC1 activity-does being a direct substrate truly matter? Regarding comment 5, I wasn't convinced of the need to include Rap-sensitive or -resistant control strains for the analysis in Fig. 5A. And regarding comment 4, while it would be interesting to examine if TORC1 regulates BCAA synthesis during cell cycle progression, this seems to be outside the scope of a demonstration that BCAA synthesis stimulates TORC1.

      Thus, it seems we all agree on certain experiments that need to be carried out, and Ref. #1 has rightly proposed a few others with the potential to strengthen the evidence that Leu production during G1 phase mediates cyclic activation of TORC1

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Blank and colleagues measure the synthesis of various metabolites from glucose during cell cycle progression and observe an increased synthesis of branched-chain amino acids (BCAA) from the early G1 to late G1 phase. Interestingly, they also found a gradual increase in TORC1 activity from the early G1 to the S phase which is proposed to be dependent on BCAA synthesis.

      Major comments:

      1. The authors show that TORC1 activity increases from the early G1 to the S phase. TORC1 activity is sensitive to short-term starvations caused during changing media or centrifugations. Hence, the concern arises regarding the increased pattern of TORC1 activity during the cell cycle. Is it really a biological phenomenon or a cellular adaptation to experimental conditions? Can authors provide more support for this observation? Can authors monitor the cell cycle for the two cell cycles to confirm that TORC1 activity shows a wavy pattern? RESPONSE: The same point was also made by Reviewer #2. As we noted in our response above, “____The second cycle comment is not pertinent to our elutriation setup. The two-cycle approach should be used in arrest-and-release synchronizations to minimize arrest-related artifacts when cells continue to grow in size. This is why we used elutriation in the first place, as described in the text, to avoid such artifacts. In elutriations it is the first cycle, exclusively of daughter cells, that can be meaningfully scored. After that, the cells lose synchrony very fast because you have mothers (which grow in size very little) and daughters (which need to double in size until mitosis). Hence, the second cycle will be meaningless and impossible to interpret.____”

      The authors use Rps6 phosphorylation as a read-out of TORC1 activity, which is not a direct substrate of TORC1. Analysis of the direct substrates of TORC1, such as phosphorylation of Sch9 will solidify the author's claim.

      RESPONSE: The reviewers discussed this point (see their comments above). We agree with the opinion that Rps6 phosphorylation accurately reports on TORC1 activity (also used in the fly experiments we now cite, as requested by Reviewer 1). For all our experiments' objectives and conclusions, it doesn't matter if the phosphorylation of Rps6 lies more downstream than Sch9 phosphorylation.

      Authors show that Bat1 lacking strain have reduced TORC1 activity. Can authors restimulate these cells with Leucin, Valine, and Isoleucine individually or in combination to identify the critical amino acid for the TORC1 activity?

      RESPONSE: Yes, that is an excellent suggestion. We show the experiment in Figure 4D (see previous response). Valine showed pronounced activation (>10-fold).

      The authors claim that increased BCAA synthesis is necessary for TORC1 activation. Since TORC1 is shown to be upstream of amino acid biosynthesis pathways, it will be interesting to check if TORC1 per se regulates BCAA synthesis during cell cycle progression. The authors could inhibit TORC1 by rapamycin treatment and monitor if the BCAA synthesis still shows cell cycle-dependent modulation.

      RESPONSE: The reviewers also discussed this point (see their comments above). We agree with the view that it is a very substantial undertaking, well beyond the scope of this work.

      In Figure 5A, the use of any rapamycin-sensitive and rapamycin-resistant strains as controls will strengthen their claim of TORC1 inhibition being epistatic to Bat1 deletion, since the rapamycin in minimal media might be less effective.

      RESPONSE: Again, the reviewers also discussed this point (see their comments above). We agree that it will not add much to the conclusions in the context of all the data we show and the existing literature.

      Minor comments:

      1. The data of metabolic labeling, especially various species M1, M2, M3, etc., of an individual metabolite is difficult to understand for the general readers. Hence, a schematic explaining various species might be helpful. RESPONSE: We added a new Figure (EV1) delineating the carbons from glucose to valine and leucine.

      Please describe the elutriation approach in more detail with media conditions and buffer conditions to understand the overall experimental setup.

      RESPONSE: We now added this information (see the second section of the Materials and Methods).

      Reviewer #3 (Significance (Required)):

      Significance:

      Overall, this study presents an interesting observation to the researchers working in TORC1 and cell cycle regulation.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      General Statement

      We thank the reviewers for a thorough review that will help us to improve the manuscript in the revision process. In our opinion, all three reviewers found the manuscript interesting, novel, and relevant for a broader readership. The reviewers suggested performing additional analyses of cell quantification from existing brain tissue or from newly generated tissue. All reviewers identified several shared concerns that we are happy to address by additional experiments and analyses to improve our manuscript. The reviewers suggested including the Control Diet + LiPR treatment group to further characterize the effects of LiPR on adult neurogenesis outside the context of the High Fat Diet. Also, the reviewers suggested including built upon the analysis of tanycytes and their proliferation. Some of these analyses will require generating new experimental animals, however, most analyses can be performed from already available brain tissue or previously collected confocal microscope images. Because we had anticipated some of the possible concerns, we have placed mice in the experiment already in February 2023. These mice are in the 4-month treatment group of Control Diet + LiPR. We will collect the brain tissue at the end of May 2023 and will analyze it in June and July 2023. In April and May 2023, we will work on analyses from existing tissue or images as described in detail below. We estimate that the suggested analyses are all feasible and should be manageable in 3 months. In fact, we are pleasantly surprised by the favorable nature of the reviews, especially from the reviewer 1 and 3, which allowed us to address around 50% of comments already as demonstrated in this revision plan (see section 3). Therefore, we are confident that we will be able to address the remaining concerns to full satisfaction of all relevant reviewers’ comments.

      Reviewer 1


      In this manuscript, Jorgensen and colleagues describe their findings on the action of a palmitoylated form of prolactin-release peptide (LiPR) on neural stem cells (NSC) in the adult mouse hypothalamus and adult mouse hippocampus. Their main conclusion is that LiPR can counteract the effects of high-fat diet (HFD) and rescue some of the adverse effects of HFD. Specifically, the authors provide evidence that: - Exposure to HFD reduces the number of presumptive adult neural stem cells (NSCs) in the adult hypothalamus, whereas exposure to LiPR reverses this trend. - The results suggest that LiPR reduces the proliferation of alpha-tanycytes and/or their progeny in the hypothalamus in the context of HFD, with Liraglutide acting similarly. In contrast, while LiPR also suppresses proliferation in the SGZ, Liraglutide works there in the opposite direction. - LiPR also helps the survival of adult-born hypothalamic neurons. - Reduction of proliferation by LiPR suggests a model where LiPR increases the number of NSCs presumably by reducing their rate of activation. - The results suggest that LiPR promotes expression of PrRP receptors in the hypothalamic neurons, suggesting that PrRP may act directly on such neurons (and tanycytes?) in vivo. - The authors also show that HFD and LiPR alter gene expression profiles of the MBH cells, with HFD, but not LiPR, inducing myelination-related genes. - Finally, they show that PrRP stimulates an increase in Ca2+ in in vitro-derived human hypothalamic neurons. - The authors conclude that LiPR may be reducing activation and proliferation of the hypothalamic stem cells and thereby preserve their pool from exhaustion, which was stimulated by HFD. The manuscript presents interesting data and is clearly written. There are several comments, mainly editorial.

      RESPONSE: We thank the reviewer for the favorable and positive assessment of our manuscript and for finding our study to be interesting to a broad audience and well written, with most comments described by the reviewer as “editiorial”. Below, we address the reviewer’s concerns in a detailed revision plan.

        • It is unclear why most of the experiments do not include the control+LiPR group. Even though the focus of the study was the action of LiPR in the context of HFD, questions remain regarding the action of LiPR per se. Is LiPR (or Liraglutide, for that matter) completely inactive on the normal diet background, with respect to neurogenesis in the hypothalamus and the hippocampus? Whether the Response is positive or negative, it would give a much better understanding of the action of LiPR - does it regulate neurogenesis in various physiological contexts, or does it only kick in with a particular type of diet? In fact, this was examined (see Supplementary figures), but only for the cells in culture and, when performed with animals, was limited to 7 and 21 days, rather than 4 months, which would have been much more informative.* RESPONSE: We thank the reviewer for this suggestion. We agree that including the Control Diet + LiPR group for the 4-month HFD group would complement the results from the 7 and 21 days. We will generate this treatment group for the 4-month HFD group and analyze the effect of LiPR on aNSC and adult-generated neurons. These mice in the 4-month treatment are in the experiment already from February 2023 and we plan to analyze their brain sections in June and July 2023.
      1. The question above is also relevant when considering the conclusions on the potential depletion of the stem cell pool (again, whether in the hypothalamus or the hippocampus), particularly at the 4-month time point. The mice are ~6 months old by that time, and neurogenesis in both regions is expected to decrease by that time. Are LiPR or Liraglutide able to suppress or exacerbate this decrease? Can they be used to mitigate this decrease when mice are on a regular diet?*

      RESPONSE: This concern will be addressed by analyzing the Control + LiPR mice for the 4-month HFD group (see our response to the point 1 above). We will analyze neural stem cells in the Hypothalamic Ventricular Zone and neural progenitors in the Median Eminence of these mice to address whether LiPR treatment changes the time-dependent decrease in both cell populations.

      • A somewhat related issue is that, in most cases, only the percentage or the density of cells are shown on the graphs, rather than the absolute numbers (at least for some cases). This sometimes complicates the comparisons; for instance, does the surface of the hypothalamus change between 2 and 6 months of age? The tanycytes' number stays, apparently, the same (e.g., Fig. 2) but the production of new neurons is supposed to fall dramatically.*

      RESPONSE: We thank the reviewer for this comment. We agree that the quantification of absolute number of cells is the preferable approach that we have used in our previous publications on subventricular (SVZ) or subgranular (SGZ) neurogenesis. However, hypothalamic adult neurogenesis is dispersed over much larger volume of tissue than neurogenesis in the SVZ or SGZ, which is confined to narrow tissue compartments. As we do not have access to a confocal microscope with stereological software, absolute quantification in entire MBH is not feasible. Nevertheless, we believe that our quantification of cell density provides an unbiased and informative approach that allowed us to compare the effects of LiPR and diet on the neurogenic process.

      • The authors write "LiPR may prevent stem cells from exhaustion, induced by HFD" - but it is not clear that HFD indeed leads to exhaustion - there is no statistically significant difference in the number of the stem cells (alpha-tanycytes) between the control and HFD or between HFD at 1, 3, or 12 weeks.*

      RESPONSE: We thank the reviewers for their insights. We adjusted the interpretation to better reflect our results. On line 442, we replaced the original statement “The lower cell activation may protect the stem cell pool from exhaustion elicited by the HFD“ with a new one, “The lower cell activation may protect the stem cell pool from exhaustion elicited by the HFD“.

      • Numerous papers show that the rate of production of new adult hypothalamic neurons (mainly those derived from beta-tanycytes) drops drastically within the first several weeks of mouse life. Does HFD accelerate, and LiPR mitigate, this decrease? Perhaps one can calculate the numbers from the graphs, but it would help if this is explained in the text of the manuscript. Also, it is not always clear whether specific experiments were performed with the zones of the hypothalamic wall that only contain alpha-tanycytes.*

      RESPONSE: Our results show that LiPR rescues the HFD-induced reduction in adult-generated hypothalamic neurons only in the context of 4-month HFD but not in the 7- and 21-day HFD. In the methods (line 877), we specify that “the Region of Interest (ROI) quantified included the MBH parenchyma with the Arcuate (Arc), DMN and Ventromedial (VMN) Nuclei and the Medial Eminence (ME)”. In the results of the revised manuscript (lines 301-303), we highlighted the areas of the ROI. Upon the request of Reviewer 3 (comment 14), we included new data on quantification of BrdU+ neurons in the Arcuate Nucleus (S.Fig.5O). This data show that 21d HFD increases the number of new neurons in ArcN, which is reversed by LiPR or Liraglutide (text added to results and discussion on lines 309-313 and 468-474, respectively). Finally, in the discussion (lines 464-488), it is stated that HFD and/or LiPR had no effect on number of new hypothalamic neurons or cells in the MBH parenchyma in the 7- and 21-day groups and this is discussed in the context of relevant literature.

      • A sharp increase in PCNA+ cells in the hippocampus at the 21-day time point, both in the control and in the HFD and HFD/LiPR groups (Fig. S2f) is a little puzzling because neither the Dcx+ nor the Ki67+ cells show this increase.*

      RESPONSE: We agree with the reviewer that this increase in the number of PCNA+ cells is puzzling. We quantified the number of PCNA+ cells twice by two different people, always getting the same result. Given that this is a minor result in a supplementary figure, we would prefer not analyzing this again, unless the reviewer would insist on it.

      • The study deals with several agents and several processes; a simple scheme that summarizes authors' conclusions might help to better understand the relationships between those agents and processes.*

      RESPONSE: We thank the reviewer for this useful suggestion. We included a summarizing schematic in the revised manuscript as the new Figure 6. We will update the schematic for the final revised manuscript, when we will incorporate the new analyses.

      ***Referee cross-commenting**

      I agree, the lack of the LiPR group complicates the interpretation of the results. I also agree that the experiments with vimentin staining, calcium increase, and even with neurospheres do not add much to the main questions that this study attempts to Response, and I'd rather see a more thorough analysis of the activation and differentiation data. I also want to reiterate that the concept of LiPR/PrRP preventing the exhaustion of the hypothalamic stem cell pool is not clear, because it is not shown that this pool does actually get exhausted under normal or HFD conditions. This latter issue again requires the LiPR-alone group. Also, as a clarification - I wrote about 1 month required to compete the revision assuming that the authors actually have the data on the Control+LipR group or at least the specimens available, mainly because the supplementary material shows results with this group, at least with the neurospheres. If this group is fully missing, then the effort will obviously take a longer time.

      Reviewer #1 (Significance (Required)):

      The provided evidence suggests, for the first time, that PrRP prevents the loss of the neural stem cells population in the adult hypothalamus that was diminished by obesity and HFD. This finding might be interesting to a broad audience.

      *

      Reviewer 2


      *The authors examine the effect of an anorexigenic drug, LiPR in the context of treatment with high fat diet (HFD) and with a special focus on hypothalamic neural stem/progenitor cells and neurogenesis. The work is mostly based on mice and a barrage of different techniques (confocal imaging, cell cultures with time lapse, gene expression...) are used. The results are interesting because they address the yet-poorly understood implication of hypothalamic neurogenesis in food intake and energy balance. The results point at complex effects at different levels (neural stem cells, neurons, division, survival...). The experimental approach is sometimes thorough in the treatment of details on the one hand, it also lacks of consistency on the other, and as a result the conclusions lack strength. There is a number of experiments that sometimes seem unrelated and this hurts the comprehension of the manuscript, specially in lieu of the complexity of the results obtained.

      *

      RESPONSE: We thank the reviewer for finding our results interesting and relevant. We will strive to improve the consistency of our results in the revised manuscript to satisfy the reviewer’s concerns.

      1. A major issue is the lack of a LiPR-only group, which would much facilitate the interpretation of the results. The effect of LiPR alone is however tested, but only in comparison with the Control in one of the in vitro experiments (S.Fig. 3) RESPONSE: We agree with the reviewer that expanding on the LiPR-only effect would facilitate the interpretation of the results (see concern 1 and 2 of reviewer 2). We want to emphasize, however, that we analyzed the HFD-independent LiPR effects not only in vitro but also in vivo by quantifying the number of BrdU+ cells and neurons in the MBH of mice exposed to 21-day HFD (S.Fig. 5 O-Q) and by including the Control Diet + LiPR in the RNAseq experiment (Fig.5C). Nevertheless, we will analyze the number of alpha tanycytes and proliferating cells for the 21-day Control Diet + LiPR treatment group. And we will generate mice treated with Control Diet + LiPR to complement the 4-month group. In this Control Diet + LiPR group, we will quantify the number of tanycytes and number of BrdU+ cells and neurons.

      2. As plotted, in Fig 1B is difficult to interpret the effect of HFD and LiPR, might be using percentage and noting the statistical differences as in the other would help. It looks like HFD has no effect compared to control on weight and only at the end LiPR could have an effect. On the other hand, after 4 months, HFD mice are clearly above the controls and it is then, albeit when weight gain has reached a plateau, that LiPR has an effect. The election of these arbitrary paradigms and their drawbacks has to be better explained.*

      RESPONSE: We thank the reviewer for the comment. We analyzed the effect of HFD and/or LiPR on the body weight for the 21-day group (Fig.1B) in the original manuscript (lines 111-115). The two-way, repeated measure ANOVA revealed no effect of the treatment on the body weight in the 7-day group, however, it revealed the effect of the duration of treatment on the body weight in the 21-day group. As suggested by the reviewer, we included the Control Diet + LiPR in the 21-day group (Fig.1B). We analyzed the data with ANOVA and found that the treatment has a statistically significant effect on the body weight, however, without any statistical difference between treatment groups (lines 112-116 in the revised manuscript). In addition, we will include the Control Diet + LiPR in the 4-month group.

      Why was the proportion of GPR10+BrdU+MAP2+ cells only assessed in control mice and no in the experimental groups if its expression in overall neurons changes? This suggests that the receptor is expressed in neurons. Interestingly, exposure to 21d HFD reduced density of GPR10, which was rescued by LiPR administration (Fig.1L). Why was this time point chosen and not the longer-term one? What is the consequence of the alterations in the potential number of GPR10, specially in relation to the administration of LiPR? This clarification is important because a 14-day treatment was chosen for the in vitro experiments in which LiPR, but not HFD, seems to have an effect on cell proliferation. Might be it would have been more useful to use a paradigm in which HFD has an effect to better compare with in vivo work and for the rationale of the work. "Besides GPR10, we co-localized neuronal cytoskeleton structures with NPFFR2 in the MBH (Fig.1O-P)..." Why were not GPR10 and NPFF2 analyzed in a similar and consistent manner ? It is confusing.

      RESPONSE: The proportion of GPR10+BrdU+Map2+ neurons was quantified to address whether new neurons express the PrRP receptor. We chose to analyze the proportion of GPR10+BrdU+Map2+ neurons at the 21d time-point because we had the most robust data for this or related time points in vitro and in vivo. We will emphasize this in the text. But we prefer not to analyze the effect of LiPR on the density or expression of GPR10 or NPFF2 for all time points. We consider this to be beyond the scope and focus of the manuscript.

      The number of GFAP+ α-tanycytes is not significantly changed by HFD therefore LiPR does not rescue, but rather increases the number of GFAP+ α-tanycytes in the 7-day setting. There are no differences among groups later, the effect is lost by 21 days, therefore there is a transient excess of GFAP+ α-tanycytes which later "disappear" in the LiPR group. The authors state that LiPR rescues the decrease in "htNSCs", but after 21 days the number of the GFAP+ α-tanycytes is the same in all groups without the need of LiPR. There is no experimental follow up (addressing proliferation and survival of these cells) and the conclusions stated in the text (results and discussion) are not really supported by the data. The in vitro experiments could be a complement, but are no substitute for the missing in vivo exploration.

      RESPONSE: We thank the reviewer for this comment. We agree that we did not correctly interpret the data. On line 158, we replaced the original statement “This suggests that short LiPR rescues HFD-induced reduction in the number of htNSCs” with a new one that reflects of date correctly, “This suggests that short LiPR increases the number of htNSCs. In our revision plan, we will quantify the number of proliferating tanycytes to complement our in vitro results.

      • The fact that cell division is "rarely found" (Rax GFAP) experiments also push for further investigation. It is difficult to see that relevance of the inclusion of the vimentin staining experiment if there is no further exploration. The effect of LiPR is only transient, in the 7-day paradigm and as the parameter evaluated is the proportion of vimentin+ tanycytes among GFAP+ tanycytes it could only be reflecting increased expression of the filament. "Nevertheless, we did not observe a statistically different change in the area occupied by Rax+ tanycytes (Fig.2H)." Why did the authors use Rax only for this experiment if "GFAP+ α-tanycytes which are considered the putative htNSCs?" What is the justification for not seeing changes in relation to the results reported in Fig 2D-F? "Because Vimentin is associated with nutrient transport in cells and with metabolic response to HFD 52-54, we quantified the proportion of GFAP+ tanycytes expressing Vimentin (Fig.2F)." It is difficult to see that relevance of the inclusion of the vimentin staining experiment if there is no further exploration. The effect of LiPR is only transient, in the 7-day paradigm and as the parameter evaluated is the proportion of vimentin+ tanycytes among GFAP+ tanicytes it could only be reflecting increased expression of the filament.*

      RESPONSE: Because Vimentin is a marker of neural stem cells and alpha tanycytes, we quantified the number of GFAP+Vimentin+ tanycytes to complement the quantification of GFAP+ alpha tanycytes. We are sorry that this was not clear, and we highlighted this connection in the revised manuscript (line 165). Because Rax is expressed in alpha tanycytes, we expected that LiPR will increase Rax in the Hypothalamic Ventricular Zone (HVZ). We agree with the reviewer that further investigation may be useful, and we will quantify the number of alpha tanycytes positive for Rax instead of determining only the volume of Rax+ tissue. We will quantify Rax+GFAP+ neural stem cells in the HVZ and Rax+GFAP+ neural progenitors (so-called beta tanycytes) in the Median Eminence to improve characterization of the cell dynamics in vivo.

      • Why there is no Ki67 experiment in the 7-day paradigm if that is the timepoint in which changes in the number or proportion of GFAP+ tanycytes are observed? PCNA was then used but only in the 21-day paradigm. What is the interpretation and relevance of these data? What are the non-htNSCs proliferating cells, whose dynamics are different from the changes in the number or proportion of htNSCs that could be potentially related to changes in mitosis? Again, I think it would be much useful for the work to explore in detail the changes in the putative htNSCs than investing in experiments that only add confusion.*

      __RESPONSE: __We apologize if the data presentation is confusing. We will include the quantification of the Ki67+ cells for the 7-day time point. In the MBH, many cell types undergo mitosis, including the oligodendrocyte precursor cells, microglia, astrocytes, and infiltrating macrophages. However, characterizing the identify of all these different cell types in response to the HFD and/or LiPR is beyond the scope of this study. To resolve whether HFD and/or LiPR influence proliferating aNSCs, we will quantify the proliferating cells in the HVZ, which will allow us to separate the proliferating aNSCs from all other proliferating cell types in the MBH.

      • The inclusion of Liraglutide + HFD, (not Liraglutide alone) only in some of the experiments is pointless if there is no direct comparison with LiPR and a timepoint is missing. In S.Fig 3, Fig. 5 and S.Fig 7 LFD (low fat diet?) is used in several occasions as in: "on reducing number of PCNA+ cells in 21d protocol (one-way ANOVA (OWA), F(2,12) = 16.66, p = 0.0003) when compared to both LFD and HFD groups". Is this the control diet?*

      RESPONSE: We apologize for the confusion caused by labelling the conditions of the Control Diet inconsistently. In some figures (e.g., Fig.2, S.Fig.3, Fig.4), we labelled the Control Diet as “Control”, whereas in some other figures (e.g., Fig.5, S.Fig.7) we labelled the Control Diet as “LFD” (Low Fat Diet). In all experiments and figures, the used Control Diet was identical. We unified the labelling of the Control Diet in all figures and in the text of the revised manuscript. Respectfully, we do not agree that including the Liraglutide data is pointless. We included the Liraglutide in the context of the HFD as a direct comparison with the HFD + LiPR group to demonstrate that the two anti-obesity compounds exert differential effects on adult neurogenesis. Such comparison has not been done before in analyzing adult neurogenesis and is valuable for better understanding of functions of these anti-obesity compounds.

      • The final experiment shows that application of hPrRP31, a variation of LiPR, causes an immediate calcium increase in human induced pluripotent stem cell-derived hypothalamic nucleus. This finding is interesting in itself because it brings light about the function of the receptor/s. It would have been very useful to test what other receptors mentioned to bind LiPR is mediating the effect. In any case, the focus of the work are the neural stem/progenitor cells responsible for neurogenesis and the changes in their properties because of HFD and LiPR, therefore I would trade these experiments for a more thorough and detailed dissection of these effects.*

      RESPONSE: We thank the reviewer for recognizing the relevance of the experiments with the hiPSC-derived neurons. As described in the comments above, we will conduct additional experiments to address the effect of LiPR on aNSCs and proliferation to more thoroughly as suggested by the reviewer.

      Minor points: __ A.__ Introduce "GLP-1RA"

      __RESPONSE: __We thank the reviewer for identifying this omission. We introduced the term in the revised manusript (line 50).

        • "HFD-induced inflammation and astrogliosis in the hypothalamus 45,46, whereas the long (4mo) protocol leads to DIO" Are these notions exclusive?* __RESPONSE: __This statement emphasized that HFD-induced inflammation and astrogliosis precede obesity. We prefer to leave the statement as it is.
        • LiPR displays no effects on astrocytes" "Displays" is not the correct term.* RESPONSE: We replaced the term “display” with the word “show” in the revised manuscript (line 342).

      ***Referee cross-commenting**

      I think we all referees agree for the most part. The main concern stated by all of us is the lack of a LiPR-alone group. The rest of the concerns are also related or complementary. In my opinion the mostly common view by the referees is reasuring.

      Reviewer #2 (Significance (Required)):

      The strengths of the work are its novelty in the field and the variety of techniques employed. The work has the potential of unveiling mechanistic insight into the regulation of neural stem/progenitor cells and neurogenesis. The main audience of this work would be the community working on this field. The lack of experiments testing that the changes observed actually participate in food intake prevent the work from being of relevance for a broader audience (food intake, energy balance, obesity...). The limitations are the descriptive nature of the work and the lack of a consistent and systematic experimental design that would allow to extract solid conclusions upon to which build upon future research.

      *

      Reviewer 3

      The work of Jörgensen et al describes the effect of a lipidized analogue of the prolactin releasing peptide (LiPR) on the mouse metabolism in response to high fat diet (HFD) and on hypothalamic and subgranular zone (SGZ) neurogenesis. They conclude that LiPR reduces body weight and improves metabolic parameters affected by HFD as well as it concomitantly stimulates neurogenesis in both niches the SGZ and the hypothalamus. The link between both effects is not demonstrated. The work is well conducted, the hypothesis is interesting and the experimental approach is adequate. The scope is wide and results are interesting, however a few aspects need to be further clarified. The manuscript is well written although the modification of some aspects would facilitate the reading such as the use of non described abbreviations for example.

      RESPONSE: We thank the reviewer for the positive assessment of our manuscript and for recognizing its novelty and importance for the research in neurogenesis, endocrinology, and metabolism. We will strive to clarify and facilitate our conclusions to improve the manuscript.

        • One concern in this study is the experimental groups. Authors analyze three groups control,HFD and HFD treated with LiPR. Authors conclude that the effects of LiPR are diet independent. However, given the results obtained by the authors on the effect of LiPR, the main question that arises in here is whether LiPR would have an effect on control mice. It seems tha a group is missing in the experimental design in which control ,mice are treated with LiPR during 7, 21 and the last two weeks of the 4 months. Author must include this information or at least argue the election of the experimental design.* RESPONSE: We thank the reviewer for this insight. We agree that including the Control Diet + LiPR in some of our analyses would improve the revised manuscript as also noted by Reviewer 2 (comment 1 and 2) and by Reviewer 2 (comment 1 and 2). In the original manuscript, we included the quantification of BrdU+ cells in the MBH for the Control Diet + LiPR in the 21-day group. To expand on these results, we will quantify the effects of LiPR on alpha tanycytes in the 21-day group. In addition, we will generate Control Diet + LiPR mice for the 4-month group to complement the HFD and HFD + LiPR data.
      1. Body weight is found reduced by LiPR as well as other metabolic parameters in mice treated with LiPR during the last two weeks of the 4 Mo HFD. However, no effects on hypothalamic or SGZ neurogenesis are not observed in this experimental group. How do authors explain this results?*

      __RESPONSE: __The 4-month group contains animals that are over 6-month-old, which display very low levels of cell proliferation and differentiation in comparison with the 7 and 21-day groups that contain mice that are 2 and 2.5 months old, respectively. It is possible that these low levels of neurogenesis did not allow us to detect any pro-neurogenic effects of LiPR. Alternatively, the low neurogenesis in older animals precludes us from detecting the adverse effects of the HFD, which are rescued by LiPR in younger animals.

      • In figure 1 I-K images are not clear and better resolution images would help.*

      RESPONSE: We provided images with higher resolution for Figure 1I-K of the revised manuscript.

      • Authors conclude that LiPR is increasing the number of NSC by reducing their activation. However, authors show an induced increase in htNSC only in mice fed HFD for 7 days and not in the 21 day fed mice or the 4 mo fed mice (fig 2 d-f). In addition, authors test for the number of cells expressing Ki67 (fig 2 L), however, the number of Ki67+ alpha tanicytes is not shown.*

      RESPONSE: We thank the reviewer for this insight. In the revised manuscript (line 158), we corrected the inaccurate statement that LiPR increased the number of aNSCs and did not rescue their number, which was also noted by Reviewer 1 (comment 5) and by Reviewer 2 (comment 4). In addition, we will quantify the number of Ki67+ cells in the Hypothalmic Ventricular Zone (HVZ), which will address whether LiPR affects proliferation of aNSCs. This concern parallels comment 6 of Reviewer 2.

      • On figure 2B it seems that is alpha 2 tanicytes that are missing in response to HFD.*

      RESPONSE: Indeed, the panel in Figure 2B shows that the HFD reduces the number of alpha tanycytes, including the alpha 2 tanycytes. This representative image supports our quantification results in Figure 2D-E.

      • Are Fig 2 A-C images representative of mice fed HFD for 7 days?*

      __RESPONSE: __Yes, the representative images in panels of Fig. 2A-C are from the 7-day group. However, the legend states that these images are from the 21-day group. This is an error that we corrected in the revised manuscript in the legend of Figure 2 (line 572). We apologize for this and thank the reviewer for double-checking.

      • By looking at figure 2B it seems like the proportion of alpha tanicytes is higher in HFD since no or very few tanicytes are observed and almost all of them are alpha tanicytes.*

      RESPONSE: Indeed, 7 days of HFD reduced the number of alpha 2 tanycytes, which occupy the ventral-lateral aspect of the 3rd ventricle. This reduction of alpha 2 tanycytes drives the lover proportion of GFAP+ alpha-tanycytes out of all GFAP+ tanycytes. We emphasized this in the text of the revised manuscript (line 435-437).

      • In fig 2 d-f, an increase in the number of GFAP+ alpha tanicytes and its proportion as well as labelled with vimentin is observed in control mice fed with normal diet for 7 days compared with mice fed normal diet for 21 days. How do authors explain this difference?*

      RESPONSE: There is no difference in the number of GFAP+ alpha tanycytes or proportion of GFAP+ alpha tanycytes between 7-day and 21-day Control Diet mice. We used the two-way, repeated measure ANOVA with the Bonferroni’s pots-hoc test and did not observe any statistical difference between these 2 quantifications for the Control Diet mice at 7 and 21 days. There is a statistical difference between 7-day and 21-day Control Diet mice in the proportion of GFAP+Vimentin+ tanycytes. This could be due to expansion of the Vimentin+ tanycytes in relatively young adult mice. Given that this is not a major point, we prefer not expanding its discussion in the manuscript.

      • In fig 2 Why are the differences in RAX, KI67 and PCNA only present in mice fed HFD for 21 days?*

      RESPONSE: We thank the reviewer for this question, which reflects a similar comment 6 of Reviewer 2. To improve consistency of the presented data, we will quantify the proliferating cells also for the 7-day time point. In addition, we will quantify the number of proliferating cells in the HVZ, which will allow us to address whether HFD and/or LiPR alter proliferation of tanycytes.

      • Authors test for adult hippocampal neurogenesis in the three groups. DO images in fig S2 correspond to the 21 day treatment group?*

      RESPONSE: Yes, the representative images in the Supplementary Figure 2 are from the 21-day group. This is stated in the figure legend.

      • On fig S2 C, it seems that in HFD fed mice treated with LiPR newly generated neuroblasts are more differentiated have authors looked at DCX+ cell morphology?*

      RESPONSE: We thank the reviewer for this observation. We have not analyzed the morphology of DCX+ cells or DCX+ neuroblasts in the SGZ. As the manuscript focuses on the hypothalamic and not hippocampal neurogenesis, we prefer not to analyze the morphology in the revised manuscript.

      • In this same figure, it seems like the number of DCX+ neuroblasts and the number of newly generated neurons is reduced in mice of the 21 d group compared to the 7 day group. Is this statistically significant?*

      RESPONSE: We used the two-way, repeated measure ANOVA with the Bonferroni’s pots-hoc test to analyze the DCX+ neuroblasts and neurons. We observed a statistically very significant effect of LiPR treatment on the number of DCX+ neuroblasts and neurons (page 10 of the original manuscript). However, the Bonferroni’s test did not reveal any difference between 7-day and 21-day treatment groups.

      • There is a large reduction in the number of DCX+ cells from control 21 d treated mice to control 4 month treated mice. Is this statistically significat? How do authors explain this dramatic reduction?*

      RESPONSE: Yes, there is statistically significant reduction in the number of DCX+ cells and DCX+ neurons in the SGZ between the 21-day and 4-month group S.Fig.2). This reduction is most likely a result of aging. The mice of the 21-day group were around 2.5 months of age when culled, whereas the 4-month group month mice were over 6.5-month-old. The decline in SGZ neurogenesis with age is well documented. Because this decrease in DCX+ cells in the SGZ is an obvious consequence of the animals’ age and because the hippocampal neurogenesis is not the primary focus of this manuscript, we prefer not to discuss this feature in the manuscript.

      • Authors do not show the effect of HFD on BrdU+ neurons in the Arcuate. However, all data need to be shown.

      *

      RESPONSE: We stated (on page 12 of the original manuscript) that in the Arcuate Nucleus of the 21-day group, there was “a statistically significant increase of BrdU+ neurons by HFD compared to Control (data not shown)”. To satisfy reviewer’s comment, we incorporated this data in the S.Fig.5 as the new panel S.Fig.5O and added the following text (lines 309-313) to the revised manuscript: “However, in the ArcN, the primary nutrient and hormone sensing neuronal nucleus of MBH 4, there was a statistically significant difference in number of BrdU+ neurons due to treatment (OWA, F(3,15) = 3.97, p = 0.0029). Exposure to 21d HFD significantly increased the number of BrdU+ neurons in the ArcN, which was reversed by co-administration of LiPR or Liraglutide (S.Fig.5O).” In addition, we adjusted the relevant discussion (lines 468-472): “Our results show that the short and intermediate exposure to HFD does not change the number of newly generated, BrdU+ cells, neurons, or astrocytes in the MBH parenchyma, however, it increases the number of BrdU+ neurons in the primary sensing ArcN, which is reversed by the con-current administration of LiPR or Liraglutide” and (lines 474-476): “In addition, our results show that while LiPR does not change the number of new cells in the MBH parenchyma, it can rescue the increased production of new neurons in the ArcN in the context of the intermediate HFD exposure.”

      *Reviewer #3 (Significance (Required)):

      In general the manuscript includes a great amount of work to demonstrate the effect of LiPR on neurogenesis (hippocampal and hypothalamic). The scope is wide, and the hypothesis is really interesting. Authors may need to solve some issues in order to completely demonstrate their claims and conclusions, but once the work is done, it will be very valuable to understand the effect of pharmacological agents used in the field of endocrinology to treat metabolic disorders such as type 2 diabetes di type 2 diabetes. So far, no studies have been done in which the effect of this molecules have been described on SGZ and hypothalamic neurogenesis. Both the field of endocrinology and metabolism as well as the field of adult neurogenesis may benefit of a study of this type.*

    1. One artform we didn’t look at in class but I have found myself interested in lately is that of bonsai. Did you know that any tree can be bonsai? It isn’t a specific type of tree! Bonsai again involves the idea of finding beauty in the natural world. In this case bonsai trees are also a practice in mindful attentiveness as it requires one to trim and shape a tree. One must have a vision for the tree and patiently cut and shape the branches so that it conforms to that vision. In most instances a bonsai should have a wide base with large roots that taper as it goes up. The branches should form a triangular shape. By limiting the space of the tree to grow i.e a small pot, the tree will stay its miniature size.

      I did not realize any tree can be Bonsai, I thought that like Oak, Cherry, Vine, Birch, etc Bonsai was a tree where the Bonsai was some sort of etymology or signified what the tree was -- that is interesting, as this may be confirming the ideas earlier stated with the two ideas of Wabi and Sabi along with holding Beauty in the moment, the fact that the Bonsai can be any tree makes it's have the same fluidity and identity of Mono No Aware philosophy I think/believe

    1. As an interpretive bias, technological determinism is often an inexplicit, taken-for-granted assumption which is assumed to be 'self-evident'. Persuasive writers can make it seem like 'natural' common sense: it is presented as an unproblematic 'given'. The assumptions of technological determinism can usually be easily in spotted frequent references to the 'impact' of technological 'revolutions' which 'led to' or 'brought about', 'inevitable', 'far-reaching', 'effects', or 'consequences' or assertions about what 'will be' happening 'sooner than we think' 'whether we like it or not'. This sort of language gives such writing an animated, visionary, prophetic tone which many people find inspiring and convincing.

      The statement highlights how technological determinism often operates as an implicit, unquestioned assumption presented as "self-evident" or "natural" common sense. Writers can use persuasive language to portray technological determinism as an unproblematic given, making sweeping assertions about the impact and consequences of technology. Such language can create an animated, visionary, or prophetic tone that may be appealing and convincing to many readers. However, it is important to critically examine the assumptions underlying technological determinism and recognize that the relationship between technology and society is complex, multifaceted, and shaped by human agency, values, and social dynamics. Taking a nuanced approach can help avoid deterministic thinking and promote a more thoughtful understanding of the role of technology in society.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We would like to thank all reviewers for their time and effort invested into reviewing our manuscript.

      Please find our responses to your comments, criticisms and suggestions below in blue.

      __Reviewer #1 (Evidence, reproducibility and clarity (Required)): __

      Summary:

      The manuscript by Vishwanatha et al. presents findings on the fission yeast transcription factor Cbf11, which is involved in regulating lipid synthesis. Changes in lipid metabolism often have detrimental effects on nuclear division (evidenced by the high percentage of cut phenotypes among strains with altered lipid content). Here the authors show that cbf11 deletion strains produce additional phenotypes such as changes to cohesion dynamics and altered chromatin modification within centromeric regions, in turn perhaps affecting microtubule attachment and proper chromosome distributions. This hypothesis is supported by the authors' finding of epistatic effects between cbf11 and cohesin loading and unloading.

      Major comments:

      While the evidence presented supports the hypothesis of altered cohesin loading as a major driver of observed mitotic defects, changes in the NE surface area are likely to also contribute to the phenotypes even in pre-anaphase stages.

      • This is an interesting notion. We are only aware of NE overproduction and nuclear “flares” observed upon the Lipin phosphatase dysregulation (PMID 23873576).

      • However, in our case we rather expect NE membrane shortage, not overproduction. Accordingly, we do see that the nuclear cross section area (thus likely also NE surface area) is smaller in cbf11KO compared to WT (see boxplots below). Is this what you are referring to? We are not sure how this would affect the pre-anaphase stages of mitosis.

      Did the authors test any double deletions with regulators involved in decreasing lipid content (e.g. spo7, nem1, ned1) to counteract the role of Cbf11? This could be useful in assessing the relative contribution of cohesion dynamics and histone modifications.

      • We previously published (PMID: 27687771) that cut6/ACC overexpression can indeed partially suppress the cut phenotype in the cbf11KO background. So lipid metabolism does play a role and does contribute to mitotic fidelity. In the current manuscript, we are showing that other factors contribute as well and that defects arise already prior to anaphase, which is not consistent with the simple notion of shortage of membrane building blocks during anaphase. We appreciate your suggestion on testing the relative contributions of these various factors to mitotic fidelity, but we have not tested any of the suggested double mutants.

      A possible role of physical constraints dictated by the NE was already mentioned by the authors in the context of spindle bending and decreased elongation rates and some preliminary experimental data on this would be appreciated. Generation of strains, acquisition of some timelapses, and quantification of spindle elongation rate/buckling frequency should be feasible in a reasonable time frame.

      • Assaying spindle parameters in Lipin-related mutants would indeed be interesting, but again, these are anaphase phenotypes. We are not sure how this is relevant for the pre-anaphase findings we report? Also, we unfortunately no longer have the personnel and capacity to carry out the suggested experiments.

      The authors report mRNA levels of the centromere flanking genes per1 and sdh1 to be increased by 1.5x and decreased by 2x in comparison to WT. Could the authors elaborate on whether this is an expected trend? Kaufmann et al., 2010 reported low transcription of per1 when the surrounding regions are predominantly acetylated. Fig. 4A suggests a slight increase of H3K9ac at per1 and a decrease of transcription would be conceivable.

      • We do not have any particular expectations regarding the expression levels of per1 and sdh1 in our system. We simply note that their expression changes in cbf11KO (in different directions) and this is accompanied by changes in H3K9 acetylation patterns.

      • The increased histone acetylation at the per1 locus that you mention (Kaufmann et al., 2010) was only shown for H4K12ac, while we measured H3K9ac (these marks are deposited by different enzymes). The authors actually report that “The levels of histone H3 at per1 did not change significantly between the two growth conditions and strains”, so we do not think that paper is relevant for our study.

      Fig. 3B indicates a catastrophic mitosis percentage of roughly 9.5% in cbf11∆ while in Fig. 1C 4% of all cells, or ˜31% of all mitotic events, is noted as abnormal. Could the authors clarify this discrepancy? Since Fig. 1 utilises time course data of 333 cells (please specify the number of analysed cells also in the legend), would the authors expect this data to be more trustworthy when compared to images of fixed cells? What were the criteria to assign divisions as catastrophic in fixed cells and which features were utilised to identify the 400 cells as mitotic?

      • We typically do see higher proportions of cut cells in fixed samples than in live-cell imaging. We believe this has to do with the different fluorescence readouts for live vs fixed cells. We have added the following explanations to the methods:

      “Please note that the observed frequencies of mitotic defects are not directly comparable between live and fixed cells. Following catastrophic mitosis, the dead cells rapidly lose histone-GFP fluorescence (imaging of live cells), but their DNA can still be visualized with DAPI for a much longer period (imaging of fixed cells), resulting in higher apparent defect frequencies in fixed cells.”

      • Importantly, we always compared cbf11KO to WT grown and processed under the same conditions, and that is how we determined the significance of any defects.

      • Mitotic defects were classified based on nuclear morphology both in live cells (histone signal) and in fixed cells (DAPI): Cells having the cut phenotype, or mis-segregated nucleus = 2 nuclei of different sizes, or septated cells with only one daughter cell having a nucleus, respectively.

      • We have analyzed images of at least 400 cells *in total* from asynchronous populations (interphase + mitotic >= 400). We have modified the figure legend to make this fact more clear. In our experience, this is the standard way of reporting the frequency of mitotic defects in asynchronous yeast cell populations.

      • We have specified the number of cells analyzed in Fig. 1C.

      Minor comments:

      Previous literature is, to the best of our knowledge, sufficiently referenced. The text is largely clear (some exceptions within the methods section will be elaborated on below). The figures, however, would benefit from graph titles and some minor formatting changes.

      • Figures:

      o Fig. 1: Specify the number of cells analysed in C within the legend as well. For B, please use colourblind-friendly schemes - especially since images are shown as merges only. The example of the "cut" phenotype appears small and crowded by surrounding cells. Especially the latter might affect mitotic fidelity. Under the assumption that this did not affect quantifications (WT seem fine) a less crowded cell would present a nicer example.

      • We have changed Fig. 1 as requested.

      o Fig. 3: Images shown in A add little benefit in their current form. What is the takeaway for the reader?

      • We hope that the reader gets concrete information on cellular and nuclear morphology of the investigated strains, which would be otherwise difficult to reproduce by textual description.

      Indicating that images represent DAPI staining and pointing out cells of interest with arrows/symbols would be helpful.

      • Done.

      The example shown for cbf11 appears to be dimmer in comparison and cell morphology is hard to interpret.

      • The cbf11KO cells stain fainter with DAPI than cells of other strains. We do not know why. To increase the clarity of the image, we have now adjusted the brightness and contrast of the cbf11KO panel (and indicated this adjustment in the figure legend).

      C feels misplaced in this figure and a title could improve readability.

      • We have added a title and moved the panel to Fig. 4 (4D).

      o Fig. 4: Graph titles needed, figure might work better in portrait

      • We have added the required graph titles.

      • We have recreated all ChIP-seq related figures to incorporate new data and to (hopefully) better highlight the differences between genotypes.

      • Text:

      o Mention median duration of mitosis in cbf11∆ (Fig. 2E) in text since WT is already noted;

      • Done.

      o Discussion, third paragraph: "TBZ [REF] and are prone to chromosome loss [...]". I assume this referred to minichromosome loss or have changes in ploidy/chromosome segregation been quantified?

      • Changes in ploidy were indeed not quantified. We have changed the wording to “__mini__chromosome loss”. But please note that the Ch16 minichromosome is derived from regular Chromosome III and is a real chromosome, albeit a small one.

      o Methods, Microscopy and image analysis:

      How were fixed cells imaged (glass bottom dishes, plated on lectin, mounted on slides)?

      Specify the CellR as widefield and provide details of the objective used (immersion and NA)

      • We have added the following information to the relevant Methods section:

      “Cells were applied on glass slides coated with soybean lectin, covered with a glass cover slip, and imaged using the 60X objective of the Olympus CellR widefield microscope with oil immersion (NA 1.4)”

      Elaborate on "manual evaluation of microscopic images"

      • We have extended the description of cell scoring:

      “The frequency of catastrophic mitosis occurrence was determined by manual evaluation of microscopic images using the counter function of ImageJ software, version 1.52p (Schneider et al., 2012). At least 400 cells from the asynchronous populations were analyzed per sample and mitotic defects were scored based on nuclear morphology and septum presence/position. ”

      For live cell microscopy, what was the estimated final density of cells within the 5 µl resuspension?

      • Our estimate is 4-8 x 10^6 cells in 5 ul. We have added this information into the Methods.

      What is meant by measuring the maximum section of plotted profiles? Is this the maximum distance of Hht1 signals within the entire time-lapse?

      • We have changed the description:

      “The nuclear distance was measured by using Hht2–GFP signals and converting the green channel images to binary, measuring the maximum distance between the Hht2-GFP signals using plot profile function in imageJ.”

      Was spindle length quantified the same way?

      • We have added the description:

      “Spindle length was quantified by drawing a line along the length of the spindle (using mCherry-Atb2 signals) at each timepoint and measuring the length of the line using imageJ.”

      Methods, ChIP-qPCR:

      It is not clear which strains were used, this can only be guessed by the use of a GFP antibody suggesting GFP tagged chromatin to be precipitated. For people with expertise outside of ChIP assays, this should be specified

      • We have listed the used strains in the ChIP-qPCR methods section.

      Reviewer #1 (Significance (Required)):

      Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      This manuscript presents a novel role for a transcription factor, one typically implicated in lipid metabolism, in chromatin modification and cohesin dynamics, with the possibility of this representing a more conserved process across ascomycetes. The mechanism of cbf11 regulation remains to be determined.

      Place the work in the context of the existing literature (provide references, where appropriate).

      This work helps link two bodies of work related to cell division that are usually considered in isolation, the regulation of lipid dynamics and the control of chromatin dynamics and cohesion. Some comparisons to phenotypes in closely related species would have helped provide a broader context (such as Yam et al., 2011, where the spindle morphologies in S. japonicus and response to cerulenin treatment might be of relevance to the work presented here).

      • We now briefly discuss the semi-open mitosis of Sch. japonicus and the Yam et al. 2011 paper at the beginning of the Discussion.

      State what audience might be interested in and influenced by the reported findings.

      Molecular and cellular biologists with interests in nuclear remodelling, lipid metabolism, kinetochore assembly.

      Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Fission yeast biology, nuclear remodelling, microscopy. We are not qualified to make in-depth comments on the soundness of ChIP-Seq and ChIP-qPCR experiments.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      This manuscript describes detailed mechanisms by which the cbf11 deletion showed the phenotype. They found that the cbf11 deletion altered pericentromeric chromatin states such as the level of cohesin and hypermethylation.

      In general, their results are interesting and provide important insights into the relationship between lipid metabolism and chromosome segregation. The presented data are valuable for the community, but the authors should carefully re-assess their data.

      Major comments:

      1. Statistical analyses in some of the Fig.3B, 3C, 4B and S2 seem to be somewhat weird because p-values are too small for such a small number of experiments (three independent experiments) with large standard deviations. Please show all the data points in Fig. 2C-E, and provide raw values as a supplementary table for assessment of the data.

      2. We now show individual data points for all barplots and boxplots and provide all source numerical data as supplementary tables. The details of the used statistical tests are given in the respective figure legends.

      3. Pages 5-6: As for Fig. 4, the data is difficult to interpret because the trends of the ChIP-seq pattern of H3K9me2 between replicates look different: replicate 2 shows an increase of H3K9me2 signal, while replicate 1 shows almost no difference or weak if any. In such a case, the authors should repeat ChIP-seq one more time and confirm hypermethylation at these regions or confirm it by ChIP-qPCR.

      4. We do not agree with this statement. It is true that the exact histone modification patterns are not identical between the two replicates, but this is likely due to the differences in chromatin extract preparation in replicate 1 vs replicate 2 (see Methods). Importantly, both replicates show pronounced differences in H3K9me2 patterns between WT and cbf11KO. We have changed the visualization style to better highlight the differences between WT and mutant (Fig. 4A, Fig. S2B, S3)).

      5. Also, we have added one more biological replicate for the H3K9me2 ChIP-seq (Fig. S3) and performed the H3K9me2 ChIP-seq also in the Pcut6MUT strain with ~50% decreased expression of the cut6 gene (Cut6/ACC is the rate-limiting enzyme of fatty acid synthesis; cut6 is target of Cbf11) as 3 biological replicates (Fig. 4A and Fig. S3). Importantly, all replicates of both mutant strains show hypermethylated regions in the centromeres compared to WT.

      Assuming that the pericentromeric regions are hypermethylated by cbf11 deletion, it is still unclear why the transcription from only dh, but not dg, regions increased although their ChIP-seq data indicated both dh/dg regions were hypermethylated. A similar question arises to the expression of per1 and sdh1. Both K9Ac and K9me2 modifications seem to unchange at both per1 and sdh1 loci, whereas the expression levels of these loci changed in the opposite direction. These results suggest that the transcription levels of the centromeric region are independent of their histone modification states.

      • We do not know why dh expression differs from dg. But note that these are multi-copy repeats and it is very difficult to study individual copies separately. Our expression data, and partly also the ChIP-seq data represent “average” values across all the dh and dg copies present in the genome.

      • Importantly, Figure 4A (and Fig. S2B, S3) show a large piece of the fission yeast chromosome (~57 kbp) and this scale does not allow making informed judgements about the state of histone modifications at a particular promoter locus.

      • When we zoom in, we do see increased and decreased H3K9ac around the TSS of per1 and sdh1, respectively (2 replicates shown).

      • A key question of this study is to understand the relationship between lipid metabolism and chromosome structures. However, the results presented are not enough to address this question. I request to distinguish whether the defects on pericentromeric regions are mediated by lipid metabolism or direct effect by cbf11 deletion. Cbf11 is a transcription factor and can directly bind to DNA, thereby there is a possibility that Cbf11 directly modulates the pericentromeric chromatin state without regulating lipid metabolism. This question can probably be addressed. As the authors have shown in their previous study (Prevorovsky et al., 2016), overexpression of cut6, which encodes acetyl coenzyme A carboxylase and is a target of cbf11, can bypass nuclear defects. If the overexpression of cut6 restores alteration on pericentromeric regions such as cohesin enrichment and hypermethylation, it suggests the defects are a secondary effect of the decrease of phospholipid biosynthesis.

      • We agree that any rescue effects can be direct or indirect. And distinguishing between these two alternatives is unfortunately not straightforward.

      • Our Cbf11 ChIP-seq data do not show Cbf11 binding to centromeres (PMID 19101542), suggesting that any impact of Cbf11 on centromeric chromatin is most likely indirect and mediated by some other, downstream, players.

      • Instead of assaying cut6OE, we now show data that decreased cut6/ACC (a target of Cbf11) expression also leads to changes in histone methylation, similar to cbf11KO (Fig. 4A, Fig. S3). This suggests that lipid metabolism indeed can affect chromatin state (and the chromatin defects in cbf11KO are likely also lipid-related).

      • We have recently shown (Princová et al., 2023, PMID: 36626368) that decreased fatty acid synthesis leads to changes in acetylation and expression of specific stress-response genes in S. pombe, and the whole process involves the histone acetyltransferases Gcn5 and Mst1. Therefore, instead of implicating membrane phospholipids, we rather suggest that lipid metabolism can affect chromatin acetylation/methylation and structure via HATs, potentially through acetyl-CoA, the common substrate of both FA synthesis and HATs. We now mention the Princová et al., 2023 paper in the Discussion section.

      Minor comments:

      1. Figure 3C: The legend says, "Values represent means + SD from 3 independent experiments". It meant "means {plus minus} SD"?

      2. Corrected. Thank you for spotting this.

      3. The relationship between phospholipid synthesis and mitotic fidelity is now discussed in the bioRxiv paper (https://doi.org/10.1101/2022.06.01.494365). It would be nice to discuss this paper.

      4. Thank you for pointing out this reference. We now briefly mention this paper as a note that dysregulation of membrane phospholipid synthesis leads to mitotic phenotypes similar to cbf11KO.

      Reviewer #2 (Significance (Required)):

      Faithful chromosome segregation into daughter cells is crucial for cell proliferation. The authors previously reported that the deletion of cbf11, a transcription factor that regulates lipid metabolism genes, causes "cut (cell untimely torn)" phenotype (Prevorovsky et al., 2015; Prevorovsky et al., 2016). In this report, they examined detailed mechanisms by which the cbf11 deletion showed the phenotype, and found that the cbf11 deletion altered pericentromeric chromatin states such as the level of cohesin and hypermethylation. In general, their results are interesting and provide important insights into the relationship between lipid metabolism and chromosome segregation. The presented data are valuable for the community of basic science in the fields of chromosome biology and cell biology.

      We are cell biologists working on chromosomes and the cell nucleus.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The Vishwanatha et al. manuscript examined the nature of the mitotic defect in cbf11 deletion cells. cbf11+ encodes a CSL transcription factor that regulates lipid metabolism genes in S. pombe. Loss of cbf11+ was previously shown to have a "cut" phenotype presumably due in part to aberrant regulation of its target gene cut6+ which encodes-acetyl CoA/biotin carboxylase involved in fatty acid biosynthesis (Zach et al. 2018). The authors hypothesized that the mitotic defect exhibited as chromosome missegregation in cbf11 deletion cells may be caused by alterations in cohesin occupancy and H3K9 methylation in centromeres. Cohesin occupancy was slightly higher in centromeric dh and dg repeats in the cbf11 mutant and loss of the cohesin-loader gene wpl1+ appeared to suppress the mitotic defect. The authors also showed by ChIP-Seq that H3K9 methylation was higher in the centromeric regions, as well as increased minichromosomal loss in the cbf11 mutant.

      The discovery of increased cohesin occupancy and H3K9 hypermethylation in the centromeric regions of cbf11 deletion cells is novel and interesting. However, the main deficiency of the manuscript is that this discovery is underdeveloped. For example, the evidence linking the mitotic defect phenotype to these two processes was not well supported.

      • We believe that the links have already been well established in the literature. The integrity of centromeric heterochromatin (H3K9me2) is known to be required for mitotic fidelity (eg. Clr4/HMT and Clr6/HDAC mutants with H3K9me2 deficiency have high minichromosome loss and/or show lagging chromosomes during mitosis - PMID: 19556509, PMID: 8937982, PMID: 9755190). Moreover, we stress the known interconnections and provide relevant citations in the Discussion:

      “It is also important to note that heterochromatin, kinetochore function, cohesin occupancy, and gene expression are all interconnected and actually interdependent (Bernard et al., 2001; Folco et al., 2019, 5; Grewal and Jia, 2007; Gullerova and Proudfoot, 2008; Nonaka et al., 2002; Volpe et al., 2002)”

      • We show in the manuscript altered cohesin occupancy in cbf11KO and show that mutations in cohesin loading factors do affect mitotic fidelity of cbf11KO. While we do agree that this connection can be developed further, we believe this is beyond the scope of our current project.

      Moreover, there was no investigation in whether/how Cbf11 regulates cohesin occupancy or H3K9 methylation at the centromeres.

      • This is true. But again, we believe this is beyond the scope of our current project.

      Finally, the title and abstract provided an impression that lipid metabolism may influence cohesin occupancy and histone H3 hypermethylation at the centromeres, but this was not directly studied in the manuscript.

      • We now provide H3K9me2 ChIP-seq data on the Pcut6MUT mutant deficient in fatty acid synthesis to show that lipid metabolism indeed can affect histone methylation at the centromeres (Fig. 4A, Fig. S3).

      Centromeres are regions where sister chromatid cohesion is abolished last in mitosis. The observed higher levels of cohesin occupancy in the centromeric dh and dg repeats of cbf11 deletion cells could be the cause of chromosome missegregation, presumably because there is a delay or hinderance of cohesin removal from sister chromatids in mitosis. However, cohesin occupancy was carry out in asynchronous wild type and cbf11 deletion cultures, so it is unknown whether there is a delay of cohesion abolishment in mitosis. A cdc25-22 block and release experiment could better address this hypothesis.

      • We acknowledge these limitations of our findings regarding cohesin occupancy in the paper:

      “ Notably, centromeres are the regions where sister chromatin cohesion is abolished last during mitosis (Peters et al., 2008). Since cbf11Δ cells show altered cell-cycle and pre-anaphase mitotic duration compared to WT (Fig. 2), the observed difference in cohesin occupancy might merely reflect these changes in the timing of cell cycle progression. Alternatively, altered cohesin dynamics could play a role in the cbf11Δ mitotic defects.”

      • We agree the issue could be addressed better using synchronous cell populations. However, the cdc25 or cdc10 block-release does not work well in cbf11KO (PMID: 27687771), and we currently do not have the capacity to perform less disruptive forms of cell cycle synchronization.

      The observation that the spindle assembly checkpoint did not influence the mitotic catastrophe phenotype of cbf11 deletion cells suggests that the chromosome missegregation may not be mediated by defects in cohesin dynamics. How does Cbf11 influence cohesin dynamics in mitosis?

      • There are clearly multiple contributors to the mitotic defects observed in the cbf11KO strain and we state this explicitly throughout the manuscript.

      • We agree that it would be interesting in future to know more details about the link between Cbf11 and cohesin, but this is beyond the scope of our current project.

      Does Cbf11 regulate transcription of cohesin genes or indirectly through defects in the centromere or condensins?

      • Expression levels of cohesin and condensin genes are not affected by deletion of cbf11 (PMID: 26366556). We now mention these findings in the Results section.

      There was no direct evidence that H3K9 hypermethylation at the centromeres contributes to the mitotic catastrophe phenotype of cbf11 deletion cells.

      • This is true. However, the importance of H3K9me2 for mitotic fidelity has already been established in the literature (as we mention above).

      It is also not clear whether Cbf11 directly or indirectly influences histone methylation at the centromeres of affect centromere function.

      • When the Cbf11 protein is missing, centromeric histone methylation is different from normal (WT), and centromere function is not normal either - dh repeats are less expressed, minichromosome derived from ChrIII (so has a normal centromere) is 9x more frequently lost. So Cbf11 does affect these processes. The question remains, whether Cbf11 does this directly or indirectly. We favor the indirect route, as we have recently shown that H3K9 acetylation or methylation can be affected by shifting the balance between fatty acid synthesis (which is regulated by Cbf11) and histone acetyltransferase activity. We now mention these findings in the Discussion (Princová et al., 2023).

      Based on a substantial number of protein-protein interactions of Cbf11 and gene products that affect chromatin function/silencing at the centromeres from the Pancaldi et al. 2012 study (e.g. HIR complex, Hrp1-Hrp3, Cnp1, Ino80 complex), I am surprised that these candidates were not mentioned in this study or investigated.

      • Unfortunately, no DNase treatment was used during the affinity purification of Cbf11 in the study you mention. Therefore, the list of potential interactors is likely contaminated by irrelevant, DNA-mediated interactions with proteins sitting at nearby loci. This is why we have not pursued these candidates.

      Also, it would be more comprehensive to examine defects in transcriptional silencing in the centromeric regions using an ade6+ or ura4+/FOA marker system rather than measuring expression of per1+ and sdh1+.

      • We agree. We actually tried the ura4/FOA reporter system, but had problems constructing the reporter strains in the cbf11KO background. The resulting clones showed variable levels of FOA sensitivity (see figure of clones OC5-9 below), so we could not get a conclusive answer from this experiment and resorted to measuring the expression of pericentromeric genes.

      Figure 1A shows that the "cut" and nuclear displacement phenotypes are independent. However, cut mutants can also generate a nuclear displacement phenotype [Samejima et al. (1993) J. Cell Sci. 105: 135-143]. Therefore, I am not sure whether the latter phenotype can be treated as entirely independent from "cut" mutants.

      • We have made clarifications to Fig. 1A accordingly.

      Reviewer #3 (Significance (Required)):

      The discovery of increased cohesin occupancy and H3K9 hypermethylation in the

      centromeric regions of cbf11 deletion cells is novel and interesting. However, the main deficiency of the manuscript is that this discovery is underdeveloped.

      The results of this manuscript would be of considerable interest in the area of cell cycle research, transcription and chromatin structure and function.

      Reviewer #4 (Evidence, reproducibility and clarity (Required)):

      Summary

      In this paper Vishwanatha et al. analyze the mitotic phenotypes of cells lacking a regulator of lipid metabolism Cbf11. They propose that sister chromatid cohesion abnormalities and altered chromatin marks may contribute to the increased incidence of catastrophic mitosis. Additional experiments are required to improve the study and strengthen the authors' conclusions.

      Major Comments

      Both histone and alpha-tubulin tagging are known to aggravate mitotic errors in S. pombe. Before using these markers for live imaging, the authors should quantitate mitotic phenotypes in untagged cbf11∆ cells, as compared to the wild type. Using DAPI and Calcofluor staining (and ideally, also visualizing microtubules using anti- alpha-tubulin antibodies) the authors should measure the percentage of cells in mitosis and the percentage of cells that are going, or just went, through catastrophic mitosis, in asynchronous early-mid-exponential cell populations.

      • We agree that tagging can affect protein function in numerous ways.

      • The tagged versions of tubulin (mCherry-Atb2) and H3 (Hht2-GFP) used in our paper have been obtained from Phong Tran’s lab. These tagged alleles had been published (Nature Communications, PMID: 26031557) and used successfully to monitor mitotic defects including chromosome segregation errors and the cut phenotype.

      • The analyses of mitotic and septation defects of asynchronous untagged cbf11KO cells that you suggest (except for the spindle visualization) were already done by us (PMID: 19101542, PMID: 26366556) and are in agreement with our present study. In brief, we showed that cbf11KO populations contain ~10-30% of cells with mitotic defects (eg. cut), depending on the cultivation conditions. They also show septation defects and altered cell morphology and shorter cell length.

      In analyzing the dynamic of nuclear division, the authors claim that the interval between spindle formation and anaphase onset is "longer" and "more variable" in cbf11∆ cells compared to WT cells. The authors should provide proper statistical analysis of both differences to show that these differences are significant.

      • We now show the required data and statistical testing as Fig. 2H.

      The same goes for the authors' claim that mitotic duration is "more variable" in cbf11∆ cells compared to WT cells.

      • The spread of values for both WT and cbf11KO is given in Fig. 2G.

      As mentioned above, alternative estimates of possible perturbations of mitotic dynamics could be obtained by measuring the percentage of cells in different mitotic phases in asynchronous untagged cell populations, in order to avoid possible artifacts given by tagging histones and alpha-tubulin.

      • As you mention above, to estimate their cell cycle stage, untagged cells would need to be fixed and stained to visualize the nucleus and septum. However, using fixed cbf11KO cells is not optimal for this purpose. cbf11KO have septation and cell separation defects (PMID: 19101542, PMID: 26366556). This results in increased numbers of cells having a (persistent) septum in the asynchronous population, which obscures any estimates of cell cycle stages, and this is why we observed live cells during a timecourse.

      The fact that inactivation of SAC does not change the incidence of catastrophic mitoses shows that SAC is not involved and that there are likely no problems with kinetochore-microtubule attachments. Therefore, the authors' statement "These results suggest that SAC activity only plays a minor role (if any) in the mitotic defects observed in cbf11Δ cells" should be changed.

      • We have changed the sentence to:

      “These results suggest that SAC activity only plays a minor role (if any) in the mitotic defects observed in cbf11Δ cells, or that the defects are not caused by problems with kinetochore-microtubule attachment.”

      Also, the authors' statement in the conclusion that "This indicates that proper microtubule attachment to kinetochores might be compromised and takes longer to achieve in cbf11Δ cells, possibly triggering the SAC" should be changed accordingly or further proof should be provided.

      • This is probably a misunderstanding. We do not conclude that failed microtubule attachment to kinetochores is surely the cause of mitotic defects in cbf11KO. We merely describe our reasoning about structuring the project during its execution. We have rephrased the problematic sentence to improve clarity.

      • We already state in the Discussion that the mitotic defects of cbf11KO may be caused by something completely different from microtubule attachment.

      As pointed out by the authors, cohesion occupancy is affected by the cell cycle phases duration. Therefore, the authors should correct their data (Fig.3C) for the different duration of mitosis or measure cohesion occupancy in mitotically synchronized populations. If this is not possible, I suggest removing this piece of data altogether.

      • We agree (and acknowledge in the paper) that the measurement of cohesin occupancy can be affected by duration of mitotic phases. However we do not see a straightforward way of normalizing for mitotic duration, as cohesin occupancy changes differentially at particular chromosomal loci.

      • The suggested experiment of measuring cohesin occupancy in synchronized mitotic cells would likely help. However, as mentioned in our response to Reviewer 3 above, the cdc25 or cdc10 block-release does not work well in cbf11KO (PMID: 27687771), and the heat shock or drugs (eg. spindle poisons) would introduce confounding issues themselves. Unfortunately, we currently do not have the capacity to perform less disruptive forms of cell cycle synchronization.

      • Since we show that mutations in cohesin loading factors can rescue mitotic fidelity of cbf11KO cells (Fig. 3B), we consider the data shown in Fig. 3C relevant. Therefore, we opt to keep Fig. 3C in the paper, and we do point out the potential limitations of these results in the Results section.

      In Fig. 3A it is not clear what the authors mean by "morphological" differences between WT and cbf11∆ cells or between cbf11∆ cells and cbf11∆wpl1∆ cells. The authors should provide clearer images and indicate for each image which cells show morphological defects as an example.

      • We now use arrows to highlight cells with nuclear defects in Fig. 3A.

      • We now state examples of the cbf11KO-associated morphological defects in the text, together with a reference to the paper describing these defects in detail.

      In Fig. 3A many cells in single or double cbf11∆ mutants show increased size typical of diploid cells. The authors should perform flow cytometry to test for possible diploidization in their mutants, as that would clearly affect any conclusions on mitotic defects rescue or enhancement.

      • We previously published that cbf11KO cells show increased tendency for spontaneous diploidization (PMID: 19101542). When constructing cbf11KO strains, we always take care (including flow cytometry tests of DNA content) to exclude purely diploid clones, but the process of spurious diploidization is continuous and there are always diploid cells present in the cbf11KO culture.

      • We mention diploidization as a possible mitotic outcome in cbf11KO cells in the first section of the Results.

      As correctly pointed out by the authors, it is not clear if the increase in mitotic defects in cbf11∆ cells is entirely due to the perturbed lipid metabolism or to other factors being affected by Cbf11. A possible approach to prove this point, as suggested by the authors too, would be to test if the mitotic defects identified in cbf11∆ are common to other mutants of lipid metabolism that also show an increase in catastrophic mitotic events.

      • We now show ChIP-seq data showing that centromeric H3K9 shows aberrant methylation patterns also in a hypomorphic cut6/ACC mutant (Pcut6MUT) (Fig. 4A, Fig. S3).

      • We previously showed that the Pcut6MUT mutation predisposes fission yeast cells to catastrophic mitosis, and the defects manifest when Cut6 function is further weakened by limiting the supply of biotin (cofactor of Cut6) (PMID: 27687771).

      Also, the authors' statement in the conclusion: "we have demonstrated several novel factors, not directly related to lipid metabolism, that affect mitotic fidelity in cells with perturbed lipid homeostasis" should be modified as it was not proven that these effects are not due to altered lipid metabolism.

      • We agree that “it was not proven that these effects are not due to altered lipid metabolism”. However, the emphasis here is on the word “directly”. H3K9me2 and cohesin dynamics are not directly related to the metabolism of lipids. We have changed the phrasing to improve clarity.

      Minor comments

      The initial distinction (Fig. 1A) between "cut" and "nuclear displacement" phenotypes is somewhat confusing, especially since the authors are not investigating the different outcomes of a catastrophic mitosis. The two outcomes should be grouped together under the definition of "catastrophic mitosis" as it is done in the rest of the paper.

      • We have changed Fig. 1A accordingly.

      I do not think I understand the statement that "SAC abolition might actually suppress the mitotic defects of the cbf11∆ cells". The lack of SAC might aggravate defects in kinetochore-microtubule attachment or other aspects of spindle assembly. If the authors know of specific examples where the deletion of mad2 or the genes encoding other SAC components rescued the mitotic defects, they should cite those papers. Either way, this point needs clarification.

      • We already provide an example in the Discussion:

      “Intriguingly, SAC inactivation has been shown to suppress the temperature sensitivity of the cut9-665 APC/C mutant, which is also prone to catastrophic mitosis (Elmore et al., 2014)”

      • We have now included this reference and explanation also at the point in the text that you are referring to.

      Brightfield images in Fig. 1 would be clearer without the overlap of the fluorescence channels. The authors could also change the contrast of the images to highlight the septum.

      • We have changed Fig. 1B as requested.

      The length of spindle (shown in Fig. S1) is a more informative measurement for mitotic dynamics and should be used instead of the "nuclear distance" presented in Fig. 2.

      • This might be true for a successful mitosis. But in case of defects (such as spindle detachment from the chromosomes, regressive merger of the daughter nuclei), these parameters become partially uncoupled and both are informative. We have therefore included the data from Fig. S1 in new Fig. 2C-D.

      Generally, the authors could improve the data visualization by including in all the plots the single data points distribution along with the mean/median and error bars like it was done in Fig.2 C,D,E.

      • Done.

      Reviewer #4 (Significance (Required)):

      The paper expands the knowledge on Cbf11, a still poorly characterized regulator of lipid metabolism. The idea that in addition to nuclear membrane limitation, perturbations of lipid metabolism might cause mitotic chromosome dynamics defects (for instance, through changing the protein acetylation levels), is interesting, but the authors should strengthen their conclusions by performing controls and further experiments.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #4

      Evidence, reproducibility and clarity

      Summary

      In this paper Vishwanatha et al. analyze the mitotic phenotypes of cells lacking a regulator of lipid metabolism Cbf11. They propose that sister chromatid cohesion abnormalities and altered chromatin marks may contribute to the increased incidence of catastrophic mitosis. Additional experiments are required to improve the study and strengthen the authors' conclusions.

      Major Comments

      Both histone and alpha-tubulin tagging are known to aggravate mitotic errors in S. pombe. Before using these markers for live imaging, the authors should quantitate mitotic phenotypes in untagged cbf11∆ cells, as compared to the wild type. Using DAPI and Calcofluor staining (and ideally, also visualizing microtubules using anti- alpha-tubulin antibodies) the authors should measure the percentage of cells in mitosis and the percentage of cells that are going, or just went, through catastrophic mitosis, in asynchronous early-mid-exponential cell populations.

      In analyzing the dynamic of nuclear division, the authors claim that the interval between spindle formation and anaphase onset is "longer" and "more variable" in cbf11∆ cells compared to WT cells. The authors should provide proper statistical analysis of both differences to show that these differences are significant. The same goes for the authors' claim that mitotic duration is "more variable" in cbf11∆ cells compared to WT cells. As mentioned above, alternative estimates of possible perturbations of mitotic dynamics could be obtained by measuring the percentage of cells in different mitotic phases in asynchronous untagged cell populations, in order to avoid possible artifacts given by tagging histones and alpha-tubulin.

      The fact that inactivation of SAC does not change the incidence of catastrophic mitoses shows that SAC is not involved and that there are likely no problems with kinetochore-microtubule attachments. Therefore, the authors' statement "These results suggest that SAC activity only plays a minor role (if any) in the mitotic defects observed in cbf11Δ cells" should be changed. Also, the authors' statement in the conclusion that "This indicates that proper microtubule attachment to kinetochores might be compromised and takes longer to achieve in cbf11Δ cells, possibly triggering the SAC" should be changed accordingly or further proof should be provided.

      As pointed out by the authors, cohesion occupancy is affected by the cell cycle phases duration. Therefore, the authors should correct their data (Fig.3C) for the different duration of mitosis or measure cohesion occupancy in mitotically synchronized populations. If this is not possible, I suggest removing this piece of data altogether.

      In Fig. 3A it is not clear what the authors mean by "morphological" differences between WT and cbf11∆ cells or between cbf11∆ cells and cbf11∆wpl1∆ cells. The authors should provide clearer images and indicate for each image which cells show morphological defects as an example.

      In Fig. 3A many cells in single or double cbf11∆ mutants show increased size typical of diploid cells. The authors should perform flow cytometry to test for possible diploidization in their mutants, as that would clearly affect any conclusions on mitotic defects rescue or enhancement.

      As correctly pointed out by the authors, it is not clear if the increase in mitotic defects in cbf11∆ cells is entirely due to the perturbed lipid metabolism or to other factors being affected by Cbf11. A possible approach to prove this point, as suggested by the authors too, would be to test if the mitotic defects identified in cbf11∆ are common to other mutants of lipid metabolism that also show an increase in catastrophic mitotic events. Also, the authors' statement in the conclusion: "we have demonstrated several novel factors, not directly related to lipid metabolism, that affect mitotic fidelity in cells with perturbed lipid homeostasis" should be modified as it was not proven that these effects are not due to altered lipid metabolism.

      Minor comments

      The initial distinction (Fig. 1A) between "cut" and "nuclear displacement" phenotypes is somewhat confusing, especially since the authors are not investigating the different outcomes of a catastrophic mitosis. The two outcomes should be grouped together under the definition of "catastrophic mitosis" as it is done in the rest of the paper.

      I do not think I understand the statement that "SAC abolition might actually suppress the mitotic defects of the cbf11∆ cells". The lack of SAC might aggravate defects in kinetochore-microtubule attachment or other aspects of spindle assembly. If the authors know of specific examples where the deletion of mad2 or the genes encoding other SAC components rescued the mitotic defects, they should cite those papers. Either way, this point needs clarification.

      Brightfield images in Fig. 1 would be clearer without the overlap of the fluorescence channels. The authors could also change the contrast of the images to highlight the septum.

      The length of spindle (shown in Fig. S1) is a more informative measurement for mitotic dynamics and should be used instead of the "nuclear distance" presented in Fig. 2.

      Generally, the authors could improve the data visualization by including in all the plots the single data points distribution along with the mean/median and error bars like it was done in Fig.2 C,D,E.

      Significance

      The paper expands the knowledge on Cbf11, a still poorly characterized regulator of lipid metabolism. The idea that in addition to nuclear membrane limitation, perturbations of lipid metabolism might cause mitotic chromosome dynamics defects (for instance, through changing the protein acetylation levels), is interesting, but the authors should strengthen their conclusions by performing controls and further experiments.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity):

      In this paper, the authors present convincing experimental proof on why the BH3-only protein PUMA resists displacement by BH3-mimetics, while others such as tBID do not. Using a SMAC-mCherry based MOMP assay on isolated mitochondria, FRET in the presence of liposomes with a phospholipid composition similar to that of mitochondria as well as quantitative fast fluorescence lifetime imaging microscopy (F__�__rster resonance energy transfer - qF3) they show that the C-terminal region of PUMA (CTS), together with its BH3-domain, effectively "double-bolt" locks its interaction with BCL-XL and BCL-2 to resist displacement by the BCL-XL-specific BH3-mimetic A-1155463 or the BCL-2/BCL-XL inhibitor ABT-263 and AZD-4320. Although a similar mechanism has previously been published for BIM, the novel C-terminal binding sequence in PUMA is unrelated to that in the CTS of BIM and functions independent of PUMA binding to membranes. First, in contrast to BIM, PUMA contains multiple prolines and charged residues, and an unusually short span of hydrophobic amino acids, secondly, full length PUMA was more resistant to BH3-mimetic displacement than a PUMA mutant lacking the CTS (PUMA-d26) even in solution suggesting that the CTS of PUMA contributes to BH3-mimetic resistance even in the absence of membranes.<br /> The second, quite unexpected finding of this paper is that, in contrast to previous publications, the CTS of PUMA does not target the protein to mitochondria but to the ER. The authors show this by FLIM-FRET imaging and confocal microscopy, and they created mutants to identify the CTS residues (I175 and P180) that mediate binding to ER membranes.

      The authors did an excellent job to show the mechanism of displacement resistance of PUMA from BCL-2 survival factors from different angles (in vitro, on isolated mitochondria, liposomes and inside living cells), generating respective BH3 and CTS mutants and also domain swaps with other BH3-only proteins such as tBID. Also, the unexpected finding that PUMA primarily localizes to the ER has been extensively scrutinized and the data presented are convincing.

      Response:

      We appreciate the favourable comments and that the reviewer found the data presented convincing.

      Major comments:

      I have only three questions which I like the authors to address before this MS can be published.

      1) How can PUMA perform its pro-apoptotic action on MOMP from its site on the ER? Does PUMA eventually localize to MAMs (mitochondrial/ER contact sites)? Is it possible to co-IP PUMA with BCL-XL or BCL-2 from ER membranes or show such an interaction inside cells with PLA?

      Response:

      The reviewer raises an important point. One of the main conclusions from this paper is that the primary localization of exogenously expressed PUMA is at the ER. Our intent was to highlight the inherent specificity of the PUMA CTS sequence. However, we agree that identifying the localization of PUMA-BCL-XL complexes would add significantly to the manuscript. We carefully considered using co-IP or a proximity ligation assay (PLA) in order to investigate the localization of PUMA-BCL-XL complexes. In our experience the use of co-IP is very difficult to interpret due to the well characterized detergent-induced artifacts previously shown for BCL-2 family protein interactions (PMID: 9553144, PMID: 33794146). Moreover, PLAs are a proximity assay with a detection range of ~>20nm, and are difficult to quantify beyond enumerating frequency (ie counting spots). In contrast, the detection of FRET by fluorescence lifetime imaging microscopy (FLIM) is very sensitive to distance with a maximum that is <10nm, and the results can be interpreted quantitatively as apparent dissociation constants (manuscript Figures 2-3). Therefore we elected to use FLIM-FRET to address this question. We examined PUMA-specific interactions with BCL-XL at the ER and mitochondria by differentially segmenting the FLIM-FRET image data based on the signal from a mCherry-fused landmark expressed at the ER (mCherry-Cb5) or mitochondria (mCherry-ActA). This approach has similar spatial resolution to PLA yet retains more rigorous requirement for proximity and the quantitative interpretability of FLIM-FRET.

      For these experiments we used a recently described the method of mitochondrial image segmentation using hyperspectral image data collected during FLIM-FRET imaging (Osterlund et al., 2023). In this approach, a watershed segmentation algorithm was used to identify mitochondria areas from mCherry-ActA images collected simultaneously with the FLIM data. The ER was identified in separate samples using the same approach with mCherry-Cb5 image data. Simultaneous collection of the images ensures that the data are not affected by movement within the cells. Example images showing the segmentation results for each organelle have been added to the manuscript as Figure 4 - Figure Supplement 2A.

      The results of this FLIM-FRET experiment described in the text lines 581-598, revealed that VPUMA interacts with CBCL-XL within both ER and mitochondria-segmented ROIs (new Figure 4 - Figure Supplement 2B). These results can be explained by the fact that VPUMA is targeted to the ER, and BCL-XL is known to localize to the ER and mitochondria when bound to BH3 proteins in cells (Kale et al., 2018, PMID: 29149100). This result is similar to what we reported for BIK, another ER-localized BH3 protein that exerts its pro-apoptotic function from ER membranes (PMID: 11884414 and PMID: 15809295). Our recent data for ER localized BIK binding to mitochondria-targeted BCL-XL (Osterlund et al., 2023), suggests that, as the referee suggested, binding to occurs via a membrane-spanning interaction at MAMs (ER-mitochondia contact sites) and/or via relocalization of BIK and/or BCL-XL in response to their co-expression (Osterlund et al., 2023). Consistent with these interpretations, when expression of endogenous PUMA was upregulated in response to stress (Figure 4- figure supplement 3A-B), the amount of PUMA increased at both ER and mitochondria (Figure 4- figure supplement 3C). We have presented this data and interpretation on lines 599-621 and discussed the localization results and the similarity to BIK in the manuscript discussion, lines 1029-1035.

      2) Since PUMA seems to be "double-bolt" locked to BCL-2 or BCL-XL via its BH3-domain and CTS, how can it act as a pro-apoptotic inducer? Is its main function to act as an inhibitor of BCL-2 and BCL-XL rather than a direct BAX/BAK activator? And if it acts as a BAX/BAK activator, how can it be released from BCL-2/ BCL-XL, for example by another BH3-only protein which is induced by apoptosis stimulation? Or would in this case PUMA remain bound to BCL-2/ BCL-XL in order to activate BAX/BAK (which would be a kind of new activation mechanism)?

      Response:

      We appreciate the reviewers queries and have clarified the text to indicate that our interpretation is that by binding to BCL-XL, PUMA releases active BAX that is sequestered by BCL-XL (as shown in Figure 1A for purified proteins). Double bolt locking increases both affinity and avidity of PUMA for BCL-XL enabling competition to favor PUMA binding and displacement of sequestered BAX. To further address the reviewers point we added two additional experiments now shown in figure supplements to Figure 1. The data shown in new Figure 1 – figure supplement 1A (described on lines 182-191 of the revised manuscript) demontrates that PUMA kills HCT116 and BMK cells but not HEK293 cells. New Figure 1 – figure supplement 1B shows that inhibition of BCL-XL and MCL-1 using BH3 mimetics is sufficient to kill HCT116 and BMK cells while HEK293 cells are not killed by even high concentrations of these BH3 mimetics. To kill HEK293 cells requires activation of BAX (described on lines 191-201). Together this data indicates that the primary pro-apoptotic function of PUMA is inhbiting BCL-XL and MCL-1 rather than by activating BAX. This data fits very well with PUMA double-bolt locking resulting in very tight binding of PUMA to BCL-XL and likely MCL-1 as the primary mode of PUMA mediated induction of cell death, at least in the three cell lines investigated here. The importance and role of PUMA mediated BAX activation is an interesting area of active investigation that is beyond the purview of the current paper.

      3) Is PUMA still bound to the ER when it is transcriptionally induced by genotoxic stress. In this case, the extra amount of PUMA produced is supposed to directly activate BAX/BAK. Does it do this on the ER or on mitochondria?

      Response:

      The referee raises a very interesting point.

      Interestingly, Zheng et al., 2022 highlighted a P53-dependent death response to genotoxic stress, which results in the extension of peripheral, tubular ER and promotes the formation of ER-mitochondria contact sites (PMID: 30030520). Furthermore, PUMA is transcriptionally activated by P53 (PMID: 17360476). Therefore, we hypothesized the induction of PUMA would increase the fraction of PUMA at ER membranes and MAMs. As the latter resemble mitochondria in micrographs of cells we anticipated an increase in apparent mitochondrial localization. To address this question experimentally, we treated MCF-7 cells with genotoxic stress and ER stressors and tracked the expression of endogenous PUMA by immunofluorescence. The results are described in the manuscript (line 603-613, page 28) and shown in Figure 4 figure supplement 3.

      The immunofluorescence data confirmed that PUMA protein levels increase after genotoxic stress, as expected (Reference 39, 40 in the manuscript) and to a lessor but still significant extent after ER stress (Figure 4 figure supplements 3A and B). In response to stress the amount of PUMA increased at both ER and mitochondria, however, in unstressed cells the endogenous Puma co-localized more to the mitochondria than to the ER (Figure 4- figure supplement 3C). This suggests that similar to BIK localization of PUMA is dynamic. In particular, the abundance and localization of PUMA binding partners such as BCL-XL also affects PUMA localization (the new data are described on pages 27-28, Lines 591-621). As described above, the extra PUMA induced by genotoxic stress can indirectly activate BAX by binding BCL-XL and displacing sequestered activated BAX. Our FLIM-FRET data suggest PUMA can bind BCL-XL at both the mitochondria and the ER. Moreover, given the expansion of ER-mitochondrial contact sites that occurs during stress we cannot rule out the possibility that ER-localized PUMA can inhibit mitochondria-localized anti-apoptotic proteins (both BCL-XL and MCL-1) at the ER (for BCL-XL)and MAMs for both proteins.

      Reviewer #1 (Significance):

      Very significant contribution to the field. Quite novel

      Reviewer #2 (Evidence, reproducibility and clarity):

      This study by Pemberton and colleagues investigates interactions of pro-apoptotic PUMA with anti-apoptotic BCL-2 proteins, employing a variety of BH3-mimetics. The authors demonstrate that the PUMA/aa BCL-2 interactions are mediated not only via BH3-domain/groove interactions, but also dependent on a C-terminal sequence of PUMA. This mirrors (with distinct differences) what the authors have previously reported for BIM. They then, reveal that unexpectedly PUMA is often localising to the ER (as opposed to mitochondria), though this localisation is not important for the resistance of PUMA/BCL-2 complexes to BH3-mimetic treatment, authors speculate that ER localised PUMA may have a day job.

      In my opinion, the study is important for several reasons, not least it strongly argues that BH3-mimetics are not optimal (in themselves) to promote apoptosis dependent on PUMA, and that approaches to disrupt the "double-lock" mechanisms should be sought - this has clear clinical importance, but equally important is it adds a new layer of complexity to how BCL-2 family members "work", how the double-lock mechanism is overcome in physiological apoptosis remains an open question, for instance. The data support the authors' conclusions, I have a few points that could be addressed.

      Response:

      The positive comments from the reviewer are greatly appreciated.

      1 - The authors data in cells is consistent with a membrane recruitment effect of the PUMA CTS making a contribution to the resistance of PUMA/aa BCL-2 complexes to BH3-mimetics. What I found really intriguing, is that the CTS also influences affinity in the absence of membranes (Figure 1) - could the authors speculate why they think CTS may be affecting PUMA/aaBCL-2 binding in the absence of membranes ?

      Response:

      We agree with the reviewer that membrane binding contributes to BH3 mimetic resistant binding of PUMA to BCL-XL consistent with elegant data presented previously (Pécot et al., 2016; PMID: 28009301). However, we show in Figure 5D that mutants of VPUMA-d26 with restored membrane binding (VPUMA-d26-ER1 and VPUMA-d26-ER2) remain sensitive to BH3-mimetic displacement, indicating that membrane binding alone is not sufficient to confer resistance to BH3-mimetics. Furthermore, as the reviewer pointed out BH3 mimetic resistant binding is observed in the absence of membranes (Figure 1).

      The data using purified proteins strongly suggests that the CTS of PUMA binds to BCL-XL and is directly involved in the protein-protein interaction. The fact that PUMA with the C-terminal fusion to the fluorescent protein Venus (PUMAV) still localizes to membranes in live cells (Figure 4 D,E) suggests that the C-terminus of PUMA does not span the membrane bilayer. Instead, we hypothesize that the C-terminus of PUMA binds peripherally to the membrane making it available to physically contribute to a protein interaction with anti-apoptotic proteins. This interpretation is consistent with the low hydrophobicity and high proline content (6 of 28 residues) of the amino acid sequence of the PUMA CTS as shown in Figure 6 and compared to the transmembrane tail anchor sequences of other proteins, including the BH3-protein BIK, in Figure 5 supplement 1. Binding of Bcl-XL by both the BH3 region and CTS of PUMA would increase both the affinity and avidity of the interaction. The presentation of this data has been revised to add clarity on pages Page 8, lines 215-223 and in the discussion (Lines 988-997 and 1044-1050).

      2 - A minor point for clarification, are the mitochondria used in Fig 1A from BAX/BAK DKO cells ? - I had presumed so given exogenous BAX was added, but didn't note this in the text.

      We indeed use mitochondria from BAX/BAK DKO cells and exogenous recombinant BAX in Figure 1A. This has now been added to the text on lines 166-180.

      Reviewer #2 (Significance):

      detailed in report above

      Reviewer #3 (Evidence, reproducibility and clarity):

      In this paper, Pemberton et al show that PUMA resists BH3-mimetic mediated displacement from BCL-XL via a novel binding site within its C-terminus of PUMA termed CTS (the last 26aa). Interestingly, the CTS of PUMA directs the protein to the ER membrane and residues I175 and P180 within the CTS are required for both ER localization and BH3-mimetic resistance.

      Specific comments:<br /> 1 - BH3-mimetics kill cells by displacing sequestered pro-apoptotic proteins to initiate apoptosis. However, PUMA resists BH3-mimetic mediated displacement, and PUMA-d26 and PUMA I175A/P180A (CTS) do not. Thus, are these mutants sensitive to BH3-mimetics cell killing? In other words, do BH3-mimetics kill PUMA-/- cells that express either PUMA-d26 or PUMA I175A/P180A but not PUMA-/- cells that express wild type PUMA?

      Response:

      The reviewer raises a very interesting question that unfortunately we have been unable to address unambiguously. To answer this question requires separating the effects of PUMA on anti-apoptosis proteins and on activation of BAX and BAK as exogenous expression of express either PUMA-d26 or PUMA I175A/P180A is sufficient to kill PUMA-/- cells without the addition of a BH3 mimetic. To date we have been unable to identify mutants that inhibit anti-apoptotic proteins but that do not activate BAX and BAK as both PUMA-d26 and PUMA I175A/P180A have impaired BAX-activation function. This is additionally complicated by PUMA mediated inhibition of MCL-1, BCL-2 and BCL-W. Further, it isn’t possible to separate the function(s) using BAX/BAK knock-out cells because then PUMA induced cell death is completely abrogated. Understanding the direct activation of BAX by PUMA is an area of current investigation that is out of the scope of this paper as here we are focused on the interaction(s) of PUMA with anti-apoptotic proteins.

      2 - The authors elegantly demonstrate using microscopic analysis that over expressed PUMA mostly localizes to the ER membrane. Since this is a major conclusion in the paper which is different than previously reported, the authors should confirm these findings using sub-cellular fractions followed by Western blot analysis. They should demonstrate that endogenous and over-expressed PUMA are mainly localized to the ER membrane and that the PUMA-d26 and PUMA I175A/P180A are mainly localized to the cytoplasm.

      Response:

      We appreciate that the reviewer found the microscopic analysis convincing. We also tested the idea of sub-cellular fractionation proposed by the reviewer.However, we have found it to be very difficult to separate mitochondria and MAMs. To address the question raised we instead performed new co-localization experiments,in addition to those reported for PUMA-d26 and the point mutants in Figure 6 (images in Figure 6 - figure supplement 3). The new experiments areforendogenous PUMA at steady state and with increased expressed in response to stress. These immunofluorescence experiments are reported in Figure 4 -figure supplements 3. We also added FLIM-FRET experiments in which ROIs were derived from areas of the cell enriched in either ER or mitochondria(Figure 4 - figure supplement 2). The results of these experiments indicate that PUMA localization is dynamic and are described in detail above in response to reviewer 1 question 3 and in the manuscript from line 579 to 621 and discussed on lines 1029-1036.

      Reviewer #3 (Significance):

      The advance in this paper is significant and the paper should be published once the specific comments are adequately addressed

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      RC-2022-01805

      We thank all reviewers for their careful analysis of our manuscript, constructive suggestions and support of our work.

      Reviewer 1

      The authors show that proximity of early mouse embryo blastomere chromosomes to the cell cortex activates the Polar Body Extrusion pathway to generate cell fragments. The authors use live cell imaging in control and Myo1C and dynein knockdown embryos to document accumulation of actin and myosin near chromosomes that come in close proximity to the cell cortex, which correlates with the increased fragmentation of the mutant blastomeres. The live imaging data are nicely presented and the results are well quantified. I have two major comments, and some minor comments on clarity, for the authors consideration in revising the manuscript.

      Major comment:

      1. The authors imply that Myo1 and dynein knockdowns result in an increase in the number of cells where chromosomes come in close proximity to the cell cortex. Apparently the spindle anchoring defects are meant to indicate that such defects are responsible for the increased frequency of abnormal chromosome proximity to the cortex. But the authors never actually document whether chromosomes in fact do come into proximity to the cortex more often in the mutant than in control embryos. The authors should clarify if they think the spindle anchoring defect does result in abnormal chromosome distributions. Can the authors somehow quantify a defect in overall chromosome positioning in mutant vs control blastomeres? Presumably the movies the authors already have could be used to provide such quantification?

      We thank the reviewer for this opportunity to correct our previous assumptions. Following the reviewer’s suggestion, we tracked the distance between the cell surface and the center of the chromosomes cluster throughout mitosis. We found little difference in this distance between control and Myo1cKO embryos (Fig S3a), unlike what we had initially implied. This distance seemed more variable in Myo1cKo embryos than in control ones, suggesting that chromosome movements may be more erratic but analysis of this variation for individual cells did not show consistent differences between control and Myo1cKO embryos either (Fig S3b). Therefore, we cannot explain the increased signaling with differences in proximity of the chromosomes to the cortex during mitosis.

      Instead, as already hinted in our initial manuscript as an additional factor, we find that signaling from chromosomes to the cortex can occur for an extended time in embryos with impaired spindle anchoring.

      We had already measured that mitotic spindles persisted for a longer time in Myo1cKO embryos than in control ones (Fig 2b), as well as in ciliobrevin treated embryos as compared to DMSO treated ones (Fig S2b). To strengthen this data, we performed additional experiments in which we injected mRNA encoding fluorescent lamin-associated protein 2b (Lap2b-GFP) to track the breakdown and reassembly of the nuclear envelope. Consistent with the mitotic spindle persisting for a longer time in Myo1cKO embryos than in control ones, it generally takes more time for Myo1cKO embryos to reassemble their nuclear envelope than for control embryos (50 min vs 70 min, n = 8 control and 15 Myo1cKO embryos, p = 0.0161, Fig S3c-d, Movie 5). Taken together, the nuclear envelope and spindle data indicate that, although chromosomes are not closer to the cortex in Myo1cKO embryos than in control ones, they spend more time outside of the nucleus. This should give chromosomes extended opportunities to signal to the cortex and explains how difficulties with chromosome separation can lead to the hyper-activation of the polar body extrusion pathway.

      We have revised our manuscript accordingly.

      Near the end of the paper, the authors discuss how cell with bent/un-anchored spindles are more prone to fragmentation, referring to Figure 2. But Figure 2 does not document a correlation between blastomeres with bent spindles and increased fragmentation. Rather it shows an increase in bent spindles and in fragmentation in mutant vs control, but does show that they occur together. The authors should more accurately describe their results or provide such a correlation with additional data.

      We thank the referee for pointing out this missing information.

      To support our conclusions, we now provide additional analyses of mitosis duration in non-fragmenting and fragmenting cells from Myo1cKO embryos. When cells fragment, their mitosis is consistently longer, as measured from the persistence of the mitotic spindle, than when not fragmenting (Fig 2c). This provides a direct correlation between spindle defects and fragmentation.

      We now present these analyses in the revised manuscript.

      Finally, in describing the data in Figure 3, the authors refer to persistence of the spindle and bending of the spindle as indicating problems with anchoring. It is not clear to me how either spindle persistence or bending relate to anchoring. The authors should explain how they are related if they are, and it would be better if the authors could document spindle displacement relative to the cell center or cortex to make their point more directly that anchoring is defective.

      We apologize for not making this clearer in our initial manuscript. As others noted before (Kotak et al, 2012; Mangon et al, 2021), poorly anchored spindles show larger displacements or rotations during mitosis. Spindle persistence and bending may not be directly related to spindle anchoring defects but could reflect broader issues with spindle assembly and function caused by spindle anchoring defects. Since a previous in vitro study had identified that Myo1cKO is important for spindle anchoring (Mangon et al 2021) and that ciliobrevin, known for compromising spindle anchoring, phenocopied these aspects, we had initially focused on anchoring defects in our conclusions. We still stand by our conclusion that our data suggest spindle anchoring defects. Nevertheless, we agree that our observations report more general spindle defects and that anchoring may be only one of the defective aspects. Instead of “spindle anchoring defects”, we now simply mention “spindle defects” unless specifically discussing spindle straightness and rotation.

      Minor comment.

      The authors document in Figure 3 that Myo1C KO blastomeres have an enhanced response, with more myosin accumulating at the cortex in response to chromosomes. Why does knocking out one non-muscle myosin lead to enhanced accumulation of another? The authors note this effect but provide no discussion as to how it occurs. Some clarification might be helpful.

      In our manuscript, we report that chromosome proximity to the cortex is associated with Cdc42 activation, which leads to cortical actin recruitment (Fig 4a-d). We also observe that non-muscle myosin II (Myh9) is recruited to the cortex when chromosomes come near (Fig 3d-f). Importantly, these phenomena occur in control embryos as well and not only in Myo1cKO embryos.

      We propose that this recruitment is further increased in Myo1cKO embryos (Fig 3f) because chromosomes spend more time outside of the nuclear envelope (Fig 2). This leads to fragmentation and is not specific to Myo1cKO since the same occurs after ciliobrevin treatment (Fig S2).

      The authors provide a significant advance in our understanding of why early mammalian embryos, especially early human embryos, are so prone to fragmentation. Their data strongly support their conclusion that increased proximity of chromosomes to the cortex does lead to activation of the PBE response, which is an interesting and well documented finding. However, unless the authors can address my major comments and provide more direct evidence for increased displacement of chromosomes being responsible for increased fragmentation, they should revise their manuscript to acknowledge that they have not directly quantified chromosome positioning and thus do not conclusively document that it is responsible for increased fragmentation in the mutant oocytes.

      We thank the reviewer for their thorough analysis of our data and for giving us the opportunity to correct some of the aspects of our study.

      Reviewer 2

      The manuscript "Ectopic activation of the polar body extrusion pathway triggers cell fragmentation in preimplantation embryos" by Pelzer and colleagues is focused on mechanism of cell fragmentation in early preimplantation embryos. This is an important issue, since fragmentation, with subsequent cell loss, has significant impact on early development of human embryos in vitro.

      To study the cell fragmentation within the embryo, authors used mouse model system. However, since during the mouse preimplantation development blastomere fragmentation is less frequent than in human embryos, they used knockout of unconventional myosin-Ic to induce fragmentation of embryonic blastomeres with higher frequency and a similar morphology, known from human embryos.

      Using their Myo1c KO, authors confirmed previous observation that reduction of myosin-Ic impairs spindle anchoring and they further show that the defects in spindle anchoring are linked to cell fragmentation. And that similar defects could be induced by chemical inhibition of dynein. Importantly, the defects in anchoring, causing aberrant spindle movements, bring spindle and chromosomal DNA to the proximity of the cell cortex. This induces local changes in concentration and organization of actin and myosin IIA and leads into fragmentation. Authors show that this pathway shares similarity with mechanism of polar body extrusion (PBE) during meiosis, namely that it requires active Cdc42-mediated actin polymerization or Ect2 signaling. And also, that important role in cell fragmentation is played by cell surface tension. Based on their results, authors propose that cell fragmentation within the embryo is triggered either by hyperactivation of PBE pathway in cells with normal surface tension, or by PBE pathway activation in cells with higher contractility.

      This manuscript brings important information about mechanism, which might contribute to the high incidence of blastomere fragmentation in human embryos. I have not identified any important issues with experimental work or conclusions and therefore I recommend this paper for publication. The results from the mouse model system however need to be verified by further studies in human or similar embryos, which naturally exhibit higher fragmentation.

      We thank the reviewer for their careful examination of our manuscript and data.

      We agree that it would be important to verify the validity of our findings in other species. We have considered performing experiments with human embryos.

      Ideally, we would need embryos in their early cleavage stages (zygote to 4-cell stages) to be able study fragmentation without perturbing morphogenetic movements, which begin at the 8-cell stage. Such early embryos are particularly rare, which further requires careful experimental design.

      Ideally, such carefully designed experiment would not cause additional fragmentation (as we have mostly done in the present study) but rather reduce this deleterious process. In light of our experiments shown in Fig 4c-d, inhibiting Cdc42 would be a good way to reduce polar body extrusion signaling. Injection of DNCdc42 mRNA would be embryo-consuming to setup. We tried a Cdc42 chemical inhibitor on mouse embryos with unreliable results. Therefore, we do not yet feel confident in using precious human embryos with our currently available options.

      Another complication is administrative since this project was funded by the ERC, which does not allow experimentation with human embryos.

      As for studying the phenomenon in species other than mouse or human, we currently have limited access to other mammalian species. Generally, other mammalian embryos are less well characterized and, in particular, the species-specific fragmentation behavior would need to be characterized before initiating any attempt to reduce it.

      We hope that the reviewers will agree that the current manuscript, describing and dissecting a previously unknown mechanism, makes sufficient advances to be published without the need to assess its evolutionary conservation.

      This study revealed important mechanism, which might be responsible for inducing fragmentation of blastomeres in early preimplantation embryos. Authors use mouse knockout model system and therefore the results should be verified in other species, in which the embryos show higher fragmentation naturally. The manuscript provides evidence that pathway, leading into PBE in oocytes, remains operational also in embryos and might contribute to blastomere fragmentation in case when spindle loses anchoring to the membrane. The results of this manuscript should be of interest not only to the researchers in reproduction, but also to the general audience.

      Reviewer 3

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The manuscript discussed interesting and relevant topics in which the Authors addressed the effects of mouse Myo1C knock out on cell fragmentation and spindle anchoring defects. The authors found that fragmentation occurs in mitosis after ectopic activation of actomyosin contractility by signals emanating from DNA.

      Reviewer #3 (Significance (Required)):

      This is an excellent report dealing with significant technical methodologies. I find no fault in the methods, data analysis, or conclusions. I only have two comments. First, the authors should expand on the previous findings about the of the role of Myo1c during early preimplantation development. Second, the discussion should be expanded to compare the results of this study with those of previous/related studies (e.g., other factors involve in fragmentation and spindle anchoring). Finally, I was not able to open movie#2 and movie#8 so they may need to be re-uploaded.

      We thank the reviewer for their careful assessment of our study.

      We apologize for not discussing enough the previous research on Myo1c. To our knowledge, there is only one previous study reporting the effect of a point mutation on Myo1c on mouse ear physiology (Stauffer et al 2005). This is the first study on the role of Myo1c during mouse development. At this point, we would like to stress that our study, while partially based on the KO of Myo1c, is about cell fragmentation, which we induce experimentally in three independent ways: Myo1c KO, ciliobrevin treatment or Ect2 overexpression.

      Regarding fragmentation, to our knowledge there is simply no convincing mechanism to explain this phenomenon. One study proposed that membrane threads connecting the cell surface to the zona pellucida could pull on cells and promote fragmentation (Derick et al 2017). However, fragmentation also occurs without zona pellucida, and hence without threads pulling on cells’ surfaces (Yumoto et al 2020). Other than that, fragmentation was associated with mitosis and general cytoskeleton defects, without no clear mechanism (Alikani 1999, Fujimoto et al 2011, Daughtry et al 2019).

      We have now expanded these discussions.

    1. General comments:

      This study carefully delineates the role of magnesium in cell division versus cell elongation. The results are really important specifically for rod-shaped bacteria and also an important contribution to the broader field of understanding cell shape. Specifically, I love that they are distinguishing between labile and non-labile intracellular magnesium pools, as well as extracellular magnesium! These three pools are really challenging to separate but I commend them on engaging with this topic and using it to provide alternative explanations for their observations!

      A major contribution to prior findings on the effects of magnesium is the author’s ability to visualize the number of septa in the elongating cells in the absence of magnesium. This is novel information and I think the field will benefit from the microscopy data shown here.

      I completely agree with the authors that we need to be more careful when using rich media such as LB. It is particularly sad that we may be missing really interesting biology because of that! It’s worth moving away from such media or at least being more careful about batch to batch variability. Batch to batch variability is not as well appreciated in microbiology as it is for growing other cell types (for example, mammalian cells and insect cells).

      For me, the most exciting finding was that a large part of the cell length changes within the first 10min after adding magnesium. The authors do speculate in the discussion that this is likely happening because of biophysical or enzymatic effects, and I hope they explore this further in the future!

      I love how the paper reads like a novel! Congratulations on a very well-written paper!

      Kudos to the authors for providing many alternative explanations for their results. It demonstrates critical thinking and an open-mind to finding the truth.

      Specific comments:

      Figure 2C → please include indication of statistical significance

      Figure 3C → please include indication of statistical significance

      Figure 6A → please include indication of statistical significance

      Figure 8B → please include indication of statistical significance

      Figure S1B → please include indication of statistical significance

      Figure S3B → please include indication of statistical significance

      For your overexpression experiments, do the overexpressed proteins have a tag? It would be helpful to have Western blot data showing that the particular proteins are actually being overexpressed. I think the phenotypes that you observe are very compelling so I don’t doubt the conclusions. Western blot data would just provide some additional confirmation that you are actually achieving overexpression of UppS, MraY, and BcrC.

      Questions:

      Based on your data, there are definitely differences in gene expression when you compare cells grown in media with and without magnesium. Because the majority in cell length increase occurs in such a short time though (the first 10min), I was wondering if you think that some or most of it is not due to gene expression? Do you have any hypotheses what is most likely to be affected by magnesium? Do you think if the membrane may be affected?

      Why do you think less magnesium activates this program of less division and more elongation? Additionally why is abundant magnesium activating a program of increased cell division and less elongation? Do you think there is some evolutionary advantage, especially considering how important magnesium is for ATP production?

      Related to this previous question, I also wonder if this magnesium-dependent phenotype would extend to other unicellular organisms, may be protists or algae? That would be a really exciting direction to explore!

      Regarding the zinc and manganese experiments, why do you think they lead to additional phenotypes compared to magnesium? Do you have any hypotheses?

      Regarding your results that Lipid I availability may be a major a problem for the cell division in the absence of magnesium, do you think that is due to effects magnesium has on the enzymes directly, or do you think magnesium affects the substrate availability/conformation by coordinating the phosphate groups? Or something else, may be membrane conformation?

    1. Author Response

      Reviewer #1 (Public Review):

      This study demonstrates that Chinmo promotes larval development as part of the metamorphic gene network (MGN), in part by regulating Br-C expression in some tissues (exemplified in the wing disc) and in a Br-C independent manner in other tissues such as the salivary gland. I have included below the following comments on the submitted version of this manuscript:

      1) The authors have shown experimentally that Chinmo regulates Br-C expression in the wing disc but not the larval salivary gland. Based on this, they posit that Chinmo promotes larval development in a Br-C-dependent manner in imaginal tissues and a Br-C-independent manner in other larval tissues. This generalization of Chinmo's role in development would be more compelling if the relationship between Chinmo and Br-C were explored in other examples of imaginal/larval tissues.

      We agree with the referee that confirmation of our observations in other tissues might help to generalize Chinmo’s role. To this aim, we have analyzed the role of chinmo in an additional larval, the larval tracheal system, and imaginal tissue, the eye disc. Consistent with the results reported in the manuscript, we found that the mode of action of Chinmo is conserved, as depletion of Br-C in the eye disc is able to rescue the lack of chinmo, whereas in the tracheal system it is not. We included this new information in the main text and in new SFigures 1 and 3.

      2) Chinmo, Br-C, and E93 have all been shown to be EcR-regulated in larval tissues, including the brain and wing disc (as in Zhou et al. 2006, Dev Cell; Narbonne-Reveau and Maurange 2019, PLOS Biology; Uyeharu et al. 2017, ). It would be interesting (and I believe relevant to this study) to know whether the roles of these factors in their respective developmental stages are EcR-dependent and whether their regulation by EcR (or lack thereof) depends on whether the tissue is larval or imaginal.

      Although the relevance of EcR on the regulation of the genes that conform the metamorphic gene network has been already established, a different response of EcR-mediated signalling of these genes in larval and imaginal tissues is still not properly addressed. Finding this possible different output of the EcR signalling would be very interesting. However, we think that this is out of scope of this report as the main aim of this study was to determine the main role of the temporal genes during development and their repressive interactions.

      3) In the chinmo qPCR analysis shown in Fig1A, whether animals were sex-matched or controlled was not indicated. Since Chinmo has a published role in regulating sexual identity (Ma et al. 2014, Dev Cell; Grmai et al. 2018, PLOS Genetics), and since growth/body size is known to be a sexually dimorphic trait (Rideout et al. 2015, PLOS Genetics), it seems important to establish whether the requirement of Chinmo for larval development and/or growth. I recommend either 1) controlling for sex by repeating qPCRs in Fig 1A in either males or females, or 2) reporting male/female chinmo levels at each stage side-by-side.

      As the referee pointed out, chinmo has been related to sexual identity raising the possibility of a different effect of chinmo in growth of males and females during development. However, several observations discard this option. First of all, the role of chinmo in sexual identity has been only reported in adult testis and specifically in cyst stem cells. In fact, specific mutations of chinmo that only affects the expression of chinmo in testis, do not affect testis formation but its maturation, suggesting a role of chinmo in sex determination specifically in the testis cyst stem cells (Ma et al. 2014, Dev Cell; Grmai et al. 2018, PLOS Genetics). Second, it has been described a sex dependent growth rate during larval development (Rideout et al. 2015, PLOS Genetics; Sawala A. and Gould AP, PLoS Biol, 2017). However, the main difference in growth rate between males and females is found in L3 larvae (Sawala A. and Gould AP, PLoS Biol, 2017), when the expression of chinmo strongly declines in both males and females, indicating that chinmo impact on sex dimorphism during larval development might be at least, limited.

      Thus, considering that, based on our results, chinmo exerts its main role in larval tissue growth during L1 and L2 stages and that body growth is practically identical in male and female during these stages (Sawala A. and Gould AP, PLoS Biol, 2017), we can assume that chinmo might not contribute to sexual body size dimorphism.

      Nevertheless, we would like to clarify that we have performed the measurements of chinmo expression always in females, when sex identification was possible, namely in L3 larvae. L1 and L2 larvae qPCRs were not sex-discriminated as sex identification was not possible in our conditions.

      4) In Fig2E, the authors show that salivary gland secretion (sgs) genes are repressed in salivary glands lacking chinmo. Sgs genes are expressed during late larval stages as the animal prepares to pupate. Thus, based on the proposed model where Chinmo promotes larval development and represses the larval-to-pupal transition, one might expect that larval salivary glands lacking chinmo would express higher than normal levels of sgs genes. This expectation directly opposes the observed result - it would be helpful to speculate on this in the interpretation of results.

      This is an interesting observation. As Sgs genes are regulated by Br-C (Duan et al. Cell Reports 2020), precocious expression of this transcription factor in chinmo depleted animals might result in an early activation of those genes. Interestingly, we were not able to detect any Sgs genes expression in chinmo depleted salivary glands. We think that this is due to the fact that in absence of chinmo, this organ does not properly develop and mature, and therefore it is unable to express Sgs genes. Proof of that is that the double knockdown of Br-C and chinmo shows the same dramatically low levels of those genes. Altogether, these results strongly suggest that SGs lacking chinmo expression are unable to grow and synthesise Sgs proteins, even in the premature presence of Br-C. We discussed this point in the main text of the edited Ms. Please also see the response to referee 2.

      Reviewer #2 (Public Review):

      The evolution and control of the three-part life history of holometabolous insects have been controversial issues for over a century. While the functioning of broad as a master gene controlling the pupal stage and of E93 as a master gene for the adult stage has been known for about a decade or more, chinmo has only recently been proposed as being the master gene responsible for maintaining the larval stage (Truman & Riddiford, 2022). While the former paper focused on the embryonic and early larval function of Chinmo, this paper explores its metamorphic effects and defines the roles of Broad and E93 in the phenotypes produced by manipulations of Chinmo expression.

      Overall, the paper is well presented but in places, readers would be helped if the authors were more explicit about the logic and details of their manipulations. There are a couple of conceptual issues that the authors should address.

      The role of Broad in larval tissues:

      One intriguing issue relates to the relationship of Chinmo to Broad and E93 in larval versus imaginal tissues prior to metamorphosis. The knock-down of chinmo in imaginal discs results in severe suppression of growth and the lack of metamorphic patterning genes such as cut and wingless. Normal growth and patterning are reestablished though, if broad is also knocked-down, supporting the notion that the effects of the lack of Chinmo are mediated through the premature expression of Broad.

      In the salivary glands, by contrast, chinmo knock-down suppresses growth, and this growth suppression is not reversed by simultaneous broad knockdown. They properly conclude that the role of Chinmo in supporting the growth of larval tissues does not involve Broad, but their data on the expression of salivary gland proteins suggest that Broad still plays some role in Chinmo function in salivary glands. Fig. 5E shows the levels of various salivary glue proteins in the glands of Chinmo knock-down larvae. The levels are reduced, as expected by the lack of salivary gland growth, but a significant finding is that they are there at all! The Costantino et al. (2008) paper shows that these genes are only induced in the mid-L3. Ecdysone, acting through Broad isoforms, is necessary for their appearance and these SGS genes can be induced in the L1 and L2 stages by ectopic expression of some Broad isoforms. Their low levels in Fig 5, would be due to the small size of the gland, but the gland's premature expression of Broad likely causes their induction. In larval cells, then, Chinmo may feed into two parallel pathways, one that does not involve broad and regulates growth and the other, utilizing Broad, regulating premetamorphic changes.

      It would be useful to look at early larval salivary gland proteins such as ng-1 to -3 that are expressed in salivary glands before the critical weight. Also, it would be interesting if the appearance of the SGS proteins after chinmo knock-down (Fig 5E) is abolished by simultaneous knock-down of broad.

      This is an interesting observation. We think that the main problem has derived from the way we presented the data. Our results showed that depletion of chinmo in the SGs dramatically impairs the induction of Sgs gene expression, even with the premature presence of Br-C, which has been shown to be responsible for Sgs expression (Duan et al. Cell Reports 2020). The confusion might come from the way we presented the level of expression of those genes. In fact, the levels of Sgs in both chinmoRNAi and chinmoRNAi/Br-CRNAi SGs were virtually undetectable, suggesting that chinmo in the SG is not only required for Br-C repression but also for proper development of the gland. We believe that based on the fact that the very low levels of expression of Sgs genes in chinmo depleted SGs are still detected in the double knockdown chinmoRNAi/Br-CRNAi. Dramatically reduced expression of the early larval SGs ng1-3 genes in chinmoRNAi and double knockdown chinmoRNAi/Br-CRNAi supports this statement. Altogether these results suggest that Br-C is necessary but not sufficient for the expression of those specific SGs genes. We have changed the plots in Figure 2 and 3 to clarify this point and added the levels of expression of ng1-3.

      Role of Chinmo and Broad in Hemimetabolous insects:

      In the conclusion of their comparative studies on the cockroach (line 342), the authors state that Broad exerts no role in the development of hemimetabolous insects. However, this conclusion is not consistent with the literature. The first study of broad knockdown in a hemimetabolous insect was in the milkweed bug Oncopeltus fasciatus by Erezyilmaz et al. (2006). Surprisingly to Erezyilmaz et al., broad knock-down in early-stage nymphs did not cause premature metamorphosis. However, Broad expression was essential for tissues of the wing pads and dorsal thorax to undergo morphogenetic growth (rather than simple isomorphic growth), and for stage-specific changes in coloration through the nymphal series (but not for the nymph to adult color change). A similar function for Broad on wing growth during the later nymphal stages was later shown in Blattella (Fernandez-Nicolas et al., 2022; Huang et al., 2013). The wing- and genital pads represent "imaginal" tissues in the nymph and the need for Broad in these tissues are the same as seen in imaginal discs as the latter shift from isomorphic growth to morphogenesis at the critical weight checkpoint in the L3. This would suggest that important roles for Broad and E93 are already established in the hemimetabolous insects with E93 controlling the shift from immature (nymphal) to adult phenotypes and Broad controlling the premetamorphic growth of imaginal tissues in early-stage nymphs. Chinmo might then be needed to keep both in check.

      We are sorry for not having dealt with these observations in the submitted manuscript. We have taken them into consideration in the new version to discuss about the role of Br-C in the transition from hemimetabolous to holometabolous.

    1. Proctor, as though a secret arrow had pained his heart: Aye. Trying to grin it away – to Hale: You see, sir, between the two of us we do know them all. Hale only looks at Proctor, deep in his attempt to define this man, Proctor grows more uneasy. I think it be a small fault.

      Proctor's defensive reaction, "I think it be a small fault" has a dramatic irony of its own since, as an audience, we cannot help thinking that although forgetting the commandment may be an excusable error, committing the sin itself is a far more grievous matter; one which has brought disaster not merely to the Proctors but to Salem as a whole.

    1. Author Response

      Reviewer #1 (Public Review):

      This paper presents a thorough biochemical characterization of inferred ancestral versions of the Dicer helicase function. Probably the most significant finding is that the deepest ancestral protein reconstructed (AncD1D2) has significant double-stranded RNA-stimulated ATPase activity that was lost later, along the vertebrate lineage. These results strongly suggest that the previously known differences in ATPase activity between extant vertebrates and, for example, extant arthropods is due to loss of the ATPase activity over evolutionary time as opposed to gains in specific lineages. Based on their analysis, the authors also "restore" ATPase function in the vertebrate dicer, but they did so by making many (over 40) mutations in the vertebrate protein, and it is not clear which of these many mutations is required for the restoration of the activity. Thus, it is difficult to discern how the results of this experiment relate to the evolutionary history.

      We completely agree with this reviewer's assessment of our paper. Our Michaelis-Menten analyses raised the intriguing idea that loss of ATPase activity in the helicase domain of the vertebrate ancestor may indicate loss of the ability to couple dsRNA binding to formation of the active conformation. Our rescue experiments support this idea, albeit in future studies we hope to create an active ancestor with fewer amino acid changes. While the rescue experiments validate what these analyses told us, as the reviewer suggests, they do not themselves inform on the evolutionary history.

      A criticism of the paper is the authors' tendency (probably unconscious) to ascribe a purposefulness to evolution. For example, in the introduction, "We speculate that the unique role of the RLR's in the interferon signaling pathway in vertebrates...created an incentive to jettison an active helicase in vertebrates." Although this sentence is clearly labelled as speculation and "incentive" is clearly a metaphor, the implication is that evolution somehow has forethought. (There are other instances of this notion in the paper, for example, in the last line of the abstract). The author's statement also implies that the developing interferon system somehow caused the loss of active helicase, but it seems equally plausible that the helicase function was lost before the interferon system co-opted it.

      We agree with the stated critiques and have rephrased language that suggests that evolution is an active force. In addition to changing the last line of the abstract (page 2, line 35), and removing the quoted sentence from the Introduction, we have included a more nuanced discussion of the order of evolutionary events that may have preceded or followed the loss of helicase function in Dicer (page 18, lines 418-430)

      Reviewer #2 (Public Review):

      The manuscript by Aderounmu presents an interesting attempt to reconstruct evolution of the function of the helicase domain in ancestral Dicers, RNase III enzymes producing siRNAs from long double-stranded RNA and microRNAs from small hairpin precursors. The helicase has a role in long dsRNA recognition and processing and this function could have an antiviral role. Authors show on reconstructed ancestral Dicer variants that the helicase was losing dsRNA binding affinity and ATPase activity during evolution of the lineage leading to vertebrates while an early divergent Dicer-2 variant in Arthropods retained high activity and seemed better adapted for blunt ended long dsRNA, which would be consistent with antiviral function.

      The work is consistent with apparent adaptation of vertebrate Dicers for miRNA biogenesis and two known modes of substrate loading: "bottom up" dsRNA threading through the helicase domain where the helicase domain recognizes the end of dsRNA and feeds it into the enzyme and "top-down" where the substrate is first anchored in the PAZ domain before it locks into the enzyme. Some extant Dicer variants are known to be adapted for just one of these two modes while Dicer in C. elegans exemplifies an "ambidextrous" variant. The reconstruction of the helicase domain complex enabled authors to test how well would be ancestral helicases supporting the "bottom up" feeding of long dsRNA and whether the helicase would be distinguishing blunt-end dsRNA and 3' 2 nucleotide overhang. Although the reconstruction of an ancestral protein from highly divergent extant sequences yields just a hypothetical ancestor, which cannot be validated, the work provides remarkable data for interpreting evolutionary history of the helicase domain and RNA silencing in more general. While it is not surprising that the ancestral helicase was a functional ATPase stimulated by dsRNA, particularly new and interesting are data that the decline of the helicase function started already at the level of the common deuterostome ancestor and the helicase was essentially dead in the vertebrate ancestor. It has been reported two decades ago that human Dicer carries a helicase, which has highly conserved critical residues in the ATPase domain but it is non-functional (10.1093/emboj/cdf582). Recently published mouse mutants showed that these highly conserved residues are not important in vivo (10.1016/j.molcel.2022.10.010). Aderounmu et al. now suggest that Dicer carried this dead ATPase with conserved residues for over 500 million years of vertebrate evolution.

      I do not have any major comments to the biochemical analyses and while I think that the ancestral protein reconstruction could yield hypothetical sequences, which did not exist, I think they represent reasonable reconstructions, which yielded data worth of interpretations. My major criticism of the work concerns clarity for the readership and interpretations of some results where I wish authors would clarify/revise the text. The following three examples are particularly significant:

      1) It should be explained to which common ancestor during metazoan evolution belongs the ancestral helicase AncD1D2 or at least what that sequence might represent in terms of common ancestry during metazoan evolution.

      We thank the reviewer for bringing this issue to our attention, and we have now included a brief discussion of the complexity in identifying AncD1D2’s exact position in metazoan evolution (page 6, lines 124-134). Our maximum likelihood phylogeny is constructed from Dicer’s helicase and DUF283 subdomains which evidently do not contain enough phylogenetic signal to resolve the finer details of early metazoan evolutionary events surrounding the divergence of non-bilaterians: Porifera, Ctenophora, Cnidaria and Placozoa. In our tree, Cnidaria even diverges later than the Nematode bilaterian branch reflecting the fact that our reported phylogeny does not match consensus species relationships, especially in the invertebrate clades. This means we cannot pinpoint AncD1D2’s exact position with certainty. While we do not intend to overinterpret the evolutionary trends from these hypothetical ancestral constructs, we believe the functional differences in biochemical activity are meaningful and correspond to big-picture changes over evolutionary time. AncD1D2 thus corresponds to some early metazoan ancestor that existed before the divergence of bilaterians from non-bilaterians. In support of this interpretation, when the phylogeny is constrained such that the bilaterian branches match the consensus species tree (Figure 1-figure supplement 2A) we observe that AncD1D2 is ancestral to the bilaterian ancestor, AncD1BILAT (now labeled on the figure), but retains 95% identity to the version of AncD1D2 constructed from the maximum likelihood phylogeny (Figure 1-figure supplement 3B).

      2) This is linked to the first point - authors work with phylogenetic trees reconstructed from a single protein sequence, which are not well aligned with predicted early metazoan divergence (https://doi.org/10.1098/rstb.2015.0036). While their sequence-based trees show early branching of Dicer-2 as if the two Dicers existed in the common ancestor of almost all animals (except of Placozoa), I do not think there is sufficient support for such a statement, especially since antiviral RNAi-dedicated Dicers evolve faster and Dicer-2 is restricted to a few distant taxonomic group, which might be better explained by independent duplications of ambidextrous ancestral Dicers. I would appreciate if authors would discuss this issue in more detail and make readers more aware of the complexity of the problem.

      We agree with the reviewer that in our initial submission we did not properly address the incongruence between our maximum likelihood phylogeny and the consensus species tree of life. We have now addressed this by revisions that discuss the difficulty in using a single gene or protein to accurately date ancient evolutionary events, especially in the case of Dicer, a protein whose evolutionary history is littered with multiple duplication events (page 6, lines 124-147, beginning with “Importantly, we observed multiple instances…”; page 16, lines 365-371, sentence beginning with “Uncertainty in the single gene or protein phylogeny…”). Our assumption that an early gene duplication produced the arthropod Dicer-2 clade is consistent with previous Dicer phylogenies that have been constructed with maximum likelihood algorithms with different parameters (https://doi.org/10.1371/journal.pone.0095350, https://doi.org/10.1093/molbev/msx187, https://doi.org/10.1093/molbev/mss263) using full length Dicer sequences with different taxon sampling depths and tree construction parameters. Removing other fast evolving taxa with long branch lengths from the sequence alignment still resulted in arthropod Dicer-2 branching out early in metazoan phylogeny (https://doi.org/10.1093/molbev/mss263).

      In analyses not included in our manuscript, we also independently constructed trees using full-length metazoan Dicers, helicase and DUF-283 subdomains using both RAXML-NG and MrBayes. We tried different taxon sampling depths and tried rooting the tree using either a non-bilaterian outgroup or a fungal outgroup and also tried breaking up potential long-branch attraction with deep taxon sampling. In every iteration, the arthropod Dicer-2 clade diverged early in animal evolution at some point before or during non-bilaterian evolution. We recognize that all these efforts are still prone to long-branch attraction that may cause the rapidly evolving Dicer-2 clade to artificially cluster with distant outgroups, but so far, the only evidence to support an arthropod-specific duplication event is parsimony. This parsimony model is plausible and one might expect a recently duplicated arthropod Dicer-2 to cluster closely with nematode Dicer-1, another antiviral Dicer that would have descended from a common ecdysozoan ancestor but this is not the case. The nematode HEL-DUF clade does get attracted to non-bilaterian Cnidaria clade in our ML tree, but unlike the arthropod Dicer-2 clade, this position varied depending on the parameters of phylogenetic analysis, and so we cannot conclude that arthropod Dicer-2’s position is due to long branch attraction. More sophisticated phylogenetic and statistical tools are needed to answer this question definitively, so we decided to proceed with the highest scoring maximum-likelihood phylogeny generated by our analysis.

      While we have now included a short discussion on the nature of this uncertainty in the revised manuscript (page 6, line 124., page 16, lines 365-371), we have excluded these additional details (paragraph above) from the main text in an attempt to prioritize readability for the generalist reader, and we hope that more specialized readers will find this discussion in the public comments helpful.

      3) Authors should take more into the account existing literature and data when hypothesizing about sequences of events. Some decline of the helicase activity is apparent in AncD1DEUT suggesting that it initiated between AncD1D2 and AncD1DEUT. This implies that a) antiviral role of Dicer was becoming redundant with other cellular protein sensors by then and b) Dicer was already becoming adapted for miRNA biogenesis, which further progressed in the lineage leading to vertebrates to the unique top-down loading with the distinct pre-dicing state where the helicase forms a rigid arm. Authors even cite Qiao et al. (https://doi.org/10.1016/j.dci.2021.103997) who report primitive interferon-like system in molluscs - this places the ancestry of the interferon response upstream of AncD1DEUT and suggests that this ancestral protein-based system was taking over antiviral role of Dicer much earlier. In fact, a bit weaker performance of AncD1LOPH/DEUT combined with the aforementioned interferon-like system and massive miRNA expansion in extant molluscs (10.1126/sciadv.add9938) suggests that molluscs possibly followed a convergent path like mammals. While I am missing this kind of discussion in the manuscript, I think that the model where "interferon appears ..." in AncD1VERT (Fig. 6) is incorrect and misleading.

      This comment is similar to others, including point 3 of Essential revisions, and we have revised our model in Figure 6 accordingly. We agree with the reviewer that we did not sufficiently explore the significance of the decline in Dicer helicase function between AncD1D2 and AncD1DEUT. In addition to the changes noted in point 3 of Essential revisions, we have corrected this by adding or modifying sentences in the Results (page 9, sentence beginning on line 197 “This reduction in ATP hydrolysis efficiency prior to deuterostome divergence may have coincided with…”, and page 11, sentence beginning on line 247 “One possibility is that between AncD1D2 and the deuterostome ancestor…”).

      We did not intend to suggest that this loss of Dicer helicase function was unique to vertebrates, but we focused on the deuterostome-to-vertebrate transition for the following reasons:

      a) The mollusk clade in our analysis is incongruent with its expected species position as a protostome. In our tree it clusters with deuterostomes instead. On one hand, this is probably an artefact of incomplete lineage sorting or long branch attraction. On the other hand, it is possible that this clade’s position is an underlying signal of the convergent evolution proposed by the reviewer. In support of the latter, some extant mollusk Dicer helicases (ACCESSION: XP_014781474, ACCESSION: XP_022331683) show a loss of amino acid conservation in Dicer’s ATPase motifs implying that extant mollusks have also lost Dicer helicase function like vertebrates. However, this is in contrast to vertebrate Dicer helicase where loss of function exists, but ATPase motifs remain conserved. We do not discuss this in the paper because the evidence remains inconclusive until extant mollusk Dicers can be functionally characterized, similar to Human Dicer and Drosophila Dicer-1, to determine that they are truly specialized for miRNA processing to the detriment of helicase function.

      b) Caenorhabditis elegans Dicer is an example of an ambidextrous Dicer, that processes both miRNAs, with the top-down mechanism, and viral dsRNAs, with the bottom-up mechanism. Recently, work has been published that suggests that C. elegans also possesses a protein-based innate immune defense mechanism, but instead of competing with the RNA interference mechanism, both mechanisms seem to work in concert and even share a protein in both pathways: DRH-1, a RIG-I-Like receptor homolog (https://doi.org/10.1128/JVI.01173-19). Furthermore, a protein-based pathway has also been reported in Drosophila and in this scenario Drosophila Dicer-2 is the dsRNA sensor that is common to both pathways (https://doi.org/10.1371/journal.pntd.0002823). This collaboration observed in ecdysozoan invertebrates is different from the competition that has been well established in vertebrates. More data is needed to understand whether a model of competition or collaboration exists in lophotrochozoan invertebrates like mollusks.

    1. Author Response

      Reviewer #1 (Public Review):

      The authors provide evidence for chromatin, which in Drosophila muscle cells is peripherally localized in the nucleus, whereas the central region is depleted of chromatin, and is organised such that RNA polymerase II (RNAp) is surrounding dense regions of chromatin. The authors theoretically study the formation of these regions by describing chromatin as a multi-block copolymer, where the blocks correspond to active and inactive chromatin regions. These regions are assumed to phase separately and to have different solvability. The solvability of the active region is regulated by binding RNAp. The authors study the core-shell organization in a layered geometry by analyzing the various contributions to free energy. In this way, they in particular obtain the dependence of the shell-layer thickness, which is described as a polymer brush. From these results, they infer chromatin organization in spherical coreshell chromatin domains and compare these results to Brownian dynamics simulations.

      The work is well done and even though it uses standard methods for studying block copolymers and polymer brushes obtains interesting information about local chromatin organization. These findings should be of great interest to researchers in the field of chromatin organization and in general to everybody interested in understanding the physical principles of biological organization.

      The work has two main weaknesses: The experimental evidence for RNAp and chromatin microorganization is weak as only one example is shown. It remains unclear whether the observed organization pattern is common or not. Also, no data is shown concerning the dependence of the extensions of the active and inactive phases on parameters, for example, solvent properties or transcriptional activity. Second, some parts could prove difficult for biologists to assess. For example, the expression for the brush-free energy should be explained in more detail and notions like that of 'mushrooms' need to be introduced. As a second example, biologists might benefit from a better explanation of the concept of a theta solvent and its relevance.

      We thank Reviewer #1 for the positive review and critical feedback. Below we answer the points raised in the last paragraph of its review.

      In the original version of the manuscript we only showed a representative image of nuclei of muscle cells in an intact, live Drosophila larvae. Notably, this organization is representative of many nuclei analyzed in muscle tissue. In the revised version we show that in a distinct tissue, e.g. salivary gland epithelium of live Drosophila larvae, RNA Pol II distribution is similarly facing the nucleoplasm, although chromatin condensation differs due to higher DNA ploidy. The new images were added as Supplement information (Fig A1). Since these representative images are the main motivation behind our theoretical analysis, we think that including them will help the reader in understanding the relevance of our minimal model. The effect of different biological perturbations, such as changes in the repressive marks and how these change the core-shell structure require extensive experiments that are outside the scope of the present paper. We also note, that in live organisms (not just live cells) such as those studied here, one can only reliably use genetic perturbations; solvent quality is regulated by the organism and cannot be controlled as in synthetic polymer experiments. Our main focus in the present paper is to highlight an area that has been relatively unexplored by the chromatin organization community, which is how changes in concentrations binding-partners of chromatin may have a strong effect in nuclear architecture.

      We have also improved the explanation of the physical concepts for biologists. We added a more thorough explanation of the concept of a polymer brush and explained more clearly what the concept of theta solvent in terms of the scaling properties of a polymer in solution. We quote these revisions below.

      Reviewer #2 (Public Review):

      This work formulates a detailed theoretical polymer physics model intended to explain the observed morphology of chromatin in the Drosophila cell nucleus. The model is examined in detail by both analytical calculation and computer simulation. The central premise of the suggested theory is that it is again based on equilibrium statistical mechanics. Within this paradigm, authors explore the model that views chromatin fiber as a block copolymer and, most importantly, describes the role of RNA polymerase as it interacts with one of the copolymer blocks and regulates its effective solvent quality. Blocks are assumed to be fixed on the time scale of interest by, e.g., different levels of acetylation or methylation. RNA polymerase is supposed to interact only with one of the chromatin blocks, called active, and assumed interaction is quite peculiar. Namely, RNA polymerase complex may absorb on chromatin fiber and, the model assumes, the fiber decorated with absorbed RNA polymerase molecules is less sticky to itself, or more repulsive than the fiber itself. This peculiar assumption allows authors to make interesting predictions about how proteins can regulate the genome folding architecture.

      We thank the reviewer for the positive and critical feedback. We agree that our assumption of changes in the effective solvent stemming from protein complexes binding to chromatin is at the core of our analysis and we justify it further below.

      STRENGTH

      The work includes a rather detailed theoretical description of the model and its equilibrium statistical mechanics. As both analytical theory and accompanying simulation indicate, the assumptions put forward in formulating the model do indeed produce the desired morphology, with isolated regions ("micelles") of core inactive chromatin surrounded by the less dense shell region in which RNA polymerization may potentially take place. Having such a detailed theory is potentially beneficial for the field and opens up avenues for further exploration.

      We thank the referee for appreciating the potential benefit of our minimal theory of solvent-quality regulation by binding processes.

      WEAKNESS

      The underlying assumption about the interaction of RNA polymerase complex with the fiber, although important and organic for the model, does not seem easy to justify from a molecular standpoint, especially thinking of the charges and electrostatic interactions.

      We visualize that the binding of RNA Pol II (mediated by different transcription factors) to chromatin is also associated with larger protein complexes that may contain hydrophobic and hydrophilic components, such as pre-initiation complexes. Some regions of these complexes might associate directly with chromatin due to positive charges on the surface of the Pol II complex , whereas the hydrophilic negative regions may be directed towards the solvent. Our theory is typical of the approach used in polymer physics where coarse-grained interactions are considered. While the origin of hydrophilic interactions lies in electrostatics, such interactions are highly screened in cells (typically 200 mM concentration of salts) and can be considered as short-ranged and competitive with hydrophobic interactions. Chromatin in solution is known to condense (see Gibson, et. al., Cell 2019 and Strickfaden, et. al., Cell 2020) and even phase separate from the nucleoplasm (see Amiad-Pavlov, et. al., Science Advances, 2021); this can arise either from hydrophobic interactions of the histone tails or from opposite charge attraction of the histones and linker DNA. In our model, this competes with the binding of protein complexes which then disrupt the self-attraction of chromatin. Previous work has shown that RNA Pol II associating with chromatin (in the absence of transcription) prevents the coarsening of dense chromatin domains (see Hilbert, et. al. Nat. Comm. 2021), which agrees with our modeling of protein complexes that bind to chromatin and interfere with its condensation; in addition, the binding of Pol-II and all its binding partners that form the pre-initiation complex (see Hahn, Nat. Struct. & Mol. Biol. 2004, 11) will result in effective, steric repulsion between different active and Pol II bound chromatin domains. Another interesting observation is that most of the surface of RNA Polymerase II is negatively charged with a few positively charged patches with which it specifically interacts with DNA while others serve as exit paths of RNA (see Cramer, et. al., Science, 2001.). We agree that a more thorough analysis of the molecular interactions between what we name protein complexes and chromatin is interesting, but it is out of the scope of our paper that uses a coarsegrained, polymer physics approach. This approach also allows our model to be to be predictive as to the physical organization and growth of the domains, independent of those molecular details that are as yet unknown.

      Reviewer #3 (Public Review):

      This theoretical study provides a theoretical explanation for a puzzling question arising from recent experiments: How can chromosomes behave like polymers collapsed in a poor solvent but also contain "open" active chromatin sections? The authors propose that the binding of proteins (e.g. RNAP's) to the active sections can effectively change the solvent quality for these sections and thus open them. They suggest further that chromosomes show micellar structures with inactive blocks forming the cores of the micelles. Protein binding causes swelling of the micellar shells which affects the whole chromosome structure by changing the total number of micelles. This theory fits well to live imaging data of chromatin in Drosophila larvae, like the one shown in the striking Figure 1.

      The manuscript is written very clearly.

      My only suggestion is that the authors, in both the theory and simulation parts, are more explicit about how the interactions between the various components are modeled. From what I could see, in the theory part, one needs to look closely at Eq. 5 to understand how the influence of the binding of proteins affects the interaction between active monomers, and in the simulation part, one needs to go to the appendix to learn that interaction strengths between monomers within the active blocks and monomers within the inactive blocks have different values. The latter is crucial to understand the micellar structure shown at the top of Fig. 5A.

      We thank the reviewer for his positive response. We have explained Eq. 5 more carefully now and included other explanatory remarks throughout the text. We also explained more clearly the interactions considered in the simulations. Below we answer point by point and add quotes from the revised manuscript.

    1. We further evaluated the pipeline with a genome containing simulated HGT regions. Since our78HGT identification pipeline has two main steps, sequence composition-based filtering step and79genome comparison step. The evaluation was done for the two steps (Figure S3, Table S1). While80top 1% fragments were input to the pipeline, 20.6% correct results would be identified after81sequence composition-based filtering and 14.3% correct results identified after genome comparison.82When the percentage of fragments input was up to 50%, 83.4% and 77.7% correct results were83identified after two steps respectively. It can be seen that the precision of prediction was higher than8460% for all cases. This indicated that we may have underestimated the number of HGTs (low recall85rate) but majority of the identified HGTs were highly reliable.

      This paragraph was a bit confusing to follow but I think I got the gist of it after a few passes through! I'm curious if you thought about controlling for natural variation in 4mer frequency throughout the genome, as some other methods have found that this helps reduce off target predictions (reviewed in https://doi.org/10.1371/journal.pcbi.1004095). It may not be necessary since you do a second step after the initial screen, but I was just curious if that was something you thought about putting in place, and if so, why you decided against it

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their constructive feedback on our manuscript. They did a very comprehensive and helpful job of laying out some key areas that could be improved. We were heartened by the fact that there was a fair amount of overlap between the two reviewers, and that comments were largely addressable without further experimentation.

      Below, we provide a summary of how we have attempted to address the comments and concerns from both reviewers. We also provide the rationale and action items for our responses. Overlapping comments from both reviewers have been consolidated and responded to together.

      Comment 1 (Reviewer #1, Minor Comment 1 & Reviewer #2, Significance)

      Both reviewers raised concerns about our choice to focus on essential genes in our CRISPRi screen, which could potentially underestimate the role of non-essential factors contributing to Tae1 sensitivity or resistance.

      Rationale: We agree with the reviewers that including non-essential genes could provide additional insights into the roles of non-essential factors in Tae1 sensitivity and resistance. We believe our focus on essential genes contributes a unique perspective to the field, as there already exists a body of work that interrogates non-essential genes in this space. Here are some citations that represent this body. We will highlight these better in the manuscript.

      Lin, H.-H.; Yu, M.; Sriramoju, M. K.; Hsu, S.-T. D.; Liu, C.-T.; Lai, E.-M. A High-Throughput Interbacterial Competition Screen Identifies ClpAP in Enhancing Recipient Susceptibility to Type VI Secretion System-Mediated Attack by Agrobacterium Tumefaciens. Front Microbiol 2020, 10, 3077. https://doi.org/10.3389/fmicb.2019.03077.

      Hersch, S. J.; Sejuty, R. T.; Manera, K.; Dong, T. G. High Throughput Identification of Genes Conferring Resistance or Sensitivity to Toxic Effectors Delivered by the Type VI Secretion System; preprint; Microbiology, 2021. https://doi.org/10.1101/2021.10.06.463450.

      Additionally, our screen was experimentally optimized for essential genes using our approach. The knockdown strategy is useful specifically for essential genes because E.coli is phenotypically very sensitive to essential gene perturbations (see more here: https://doi.org/10.1128/mBio.02561-21). While it would have been ideal to include non-essential genes too, doing so would require a different additional optimization that we believe would have diluted our bandwidth for this study. We do thank the reviewers for recognizing how much effort went into this!

      We do acknowledge this is a limitation and want to make sure the readership is aware of that. Ideally, one could do more rigorous side-by-side comparisons between studies if the approaches, set-up, and assays are the same. Unfortunately, due to differences in experimental set-up, we could not directly compare with the non-essential screens. We hope others will pick up where we left off. Here are some action items we can take to increase the odds of that:

      In the Introduction, we will mention other studies and highlight the need to investigate essential genes side-by-side with non-essential. (Lines 64-7) In the Discussion, we will add a sentence that acknowledges the importance of exploring non-essential genes for a more comprehensive understanding of Tae1 sensitivity and resistance. (Lines 484-5)

      Comment 2 (Reviewer #1, Minor Comment 5 & Reviewer #2, Major Comment)

      Both reviewers mentioned that the dormancy state in msbA-KD cells is not well characterized and its relationship with Tae1 resistance is not convincingly shown.

      Rationale: We agree that our manuscript does not clearly pin down whether Tae1 resistance is linked to a true dormancy state. There are some intriguing similarities between what we observe and what is classically known as “dormancy” or “persistence”, which have specific definitions. Although we don’t yet have a concrete reason to think it’s NOT those states, we also don’t have sufficient data to point to it clearly being the same at a mechanistic or cellular level. This is merely a hypothesis that our work suggests. We would love to see others follow up on this, as we suspect there are overlaps and potentially additional cellular states that have yet to be clearly defined in this field of bacterial physiology.

      Here is how we propose to address this concern:

      We simplified our language to be more descriptive and less loaded in terms of nomenclature around dormancy or persistence. Namely, we are referring to the cells in a more descriptive way with “slowed growth.” This allows us to clearly describe what we observe without attempting to ascribe mechanism or anything beyond that. It doesn’t fundamentally change the overarching interpretation of our study. (Lines 444, 490,497-9) In the Discussion, we will add text emphasizing the need for follow-up studies to fully address whether there is indeed a connection between Tae1 resistance and slowed growth. (Lines 491-3)

      Comment 3 (Reviewer #2, Major Comment)

      The reviewer asks if the degradation of the sugar backbone is also required for lysis or if it is just the crosslinking step that is important.

      Rationale: This is an astute point. We acknowledge that the degradation of the sugar backbone may play a role in lysis, and it’s predicted that this may be why the Pae H1-T6SS delivers a second PG-degrading toxin (Tge1), a muramidase that targets the sugar backbone. The most parsimonious conclusion from past studies by us and others is that Tae1 is critical for lysis, but not sufficient in the absence of any backbone-targeting enzyme. Indeed, many T6SS-encoding bacterial species also encode >1 type of PG-degrading enzyme, which may speak precisely to the reviewer’s point. However, it should also be noted that there may be endogenous enzymes with activities that can be leveraged alongside these toxins for the same effect.

      Action items:

      In the Discussion, we will add a sentence addressing the potential role of sugar backbone degradation in the lysis process and the need for future research on this topic. (Lines 524-6)

      Comment 4 (Reviewer #1, Minor Comment 2)

      The reviewer asks why lptC-KD leads to sensitivity to Tae1, while msbA-KD leads to resistance, considering both genes are implicated in LPS export.

      Rationale: We appreciate the reviewer's careful attention to the underlying biology. They are absolutely correct in pointing this difference out. Our interpretation is that the different phenotypes may indicate that although the LPS biosynthesis superpathway intersects with PG synthesis, lptC and msbA may intersect with PG synthesis in distinct ways. We can address this concern through the following:

      We will add a sentence in the Discussion section providing our interpretation of the different phenotypes observed for lptC-KD and msbA-KD. (Lines 508-13)

      Comment 5 (Reviewer #1, Minor Comment 4)

      The reviewer notes that the contribution of msbA to Tae1 resistance appears minor based on the results in Figure 3d.

      Rationale: There are actually two aspects to this concern, which we note below. We found it difficult to fully capture it in the manuscript, but our thoughts are as follows.

      (1) Technical viewpoint:

      Bacterial competition experiments are inherently noisy. The quantitative read-out is easily impacted by a number of parameters, including cellular density, input ratio between competitor cell types, growth stage, and possibly other environmental factors that are difficult to predict. In general, our view is that we should avoid over-indexing on the degree of the phenotype, focusing more on the direction of the phenotype (loss of statistically-significant Tae1 sensitivity) and the fact that it is reproducible in our hands. Furthermore, our argument is bolstered by clear validation of the loss of Tae1 sensitivity through orthogonal lysis assays (Fig. 4a-c).

      (2) Biological viewpoint

      It is challenging to isolate the specific interaction between Tae1 and individual genetic determinants, as we think it’s a complex system with multiple factors simultaneously at play. It is crucial to acknowledge that the unique contribution of Tae1 is only a part of the T6SS. There may be other compensatory actions that influence the outcomes observed, such as upregulation of non-Tae1 toxins, regulation of system activation/firing, timing and location of T6S injections, etc. We think these are exciting possibilities and that more groups should delve into the context-dependent dynamics of the system. Although outside the scope of our manuscript, we would be open to suggestions for how we can further emphasize this point.

      Comment 6 (Reviewer #2, Minor Comment)

      The reviewer recommends that we discuss whether our findings are specific to Tae1 or if they can be extrapolated to other toxins.

      Rationale: We understand the reviewer's interest in understanding the broader implications of our findings. Although our study focuses specifically on Tae1, we believe that our findings may provide insights into the mechanisms of sensitivity and resistance to other toxins that target the cell wall. However, experimentally investigating this would fall outside the scope of our current manuscript.

      Additional Minor Revisions

      Table 1: I would label MsbA and LptC as "LPS transport" and not "LPS synthesis" (Reviewer 1) Rationale: We agree that using “LPS transport” to describe the gene functions for lptC and msbA is more specific to their functions.

      Table 1 was updated to change the “pathway/process” categorizations for lptC and msbA from “LPS synthesis” to “LPS transport”. In line with this comment, we also changed the pathway/process categorization for murJ (Lipid II flippase) to “PG transport”. Figure 3 legend: "...deformed membranes .........are demarcated in (g) and (h)" (Reviewer 1) We thank the reviewer for pointing out the missing text in this figure legend.

      We corrected the error by adding the missing text back in Figure 3. Line 339-341: Supp. Fig. 9 should be Supp. Fig. 8 (Reviewer 1) Referenced Supp. Fig. was corrected. * Second, (L422-425) the authors conclude that their data demonstrate a "reactive crosstalk between LPS and PG synthesis". I disagree. There is no information in the paper that this is the case. The authors can only suggest that cross talk may occur. (Reviewer 2) We agree. Line 421-2: replaced “demonstrate” with “suggest” to soften the argument. *

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      This study reports the finding that lipopolysaccharide integrity modulates bacterial sensitivity to a Type-6-secreted bacterial toxin. The authors used the Tae1 amidase produced by the P. aeruginosa T6SS and Escherichia coli bacteria as prey cells as a model system to test the effect of knockdowns in essential gene expression of the prey. This was accomplished by constructing a library of knockdown (KD) genes based on Crispr/Cas9 and selecting for those targets where E. coli prey is not killed. The screen revealed, as expected, that KD genes encoding cell wall synthesis assembly (and bamA, involved in OM protein assembly) enhanced the sensitivity to Tae1. In contrast, KD targets in genes involved in lipid metabolism and lipopolysaccharide synthesis conferred resistant to the amidase toxin. The authors hypothesized that non-PG components of the cell envelope may shape Tae1 toxicity and undertook a more detailed analysis of the effects of knocking down one of these genes, msbA, using a various biochemical and imaging approaches. The MsbA protein is an ATPase permease that plays an essential role in flipping newly synthesized lipid A across the bacterial inner membrane. The authors show that resistance to Tae1 in msbA-KD is independent of cell wall hydrolysis (meaning that the Tae1 remains active), PG synthesis is suppressed (despite PG is still Tae1 sensitive), and that protein synthesis and growth is suppressed. This latter observation suggests that the E. coli prey enters a persistent (dormant) state that protects it from Tae1 toxicity. The authors conclude that Tae1 susceptibility in vivo is determined by cross talk between essential cell envelope pathways and the general growth state of the cell.

      Major comments:

      This is a nice study unravelling cellular off target factors that affect the killing in vivo by a T6SS toxin. In that sense the study is novel since the interplay of T6SS effectors in the context of the physiological state of the prey cell has not been directly investigated. so this study adds new information to the literature in the field.

      I have several comments concerning the interpretation of the results.

      First, it is interesting that Tae1, being an amidase, can be the sole responsible for PG degradation. The enzyme cleaved the peptide bridges but has no effect on the PG backbone. The study was not designed to pick up autolysins (since only essential genes were targeted) but one would assume that degradation of the sugar backbone must also be required for lysis.

      Second, (L422-425) the authors conclude that their data demonstrate a "reactive crosstalk between LPS and PG synthesis". I disagree. There is no information in the paper that this is the case. The authors can only suggest that cross talk may occur.

      Third, Tae1 maximal effect is present when new PG is made, which also begs the question about the location of this protein in the PG mesh. Like B-lactam and other PG-active antibiotics, the effect of Tae1 requires active cell growth. This is also consistent with the authors' finding that the msbA-KD bacterial cells enter a state of dormancy or persistence, which will make them capable of overcoming Tae1 toxicity.

      Fourth, an important outcome of protein synthesis inhibition and PG synthesis is increased oxidation and lipid peroxidation. This could also influence the results obtained in this study. It would be consistent with the other targets observed, which compromise lipid metabolism and membrane trafficking and secretion.

      Referees Cross-commenting

      Based on my own review and that of Reviewer 1, I think we both agree that there are 2 major limitations in this work: (i) the KD library only targets essential genes and this would potentially miss non-essential genes that when targeted for mutated could lead to synthetic lethal phenotypes that could be more revaling than a general defect protein synthesis, etc. and (ii) the dormancy state is not well characterized.

      Despite these points the study is very nicely done with a huge amount of work.

      Significance

      This is an important study addressing experimentally the complexities of bacteria-bacteria interactions in the context of predator-prey interplay. The T6SS effectors affecting PG appear to have the same characteristics as known antibiotics and bacteria use similar strategies to protect themselves from PG attack. This is not only to increase growth as an escape approach but also to reduce it to a point in which the target cell cannot be effectively killed despite the presence of the toxin.

    1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      The authors would like to thank the reviewers for their valuable comments and suggestions. We have carefully considered all of the points raised and revised our manuscript accordingly. In the rebuttal letter below, we have extensively discussed all the different concerns and adjustments we made to our work. In what follows the reviewers’ comments are in blue and the authors’ responses are in black. The additions and changes to the main and supplementary text of the manuscript are highlighted in yellow.

      *Reviewer #1 (Evidence, reproducibility and clarity (Required)): *

      *In their paper entitled "CD38 promotes hematopoietic stem cell dormancy via c-Fos", Ibneeva et al., present a set of data predominantly from mouse HSCs where they explore the cell cycle kinetics and self-renewal capacity of LT-HSCs expressing (or not) CD38. They perform a series of sophisticated in vitro and in vivo experiments, including transplantations and single cell cultures and arrive at the conclusion that CD38 can fractionate LT-HSCs that are more deeply quiescent. Overall, it is an interesting question and would be of interest to experimental hematologists. That said, I had a number of issues that concerned me throughout the manuscript with regard to the robustness of the conclusions around CD38 and I have tried to detail these below.

      Major concerns: *

      *1) Novelty - It was unclear what the relationship of this CD38+ fraction had with other "segregators" of LT-HSCs - e.g., how does it compare with the Sca1 fractionation of Wilson et al, Cell Stem Cell 2015 or Gprc5c of Cabezas-Wallschied Cell 2017? Even if CD38 fractionated LT-HSCs, it was unclear what it would give beyond these two molecules (especially re: Sca-1 which is also a cell surface marker). *

      Response:

      We agree with the reviewer that further elaboration of this point with additional data would be helpful. We compared the expression of Sca-1 in the population of LT-HSCs (Lin- Kit+ Sca-1+ CD48- CD150+ CD34- CD201+) based on the gating strategy from the paper Wilson et al, Cell Stem Cell 2015. We found that all LT-HSCs (independent of CD38 expression) express Sca-1 at a high level and can be quantified as Sca-1hi (we have added these data in Fig. S2A). Thus, CD38 subfractionates LT-HSCs, and considering that we have shown that CD38+ are more quiescent (Fig. 3) and have higher repopulation capacity compared with CD38- LT-HSCs (Fig. 2E-G), we conclude that CD38 should be used in addition to Sca-1 to define dormant LT-HSCs.

      We found that CD38+ dormant HSCs expressed Gprc5c mRNA at higher levels than CD38- LT-HSCs (Fig. 5D). Therefore, we cannot exclude that CD38+ and Gprc5c+ identify the same population of dormant HSCs. However, Cabezas-Wallscheid Cell 2017 used the reporter Gprc5c-EGFP mouse strain, which is not widely available. In contrast, we propose to use readily available antibodies against CD38 for efficient isolation of dormant HSCs. Moreover, to define CD38+ dormant HSCs, researchers do not need to use the CD38KO mice as a negative control, it would be sufficient to use total bone marrow cells to identify the CD38+ population for gating dHSCs (we have added this information to Fig. S2C and in the text: line 119-121: “We demonstrated that total bone marrow cells can be used to define the CD38+ fraction in the absence of CD38 knock-out mice (CD38KO) (Fig. S2C), providing the possibility of an internal positive control for easy identification of CD38+ cells”.

      *Claims of CD38+ superiority in transplantation - I was surprised with the claim of CD38 negative cells being a less functional HSC when they are clearly still very strong in secondary transplantation assays. Both 38+ and 38- cells strongly repopulate secondary animals and only 5 mice were shown in the Figure. The legend suggests another experiment was undertaken, but these data are not presented. Did they substantially differ in their chimerism in primary and secondary animals? Was the magnitude of difference between the two fractions similar in both experiments? Is there a reason that the data could not be plotted on the same graph?

      *

      We have added the data from the second experiment to the graphs and changed the figure legend accordingly (Fig. 2D-H), now for primary transplantation n=8, for secondary transplantation n=6 vs 7. These data show the same trend of higher repopulation capacity of CD38+ LT-HSCs compared to CD38- LT-HSCs, although with the larger magnitude of difference in primary transplantation. We agree with the reviewer that CD38- LT-HSCs strongly repopulate secondary animals. However, the higher percentage of chimerism in peripheral blood and bone marrow for CD38+ LT-HSC progeny indicates their superior repopulation and self-renewal capacity compared to CD38- counterparts.

      Also, the typical experiment to establish a quantitative difference in HSC production would be a limiting dilution analysis with a much larger number of recipient animals - without such data it is difficult to ascertain how different the two fractions really are.

      While we appreciate the reviewer's suggestion to include additional data on the amount of repopulating HSCs, we respectfully disagree as we believe that this information is beyond the scope of the current study, which only aims to assess the functional superiority of CD38+ LT-HSCs over CD38- LT-HSCs in side-by-side comparisons. Assessment of donor-derived cells’ frequency in peripheral blood and bone marrow relative to the frequency of competitors after transplantation of the same amount of HSCs (so-called chimerism level) is a widely accepted assay in the field to demonstrate the difference in the functionality between two HSC fractions (Sanjuan-Pla et al., Nature 2013; Gekas C and Graf T, Blood 2015; Bernitz J.M et al. Cell 2016; and others, including papers cited by the reviewer: Wilson et al., Cell Stem Cell 2015 and Cabezas-Wallscheid et al., Cell 2017). A limiting dilution experiment will provide more detailed characteristics of two HSC fractions, namely the quantitative difference (how many cells from the sorted population can repopulate). However, this experiment will not significantly change our conclusion that the CD38+ LT-HSC fraction is superior in repopulation and self-renewal capacity compared to the CD38- LT-HSC fraction, as sufficiently demonstrated in Fig. 2E-G.

      Furthermore the claim that CD38- HSCs do not ever produce CD38+ cells is a bit premature with so few mice and confusingly presented data (e.g., Fig 2I is 5 pooled mice in a single histogram plot - were these concatenated flow files? If so, how were they normalised? Did the other experiment look the same? And were all CD38+ HSCs capable of giving rise to both CD38+ and CD38- cells or was it a subfraction of mice/samples?).

      The plot provided in Fig. 2I is a FACS analysis of pooled cells from mice transplanted with CD38+ or CD38- LT-HSCs (we added a detailed explanation in figure legend 2, lines 701-703). We provided data from the second experiment in Fig. S2G. All CD38+ LT-HSCs could give rise to both CD38+ and CD38- HSC; we added data in Fig. S2H.

      Cell Cycle status differences and grades of quiescence - Ki67 and DAPI are really quite tricky for discerning G0 versus G1 and no flow cytometry plots are provided for the reader to assess how this has been done. Could another technique (e.g., Hoechst/Pyronin) be used to confirm the results? Perhaps more concerning is the variability of the assay in the authors own hands. If I am interpreting things correctly, the plots in 3G, 3H and 3I in the platelet depletion, pIpC and 5FU experiments are >10% higher in the CD38- control arm than the data in 3A which make me worried about the robustness of the cell cycle assay to distinguish G0 from G1.

      Ki67 and DAPI staining is a widely accepted technique for distinguishing G0 from G1. We provide flow cytometry plots in Fig. S2F (original figures, S3B - updated figures), which the referee may have overlooked. We added a reference to the Fig. S3B to figure legend 3 to make it more transparent for the readers. We would like to clarify the reviewer’s concern regarding the slightly different frequency of CD38- cells in the G0 phase of the cell cycle at steady state in Fig. 3A (original figures). Fig. 3A compares the cell cycle stages between CD38- and CD38+ HSCs, while Fig. 3B compares the same parameters for CD38- vs CD38+ LT-HSCs, which are enriched for quiescent HSCs by using additional surface markers. Therefore, it is correct to compare the data for LT-HSCs under stress (Fig. 3G-I, original figures) with the data for LT-HSCs at steady state in figure 3B (original figures). To make it less confusing for the reader, since the entire Figure 3 is devoted to LT-HSCs, we have moved Figure 3A to the supplementary Figures (Fig. S3A).

      All experiments for Fig. S3A&3A, 3F, 3G, and 3H (updated figures), were performed separately, and we did not compare mice from different experiments to avoid differences due to technical details. However, the groups of mice for each specific treatment (ctrl vs. treatment at different time points) were analyzed on the same day, using the same amount of cells, the same master mix of antibodies, and the same FACS machine and settings to compare ctrl vs. treated mice (we added this information in the Materials and Methods section, lines 388-391). In addition, we performed a BrdU incorporation assay and label retention assay using H2B-GFP mice, which support our finding that CD38+ LT-HSCs are more quiescent than CD38- cells in the steady state.

      Minor points: Figure 3I was really confusing - it says it is the gating strategy for GFP retaining LT-HSCs, but only shows GFP versus cKit

      We reformulated the figure legend for 3D: “Representative plot defining GFP+ cells in LT-HSCs.”

      Figure 4B suggests that only 40% of CD38+ cells divide in the first 3 days - are there survival differences or are the cells sat there as single cells? It would be important to carry these further to see if cells eventually divide.

      This is a relevant and crucial point addressed by the reviewer. We did not find any significant difference in the survival of cells. We have added this data to the supplementary data - Fig. S4Q-R.

      Reviewer #1 (Significance (Required)):

      I believe the study will be of interest to specialist readers in the HSC field, especially those working on quiescence and G0 exit. At present, I think the conclusion of a true subfractionation is a bit premature, but there are pieces of data that do look exciting and warrant further investigation. It was a little unclear how this would advance beyond Sca-1 or Gprc5c fractionation for finding more primitive HSCs, but having cleaner markers is always a useful advance for the field.

      We thank the reviewer for his/her positive evaluation of our study. In our work, we compared several functional aspects of CD38+ and CD38- LT-HSCs:

      1. We used four techniques (Ki67 and DAPI staining, BrdU incorporation assay, label retention assay, single-cell division tracing assay) and showed that CD38+ LT-HSCs are more quiescent than CD38- cells.
      2. We performed a serial transplantation assay and found that although CD38- LT-HSCs have the long-term repopulation capacity, they repopulate significantly less effectively than CD38+ LT-HSCs.
      3. We used a combination of surface markers (Lin- Kit+ Sca-1+ CD48- CD150+ CD34- CD201+) to define LT-HSCs; all of which belong to the Sca-1hi population according to Wilson et al, 2015. We further separated Sca-1hi LT-HSCs into CD38+ and CD38- cells and found that they differ in the repopulation capacity and quiescence in steady state and upon hematological stress. We conclude that CD38 surface staining should be used on top of Sca-1 to sort dormant LT-HSCs.
      4. We found that CD38+ dormant LT-HSCs differ from CD38- cells in gene expression and response to CD38 and c-Fos inhibitors. CD38+ LT-HSCs are characterized by higher cytoplasmic Ca2+ and cell cycle inhibitor p57 levels than CD38- LT-HSCs. Thus, we demonstrated that CD38 is not only a marker but also has a functional role in mediating HSC dormancy. We discovered that CD38/cADPR/Ca2+/c-Fos/p57 axis regulates CD38+ HSC dormancy. Taken together, our findings demonstrate that CD38+ LT-HSCs have superior properties compared to CD38- LT-HSCs and can be classified as dHSCs, providing a simple approach for their isolation and further study. Moreover, we uncovered the CD38-mediated molecular mechanism regulating HSCs dormancy.

      Regarding my own expertise - I have spent ~20 years in the field undertaking single cell assays of normal and malignant mouse and human HSCs, including many of the core functional assays described in this paper and consider myself very familiar with the topic area.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Although the experiments were well done and supported their testing hypothesis, but the overall novelty of the whole work is not that strong and this is because:

      -the use of CD38 to identify/select and to test mouse LT-HSCs' function in vivo (although not commonly used nowadays) was demonstrated a few times more than 20 years ago (Randall, et al., 1996: PMID: 8639761 and Tajima et al., 2001; PMID: 11313250); in fact, the authors didn't even reference/acknowledge these papers which they should have done so; hence, most of the results in Fig.2 were already known (despite this current work gave a more detailed/better analysis);

      We agree with the reviewer that the previous findings using CD38 to separate HSPCs should be appreciated; however, we would like to point out that while the studies by Randall, et al., 1996: PMID: 8639761 and Tajima et al., 2001; PMID: 11313250 employ only 3 markers to discriminate HSPC (Lin- Sca-1+ Kit+), in our study, we performed for the first time a very detailed characterization of CD38+ cells using surface markers that were not available 20 years ago. We analyzed not only the HSC compartment but also different populations of multipotent progenitors. Modern surface marker combinations for the LT-HSC isolation allow us to show that both populations: CD38- and CD38+, can be classified as LT-HSCs in contrast to the data of Randall et al, where the authors did not find any long-term repopulating activity in the CD38- KLS compartment. Moreover, we showed the hierarchical relationships between these two populations. We appreciate the previous findings and recommendations of the reviewer, and have added citations (Randall, et al., 1996: PMID: 8639761 and Tajima et al., 2001; PMID: 11313250) and comment in the discussion section, lines 267-271:

      In contrast to previous studies reporting that only CD38+ HSPC compartment from adult mice contains LT-HSCs (42, 43), in our study we demonstrated using modern surface marker combinations for the isolation of LT-HSCs that while both populations: CD38- and CD38+, can be classified as LT-HSCs, only CD38+ LT-HSCs display characteristics of dormant HSCs (4).’’

      -it is known the generic roles of CD38 in producing cADPR, ADPR, etc and these can induce Ca2+ oscillation in cells; despite that, it was nicely demonstrated here that in mouse HSCs cADPR was the main signalling mediator;

      We thank the reviewer for pointing this out; indeed, it has not been shown before how Ca2+ is regulated in HSCs.

      the roles of cADPR in human CD34+ were demonstrated (Podesta et al., 2023; PMID: 12475890: when CD34+ HSPCs were primed in vitro with cADPR it resulted in enhanced short-term while maintaining long-term (secondary transplant) engraftment in NOD/SCID mice, probably (mechanisms were not determined at that time) inducing cycling/expansion of human CD34+CD38+ progenitors while inhibiting cycling (hence, better long-term maintenance) of CD34+CD38- HSPCs); on this note; the data presented in Fig.4 K and S5 should be eliminated as it adds little to their story and it can be quite confusing when comparing to mouse data unless the authors wish to explore in a more detailed way the human part.

      We appreciate the reviewer’s valuable suggestion. However, we respectfully disagree with their interpretation because we do not believe that the technical aspects of the cited paper (Podesta et al., 2003; PMID: 12475890) are robust enough to support their conclusions. Podesta et al. concluded that in vivo and in vitro treatment with a high dose of cADPR (25-fold higher than the physiological dose, according to the authors' estimation) stimulates the expansion of HSC and progenitor cells. At the same time, they did not use any surface markers to define populations and studied total mononuclear cord blood cells, so no conclusions can be drawn regarding CD34+ CD38+ and CD34+ CD38- dynamics. Unfortunately, we cannot confirm the reliability of the HSC engraftment data presented by Podesta et al. This is because they did not analyze the chimerism of human cells in peripheral blood and bone marrow for sixteen weeks post-transplantation, which is considered a standard time period for assessing long-term engraftment of human HSCs in the field (Brehm M.A. et al., Blood 2012, Cosgun K.N. et al., Cell Stem Cell 2014, Takagi S. et al. Blood 2012). Instead, they counted only some CD34+ cells at three and eleven weeks after transplantation. Therefore, the role of cADPR in the regulation of human HSC quiescence remained unknown.

      In our original study, we showed that blocking the CD38 ecto-enzymatic activity stimulated both human HSC and mouse HSCs to exit from the G0 phase of the cell cycle. The role of CD38 enzymatic activity can be conservative for mice and humans and needs to be further investigated in future studies on human HSCs. For this reason, we decided to keep Fig. 4K and S6 in the paper.

      -Ca2+ induction in cells can induce c-fos expression (as in an early response gene); in many cell types hence, it was not a surprising finding;

      We agree with the reviewer that it has been shown previously that Ca2+ induction in cells could induce c-fos expression (as an early response gene to stress). However, we have shown for the first time that Ca2+ regulates c-Fos expression in LT-HSCs under steady-state conditions.

      -c-fos was demonstrated to suppress cell cycle entry of dormant hematopoietic stem cells (Okada et al., 1999: PMID: 9920830).

      In the cited publication (Okada et al., 1999: PMID: 9920830) the authors have only analyzed the in vitro proliferation and colony formation of Lin- Sca-1+ cells in the IFNα/β inducible c-Fos overexpression model. This population mainly contains progenitor cells and only 0.004% of dormant LT-HSCs (please find below an estimation of LT-HSC frequency). Therefore, the role of c-Fos in the regulation of dormant HSC cell cycle entry remained unexplored.

      It would be useful to do ChIP-seq to determine to confirm that c-fos regulates p57 expression.

      We have shown that inhibition of c-Fos transcriptional activity inhibits p57 expression (Fig. 6G). ChIP–seq with antibody against c-Fos will answer whether c-Fos directly activates the expression of p57. However, we can only isolate 200-300 CD38+ LT-HSCs from all bones of one mouse. Unfortunately, the ChIP-seq with such an amount of cells is technically very difficult, which explains the absence of publications using ChIP-seq for studying transcription factors in LT-HSCs. We added in the Discussion section that we couldn’t exclude indirect regulation of p57 expression by c-Fos, lines 307-308:” In contrast, although we couldn’t exclude indirect regulation of p57kip2 expression by c-Fos, our data clearly reveal that inhibiting the interaction between c-Fos and DNA in dHSCs reduced protein levels of the cell cycle inhibitor p57kip2 and stimulated cell cycle entry.”

      So overall, many of the findings were already out there and the authors gathered many of the pieces of the puzzle and put them together (and demonstrated) in a nice and well-thought manner. This work does add useful information to the scientific community but unfortunately is not ground-breaking. It may contribute to other fields beyond hematopoiesis where CD38 function may play a role.

      Thank you very much for the positive review of our work. As mentioned by the reviewer, CD38 is expressed by other normal (lymphocytes, Kupffer cells (Tarrago M.G. et al., Cell Metabolism 2018)) and cancer cells, e.g. hematological malignancies, lung cancer, prostate cancer (Hogan K.A. et al. Frontiers in Immunology, 2019),) but has not been studied in the context of quiescence regulation. Currently, anti-CD38 monoclonal antibodies are used to treat malignancies (Daratumumab) by mediating cytotoxicity (Lokhorst H.M et al., N. Engl. J. Med, 2015). However, the inhibition of CD38 enzymatic activity has not been used broadly. Therefore, our study can be groundbreaking and open new directions in anti-cancer therapy.

      Reviewer #2 (Significance (Required)):

      In this manuscript, the authors investigated the potential roles of CD38 (mainly) in mouse HSCs quiescent; the authors dissected the potential molecular mechanism by which this occurred, and it was via CD38/cADPR/Ca2+/cFos/p57Kip2. The authors used a combination of transplantation assays to test the importance of CD38 in vivo, followed by a series of simple in vitro experiments (mainly using pharmacological means) to dissect the molecular mechanisms. The manuscript is well-written/explained and the data presented is solid. There are no major issues in terms of reproducibility and clarity in this work.

      We would like to thank the reviewer again for the detailed positive feedback.

    1. Reviewer #1 (Public Review):

      The authors evaluate a number of stochastic algorithms for the generation of wiring diagrams between neurons by comparing their results to tentative connectivity measured in cell cultures derived from embryonic rodent cortices. They find the best match for algorithms that include a term of homophily, i.e. preference for connections between pairs that connect to an overlapping set of neurons. The trend becomes stronger, the older the culture is (more days in vitro).

      From there, they branch off to a set of related results: First, that connectivity states reached by the optimal algorithm along the way are similar to connectivity in younger cultures (fewer days in vitro). Second, that connectivity in a more densely packed network (higher plating density) differs only in terms of shorter-range connectivity and even higher clustering, while other topological parameters are conserved. Third, blocking inhibition results in more unstructured functional connectivity. Fourth, results can be replicated to some degree in cultures of human neurons, but it depends on the type of cell.

      The culturing and recording methods are strong and impressive. The connectivity derivation methods use established algorithms but come with one important caveat, in that they are purely based on correlation, which can lead to the addition of non-structurally present edges. While this focus on "functional connectivity" is an established method, it is important to consider how this affects the main results. One main way in which functional connectivity is likely to differ from the structural one is the presence of edges between neurons sharing common innervation, as this is likely to synchronize their spiking. As they share innervation from the same set of neurons, this type of edge is placed in accordance with a homophilic principle. In other words, this is not merely an algorithmic inaccuracy, but a potential bias directly related to the main point of the manuscript. This is not invalidating the main point, which the authors clearly state to be about the correlational, functional connectivity (and using that is established in the field). But it becomes relevant when in conclusion the functional connectivity is implicitly or explicitly equated with the structural one. Specifically, considering a long-range connection to be more costly implies an actual, structural connection to be present. Speculating that the algorithm reveals developmental principles of network formation implies that it is the actual axons and synapses forming and developing. The term "wiring" also implies structural rather than functional connectivity. One should carefully consider what the distinction means for conclusions and interpretation of results.

      The main finding is that out of 13 tested algorithms to model the measured functional connectivity, one based on homophilic attachment works best, recreating with a simple principle the distributions of various topological parameters.<br /> First, I want to clear up a potential misunderstanding caused by the naming the authors chose for the four groups of generative algorithms: While the ones labelled "clustering" are based on the clustering coefficient, they do not necessarily lead to a large value of that measure nor are they really based on the idea that connectivity is clustered. Instead, the "homophilic" ones are a form of maximizing the measure (but balanced by the distance term). To be clear, their naming is not wrong, nor needs to be changed, but it can lead to misunderstandings that I wanted to clear up. Also, this means that the principle of "homophilic wiring" is a confirmation of previous findings that neuronal connectivity features increased values of the clustering coefficient. What is novel is the valuable finding that the principle also leads to matching other topological network parameters.

      The main finding is based on essentially fitting a network generation algorithm by minimizing an energy function. As such, we must consider the possibility of overfitting. Here the authors provide additional validation by using measures that were not considered in the fitting (Fig 5, to a lesser degree Fig 3e), increasing the strength of the results. Also, for a given generative algorithm, only 2 wiring parameters were optimized. However, with respect to this, I was left with the impression that a different set of them was optimized for every single in-vitro network (e.g. n=6 sets for the sparse PC networks; though this was not precisely explained, I base this on the presence of distributions of wiring parameters in Fig 6c). The results would be stronger if a single set could be found for a given type of cell culture, especially if we are supposed to consider the main finding to be a universal wiring principle. At least report and discuss their variability.

      Next, the strength of the finding depends on the strengths of the alternatives considered. Here, the authors selected a reasonably high number of twelve alternatives. The "degree" family places connections between nodes that are already highly connected, implementing a form of rich-club principle, which has been repeatedly found in brain networks. However, I do not understand the motivation for the "clustering" family. As mentioned above, they do not serve to increase the measure of the clustering coefficient, as the pair is likely not part of the same cluster. As inspiration, "Collective dynamics of 'small-world' networks" is cited, but I do not see the relation to the algorithm or results presented in that study. A clearly explained motivation for the alternatives (and maybe for the individual algorithms, not just the larger families) would strengthen the result. 

      Related to the interpretation of results, as they are presented in Fig3a, bottom left: What data points exactly go into each colored box? Specifically, into the purple box? What exactly is meant by "top performing networks across the main categories" mean? Compared with Supp Fig S4, it seems as if the authors do not select the best model out of a family and instead pool the various models that are part of the same family, albeit each with their optimized gamma and eta. Otherwise, the purple box at DIV14 in Fig3 would be identical to "degree average" at DIV14 in S4. If true, I find this problematic, as visually, the performance of one family is made to look weaker by including weak-performing models in it. I am sure one could formulate a weak-performing homophily-based rule that drives the red box up. If such pooling is done for the statistical tests in Supp Tables 3-7, this is outright misleading! (for some cases "degree average" seems not significantly worse than the homophily rules).

      The next finding is related to the development of connectivity over the days in vitro. Here, the authors compare the connectivity states the network model goes through as the algorithm builds it up, to connectivity in-vitro in younger cultures. They find comparable trajectories for two global topological parameters. <br /> Here, once again it is a strength that the authors considered additional parameters outside the ones used in fitting. However, it should be noted that the values for "global efficiency" at DIV14 (the very network that was optimized!) are clearly below the biological values plotted, weakening the generality of the previous result. This is never discussed in the text.

      The conclusion of the authors in this part derives from values of modularity decreasing over time in both model and data, and global efficiency increasing. The main impact of "time" in this context is the addition of more connections, and increasing edge density. And there is a known dependency between edge density and the bounds of global efficiency. I am not convinced the result is meaningful for the conclusion in this state. If one were to work backwards from the DIV14 model, randomly removing connections (with uniform probabilities): Would the resulting trajectory match DIV12, DIV10, and DIV7 equally well? If so, the trajectory resulting from the "matching" algorithm is not meaningful.

      Further, the conclusion of the authors implies that connections in the cultures are formed as in the algorithm: one after another over time without pruning. This could be simply tested: How stable are individual connections in vitro over time (between DIV)? 

      The next finding is that at higher densities, the connections formed by the neurons still have very comparable structures, only differing in clustering and range; and that the same generative algorithm is optimal for modelling them. I think in its current state, the correlation analysis in Fig. 4a supports this conclusion only partially: Most of these correlations are not surprising. Shortest path lengths feature heavily in the calculation of small worldness and efficiency (in one case admittedly the inverse). Also for example network density has known relations with other measures. The analysis would be stronger if that was taken into account, for example showing how correlations deviate from the ones expected in an Erdos-Renyi-type network of equal sizes.

      Yet, overall the results are supported by the depicted data and model fits in Supp. Fig S7. With the caveat that some of the numerical values depicted seem off: <br /> What are the units for efficiency? Why do they take values up to 2000? Should be < 1 as in 4b. Also, what is "strength"? I assume it's supposed to be the value of STTC, but that's not supposed to be >1. Is it the sum over the edges? But at a total degree of around 40, this would imply an average STTC almost three times higher than what's reported in Fig 1i. Also, why is the degree around 40, but between 1000 and 1500 in Fig S2? <br /> Finally, it should be mentioned that "degree average" seems (from the boxplot) to work equally well.

      Further, the conclusion of the "matching" algorithm equally fitting both cases would be stronger if we were informed about the wiring parameters (η and γ) resulting in both cases. That way we could understand: Is it the same algorithm fitting both cases or very different variants of the same? It is especially crucial here, because the η and γ parameters determine the interplay between the distance- and topology-dependent terms, and this is the one case where a very different set of pairwise distances (due to higher density) are tested. Does it really generalize to these new conditions?

      Conversely, the results relating to GABAa blocking show a case where the distances are comparable, but the topology of functional connectivity is very different. (Here again, the contrast between structural and functional connectivity could be made a bit clearer. How is correlational detection of connections affected by "bursty" activity?) The reduction in tentative inhibition following the application of the block is convincing.

      The main finding is that despite of very different connectivities, the "matching" algorithm still holds best. This is adequately supported by applying the previous analyses to this case as well. <br /> The authors then interpret the differences between blocked and control by inspection of the η and γ parameters, finding that the relative impact of the distance-based term is likely reduced, as a lower (less negative) exponent would lead to more equal values for different distances. This is a good example of inspecting the internals of a generative algorithm to understand the modeled system and is confirmed by longer edge lengths in Supp Fig. S12C.

      The authors further inspect the wiring probabilities used internally at each step of the algorithm and compare across conditions. They conclude from differences in the distribution of P_ij values that the GABAa-blocked network had a "more random" topology with "less specific" wiring. This is the opposite of the conclusion I would draw, given the depicted data. This may be partially because the authors do not clearly define their concept of "random" vs. "specific". I understand it to be the following: At each time step, one unconnected pair is randomly picked and connected, with probabilities proportional to P_ij, as in Akarca et al., 2021; "randomness" then refers to the entropy of that process. In that case, the "most random" or highest entropy case is given by uniform P_ij values, which would be depicted as a delta peak at 1 / n_pairs in the present plot. A flatter distribution would indicate more randomness if it was the distribution of P_ij over pairs of neurons (x-axis: pairs; y-axis P_ij). The conclusion should be clarified by the use of a mathematical definition and supported by data using that definition.

      Next, the methods are repeated for various cultures of human neurons. I have no specific observations there.

      In summary, while I think the most important methods are sound, and the main conclusions (reflected in the title of the paper) are supported, the analysis of more specific cases (everything from Fig 3e onwards, except for Fig 5) requires more work as in the current state their conclusions are not adequately supported.

    1. His faith in computers and quantitative data was legendary, his famous quote he said to [55:25.380 --> 55:26.380]  it. [55:26.380 --> 55:28.180]  It might have been Ellsberg that he said this to actually someone was saying that we're [55:28.180 --> 55:31.140]  losing the war in Vietnam and he said, where is your data? [55:31.140 --> 55:35.860]  Don't get me poetry, give me something I can put in the computer.

      Although computers provided nations with the ability to better themselves, negatively this is not the case for many. As the current nature of computers, are heavily used as echo chambers for peoples biases, reinforcing political polarization. This problem is highlighted by Vannevar Bush in As we may think Article.

    1. En 1945 publica un artículo llamado «As we may think» («Como podríamos pensar»[3]​) en la revista Atlantic Monthly, donde describió, principalmente, la llegada de dos dispositivos.

      Considero relevante volver a resaltar como en su niñez emergió su curiosidad y amor por la ciencia a través de sus propias experiencias y exploraciones, una especie de fascinación por la ciencia y la tecnología que se reflejó en su trabajo elaborando dispositivos.

    2. En 1945 publica un artículo llamado «As we may think» («Como podríamos pensar»[3]​) en la revista Atlantic Monthly, donde describió, principalmente, la llegada de dos dispositivos.

      Es interesante ver cómo en la historia la creación de dispositivos ha buscado facilitar las acciones humanas, lo cual ha traído velocidad, optimización de tiempo, organización, aportando a las ciencias de la información y desafortunadamente también a las armas para la guerra, pero también es complejo cómo su uso inspirado en el cerebro humano, hace que este ya no se ejercite de igual manera en acciones como hacer operaciones matemáticas, redactar escribir a mano, leer y sean reemplazadas por los dispositivos.

    1. As a rule, humans do not like to be duped. We like to know which kinds of signals to trust, and which to distrust. Being lulled into trusting a signal only to then have it revealed that the signal was untrustworthy is a shock to the system, unnerving and upsetting. People get angry when they find they have been duped. These reactions are even more heightened when we find we have been duped simply for someone else’s amusement at having done so.

      I think this has become a prevalent issue the last few years when looking at politics. The right and the left have felt so divided in recent time and it is difficult to watch politics without feeling like you may be getting "duped."

    1. As educators, it is important to understand that asking students to use apps or digital tools for learning activities gives companies the opportunity to collect data on them.

      This is scary to think about. I have never thought about how so many websites may have my information simply because I signed up for websites. I feel like this should be especially concerning for younger students, these websites may have access of students' address, school, name, date of birth, etc. This also made me realize that we need to teach more about internet safety and teach about this kind of stuff.

    2. Similarly, end-user license agreements (EULA) and terms of service (TOS) agreements feature opaque language that may cause you to give away your right to privacy without truly understanding what you are doing when you click “I agree.”

      While I know that companies do this on purpose to gain rights to your information and for that reason any attempt to simplify these agreements will have pushback, I really think their needs to be an extension, tool, or platform that will put these forms into simpler terms. This is a prime example of the U in Pour (understandable), and it reminds me of how error signs need to tell us what's wrong in simple language so we can fix it just as these terms of agreement should be in understandable words and syntax so we know what we're signing.

    1. As we may Think.

      Aquí, describió una máquina que combinaría tecnologías de bajo nivel para lograr un mayor nivel de conocimiento organizado (como los procesos de la memoria humana)