26,925 Matching Annotations
  1. Jan 2024
    1. eLife assessment

      This potentially important study used single-cell whole-brain imaging of the immediate early gene Fos to identify the brain areas recruited by two anesthetics, ketamine and isoflurane. The utilization of a custom software package to align and analyze brain images for c-Fos positive cells stands out as an impressive component of the approach. The results suggest these anesthetics might induce anesthesia via different brain regions and pathways, and raw fos showed shared and distinct activation patterns after ketamine- v. isoflurane-based anesthesia. However, the support for the primary conclusions is incomplete owing largely to concerns with the data transformation. The results could also be influenced by differences in route of administration between the drugs. This paper may be of interest to preclinical and clinical scientists working with anesthetic and dissociative drugs.

    2. Reviewer #2 (Public Review):

      Summary: In the revised manuscript, the authors aim to investigate brain-wide activation patterns following administration of the anesthetics ketamine and isoflurane, and conduct comparative analysis of these patterns to understand shared and distinct mechanisms of these two anesthetics. To this end, they perform Fos immunohistochemistry in perfused brain sections to label active nuclei, use a custom pipeline to register images to the ABA framework and quantify Fos+ nuclei, and perform multiple complementary analyses to compare activation patterns across groups.

      In the latest revision, the authors have made some changes in response to our previous comments on how to fix the analyses. However, the revised analyses were not changed correctly and remain flawed in several fundamental ways.

      Critical problems:

      (1) Before one can perform higher level analyses such as hiearchal cluster or network hub (or PC) analysis, it is fundamental to validate that you have significant differences of the raw Fos expression values in the first place. First of all, this means showing figures with the raw data (Fos expression levels) in some form in Figures 2 and 3 before showing the higher level analyses in Figures 4 and 5; this is currently switched around. Second and most importantly, when you have a large number of brain areas with large differences in mean values and variance, you need to account for this in a meaningful way. Changing to log values is a step in the right direction for mean values but does not account well for differences in variance. Indeed, considering the large variances in brain areas with high mean values and variance, it is a little difficult to believe that all brain regions, especially brain areas with low mean values, passed corrections for multiple comparisons test. We suggested Z-scores relative to control values for each brain region; this would have accounted for wide differences in mean values and variance, but this was not done. Overall, validation of anesthesia-induced differences in Fos expression levels is not yet shown.

      (2) Let's assume for a moment that the raw Fos expression analyses indicate significant differences. They used hierarchal cluster analyses as a rationale for examining 53 brain areas in all subsequent analyses of Fos expression following isoflurane versus home cage or ketamine versus saline. Instead, the authors changed to 201 brain areas with no validated rationale other than effectively saying 'we wanted to look at more brain areas'. And then later, when they examined raw Fos expression values in Figures 4 and 5, they assess 43 brain areas for ketamine and 20 brain areas for isoflurane, without any rationale for why choosing these numbers of brain areas. This is a particularly big problem when they are trying to compare effects of isoflurane versus ketamine on Fos expression in these brain areas - they did not compare the same brain areas.

      Less critical comments:

      (3) The explanation of hierarchical level's in lines 90-95 did not make sense.

      (4) I am still perplexed by why the authors consider the prelimbic and infralimbic cortex 'neuroendocrine' brain areas in the abstract. In contrast, the prelimbic and infralimbic were described better in the introduction as "associated information processing" areas.

      5- It looks like overall Fos levels in the control group Home (ISO) are a magnitude (~10-fold) lower than those in the control group Saline (KET) across all regions shown. This large difference seems unlikely to be due to a biologically driven effect and seems more likely to be due to a technical issue, such as differences in staining or imaging between experiments. The authors discuss this issue but did not answer whether the Homecage-ISO experiment or at least the Fos labeling and imaging performed at the same time as for the Saline-Ketamine experiment?

    3. Reviewer #3 (Public Review):

      The present study presents a comprehensive exploration of the distinct impacts of Isoflurane and Ketamine on c-Fos expression throughout the brain. To understand the varying responses across individual brain regions to each anesthetic, the researchers employ principal component analysis (PCA) and c-Fos-based functional network analysis. The methodology employed in this research is both methodical and expansive. Notably, the utilization of a custom software package to align and analyze brain images for c-Fos positive cells stands out as an impressive addition to their approach. This innovative technique enables effective quantification of neural activity and enhances our understanding of how anesthetic drugs influence brain networks as a whole.

      The primary novelty of this paper lies in the comparative analysis of two anesthetics, Ketamine and Isoflurane, and their respective impacts on brain-wide c-Fos expression. The study reveals the distinct pathways through which these anesthetics induce loss of consciousness. Ketamine primarily influences the cerebral cortex, while Isoflurane targets subcortical brain regions. This finding highlights the differing mechanisms of action employed by these two anesthetics-a top-down approach for Ketamine and a bottom-up mechanism for Isoflurane. Furthermore, this study uncovers commonly activated brain regions under both anesthetics, advancing our knowledge about the mechanisms underlying general anesthesia.

    1. eLife assessment

      In this manuscript, the authors present a wealth of fMRI data at both 3T and 7T to identify a scene-selective region of the intraparietal gyrus ("PIGS") that appears to have some responsivity to characteristics of ego-motion. In a series of experiments, they delineate the anatomical location of PIGS and functionally differentiate it from nearby V6 and OPA. Evidence for these valuable findings is solid, but further (a) consideration of whether this region overlaps with others reported previously and (b) support for, or tempering of, the ego-motion claim may be warranted.

    2. Reviewer #1 (Public Review):

      Summary:<br /> The authors ran a series of experiments with separate subject populations, different stimuli, and on two different MRI scanners (one 3T, one 7T) to establish a scenes-selective region on the intraparietal gyrus that they decided to name PIGS. I think that IPA (intraparietal place area) would also have been a good choice with an allusion to a beverage rather than a domestic animal. The authors show that PIGS can be detected robustly through a series of experiments. They anatomically and functionally separate PIGS from nearby V6, which encodes optic flow. The authors determined that PIGS encodes ego-motion.

      Strengths:<br /> The robust detection of PIGS in several experiments with different sets of participants and on different scanners makes these results convincing. The functional differentiation is well executed.

      Weaknesses:<br /> The distinction of PIGS from nearby OPA, which has also been implied in navigation and ego-motion, is not as clear as it could be.

      Impact:<br /> Overall, this is a valuable contribution to the cognitive neuroscience of the visual system. It shows that there is still room for discovering details of visual processing, given recent advances in scanning technology, statistical methods, and larger sample sizes.

    3. Reviewer #2 (Public Review):

      Summary<br /> The authors report an extensive series of neuroimaging experiments (at both 3T and 7T) to provide evidence for a scene-selective visual area in the human posterior parietal cortex (PIGS) that is distinct from the main three (parahippocampal place area, PPA; occipital place area, OPA; medial place area, MPA) typically reported in the literature. Further, they argue that in comparison with the other three, this region may specifically be involved in representing ego-motion in natural contexts. The characterization of this scene-selective region provides a useful reference point for studies of scene processing in humans.

      Strengths<br /> One of the major strengths of the work is the extensive series of experiments reported, showing clear reproducibility of the main finding and providing functional insight into the region studied. The results are clearly presented and for the most part, convincing.

      Weaknesses<br /> One of the major weaknesses of the work is the failure to relate the current results to other findings in the literature, making it hard to assess whether it is is a "previously undescribed scene-selective site".

      First, the scene-selective region identified appears to overlap with regions that have previously been identified in terms of their retinotopic properties. In particular, it is unclear whether this region overlaps with V7/IPS0 and/or IPS1. This is particularly important since prior work has shown that OPA often overlaps with v7/IPS0 (Silson et al, 2016, Journal of Vision). The findings would be much stronger if the authors could show how the location of PIGS relates to retinotopic areas (other than V6, which they do currently consider). I wonder if the authors have retinotopic mapping data for any of the participants included in this study. If not, the authors could always show atlas-based definitions of these areas (e.g. Wang et al, 2015, Cerebral Cortex).

      Second, recent studies have reported a region anterior to OPA that seems to be involved in scene memory (Steel et al, 2021, Nature Communications; Steel et al, 2023, The Journal of Neuroscience; Steel et al, 2023, biorXiv). Is this region distinct from PIGS? Based on the figures in those papers, the scene memory-related region is inferior to V7/IPS0, so characterizing the location of PIGS to V7/IPS0 as suggested above would be very helpful here as well.

      If PIGS overlaps with either of V7/IPS0 or the scene memory-related area described by Steel and colleagues, then arguably it is not a newly defined region (although the characterization provided here still provides new information).

      Another reason that it would be helpful to relate PIGS to this scene memory area is that this scene memory area has been shown to have activity related to the amount of visuospatial context (Steel et al, 2023, The Journal of Neuroscience). The conditions used to show the sensitivity of PIGS to ego-motion also differ in the visuospatial context that can be accessed from the stimuli. Even if PIGS appears distinct from the scene memory area, the degree of visuospatial context is an alternative account of what might be represented in PIGS.

    4. Reviewer #3 (Public Review):

      Summary:<br /> The authors report a scene-selective area in the posterior intraparietal gyrus (PIGS). This area lies outside the classical three scene-selective regions (PPA/TPA, RSC/MPA, TOS/OPA), and is selective for ego-motion.

      Strengths:<br /> The authors firmly establish the location and selectivity of the new area through a series of well-crafted controlled experiments. They show that the area can be missed with too much smoothing, thus providing a case for why it has not been previously described. They show that it appears in much the same location in different subjects, with different magnetic field strengths, and with different stimulus sets. Finally, they show that it is selective for ego-motion - defined as a series of sequential photographs of an egocentric trajectory along a path. They further clarify that the area is not generically motion-selective by showing that it does not respond to biological motion without an ego-motion component to it. All statistics are standard and sound; the evidence presented is strong.

      Weaknesses:<br /> There are few weaknesses in this work. If pressed, I might say that the stimuli depicting ego-motion do not, strictly speaking, depict motion, but only apparent motion between 2m apart photographs. However, this choice was made to equate frame rates and motion contrast between the 'ego-motion' and a control condition, which is a useful and valid approach to the problem. Some choices for visualization of the results might be made differently; for example, outlines of the regions might be shown in more plots for easier comparison of activation locations, but this is a minor issue.

      This is a very strong paper.

    1. eLife assessment

      This important study provides evidence supporting the idea that visual experience plays a role in shaping the patterns of functional connectivity between extrastriate visual cortex and prefrontal regions during development, by comparing neonates, blind and sighted adults. The evidence supporting the authors' claim is solid, although control analyses could strengthen the conclusions and possibly offer additional mechanistic insights. This study will be of significant interest to neuroscientists and neuroimaging researchers working on vision, plasticity, and development.

    2. Reviewer #1 (Public Review):

      Summary:<br /> The present study evaluates the role of visual experience in shaping functional correlations between extrastriate visual cortex and frontal regions. The authors used fMRI to assess "resting-state" temporal correlations in three groups: sighted adults, congenitally blind adults, and neonates. Previous research has already demonstrated differences in functional correlations between visual and frontal regions in sighted compared to early blind individuals. The novel contribution of the current study lies in the inclusion of an infant dataset, which allows for an assessment of the developmental origins of these differences.

      The main results of the study reveal that correlations between prefrontal and visual regions are more prominent in the blind and infant groups, with the blind group exhibiting greater lateralization. Conversely, correlations between visual and somato-motor cortices are more prominent in sighted adults. Based on these data, the authors conclude that visual experience plays an instructive role in shaping these cortical networks. This study provides valuable insights into the impact of visual experience on the development of functional connectivity in the brain.

      Strengths:<br /> The dissociations in functional correlations observed among the sighted adult, congenitally blind, and neonate groups provide strong support for the study's main conclusion regarding experience-driven changes in functional connectivity profiles between visual and frontal regions.

      In general, the findings in sighted adult and congenitally blind groups replicate previous studies and enhance the confidence in the reliability and robustness of the current results.

      Split-half analysis provides a good measure of robustness in the infant data.

      Weaknesses:<br /> There is some ambiguity in determining which aspects of these networks are shaped by experience.

      This uncertainty is compounded by notable differences in data acquisition and preprocessing methods, which could result in varying signal quality across groups. Variations in signal quality may, in turn, have an impact on the observed correlation patterns.

      The study's findings could benefit from being situated within a broader debate surrounding the instructive versus permissive roles of experience in the development of visual circuits.

    3. Reviewer #2 (Public Review):

      Summary:<br /> Tian et al. explore the developmental organs of cortical reorganization in blindness. Previous work has found that a set of regions in the occipital cortex show different functional responses and patterns of functional correlations in blind vs. sighted adults. In this paper, Tian et al. ask: how does this organization arise over development? Is the "starting state" more like the blind pattern, or more like the adult pattern? Their analyses reveal that the answer depends on the particular networks investigated; some functional connections in infants look more like blind than sighted adults; other functional connections look more like sighted than blind adults; and others fall somewhere in the middle, or show an altogether different pattern in infants compared with both sighted and blind adults.

      Strengths:<br /> The question raised in this paper is extremely important: what is the starting state in development for visual cortical regions, and how is this organization shaped by experience? This paper is among the first to examine this question, particularly by comparing infants not only with sighted adults but also blind adults, which sheds new light on the role of visual (and cross-modal) experience. Another clear strength lies in the unequivocal nature of many results. Many results have very large effect sizes, critical interactions between regions and groups are tested and found, and infant analyses are replicated in split halves of the data.

      Weaknesses:<br /> A central claim is that "infant secondary visual cortices functionally resemble those of blind more than sighted adults" (abstract, last paragraph of intro). I see two potential issues with this claim. First, a minor change: given the approaches used here, no claims should be made about the "function" of these regions, but rather their "functional correlations". Second (and more importantly), the claim that the secondary visual cortex in general resembles blind more than sighted adults is still not fully supported by the data. In fact, this claim is only true for one aspect of secondary visual area functional correlations (i.e., their connectivity to A1/M1/S1 vs. PFC). In other analyses, the infant secondary visual cortex looks more like sighted adults than blind adults (i.e., in within vs. across hemisphere correlations), or shows a different pattern from both sighted and blind adults (i.e., in occipito-frontal subregion functional connectivity). It is not clear from the manuscript why the comparison to PFC vs. non-visual sensory cortex is more theoretically important than hemispheric changes or within-PFC correlations (in fact, if anything, the within-PFC correlations strike me as the most important for understanding the development and reorganization of these secondary visual regions). It seems then that a more accurate conclusion is that the secondary visual cortex shows a mix of instructive effects of vision and reorganizing effects of blindness, albeit to a different extent than the primary visual cortex.

      Relatedly, group differences in overall secondary visual cortex connectivity are particularly striking as visualized in the connectivity matrices shown in Figure S1. In the results (lines 105-112), it is noted that while the infant FC matrix is strongly correlated with both adult groups, the infant group is nonetheless more strongly correlated with the blind than sighted adults. I am concerned that these results might be at least partially explained by distance (i.e., local spread of the bold signal), since a huge portion of the variance in these FC matrices is driven by stronger correlations between regions within the same system (e.g., secondary-secondary visual cortex, frontal-frontal cortex), which are inherently closer together, relative to those between different systems (e.g., visual to frontal cortex). How do results change if only comparisons between secondary visual regions and non-visual regions are included (i.e., just the pairs of regions within the bold black rectangle on the figure), which limits the analysis to long-rang connections only? Indeed, looking at the off-diagonal comparisons, it seems that in fact there are three altogether different patterns here in the three groups. Even if the correlation between the infant pattern and blind adult pattern survives, it might be more accurate to claim that infants are different from both adult groups, suggesting both instructive effects of vision and reorganizing effects of blindness. It might help to show the correlation between each group and itself (across independent sets of subjects) to better contextualize the relative strength of correlations between the groups.

      It is not clear that differences between groups should be attributed to visual experience only. For example, despite the title of the paper, the authors note elsewhere that cross-modal experience might also drive changes between groups. Another factor, which I do not see discussed, is possible ongoing experience-independent maturation. The infants scanned are extremely young, only 2 weeks old. Although no effects of age are detected, it is possible that cortex is still undergoing experience-independent maturation at this very early stage of development. For example, consider Figure 2; perhaps V1 connectivity is not established at 2 weeks, but eventually achieves the adult pattern later in infancy or childhood. Further, consider the possibility that this same developmental progression would be found in infants and children born blind. In that case, the blind adult pattern may depend on blindness-related experience only (which may or may not reflect "visual" experience per se). To deal with these issues, the authors should add a discussion of the role of maturation vs. experience and temper claims about the role of visual experience specifically (particularly in the title).

      The authors measure functional correlations in three very different groups of participants and find three different patterns of functional correlations. Although these three groups differ in critical, theoretically interesting ways (i.e., in age and visual/cross-modal experience), they also differ in many uninteresting ways, including at least the following: sampling rate (TR), scan duration, multi-band acceleration, denoising procedures (CompCor vs. ICA), head motion, ROI registration accuracy, and wakefulness (I assume the infants are asleep).

      Addressing all of these issues is beyond the scope of this paper, but I do feel the authors should acknowledge these confounds and discuss the extent to which they are likely (or not) to explain their results. The authors would strengthen their conclusions with analyses directly comparing data quality between groups (e.g., measures of head motion and split-half reliability would be particularly effective).

    4. Reviewer #3 (Public Review):

      Summary:<br /> This study aimed to investigate whether the differences observed in the organization of visual brain networks between blind and sighted adults result from a reorganization of an early functional architecture due to blindness, or whether the early architecture is immature at birth and requires visual experience to develop functional connections. This question was investigated through the comparison of 3 groups of subjects with resting-state functional MRI (rs-fMRI). Based on convincing analyses, the study suggests that: 1) secondary visual cortices showed higher connectivity to prefrontal cortical regions (PFC) than to non-visual sensory areas (S1/M1 and A1) in sighted infants like in blind adults, in contrast to sighted adults; 2) the V1 connectivity pattern of sighted infants lies between that of sighted adults (stronger functional connectivity with non-visual sensory areas than with PFC) and that of blind adults (stronger functional connectivity with PFC than with non-visual sensory areas); 3) the laterality of the connectivity patterns of sighted infants resembled those of sighted adults more than those of blind adults, but sighted infants showed a less differentiated fronto-occipital connectivity pattern than adults.

      Strengths:<br /> - The question investigated in this article is important for understanding the mechanisms of plasticity during typical and impaired development, and the approach considered, which compares different groups of subjects including, neonates/infants and blind adults, is highly original.

      - Overall, the analyses considered are solid and well-detailed. The results are quite convincing, even if the interpretation might need to be revised downwards, as factors other than visual experience may play a role in the development of functional connections with the visual system.

      Weaknesses:<br /> - While it is informative to compare the "initial" state (close to birth) and the "final" states in blind and sighted adults to study the impact of post-natal and visual experience, this study does not analyze the chronology of this development and when the specialization of functional connections is completed. This would require investigating when experience-dependent mechanisms are important for the setting- establishment of multiple functional connections within the visual system. This could be achieved by analyzing different developmental periods in the same way, using open databases such as the Baby Connectome Project. Given the early, "condensed" maturation of the visual system after birth, we might expect sighted infants to show connectivity patterns similar to those of adults a few months after birth.

      - The rationale for mixing full-term neonates and preterm infants (scanned at term-equivalent age) from the dHCP 3rd release is not understandable since preterms might have a very different development related to prematurity and to post-natal (including visual) experience. Although the authors show that the difference between the connectivity of visual and other sensory regions, and the one of visual and PFC regions, do not depend on age at birth, they do not show that each connectivity pattern is not influenced by prematurity. Simply not considering the preterm infants would have made the analysis much more robust, and the full-term group in itself is already quite large compared with the two adult groups. The current study setting and the analyses performed do not seem to be an adequate and sufficient model to ascertain that "a few weeks of vision after birth is ... insufficient to influence connectivity".

      In a similar way, excluding the few infants with detected brain anomalies (radiological scores higher or equal to 4) would strengthen the group homogeneity by focusing on infants supposed to have a rather typical neurodevelopment. The authors quote all infants as "sighted" but this is not guaranteed as no follow-up is provided.

      The post-menstrual age (PMA) at scan of the infants is also not described. The methods indicate that all were scanned at "term-equivalent age" but does this mean that there is some PMA variability between 37 and 41 weeks? Connectivity measures might be influenced by such inter-individual variability in PMA, and this could be evaluated.

      - The rationale for presenting results on the connectivity of secondary visual cortices before one of the primary cortices (V1) was not clear to understand. Also, it might be relevant to better justify why only the connectivity of visual regions to non-visual sensory regions (S1-M1, A1) and prefrontal cortex (PFC) was considered in the analyses, and not the ones to other brain regions.

      - In relation to the question explored, it might be informative to reposition the study in relation to what others have shown about the developmental chronology of structural and functional long-distance and short-distance connections during pregnancy and the first postnatal months.

      - The authors acknowledge the methodological difficulties in defining regions of interest (ROIs) in infants in a similar way as adults. The reliability and the comparability of the ROIs positioning in infants is definitely an issue. Given that brain development is not homogeneous and synchronous across brain regions (in particular with the frontal and parietal lobes showing delayed growth), the newborn brain is not homothetic to the adult brain, which poses major problems for registration. The functional specialization of cortical regions is incomplete at birth. This raises the question of whether the findings of this study would be stable/robust if slightly larger or displaced regions had been considered, to cover with greater certainty the same areas as those considered in adults. And have other cortical parcellation approaches been considered to assess the ROIs robustness (e.g. MCRIB-S for full-terms)?

    1. eLife assessment

      Songbirds provide a tractable model system to study mechanisms of vocal production and sequencing, and past work showed that the lesions to LMAN, the output of a basal ganglia thalamocortical loop, reduced vocal variability, consistent with a role in motor exploration. In this important work, the authors examined how lesions to an understudied neighboring region, MMAN, part of a parallel basal ganglia loop, affect singing in Bengalese finches, whose songs exhibit complex sequential transitions. They provide convincing evidence that MMAN lesions cause increased sequential variability, showing that distinct frontal systems can have distinct functions for producing and sequencing song syllables.

    2. Reviewer #1 (Public Review):

      Summary:

      Songbirds provide a tractable system to examine neural mechanisms of sequence generation and variability. In past work, the projection from LMAN to RA (output of the anterior forebrain pathway) was shown to be critical for driving vocal variability during babbling, learning, and adulthood. LMAN is immediately adjacent to MMAN, which projects to HVC. MMAN is less well understood but, anatomically, appears to resemble LMAN in that it is the cortical output of a BG-thalamocortical loop. Because it projects to HVC, a major sequence generator for both syllable phonology and sequence, a strong prediction would be that MMAN drives sequence variability in the same way that LMAN drives phonological variability. This hypothesis predicts that MMAN lesions in a Bengalese finch would reduce sequence variability. Here, the authors test this hypothesis. They provide a surprising and important result that is well motivated and well analyzed: MMAN lesions increase sequence variability - this is exactly the opposite result from what would be predicted based on the functions of LMAN.

      Strengths:

      1. A very important and surprising result shows that lesions of a frontal projection from MMAN to HVC, a sequence generator for birdsong, increase syntactical variability.

      2. The choice of Bengalese finches, which have complex transition structures, to examine the mechanisms of sequence generation, enabled this important discovery.

      3. The idea that frontal outputs of BG-cortical loops can generate vocal variability comes from lesions/inactivations of a parallel pathway from LMAN to RA. The difference between MMAN and LMAN functions is striking and important.

      Weaknesses:

      1. If more attention was paid to how syllable phonology was (or was not) affected by MMAN lesions then the claims could be stronger around the specific effects on sequence.

    3. Reviewer #2 (Public Review):

      Summary:

      This study investigates the neural substrates of syntax variation in Bengalese finch songs. Here, the authors tested the effects of bilateral lesions of mMAN, a brain area with inputs to HVC, a premotor area required for song production. Lesions in mMAN induce variability in syntactic elements of song specifically through increased transition entropy, variability within stereotyped song elements known as chunks, and increases in the repeat number of individual syllables. These results suggest that mMAN projections to HVC contribute to multiple aspects of song syntax in the Bengalese finch. Overall the experiments are well-designed, the analysis excellent, and the results are of high interest.

      Strengths:

      The study identifies a novel role for mMAN, the medial magnocellular nucleus of the anterior nidopallium, in the control of syntactic variation within adult Bengalese finch song. This is of particular interest as multiple studies previously demonstrated that mMAN lesions do not affect song structure in zebra finches. The study undertakes a thorough analysis to characterise specific aspects of variability within the song of lesioned animals. The conclusions are well supported by the data.

      Weaknesses:

      The study would benefit from additional mechanistic information. A more fine-grained or reversible manipulation, such as brain cooling, might allow additional insights into how mMAN influences specific aspects of syntax structure. Are repeat number increases and transition entropy resulting from shared mechanisms within mMAN, or perhaps arising from differential output to downstream pathways (i.e. projections to HVC)? Similarly, unilateral manipulations would allow the authors to further test the hypothesis that mMAN is involved in inter-hemispheric synchronization.

    1. eLife assessment

      This study provides a valuable contribution to our understanding of causal inference in visual perception. The evidence provided through multiple well-designed psychophysical experiments is solid. However, the conclusions drawn on the implementation of causal inference in general are too broad to be properly supported by the current results given their narrow focus on visual launch events.

    2. Reviewer #1 (Public Review):

      Summary:<br /> The authors investigated causal inference in the visual domain through a set of carefully designed experiments, and sound statistical analysis. They suggest the early visual system has a crucial contribution to computations supporting causal inference.

      Strengths:<br /> I believe the authors target an important problem (causal inference) with carefully chosen tools and methods. Their analysis rightly implies the specialization of visual routines for causal inference and the crucial contribution of early visual systems to perform this computation. I believe this is a novel contribution and their data and analysis are in the right direction.

      Weaknesses:<br /> In my humble opinion, a few aspects deserve more attention:

      1. Causal inference (or causal detection) in the brain should be quite fundamental and quite important for human cognition/perception. Thus, the underlying computation and neural substrate might not be limited to the visual system (I don't mean the authors did claim that). In fact, to the best of my knowledge, multisensory integration is one of the best-studied perceptual phenomena that has been conceptualized as a causal inference problem. Assuming the causal inference in those studies (Shams 2012; Shams and Beierholm 2022; Kording et al. 2007; Aller and Noppeney 2018; Cao et al. 2019) (and many more e.g., by Shams and colleagues), and the current study might share some attributes, one expects some findings in those domains are transferable (at least to some degree) here as well. Most importantly, underlying neural correlates that have been suggested based on animal studies and invasive recording that has been already studied, might be relevant here as well. Perhaps the most relevant one is the recent work from the Harris group on mice (Coen et al. 2021). I should emphasize, that I don't claim they are necessarily relevant, but they can be relevant given their common roots in the problem of causal inference in the brain. This is a critical topic that the authors may want to discuss in their manuscript.

      2. If I understood correctly, the authors are arguing pro a mere bottom-up contribution of early sensory areas for causal inference (for instance, when they wrote "the specialization of visual routines<br /> for the perception of causality at the level of individual motion directions raises the possibility that this function is located surprisingly early in the visual system *as opposed to a higher-level visual computation*."). Certainly, as the authors suggested, early sensory areas have a crucial contribution, however, it may not be limited to that. Recent studies progressively suggest perception as an active process that also weighs in strongly, the top-down cognitive contributions. For instance, the most simple cases of perception have been conceptualized along this line (Martin, Solms, and Sterzer 2021)<br /> and even some visual illusion (Safavi and Dayan 2022), and other extensions (Kay et al. 2023). Thus, I believe it would be helpful to extend the discussion on the top-down and cognitive contributions of causal inference (of course that can also be hinted at, based on recent developments). Even adaptation, which is central in this study can be influenced by top-down factors (Keller et al. 2017). I believe, based on other work of Rolfs and colleagues, this is also aligned with their overall perspective on vision.

      3. The authors rightly implicate the neural substrate of causal inference in the early sensory system. Given their study is pure psychophysics, a more elaborate discussion based on other studies that used brain measurements is needed (in my opinion) to put into perspective this conclusion. In particular, as I mentioned in the first point, the authors mainly discuss the potential neural substrate of early vision, however much has been done about the role of higher-tier cortical areas in causal inference e.g., see (Cao et al. 2019; Coen et al. 2021).

      There were many areas in this manuscript that I liked: clever questions, experimental design, and statistical analysis.

      Bibliography<br /> \============

      Aller, Mate, and Uta Noppeney. 2018. "To Integrate or Not to Integrate: Temporal Dynamics of Bayesian Causal Inference." Biorxiv, December, 504118. .

      Cao, Yinan, Christopher Summerfield, Hame Park, Bruno Lucio Giordano, and Christoph Kayser. 2019. "Causal Inference in the Multisensory Brain." Neuron 102 (5): 1076-87.e8. .

      Coen, Philip, Timothy P. H. Sit, Miles J. Wells, Matteo Carandini, and Kenneth D. Harris. 2021. "The Role of Frontal Cortex in Multisensory Decisions." Biorxiv, April. Cold Spring Harbor Laboratory, 2021.04.26.441250. .

      Kay, Kendrick, Kathryn Bonnen, Rachel N. Denison, Mike J. Arcaro, and David L. Barack. 2023. "Tasks and Their Role in Visual Neuroscience." Neuron 111 (11). Elsevier: 1697-1713. .

      Keller, Andreas J, Rachael Houlton, Björn M Kampa, Nicholas A Lesica, Thomas D Mrsic-Flogel, Georg B Keller, and Fritjof Helmchen. 2017. "Stimulus Relevance Modulates Contrast Adaptation in Visual Cortex." Elife 6. eLife Sciences Publications, Ltd: e21589.

      Kording, K. P., U. Beierholm, W. J. Ma, S. Quartz, J. B. Tenenbaum, and L. Shams. 2007. "Causal Inference in Multisensory Perception." PloS One 2: e943. .

      Martin, Joshua M., Mark Solms, and Philipp Sterzer. 2021. "Useful Misrepresentation: Perception as Embodied Proactive Inference." Trends Neurosci. 44 (8): 619-28. .

      Safavi, Shervin, and Peter Dayan. 2022. "Multistability, Perceptual Value, and Internal Foraging." Neuron, August. .

      Shams, L. 2012. "Early Integration and Bayesian Causal Inference in Multisensory Perception." In The Neural Bases of Multisensory Processes, edited by M. M. Murray and M. T. Wallace. Frontiers in<br /> Neuroscience. Boca Raton (FL).

      Shams, Ladan, and Ulrik Beierholm. 2022. "Bayesian Causal Inference: A Unifying Neuroscience Theory." Neuroscience & Biobehavioral Reviews 137 (June): 104619. .

    3. Reviewer #2 (Public Review):

      This paper seeks to determine whether the human visual system's sensitivity to causal interactions is tuned to specific parameters of a causal launching event, using visual adaptation methods. The three parameters the authors investigate in this paper are the direction of motion in the event, the speed of the objects in the event, and the surface features or identity of the objects in the event (in particular, having two objects of different colors).

      The key method, visual adaptation to causal launching, has now been demonstrated by at least three separate groups and seems to be a robust phenomenon. Adaptation is a strong indicator of a visual process that is tuned to a specific feature of the environment, in this case launching interactions. Whereas other studies have focused on retinotopically-specific adaptation (i.e., whether the adaptation effect is restricted to the same test location on the retina as the adaptation stream was presented to), this one focuses on feature-specificity.

      The first experiment replicates the adaptation effect for launching events as well as the lack of adaptation event for a minimally different non-causal 'slip' event. However, it also finds that the adaptation effect does not work for launching events that do not have a direction of motion more than 30 degrees from the direction of the test event. The interpretation is that the system that is being adapted is sensitive to the direction of this event, which is an interesting and somewhat puzzling result given the methods used in previous studies, which have used random directions of motion for both adaptation and test events.

      The obvious interpretation would be that past studies have simply adapted to launching in every direction, but that in itself says something about the nature of this direction-specificity: it is not working through opposed detectors. For example, in something like the waterfall illusion adaptation effect, where extended exposure to downward motion leads to illusory upward motion on neutral-motion stimuli, the effect simply doesn't work if motion in two opposed directions is shown (i.e., you don't see illusory motion in both directions, you just see nothing). The fact that adaptation to launching in multiple directions doesn't seem to cancel out the adaptation effect in past work raises interesting questions about how directionality is being coded in the underlying process. In addition, one limitation of the current method is that it's not clear whether the motion-direction-specificity is also itself retinotopically-specific, that is, if one retinotopic location were adapted to launching in one direction and a different retinotopic location adapted to launching in the opposite direction, would each test location show the adaptation effect only for events in the direction presented at that location?

      The second experiment tests whether the adaptation effect is similarly sensitive to differences in speed. The short answer is no; adaptation events at one speed affect test events at another. Furthermore, this is not surprising given that Kominsky & Scholl (2020) showed adaptation transfer between events with differences in speeds of the individual objects in the event (whereas all events in this experiment used symmetrical speeds). This experiment is still novel and it establishes that the speed-insensitivity of these adaptation effects is fairly general, but I would certainly have been surprised if it had turned out any other way.

      The third experiment tests color (as a marker of object identity), and pits it against motion direction. The results demonstrate that adaptation to red-launching-green generates an adaptation effect for green-launching-red, provided they are moving in roughly the same direction, which provides a nice internal replication of Experiment 1 in addition to showing that the adaptation effect is not sensitive to object identity. This result forms an interesting contrast with the infant causal perception literature. Multiple papers (starting with Leslie & Keeble, 1987) have found that 6-8-month-old infants are sensitive to reversals in causal roles exactly like the ones used in this experiment. The success of adaptation transfer suggests, very clearly, that this sensitivity is not based only on perceptual processing, or at least not on the same processing that we access with this adaptation procedure. It implies that infants may be going beyond the underlying perceptual processes and inferring genuine causal content. This is also not the first time the adaptation paradigm has diverged from infant findings: Kominsky & Scholl (2020) found a divergence with the object speed differences as well, as infants categorize these events based on whether the speed ratio (agent:patient) is physically plausible (Kominsky et al., 2017), while the adaptation effect transfers from physically implausible events to physically plausible ones. This only goes to show that these adaptation effects don't exhaustively capture the mechanisms of early-emerging causal event representation.

      One overarching point about the analyses to take into consideration: The authors use a Bayesian psychometric curve-fitting approach to estimate a point of subjective equality (PSE) in different blocks for each individual participant based on a model with strong priors about the shape of the function and its asymptotic endpoints, and this PSE is the primary DV across all of the studies. As discussed in Kominsky & Scholl (2020), this approach has certain limitations, notably that it can generate nonsensical PSEs when confronted with relatively extreme response patterns. The authors mentioned that this happened once in Experiment 3 and that a participant had to be replaced. An alternate approach is simply to measure the proportion of 'pass' reports overall to determine if there is an adaptation effect. I don't think this alternate analysis strategy would greatly change the results of this particular experiment, but it is robust against this kind of self-selection for effects that fit in the bounds specified by the model, and may therefore be worth including in a supplemental section or as part of the repository to better capture the individual variability in this effect.

      In general, this paper adds further evidence for something like a 'launching' detector in the visual system, but beyond that, it specifies some interesting questions for future work about how exactly such a detector might function.

      Kominsky, J. F., & Scholl, B. J. (2020). Retinotopic adaptation reveals distinct categories of causal perception. Cognition, 203, 104339. https://doi.org/10.1016/j.cognition.2020.104339

      Kominsky, J. F., Strickland, B., Wertz, A. E., Elsner, C., Wynn, K., & Keil, F. C. (2017). Categories and Constraints in Causal Perception. Psychological Science, 28(11), 1649-1662. https://doi.org/10.1177/0956797617719930

      Leslie, A. M., & Keeble, S. (1987). Do six-month-old infants perceive causality? Cognition, 25(3), 265-288. https://doi.org/10.1016/S0010-0277(87)80006-9

    4. Reviewer #3 (Public Review):

      Summary:<br /> This paper presents evidence from three behavioral experiments that causal impressions of "launching events", in which one object is perceived to cause another object to move, depending on motion direction-selective processing. Specifically, the work uses an adaptation paradigm (Rolfs et al., 2013), presenting repetitive patterns of events matching certain features to a single retinal location, then measuring subsequent perceptual reports of a test display in which the degree of overlap between two discs was varied, and participants could respond "launch" or "pass". The three experiments report results of adapting to motion direction, motion speed, and "object identity", and examine how the psychometric curves for causal reports shift in these conditions depending on the similarity of the adapter and test. While causality reports in the test display were selective for motion direction (Experiment 1), they were not selective for adapter-test speed differences (Experiment 2) nor for changes in object identity induced via color swap (Experiment 3). These results support the notion that causal perception is computed (in part) at relatively early stages of sensory processing, possibly even independently of or prior to computations of object identity.

      Strengths:<br /> The setup of the research question and hypotheses is exceptional. The experiments are carefully performed (appropriate equipment, and careful control of eye movements). The slip adaptor is a really nice control condition and effectively mitigates the need to control motion direction with a drifting grating or similar. Participants were measured with sufficient precision, and a power curve analysis was conducted to determine the sample size. Data analysis and statistical quantification are appropriate. Data and analysis code are shared on publication, in keeping with open science principles. The paper is concise and well-written.

      Weaknesses:<br /> The biggest uncertainty I have in interpreting the results is the relationship between the task and the assumption that the results tell us about causality impressions. The experimental logic assumes that "pass" reports are always non-causal impressions and "launch" reports are always causal impressions. This logic is inherited from Rolfs et al (2013) and Kominsky & Scholl (2020), who assert rather than measure this. However, other evidence suggests that this assumption might not be solid (Bechlivanidis et al., 2019). Specifically, "[our experiments] reveal strong causal impressions upon first encounter with collision-like sequences that the literature typically labels "non-causal"" (Bechlivanidis et al., 2019) -- including a condition that is similar to the current "pass". It is therefore possible that participants' "pass" reports could also involve causal experiences.

      Furthermore, since the only report options are "launch" or "pass", it is also possible that "launch" reports are not indications of "I experienced a causal event" but rather "I did not experience a pass event". It seems possible to me that different adaptation transfer effects (e.g. selectivity to motion direction, speed, or color-swapping) change the way that participants interpret the task, or the uncertainty of their impression. For example, it could be that adaptation increases the likelihood of experiencing a "pass" event in a direction-selective manner, without changing causal impressions. Increases of "pass" impressions (or at least, uncertainty around what was experienced) would produce a leftward shift in the PSE as reported in Experiment 1, but this does not necessarily mean that experiences of causal events changed. Thus, changes in the PSEs between the conditions in the different experiments may not directly reflect changes in causal impressions. I would like the authors to clarify the extent to which these concerns call their conclusions into question.

      Leaving these concerns aside, I am also left wondering about the functional significance of these specialised mechanisms. Why would direction matter but speed and object identity not? Surely object identity, in particular, should be relevant to real-world interpretations and inputs of these visual routines? Is color simply too weak an identity?

      References:

      Bechlivanidis, C., Schlottmann, A., & Lagnado, D. A. (2019). Causation without realism. Journal of Experimental Psychology: General, 148(5), 785-804. https://doi.org/10.1037/xge0000602

      Kominsky, J. F., & Scholl, B. J. (2020). Retinotopic adaptation reveals distinct categories of causal perception. Cognition, 203, 104339.

      Rolfs, M., Dambacher, M., & Cavanagh, P. (2013). Visual Adaptation of the Perception of Causality. Current Biology, 23(3), 250-254. https://doi.org/10.1016/j.cub.2012.12.017

    1. eLife assessment

      This valuable study reports on the characteristics of premotor cortical population activity during the execution and observation of a moderately complex reaching and grasping task. By using new variants of well-established techniques to analyse neural population activity, the authors provide solid evidence that while the geometry of neural population activity changes between execution and observation, their dynamics are largely preserved. While these observations are novel and robust barring the need for additional controls, the authors should do additional work to define the functional implications of their findings.

    2. Reviewer #1 (Public Review):

      Summary and strengths. This paper starts with an exceptionally fair and balanced introduction to a topic, the mirror neuron literature, which is often debated and prone to controversies even in the choice of the terminology. In my opinion, the authors made an excellent job in this regard, and I really appreciated it. Then, they propose a novel method to look at population dynamics to compare neural selectivity and alignment between execution and observation of actions performed with different types of grip.

      Weakness. Unfortunately, the goal and findings within this well-described framework are less clear to me. The authors aimed to investigate, using a novel analytic approach, whether and to what extent a match exists between population codes and neural dynamics when a monkey performs an action or observes it performed by an experimenter. This motivation stems from the fact that the general evidence in the literature is that the match between visual and motor selectivity of mirror neuron responses is essentially at a chance level. While the approach devised by the author is generally well-described and understandable, the main result obtained confirms this general finding of a lack of matching between the two contexts in 2 out of the three monkeys. Nevertheless, the authors claim that the patterns associated with execution and observation can be re-aligned with canonical correlation, indicating that these distinct neural representations show dynamical similarity that may enable the nervous system to recognize particular actions. This final conclusion is hardly acceptable to me, and constitutes my major concern, at least without a more explicit explanation: how do we know that this additional operation can be performed by the brain? Is this a computational trick to artificially align something that is naturally non-aligned, or can it capture something real and useful?<br /> Based on the accumulated evidence on space-constrained coding of others' actions by mirror neurons (e.g., Caggiano et al. 2009; Maranesi et al. 2017), recent evidence also cited by the authors (Pomper et al. 2023), and the most recent views supported even by the first author of the original discovery (i.e., Vittorio Gallese, see Bonini et al. 2022 on TICS), it seems that one of the main functions of these cells, especially in monkeys, might be to prepare actions and motor responses during social interaction rather than recognizing the actions of others - something that visual brain areas could easily do better than motor ones in most situations. In this perspective, and given the absence of causal evidence so far, the lack of visuo-motor congruence is a potentially relevant feature of the mechanism rather than something to be computationally cracked at all costs.

      Specific comments on Results/Methods:<br /> I can understand, based on the authors' hypothesis, that they employed an ANOVA to preliminarily test whether and which of the recorded neurons fit their definition of "mirror neurons". However, given the emphasis on the population level, and the consolidated finding of highly different execution and observation responses, I think it could be interesting to apply the same analysis on (at least also) the whole recorded neuronal population, without any preselection-based on a single neuron statistic. Such preselection of mirror neurons could influence the results of EXE-OBS comparisons since all the neurons activated only during EXE or OBS are excluded. Related to this point, the authors could report the total number of recorded neurons per monkey/session, so that also the fraction of neurons fitting their definition of mirror neuron is explicit.<br /> Furthermore, the comparison of the dynamics of the classification accuracy in figures 4 and 5, and therefore the underlying assumption of subspaces shift in execution and observation, respectively, reveal substantial similarities between monkeys despite the different contexts, which are clearly greater than the similarities among neural subspaces shifts across task epochs: to me, this suggests that the main result is driven by the selected neural populations in different monkeys/implants rather than by an essential property of the neuronal dynamics valid across animals. Could the author comment on this issue? This could easily explain the "strange" result reported in figure 6 for monkey T.

    3. Reviewer #2 (Public Review):

      In this work, the authors set out to identify time-varying subspaces in the premotor cortical activity of monkeys as they executed/observed a reach-grasp-hold movement of 4 different objects. Then, they projected the neural activity to these subspaces and found evidence of shifting subspaces in the time course of a trial in both conditions, executing and observing. These shifting subspaces appear to be distinct in execution and observation trials. However, correlation analysis of neural dynamics reveals the similarity of dynamics in these distinct subspaces. Taken together, Zhao and Schieber speculate that the condition-dependent activity studied here provides a representation of movement that relies on the actor.<br /> This work addresses an interesting question. The authors developed a novel approach to identify instantaneous subspaces and decoded the object type from the projected neural dynamics within these subspaces. As interesting as these results might be, I have a few suggestions and questions to improve the manuscript:<br /> 1- Repeating the analyses in the paper, e.g., in Fig5, using non-MN units only or the entire population, and demonstrating that the results are specific to MNs would make the whole study much more compelling.<br /> 2- The method presented here is similar and perhaps related to principal angles (https://doi.org/10.2307/2005662). It would be interesting to confirm these results with principal angles. For instance, instead of using the decoding performance as a proxy for shifting subspaces, principal angles could directly quantify the 'shift' (similar to Gallego et al, Nat Comm, 2018). Relatedly, why the decoding of the 'object type' is used to establish the progressive shifting of the subspaces? I would be interested to see the authors' argument. The object type should be much more decodable during movement or hold, than instruction, which is probably why the chance-level decoding performance (horizontal lines) is twice the instruction segment for the movement segment.<br /> 3- Why aren't execution and observation subspaces compared together directly? Especially given that there are both types of trials in the same session with the same recorded population of neurons. Using instantaneous subspaces, or the principal angles between manifolds during exec trials vs obs trials.<br /> 4- The definition of the instantaneous subspaces is a critical point in the manuscript. I think it is slightly unclear: based on the Methods section #715-722 and the main text #173-#181, I gather that the subspaces are based on trial averaged neural activity for each of the 4 objects, separately. So for each object and per timepoint, a vector of size (1, n) -n neurons- is reduced to a vector of (1, 2 or 3 -the main text says 2, methods say 3-) which would be a single point in the low-d space. Is this description accurate? This should be clarified in the manuscript.<br /> 5- Isn't the process of projecting segments of neural dynamics and comparing the results equivalent to comparing the projection matrices in the first place? If so, that might have been a more intuitive avenue to follow.<br /> 6- Lines #385-#389: This process seems unnecessarily complicated. Also, given the number of trials available, this sometimes doesn't make sense. E.g. Monkey R exec has only 8 trials of one of the objects, so bootstrapping 20 trials 500 times would be spurious. Why not, as per Gallego et al, Nat Neurosci 2020 and Safaie et al, Nat 2023 which are cited, concatenate the trials?<br /> 7- Related to the CCA analysis, what behavioural epoch has been used here, the same as the previous analyses, i.e. 100ms? how many datapoint is that in time? Given that CCA is essentially a correlation value, too few datapoints make it rather meaningless. If that's the case, I encourage using, let's say, one window combined of I and G until movement, and one window of movement and hold, such that they are both easier to interpret. Indeed low values of exec-exec in CC2 compared to Gallego et al, Nat Neurosci, 2020 might be a sign of a methodological error.

    4. Reviewer #3 (Public Review):

      Summary:<br /> In their study, Zhao et al. investigated the population activity of mirror neurons (MNs) in the premotor cortex of monkeys either executing or observing a task consisting of reaching to, grasping, and manipulating various objects. The authors proposed an innovative method for analyzing the population activity of MNs during both execution and observation trials. This method enabled to isolate the condition-dependent variance in neural data and to study its temporal evolution over the course of single trials. The method proposed by the authors consists of building a time series of "instantaneous" subspaces with single time step resolution, rather than a single subspace spanning the entire task duration. As these subspaces are computed on an instant time basis, projecting neural activity from a given task time into them results in latent trajectories that capture condition-dependent variance while minimizing the condition-independent one. The authors then analyzed the time evolution of these instantaneous subspaces and revealed that a progressive shift is present in subspaces of both execution and observation trials, with slower shifts during the grasping and manipulating phases compared to the initial preparation phase. Finally, they compared the instantaneous subspaces between execution and observation trials and observed that neural population activity did not traverse the same subspaces in these two conditions. However, they showed that these distinct neural representations can be aligned with Canonical Correlation Analysis, indicating dynamic similarities of neural data when executing and observing the task. The authors speculated that such similarities might facilitate the nervous system's ability to recognize actions performed by oneself or another individual.

      Strengths:<br /> Unlike other areas of the brain, the analysis of neural population dynamics of premotor cortex MNs is not well established. Furthermore, analyzing population activity recorded during non-trivial motor actions, distinct from the commonly used reaching tasks, serves as a valuable contribution to computational neuroscience. This study holds particular significance as it bridges both domains, shedding light on the temporal evolution of the shift in neural states when executing and observing actions. The results are moderately robust, and the proposed analytical method could potentially be used in other neuroscience contexts.

      Weaknesses:<br /> While the overall clarity is satisfactory, the paper falls short in providing a clear description of the mathematical formulas for the different methods used in the study. Moreover, it was not immediately clear why the authors did not consider a (relatively) straightforward metric to quantity the progressive shift of the instantaneous subspaces, such as computing the angle between consecutive subspaces, rather than choosing a (in my opinion) more cumbersome metric based on classification of trajectory segments representing different movements.

      Specific comments:<br /> In the methods, it is stated that instantaneous subspaces are found with 3 PCs. Why does it say 2 here? Another doubt on how instantaneous subspaces are computed: in the methods you state that you apply PCA on trial-averaged activity at each 50ms time step. From the next sentence, I gather that you apply PCA on an Nx4 data matrix (N being the number of neurons, and 4 being the trial-averaged activity of the four objects) every 50 ms. Is this right? It would help to explicitly specify the dimensions of the data matrix that goes into PCA computation.

      It would help to include some equations in the methods section related to the LSTM decoding. Just to make sure I understood correctly: after having identified the instantaneous subspaces (every 50 ms), you projected the Instruction, Go, Movement, and Holding segments from individual trials (each containing 100 samples, since they are sampled from a 100ms window) onto each instantaneous subspace. So you have four trajectories for each subspace. In the methods, it is stated that a single LSTM classifier is trained for each subspace. Do you also have a separate classifier for each trajectory segment? What is used as input to the classifier? Each trajectory segment should be a 100x3 matrix once projected in an instantaneous subspace. Is that what (each of) the LSTMs take as input? And lastly, what is the LSTM trained to predict exactly? Just a label indicating the type of object that was manipulated in that trial? I apologize if I overlooked any detail, but I believe a clearer explanation of the LSTM, preferably with mathematical formulas, would greatly help readers understand this section.

    1. eLife assessment

      This study presents important findings about synaptic connectivity among subsets of unipolar brush cells (UBCs), a specialized interneuron primarily located in the vestibular lobules of the cerebellar cortex. The evidence supporting the claims are interesting and solid. The work will be of interest to cerebellar neuroscientists as well as those focussed on synaptic properties and mechanisms. Although several compelling pieces of data were presented, some in vivo work remains to be conducted in order to test if the hypothesis and predictions translate into the behaving animal and how it would impact the processing of feedback or feedforward activity that would be required to promote behavior.

    2. Reviewer #1 (Public Review):

      The manuscript by Hariani et al. presents experiments designed to improve our understanding of the connectivity and computational role of Unipolar Brush Cells (UBCs) within the cerebellar cortex, primarily lobes IX and X. The authors develop and cross several genetic lines of mice that express distinct fluorophores in subsets of UBCs, combined with immunocytochemistry that also distinguishes subtypes of UBCs, and they use confocal microscopy and electrophysiology to characterize the electrical and synaptic properties of subsets of so-labelled cells, and their synaptic connectivity within the cerebellar cortex. The authors then generate a computer model to test possible computational functions of such interconnected UBCs.

      Using these approaches, the authors report that:<br /> 1) GRP-driven TDtomato is expressed exclusively in a subset (20%) of ON-UBCs, defined electrophysiologically (excited by mossy fiber afferent stimulation via activation of UBC AMPA and mGluR1 receptors) and immunocytochemically by their expression of mGluR1.

      2) UBCs ID'd/tagged by mCitrine expression in Brainbow mouse line P079 is expressed in a similar minority subset of OFF-UBCs defined electrophysiologically (inhibited by mossy fiber afferent stimulation via activation of UBC mGluR2 receptors) and immunocytochemically by their expression of Calretinin. However, such mCitrine expression was also detected in some mGluR1 positive UBCs, which may not have shown up electrophysiologically because of the weaker fluorophore expression without antibody amplification.

      3) Confocal analysis of crossed lines of mice (GRP X P079) stained with antibodies to mGluR1 and calretinin documented the existence of all possible permutations of interconnectivity between cells (ON-ON, ON-OFF, OFF-OFF, OFF-ON), but their overall abundance was low, and neither their absolute or relative abundance was quantified.

      4) A computational model (NEURON ) indicated that the presence of an intermediary UBC (in a polysynaptic circuit from MF to UBC to UBC) could prolong bursts (MF-ON-ON), prolong pauses (MF-ON-OFF), cause a delayed burst (MF-OFF-OFF), cause a delayed pause (MF-OFF-ON) relative to solely MF to UBC synapses which would simply exhibit long bursts (MF-ON) or long pauses (MF-OFF).

      The authors thus conclude that the pattern of interconnected UBCs provides an extended and more nuanced pattern of firing within the cerebellar cortex that could mediate longer lasting sensorimotor responses.

      The cerebellum's long known role in motor skills and reflexes, and associated disorders, combined with our nascent understanding of its role in cognitive, emotional, and appetitive processing, makes understanding its circuitry and processing functions of broad interest to the neuroscience and biomedical community. The focus on UBCs, which are largely restricted to vestibular lobes of the cerebellum reduces the breadth of likely interest somewhat. The overall design of specific experiments is rigorous and the use of fluorophore expressing mouse lines is creative. The data that is presented and the writing are clear.

    3. Reviewer #2 (Public Review):

      In this paper, the authors presented a compelling rationale for investigating the role of UBCs in prolonging and diversifying signals. Based on the two types of UBCs known as ON and OFF UBC subtypes, they have highlighted the existing gaps in understanding UBCs connectivity and the need to investigate whether UBCs target UBCs of the same subtype, different subtypes, or both. The importance of this knowledge is for understanding how sensory signals are extended and diversified in the granule cell layer.

      The authors designed very interesting approaches to study UBCs connectivity by utilizing transgenic mice expressing GFP and RFP in UBCs, Brainbow approach, immunohistochemical and electrophysiological analysis, and computational models to understand how the feed-forward circuits of interconnected UBCs transform their inputs.

      This study provided evidence for the existence of distinct ON and OFF UBC subtypes based on their electrophysiological properties, anatomical characteristics, and expression patterns of mGluR1 and calretinin in the cerebellum. The findings support the classification of GRP UBCs as ON UBCs and P079 UBCs as OFF UBCs and suggest the presence of synaptic connections between the ON and OFF UBC subtypes. In addition, they found that GRP and P079 UBCs form parallel and convergent pathways and have different membrane capacitance and excitability. Furthermore, they showed that UBCs of the same subtype provide input to one another and modify the input to granule cells, which could provide a circuit mechanism to diversify and extend the pattern of spiking produced by mossy fiber input. Accordingly, they suggested that these transformations could provide a circuit mechanism for maintaining a sensory representation of movement for seconds.

      Overall, the article is well written in a sound detailed format, very interesting with excellent discovery and suggested model.

    1. eLife assessment

      This study provides direct evidence showing that Kv1.8 channels underly several potassium currents in the two types of sensory hair cells found in the mouse vestibular system. This is an important finding because the nature of the channels underpinning the unusual potassium conductance gK,L in type I hair cells has been under scrutiny for many years. Although most of the experimental evidence is compelling and the analysis is rigorous, the evidence supporting some of the claims related to Kv1.4 channels is incomplete. The study will be of interest to cell and molecular biologists and auditory neuroscientists.

    2. Reviewer #1 (Public Review):

      Summary:<br /> In this paper, the authors provide a thorough demonstration of the role that one particular type of voltage-gated potassium channel, Kv1.8, plays in a low voltage-activated conductance found in type I vestibular hair cells. Along the way, they find that this same channel protein appears to function in type II vestibular hair cells as well, contributing to other macroscopic conductances. Overall, Kv1.8 may provide especially low input resistance and short time constants to facilitate encoding of more rapid head movements in animals that have necks. Combination with other channel proteins, in different ratios, may contribute to the diversified excitability of vestibular hair cells.

      Strengths:<br /> The experiments are comprehensive and clearly described, both in the text and in the figures. Statistical analyses are provided throughout.

      Weaknesses:<br /> None.

    3. Reviewer #2 (Public Review):

      The focus of this manuscript was to investigate whether Kv1.8 channels, which have previously been suggested to be expressed in type I hair cells of the mammalian vestibular system, are responsible for the potassium conductance gK,L. This is an important study because gK,L is known to be crucial for the function of type I hair cells, but the channel identity has been a matter of debate for the past 20 years. The authors have addressed this research topic by primarily investigating the electrophysiological properties of the vestibular hair cells from Kv1.8 knockout mice. Interestingly, gK,L was completely abolished in Kv1.8-deficient mice, in agreement with the hypothesis put forward by the authors based on the literature. The surprising observation was that in the absence of Kv1.8 potassium channels, the outward potassium current in type II hair cells was also largely reduced. Type II hair cells express the largely inactivating potassium conductance g,K,A, but not gK,L. The authors concluded that heteromultimerization of non-inactivating Kv1.8 and the inactivating Kv1.4 subunits could be responsible for the inactivating gK,A. Overall, the manuscript is very well written and most of the conclusions are supported by the experimental work. The figures are well described, and the statistical analysis is robust.

      My only comment relates to the statement regarding the results providing "evidence" that Kv1.4 form heteromultimers with Kv1.8 channels (see Discussion). The only data I can see from the results is that Kv1.4 channels are expressed in the membrane of type II hair cells, which is not sufficient evidence for the above claim. Is the distribution of Kv1.8 and Kv1.4 overlapping in type II hair cells? Have the authors attempted to perform some pharmacological studies on Kv1.4? For example, would gK,A be completely blocked by a Kv1.4 antagonist? Addressing at least some of these questions would strengthen your argument.

    4. Reviewer #3 (Public Review):

      Summary:<br /> This paper by Martin et al. describes the contribution of a Kv channel subunit (Kv1.8, KCNA10) to voltage-dependent K+ conductances and membrane properties of type I and type II hair cells of the mouse utricle. Previous work has documented striking differences in K+ conductances between vestibular hair cell types. In particular amniote type I hair cells are known to express a non-typical low-voltage-activated K+ conductance (GK,L) whose molecular identity has been elusive. K+ conductances in hair cells from 3 different mouse genotypes (wildtype, Kv1.8 homozygous knockouts, and heterozygotes) are examined here and whole-cell patch-clamp recordings indicate a prominent role for Kv1.8 subunits in generating GK,L. Results also interestingly support a role for Kv1.8 subunits in type II hair cell K+ conductances; inactivating conductances in null mice are reduced in type II hair cells from striola and extrastriola regions of the utricle. Kv1.8 is therefore proposed to contribute as a pore-forming subunit for 3 different K+ conductances in vestibular hair cells. The impact of these conductances on membrane responses to current steps is studied in the current clamp. Pharmacological experiments use XE991 to block some residual Kv7-mediated current in both hair cell types, but no other pharmacological blockers are used. In addition, immunostaining data are presented and raise some questions about Kv7 and Kv1.8 channel localization. Overall, the data present compelling evidence that the removal of Kv1.8 produces profound changes in hair cell membrane conductances and sensory capabilities. These changes at hair cell level suggest vestibular function would be compromised and further assessment in terms of balance behavior in the different mice would be interesting.

      Strengths:

      This study provides strong evidence that Kv1.8 subunits are major contributors to the unusual K+ conductance in type I hair cells of the utricle. It also indicates that Kv1.8 subunits are important for type II hair cell K+ conductances because Kv1.8-/- mice lacked an inactivating A conductance and had reduced delayed rectifier conductance compared to controls. A comprehensive and careful analysis of biophysical profiles is presented of expressed K+ conductances in 3 different mouse genotypes. Voltage-dependent K+ currents are rigorously characterized at a range of different ages and their impact on membrane voltage responses to current input is studied. Some pharmacological experiments are performed in addition to immunostaining to bolster the conclusions from the biophysical studies. The paper has a significant impact in showing the role of Kv1.8 in determining utricular hair cell electrophysiological phenotypes.

      Weaknesses:

      1. From previous work it is known that GK,L in type I hair cells have unusual ion permeation and pharmacological properties that differ greatly from type II hair cell conductances. Notably GK,L is highly permeable to Cs+ as well as K+ ions and is slightly permeable to Na+. It is blocked by 4-aminopyridine and divalent cations (Ba2+, Ca2+, Ni2+), enhanced by external K+, and modulated by cyclic GMP. The question arises, if Kv1.8 is a major player and pore-forming subunit in type I and type II cells (and cochlear inner hair cells as shown by Dierich et al. 2020) how are subunits modified to produce channels with very different properties? A role for Kv1.4 channels (gA) is proposed in type II hair cells based on previous findings in bird hair cells and immunostaining for Kv1.4 channels in rat utricle presented here in Fig. 6. However, hair cell-specific partner interactions with Kv1.8 that result in GK,L in type I hair cells and Cs+ impermeable, inactivating currents in type II hair cells remain for the most part unexplored.

      2. Data from patch-clamp and immunocytochemistry experiments are not in close alignment. XE991 (Kv7 channel blocker) decreases remaining K+ conductance in type I and type II hair cells from null mice supporting the presence of Kv7 channels in hair cells (Fig. 7). Also, Holt et al. (2007) previously showed inhibition of GK,L in type I hair cells (but not delayed rectifier conductance in type II hair cells) using a dominant negative construct of Kv7.4 channels. However, immunolabelling indicates Kv7.4 channels on the inner face of calyx terminals adjacent to hair cells (Fig. 5). Some reconciliation of these findings is needed.

      3. Strong immunosignal appears in the cuticle plates of hair cells in addition to signal in basal regions of hair cells and supporting cells. Please provide a possible explanation for this.

      4. A previous paper reported that a vestibular evoked potential was abnormal in Kv1.8-/- mice (Lee et al. 2013) as briefly mentioned (lines 94-95). It would be very interesting to know if any vestibular-associated behaviors and/or hearing loss were observed in the mice populations. If responses are compromised at the sensory hair cell level across different zones, degradation of balance function would be anticipated and should be elucidated.

    1. eLife assessment

      This manuscript presents valuable findings on the identification of epigenetically mediated control for the recognition of dihydropyrimidine dehydrogenase (DPYD) gene expression that is linked with cancer treatment resistance using 5-fluorouracil. The evidence is compelling, supported by data from patient-derived specimens and direct assessment of 5-fluorouracil sensitivity, which provides confidence in the proposed mechanisms. The model is additionally supported by genome data from a population with high "compromised allele frequency". This work will interest those studying drug resistance in cancer therapy.

    2. Joint Public Review:

      Zhang et. al. presents compelling results that support the identification of epigenetically mediated control for the recognition of dihydropyrimidine dehydrogenase (DPYD) gene expression that is linked with cancer treatment resistance 5-fluorouracil. The experimental approach was developed and pursued with in vitro and in vivo strategies. Combining molecular, cellular, and biochemical approaches, the authors identify a germline variant with compromised enhancer control. Several lines of evidence were presented that are consistent with increased CEBP recruitment to the DPYD regulatory domain with consequential modifications in promoter-enhancer interactions that are associated with compromised 5-fluorouracil resistance. Functional identification of promoter and enhancer elements was validated by CRISPRi and CRISPRa assays. ChIP and qPCR documented histone marks that can account for the control of DPYD gene expression were established. Consistency with data from patient-derived specimens and direct assessment of 5-fluorouracil sensitivity provides confidence in the proposed mechanisms. The model is additionally supported by genome data from a population with high "compromised allele frequency". It can be informative to directly demonstrate DPYD promoter-enhancer interactions. However, the genetic variants support the integration of regulatory activities.

    1. eLife assessment

      This important study using engineered mouse models provides a first compelling demonstration of a pathogenic phenotype associated with lack of expression of p53AS, an isoform of the p53 protein with a different C-terminus as canonical p53. The work also offers correlative evidence that Ackr4, differentially expressed in this mouse model, may be a male-specific prognostic factor in a specific type of B-cell lymphomas. Direct functional evidence testing the links proposed would better support the major findings of the study.

    2. Reviewer #1 (Public Review):

      Summary:<br /> The authors originally investigated the function of p53 isoforms with an alternative C-terminus encoded by the Alternatively Spliced (AS) exon in place of exon 11 encoding the canonical "α" C-terminal domain. For this purpose, the authors create a mouse model with a specific deletion of the AS exon.

      Strengths:<br /> Interestingly, wt or p53ΔAS/ΔAS mouse embryonic fibroblasts did not differ in cell cycle control, expression of well-known p53 target genes, proliferation under hyperoxic conditions, or the growth of tumor xenografts. However, p53-AS isoforms were shown to confer male-specific protection against lymphomagenesis in Eμ-Myc transgenic mice, prone to highly penetrant B-cell lymphomas. In fact, p53ΔAS/ΔAS Eμ-Myc mice were less protected from developing B-cell lymphomas compared to WT counterparts. The important difference that the authors find between WT and p53ΔAS/ΔAS Eμ-Myc males is a higher number of immature B cells in p53ΔAS/ΔAS vs WT mice. Higher expression of Ackr4 and lower expression of Mt2 was found in p53+/+ Eμ-Myc males compared to p53ΔAS/ΔAS counterparts, suggesting that these two transcripts are in part regulators of B-cell lymphomagenesis and enrichment for immature B cells.

      Weaknesses:<br /> The manuscript is interesting but the data are not so striking and are very correlative. The authors should add functional experiments to reinforce their hypotheses and to provide, beyond potential prognostic factors, any potential mechanism at the basis of the different rates of B-cell lymphomagenesis in males vs females individuals and in WT vs p53ΔAS/ΔAS Eμ-Myc males.

    3. Reviewer #2 (Public Review):

      Summary:<br /> This manuscript provides a detailed analysis of B-cell lymphomagenesis in mice lacking an alternative exon in the region encoding the C-terminal (regulatory) domain of the p53 protein and thus enable to assemble the so-called p53AS isoform. This isoform differs from canonical p53 by the replacement of roughly 30 c-terminal residues by about 10 residues encoded by the alternative exon. There is biochemical and biological evidence that p53AS retains strong transcriptional and somewhat enhanced suppressive activities, with mouse models expressing protein constructs similar to p53AS showing signs of increased p53 activity leading to rapid and lethal anemia. However, the precise role of the alternative p53AS variant has not been addressed so far in a mouse model aimed at demonstrating whether the lack of this particular p53 isoform (trp53ΔAS/ΔAS mice) may cause a specific pathological phenotype.

      Results show that lack of AS expression does not noticeably affect p53 transcriptional activity but reveals a subtle pathogenic phenotype, with trp53ΔAS/ΔAS males, but not females, tending to develop more frequently and earlier B-cell lymphoma than WT. Next, the authors then introduced ΔAS in transgenic Eμ-Myc mice that show accelerated lymphomagenesis. They show that lack of AS caused increased lethality and larger tumor lymph nodes in p53ΔAS Eμ-Myc males compared to their p53WT Eμ-Myc male counterparts, but not in females. Comparative transcriptomics identified a small set of candidate, differentially expressed genes, including Ackr4 (atypical chemokine receptor 4), which was significantly less expressed in the spleens of ΔAS compared to WT controls. Ackr4 encodes a dummy receptor acting as an interceptor for multiple chemokines and thus may negatively regulate a chemokine/cytokine signalling axis involved in lymphomagenesis, which is down-regulated by estrogen signalling. Using in vitro cell models, the authors provide evidence that Ackr4 is a transcriptional target for p53 and that its p53-dependent activation is repressed by 17b-oestradiol. Finally, seeking evidence for a relevance for this gene in human lymphomagenesis, the authors analyse Burkitt lymphoma transcriptomic datasets and show that high ACKR4 expression correlated with better survival in males, but not in females

      Strengths:<br /> A convincing demonstration of a subtle, gender-specific pathogenic phenotype associated with the lack of p53AS. The characterization of trp53ΔAS/ΔAS is well described and the data presented are convincing. This represents a significant achievement since, as mentioned, in vivo data establishing the relevance of p53AS isoform remains scarce. Based on this initial observation, the authors provide strong correlative evidence that this particular phenotype is associated by differential expression of Ackr4.

      Weaknesses:<br /> The study does not demonstrate how p53AS may specifically and differentially contribute to the regulation of Ackr4, nor whether restoring Ackr4 expression may nullify the observed phenotype.

    1. eLife assessment

      This study offers valuable insights into the remarkable resistance of tardigrades to ionizing radiation by showing that radiation treatment induces a suite of DNA repair proteins. They identify a strongly induced tardigrade-specific DNA-binding protein that can reduce the number of double-strand breaks in human cancer-derived cells. The evidence of upregulation of repair proteins is compelling and the case for a role of the newly identified protein in repair can be strengthened as genetic tools for tardigrades become better developed. The results will be of interest the fields of DNA repair and radiobiology as well as tardigrade biologists.

    2. Reviewer #1 (Public Review):

      Summary:<br /> The manuscript "comparative transcriptomics reveal a novel tardigrade specific DNA binding protein induced in response to ionizing radiation" aims to provide insights into the mediators and mechanisms underlying tardigrade radiation tolerance. The authors start by assessing the effect of ionizing radiation (IR) on the tardigrade lab species, H. exemplaris, as well as the ability of this organism to recover from this stress - specifically, they look at DNA double and single-strand breaks. They go on to characterize the response of H. exemplaris and two other tardigrade species to IR at the transcriptomic level. Excitingly, the authors identify a novel gene/protein called TDR1 (tardigrade DNA damage response protein 1). They carefully assess the induction of expression/enrichment of this gene/protein using a combination of transcriptomics and biochemistry - even going so far as to use a translational inhibitor to confirm the de novo production of this protein. TDR1 binds DNA in vitro and co-localizes with DNA in tardigrades.

      Reverse genetics in tardigrades is difficult, thus the authors use a heterologous system (human cells) to express TDR1 in. They find that when transiently expressed TDR1 helps improve human cell resistance to IR.

      This work is a masterclass in integrative biology incorporating a holistic set of approaches spanning next-gen sequencing, organismal biology, biochemistry, and cell biology. I find very little to critique in their experimental approaches.

      Strengths:<br /> 1. Use of trans/interdisciplinary approaches ('omics, molecular biology, biochemistry, organismal biology)<br /> 2. Careful probing of TDR1 expression/enrichment<br /> 3. Identification of a completely novel protein seemingly involved in tardigrade radio-tolerance.<br /> 4. Use of multiple, diverse, tardigrade species of 'omics comparison.

      Weaknesses:<br /> 1. No reverse genetics in tardigrades - all insights into TDR1 function from heterologous cell culture system.<br /> 2. Weak discussion of Dsup's role in preventing DNA damage in light of DNA damage levels measured in this manuscript.<br /> 3. Missing sequence data which is essential for making a complete review of the work.

      Overall, I find this to be one of the more compelling papers on tardigrade stress-tolerance I have read. I believe there are points still that the authors should address, but I think the editor would do well to give the authors a chance to address these points as I find this manuscript highly insightful and novel.

    3. Reviewer #3 (Public Review):

      Summary:<br /> This paper describes transcriptomes from three tardigrade species with or without treatment with ionizing radiation (IR). The authors show that IR produces numerous single-strand and double-strand breaks as expected and that these are substantially repaired within 4-8 hours. Treatment with IR induces strong upregulation of transcripts from numerous DNA repair proteins including Dsup specific to the Hypsobioidea superfamily. Transcripts from the newly described protein TDR1 with homologs in both Hypsibioidea and Macrobiotoidea supefamilies are also strongly upregulated. They show that TDR1 transcription produces newly translated TDR1 protein, which can bind DNA and co-localizes with DNA in the nucleus. At higher concentrations, TDR appears to form aggregates with DNA, which might be relevant to a possible function in DNA damage repair. When introduced into human U2OS cells treated with bleomycin, TDR1 reduces the number of double-strand breaks as detected by gamma H2A spots. This paper will be of interest to the DNA repair field and to radiobiologists.

      Strengths:<br /> The paper is well-written and provides solid evidence of the upregulation of DNA repair enzymes after irradiation of tardigrades, as well as upregulation of the TRD1 protein. The reduction of gamma-H2A.X spots in U2OS cells after expression of TRD1 supports a role in DNA damage.

      Weaknesses:<br /> Genetic tools are still being developed in tardigrades, so there is no mutant phenotype to support a DNA repair function for TRD1, but this may be available soon.

    4. Reviewer #4 (Public Review):

      The manuscript brings convincing results regarding genes involved in the radio-resistance of tardigrades. It is nicely written and the authors used different techniques to study these genes. There are sometimes problems with the structure of the manuscript but these could be easily solved. According to me, there are also some points which should be clarified in the result sections. The discussion section is clear but could be more detailed, although some results were actually discussed in the results section. I wish that the authors would go deeper in the comparison with other IR-resistant eucaryotes. Overall, this is a very nice study and of interest to researchers studying molecular mechanisms of ionizing radiation resistance.

      I have two small suggestions regarding the content of the study itself.

      1) I think the study would benefit from the analyses of a gene tree (if feasible) in order to verify if TDR1 is indeed tardigrade-specific.<br /> 2) It would be appreciated to indicate the expression level of the different genes discussed in the study, using, for example, transcript per millions (TPMs).

    1. eLife assessment

      This study provides a valuable theoretical exploration of non-enzymatic sustained replication of RNA systems, in the parabolic growth regime of the evolution of putative primordial replicators. It provides solid evidence that parabolic growth mitigates the error threshold catastrophe, thus demonstrating another way in which this regime contributes to the maintenance of genetic diversity, although the justification of modeling choices and of parameter values is sometimes incomplete. The findings shed light on relevant evolutionary regimes of primordial replicators, with potential applicability to our understanding of the origin of life.

    2. Reviewer #1 (Public Review):

      Summary: Szathmary and colleagues explore the parabolic growth regime of replicator evolution. Parabolic growth occurs when nucleic acid strain separation is the rate-limiting step of the replication process which would have been the case for non-enzymatic replication of short oligonucleotide that could precede the emergence of ribozyme polymerases and helicases. The key result is that parabolic replication is conducive to the maintenance of genetic diversity, that is, the coexistence of numerous master sequences (the Gause principle does not apply). Another important finding is that there is no error threshold for parabolic replication except for the extreme case of zero fidelity.

      Strengths:<br /> I find both the analytic and the numerical results to be quite convincing and well-described. The results of this work are potentially important because they reveal aspects of a realistic evolutionary scenario for the origin of replicators.

      Weaknesses:<br /> There are no obvious technical weaknesses. It can be argued that the results represent an incremental advance because many aspects of parabolic replication have been explored previously (the relevant publications are properly cited). Obviously, the work is purely theoretical, experimental study of parabolic replication is due. In the opinion of this reviewer, though, these are understandable limitations that do not actually detract from the value of this work.

    3. Reviewer #2 (Public Review):

      Summary:

      A dominant hypothesis concerning the origin of life is that, before the appearance of the first enzymes, RNA replicated non-enzymatically by templating. However, this replication was probably not very efficient, due to the propensity of single strands to bind to each other, thus inhibiting template replication. This phenomenon, known as product inhibition, has been shown to lead to parabolic growth instead of exponential growth. Previous works have shown that this situation limits competition between alternative replicators and therefore promotes RNA population diversity. The present work examines this scenario in a model of RNA replication, taking into account finite population size, mutations, and differences in GC content. The main results are (1) confirmation that parabolic growth promotes diversity, but that when the population size is small enough, sequences least efficient at replicating may nevertheless go extinct; (2) the observation that fitness is not only controlled by the replicability of sequences, but also by their GC content ; (3) the observation that parabolic growth attenuates the impact of mutations and, in particular, that the error threshold to which exponentially growing sequences are subject can be exceeded, enabling sequence identity to be maintained at higher mutation rates.

      Strengths:

      The analyses are sound and the observations are intriguing. Indeed, it has been noted previously that parabolic growth promotes coexistence, its role in mitigating the error threshold catastrophe - which is often presented as a major obstacle to our understanding of the origin of life - had not been examined before.

      Weaknesses:

      Although all the conclusions are interesting, most are not very surprising for people familiar with the literature. As the authors point out, parabolic growth is well known to promote diversity (Szathmary-Gladkih 89) and it has also been noted previously that a form of Darwinian selection can be found at small population sizes (Davis 2000). Given that under parabolic growth, no sequence is ever excluded for infinite populations, it is also not surprising to find that mutations have a less dramatic exclusionary impact.

      A general weakness is the presentation of models and parameters, whose choices often appear arbitrary. Modeling choices that would deserve to be further discussed include the association of the monomers with the strands and the ensuing polymerization, which are combined into a single association/polymerization reaction (see also below), or the choice to restrict to oligomers of length L = 10. Other models, similar to the one employed here, have been proposed that do not make these assumptions, e.g. Rosenberger et al. Self-Assembly of Informational Polymers by Templated Ligation, PRX 2021. To understand how such assumptions affect the results, it would be helpful to present the model from the perspective of existing models.

      The values of the (many) parameters, often very specific, also very often lack justifications. For example, why is the "predefined error factor" ε = 0.2 and not lower or higher? How would that affect the results? Similarly, in equation (11), where does the factor 0.8 come from? Why is the kinetic constant for duplex decay reaction 1.15e10−8? Are those values related to experiments, or are they chosen because specific behaviors can happen only then?

      The choice of the model and parameters potentially impact the two main results, the attenuation of the error threshold and the role of GC content:

      Regarding the error threshold, it is also noted (lines 379-385) that it disappears when back mutations are taken into account. This suggests that overcoming the error threshold might not be as difficult as suggested, and can be achieved in several ways, which calls into question the importance of the particular role of parabolic growth. Besides, when the concentration of replicators is low, product inhibition may be negligible, such that a "parabolic replicator" is effectively growing exponentially and an error catastrophe may occur. Do the authors think that this consideration could affect their conclusion? Can simulations be performed?

      Regarding the role of the GC content, GC-rich oligomers are found to perform the worst but no rationale is provided. One may assume that it happens because GC-rich sequences are comparatively longer to release the product. However, it is also conceivable that higher GC content may help in the polymerization of the monomers as the monomers attach longer on the template (as described in Eq.(9)). This is an instance where the choice to pull into a single step the association and polymerization reactions are pulled into a single step independent of GC content may be critical. It would be important to show that the result arises from the actual physics and not from this modeling choice.

      Some more specific points that would deserve to be addressed:

      - Line 53: it is said that p "reflects how easily the template-reaction product complex dissociates". This statement is not correct. A reaction order p<1 reflects product inhibition, the propensity of templates to bind to each other, not slow product release. Product release can be limiting, yet a reaction order of 1 can be achieved if substrate concentrations are sufficiently high relative to oligomer concentrations (von Kiedrowski et al., 1991).

      - Population size is a key parameter, and a comparison is made between small (10^3) and large (10^5) populations, but without explaining what determines the scale (small/large relative to what?).

      - In the same vein, we might expect size not to be the only important parameter, but also concentration.

      - Lines 543-546: if understanding correctly, the quantitative result is that the error threshold rises from 0.1 in the exponential case to 0.196 in the parabolic. Are the authors suggesting that a factor of 2 is a significant difference?

      - Figure 3C: this figure shows no statistically significant effect?

      - line 542: "phase transition-like species extension (Figure 4B)": such a clear threshold is not apparent.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer 1 (Public Review):

      1. The name of the new method "inter-haplotype distance" is more confusing than helpful, as the haplotype information is not critical for implementing this method. First, the mutation spectrum is aggregated genome-wide regardless of the haplotypes where the mutations are found. Second, the only critical haplotype information is that at the focal site (i.e., the locus that is tested for association): individuals are aggregated together when they belong to the same "haplotype group" at the focal site. However, for the classification step, haplotype information is not really necessary: individuals can be grouped based on their genotypes at the given locus (e.g., AA vs AB). As the authors mentioned, this method can be potentially applied to other mutation datasets, where haplotype information may well be unavailable. I hope the authors can reconsider the name and remove the term "haplotype" (perhaps something like "inter-genotype distance"?) to avoid giving the wrong impression that haplotype information is critical for applying this method.

      We appreciate the reviewer's concern about the name of our method. The reviewer is correct that haplotype information is not critical for our method to work, and as a result we've decided to simply rename the approach to "aggregate mutation spectrum distance" (abbreviated AMSD). For simplicity, we refer to the method as IHD throughout our responses to reviewers, but the revised manuscript now refers to AMSD.

      1. The biggest advantage of the IHD method over QTL mapping is alleviation of the multiple testing burden, as one comparison tests for any changes in the mutation spectrum, including simultaneous, small changes in the relative abundance of multiple mutation types. Based on this, the authors claim that IHD is more powerful to detect a mutator allele that affects multiple mutation types. Although logically plausible, it is unclear under what quantitative conditions IHD can actually have greater power over QTL. It will be helpful to support this claim by providing some simulation results.

      This comment prompted us to do a more detailed comparison of IHD vs. QTL power under conditions that are more similar to those observed in the BXD cohort. While preparing the original manuscript, we assumed that IHD might have greater power than QTL mapping in a population like the BXDs because some recombinant inbred lines have accumulated many more germline mutations than others (see Figure 1 in Sasani et al. 2022, Nature). In a quantitative trait locus scan (say, for the fraction of C>A mutations in each line) each BXD's mutation data would be weighted equally, even if a variable number of mutations was used to generate the phenotype point estimate in each line.

      To address this, we performed a new series of simulations in which the average number of mutations per haplotype was allowed to vary. At the low end, some BXDs accumulated as few as 100 total germline mutations, while others have accumulated as many as 2,000. Thus, instead of simulating a mean number of mutations on each simulated haplotype, we allowed the mean number of mutations per haplotype to vary from N to 20N. By simulating a variable count of mutations on each haplotype, we could more easily test the benefits of comparing aggregate, rather than individual, mutation spectra between BXDs.

      In these updated simulations, we find that IHD routinely outperforms QTL mapping under a range of parameter choices (see Author Response image 1). Since IHD aggregates the mutation spectra of all haplotypes with either B or D alleles at each locus in the genome, the method is much less sensitive to individual haplotypes with low mutation counts. We include a mention of these updated simulations on lines 135-138 and describe the updated simulations in greater detail in the Materials and Methods (lines 705-715).

      Author response image 1.

      Power of IHD and QTL mapping on simulated haplotypes with variable counts of mutations. We simulated germline mutations on the specified number of haplotypes (as described in the manuscript) but allowed the total number of mutations per haplotype to vary by a factor of 20.

      1. The flip side of this advantage of IHD is that, when a significant association is detected, it is not immediately clear which mutation type is driving the signal. Related to this, it is unclear how the authors reached the point that "...the C>A mutator phenotype associated with the locus on chromosome 6", when they only detected significant IHD signal at rs46276051 (on Chr6), when conditioning on D genotypes at the rs27509845 (on Chr4) and no significant signal for any 1-mer mutation type by traditional mapping. The authors need to explain how they deduced that C>A mutation is the major source of the signal. In addition, beyond C>A mutations, can mutation types other than C>A contribute to the IHD signal at rs46276051? More generally, I hope the authors can provide some guidelines on how to narrow a significant IHD signal to specific candidate mutation type(s) affected, which will make the method more useful to other researchers.

      We thank the reviewer for pointing out this gap in our logic. We omitted specific instructions for narrowing down an IHD signal to specific mutation type(s) for a few reasons. First, this can be addressed using mutational signature analysis methods that are in widespread use. For example, upon identifying one or more candidate mutator loci, we can enter the mutation spectra of samples with each possible mutator genotype into a program (e.g., SigProfilerExtractor) to determine which combinations of mutation types occur proportionally more often in the genomes that harbor mutators (see Figure 3c in our manuscript). A second approach for narrowing down an IHD signal, highlighted in Figure 3a (and now described in the text of the Results section at lines 256-261), is to simply test which mutation type proportion(s) differ significantly between groups of samples with and without a candidate mutator (for example, with a Chi-square test of independence for each mutation type).

      Although this second approach incurs a multiple testing burden, the burden is offset somewhat by using IHD to identify mutator loci, rather than performing association tests for every possible mutation type to begin with. Although Figure 3a only shows the significant difference in C>A fraction among BXDs with different mutator locus genotypes, Figure 3-figure supplement 1 shows the complete set of 1-mer spectrum comparisons. It is possible that this second approach would not prove very useful in the case of a mutator with a “flat” signature (i.e., a mutator that slightly perturbs the rates of many different mutation types), but in our case it clearly shows which mutation type is affected.

      1. To account for differential relatedness between the inbred lines, the authors regressed the cosine distance between the two aggregate mutation spectra on the genome-wide genetic similarity and took the residual as the adjusted test metric. What is the value of the slope from this regression? If significantly non-zero, this would support a polygenic architecture of the mutation spectrum phenotype, which could be interesting. If not, is this adjustment really necessary? In addition, is the intercept assumed to be zero for this regression, and does such an assumption matter? I would appreciate seeing a supplemental figure on this regression.

      The reviewer raises a good question. We find that the slope of the "distance vs. genetic similarity" regression is significantly non-zero, though the slope estimate itself is small. A plot of cosine distance vs. genome-wide genetic similarity (using all BXDs) is shown below in Author response image 2:

      Author response image 2.

      Relationship between cosine distance and genetic similarity in the BXDs. As described in the Materials and Methods, we computed two values at each marker in the BXDs: 1) the cosine distance between the aggregate mutation spectra of BXDs with either B or D genotypes at the marker, and 2) the correlation between genome-wide D allele frequencies in BXDs with either B or D genotypes at the marker. We then regressed these two values across all genome-wide markers.

      This result indicates that if two groups of BXDs (one with D genotypes and one with B genotypes at a given locus) are more genetically similar, their mutation spectra are also more similar. Since the regression slope estimate is significantly non-zero (p < 2.2e-16), we believe that it's still worth using residuals as opposed to raw cosine distance values. This result also suggests that there may be a polygenic effect on the mutation spectrum in the BXDs.

      We have also generated a plot showing the cosine distance between the mutation spectra of every possible pair of BXDs, regressed against the genetic similarity between each of those pairs (Author Response image 3). Here, the potential polygenic effects on mutation spectra similarity are perhaps more obvious.

      Author response image 3.

      Pairwise cosine distance between BXD mutation spectra as a function of genetic similarity. We computed two values for every possible pair of n = 117 BXDs: 1) the cosine distance between the samples' individual 1-mer mutation spectra and 2) the correlation coefficient between the samples' genome-wide counts of D alleles.

      Private Comments

      1. It will also be useful to see how the power of IHD and QTL mapping depend on the allele frequency of the mutator allele and the sample size, as mutator alleles are likely rare or semi-rare in natural populations (such as the human de novo mutation dataset that the authors mentioned).

      This is another good suggestion. In general, we'd expect the power of both IHD and QTL mapping to decrease as a function of mutator allele frequency. At the same time, we note that the power of these scans should mostly depend on the absolute number of carriers of the mutator allele and less on its frequency. In the BXD mouse study design, we observe high frequency mutators but also a relatively small sample size of just over 100 individuals. In natural human populations, mutator frequencies might be orders of magnitude smaller, but sample sizes may be orders of magnitude larger, especially as new cohorts of human genomes are routinely being sequenced. So, we expect to have similar power to detect a mutator segregating at, say, 0.5% frequency in a cohort of 20,000 individuals, as we would to detect a mutator segregating at 50% frequency in a dataset of 200 individuals.

      To more formally address the reviewer's concern, we performed a series of simulations in which we simulated a population of 100 haplotypes. We assigned the same average number of mutations to each haplotype but allowed the allele frequency of the mutator allele to vary between 0.1, 0.25, and 0.5. The results of these simulations are shown in Author response image 4 and reveal that AMSD tends to have greater power than QTL mapping at lower mutator allele frequencies. We now mention these simulations in the text at lines 135-138 and include the simulation results in Figure 1-figure supplement 4.

      Author response image 4.

      Power of AMSD and QTL mapping on simulated haplotypes with variable marker allele frequencies. We simulated germline mutations on the specified number of haplotypes (as described in the manuscript), but simulated genotypes at the mutator allele such that "A" alleles were at the specified allele frequency.

      1. In the Methods section of "testing for epistasis between the two mutator loci", it will be helpful to explicitly lay out the model and assumptions in mathematical formulae, in addition to the R scripts. For example, are the two loci considered independent when their effects on mutation rate is multiplicative or additive? Given the R scripts provided, it seems that the two loci are assumed to have multiplicative effects on the mutation rate, and that the mutation count follows a Poisson distribution with mean being the mutation rate times ADJ_AGE (i.e., the mutation opportunity times the number of generations of an inbred line). However, this is not easily understandable for readers who are not familiar with R language. In addition, I hope the authors can be more specific when discussing the epistatic interaction between the two loci by explicitly saying "synergistic effects beyond multiplicative effects on the C>A mutation rate".

      The reviewer raises a good point about the clarity of our descriptions of tests for epistasis. We have now added a more detailed description of these tests in the section of the Materials and Methods beginning at line 875. We have also added a statement to the text at lines 289-291: “the combined effects of D genotypes at both loci exceed the sum of marginal effects of D genotypes at either locus alone.” We hope that this will help clarify the results of our tests for statistical epistasis.

      Reviewer 2 (Public Review):

      1. The main limitation of the approach is that it is difficult to see how it might be applied beyond the context of mutation accumulation experiments using recombinant inbred lines. This is because the signal it detects, and hence its power, is based on the number of extra accumulated mutations linked to (i.e. on the same chromosome as) the mutator allele. In germline mutation studies of wild populations the number of generations involved (and hence the total number of mutations) is typically small, or else the mutator allele becomes unlinked from the mutations it has caused (due to recombination), or is lost from the population altogether (due to chance or perhaps selection against its deleterious consequences).

      The reviewer is correct that as it currently exists, IHD is mostly limited to applications in recombinant inbred lines (RILs) like the BXDs. This is due to the fact that IHD assumes that each diploid sample harbors one of two possible genotypes at a particular locus and ignores the possibility of heterozygous genotypes for simplicity. In natural, outbreeding populations, this assumption will obviously not hold. However, as we plan to further iterate on and improve the IHD method, we hope that it will be applicable to a wider variety of experimental systems in the future. We have added additional caveats about the applicability of our method to other systems in the text at lines 545-550.

      Private Comments

      1. On p. 8, perhaps I've misunderstood but it's not clear in what way the SVs identified were relevant to the samples used in this dataset - were the founder strains assembled? Is there any chance that additional SVs were present, e.g. de novo early in the accumulation line?

      Our description of this structural variation resource could have been clearer. The referenced SVs were identified in Ferraj et al. (2023) by generating high-quality long read assemblies of inbred laboratory mice. Both DBA/2J and C57BL/6J (the founder strains for the BXD resource) were included in the Ferraj et al. SV callset. We have clarified our description of the callset at lines 247-248.

      It is certainly possible that individual BXD lines have accumulated de novo structural variants during inbreeding. However, these "private" SVs are unlikely to produce a strong IHD association signal (via linkage to one of the ~7,000 markers) at either the chromosome 4 or chromosome 6 locus, since we only tested markers that were at approximately 50% D allele frequency among the BXDs.

      1. On p. 13, comparing the IHD and QTL approaches, regarding the advantage of the former in that it detects the combined effect of multiple k-mer mutation types, would it not be straightforward to aggregate counts for different types in a QTL setting as well?

      The mutation spectrum is a multi-dimensional phenotype (6-dimensional if using the 1-mer spectrum, 96-dimensional if using the 3-mer spectrum, etc.). Most QTL mapping methods use linear models to test for associations between genotypes and a 1-dimensional phenotype (e.g., body weight, litter size). In the past, we used QTL mapping to test for associations between genotypes and a single element of the mutation spectrum (e.g., the rate of C>A mutations), but there isn't a straightforward way to aggregate or collapse the mutation spectrum into a 1dimensional phenotype that retains the information contained within the full 1-mer or 3-mer spectrum. For that reason, we developed the "aggregate mutation spectrum" approach, as it preserves information about the complete mutation spectrum in each group of strains.

      The reviewer is correct that we could also aggregate counts of different mutation types to, say, perform a QTL scan for the load of a specific mutational signature. For example, we could first perform standard mutational signature analysis on our dataset and then test for QTLs associated with each signature that is discovered. However, this approach would not solve the second problem that our method is designed to solve: the appropriate weighting of samples based on how many mutations they contain.

      1. pp. 15-16: In the discussion of how you account for relatedness between strains, I found the second explanation (on p. 16) much clearer. It would be interesting to know how much variance was typically accounted for by this regression?

      As shown in the response to Reviewer 1, genotype similarity between genotype groups (i.e., those with either D or B genotypes at a marker) generally explains a small amount of variance in the cosine distance between those groups (R2 ~= 0.007). However, since the slope term in that regression is significantly non-zero, correcting for this relationship should still improve our power relative to using raw cosine distance values that are slightly confounded by this relationship.

      1. Similarly, in the section on Applying the IHD method to the BXDs (pp. 18-19), I think this description was very useful, and some or all of this description of the experiment (and how the DNMs in it arise) could profitably be moved to the introduction.

      We appreciate the reviewer’s feedback about the details of the BXD cohort. Overall, we feel the description of the BXDs in the Introduction (at lines 65-73) is sufficient to introduce the cohort, though we now add some additional detail about variability in BXD inbreeding duration (at lines 89-93) to the Introduction as well, since it is quite relevant to some of the new simulation results presented in the manuscript.

      1. A really minor one, not sure if this is for the journal or the authors, but it would be much better to include both page and line numbers in any version of an article for review. My pdf had neither!

      We apologize for the lack of page/line numbers in the submitted PDF. We have now added line numbers to the revised version of the manuscript.

      Reviewer 3 (Public Review):

      1. Under simulated scenarios, the authors' new IHD method is not appreciably more powerful than conventional QTL mapping methods. While this does not diminish the rigor or novelty of the authors findings, it does temper enthusiasm for the IHD method's potential to uncover new mutators in other populations or datasets. Further, adaptation of this methodology to other datasets, including human trios or multigenerational families, will require some modification, which could present a barrier to broader community uptake. Notably, BXD mice are (mostly) inbred, justifying the authors consideration of just two genotype states at each locus, but this decision prevents out-of-the-box application to outbred populations and human genomic datasets. Lastly, some details of the IHD method are not clearly spelled out in the paper. In particular, it is unclear whether differences in BXD strain relatedness due to the breeding epoch structure are fully accounted for in permutations. The method's name - inter-haplotype distance - is also somewhat misleading, as it seems to imply that de novo mutations are aggregated at the scale of sub-chromosomal haplotype blocks, rather than across the whole genome.

      The reviewer raises very fair concerns. As mentioned in response to a question from Reviewer 1, we performed additional simulation experiments that demonstrate the improved power of IHD (as compared to QTL mapping) in situations where mutation counts are variable across haplotypes or when mutator alleles are present at allele frequencies <50% (see Author response image 2 and 3, as well as new supplements to Figure 1 in the manuscript). However, the reviewer is correct that the IHD method is not applicable to collections of outbred individuals (that is, individuals with both heterozygous and homozygous genotypes), which will limit its current applications to datasets other than recombinant inbred lines. We have added a mention of these limitations to the Results at lines 138-141 and the Discussion at lines 545-550, but plan to iterate on the IHD method and introduce new features that enable its application to other datasets. We have also explicitly stated that we account for breeding epochs in our permutation tests in the Materials and Methods at lines 670-671. Both Reviewer 1 and Reviewer 3 raised concerns about the name of our method, and we have therefore changed “inter-haplotype distance” to “aggregate mutation spectrum distance” throughout the manuscript.

      1. Nominating candidates within the chr6 mutator locus requires an approach for defining a credible interval and excluding/including specific genes within that interval as candidates. Sasani et al. delimit their focal window to 5Mb on either side of the SNP with the most extreme P-value in their IHD scan. This strategy suffers from several weaknesses. First, no justification for using 10 Mb window, as opposed to, e.g., a 5 Mb window or a window size delimited by a specific threshold of P-value drop, is given, rendering the approach rather ad hoc. Second, within their focal 10Mb window, the authors prioritize genes with annotated functions in DNA repair that harbor protein coding variants between the B6 and D2 founder strains. While the logic for focusing on known DNA repair genes is sensible, this locus also houses an appreciable number of genes that are not functionally annotated, but could, conceivably, perform relevant biological roles. These genes should not be excluded outright, especially if they are expressed in the germline. Further, the vast majority of functional SNPs are non-coding, (including the likely causal variant at the chr4 mutator previously identified in the BXD population). Thus, the author's decision to focus most heavily on coding variants is not well-justified. Sasani et al. dedicate considerable speculation in the manuscript to the likely identity of the causal variant, ultimately favoring the conclusion that the causal variant is a predicted deleterious missense variant in Mbd4. However, using a 5Mb window centered on the peak IHD scan SNP, rather than a 10Mb window, Mbd4 would be excluded. Further, SNP functional prediction accuracy is modest [e.g., PMID 28511696], and exclusion of the missense variant in Ogg1 due its benign prediction is potentially premature, especially given the wealth of functional data implicating Ogg1 in C>A mutations in house mice. Finally, the DNA repair gene closest to the peak IHD SNP is Rad18, which the authors largely exclude as a candidate.

      We agree that the use of a 10 Mb window, rather than an empirically derived confidence interval, is a bit arbitrary and ad hoc. To address this concern, we have implemented a bootstrap resampling approach (Visscher et al. 1996, Genetics) to define confidence intervals surrounding IHD peaks. We have added a description of the approach to the Materials and Methods at lines 609-622, but a brief description follows. In each of N trials (here, N = 10,000), we take a bootstrap sample of the BXD phenotype and genotype data with replacement. We then perform an IHD scan on the chromosome of interest using the bootstrap sample and record the position of the marker with the largest cosine distance value (i.e., the "peak" marker). After N trials, we calculate the 90% confidence interval of bootstrapped peak marker locations; in other words, we identify the locations of two genotyped markers, between which 90% of all bootstrap trials produced an IHD peak. We note that bootstrap confidence intervals can exhibit poor "coverage" (a measure of how often the confidence intervals include the "true" QTL location) in QTL mapping studies (see Manichaikul et al. 2006, Genetics), but feel that the bootstrap is more reasonable than simply defining an ad hoc interval around an IHD peak.

      The new 90% confidence interval surrounding the IHD peak on chromosome 6 is larger than the original (ad hoc) 10 Mbp window, now extending from around 95 Mbp to 114 Mbp. Notably, the new empirical confidence interval excludes Mbd4. We have accordingly updated our Results and Discussion sections to acknowledge the fact that Mbd4 no longer resides within the confidence interval surrounding the IHD peak on chromosome 6 and have added additional descriptions of genes that are now implicated by the 90% confidence interval. Given the uncertainties associated with using bootstrap confidence intervals, we have retained a brief discussion of the evidence supporting Mbd4 in the Discussion but focus primarily on Ogg1 as the most plausible candidate.

      The reviewer raises a valid concern about our treatment of non-DNA repair genes within the interval surrounding the peak on chromosome 6. We have added more careful language to the text at lines 219-223 to acknowledge the fact that non-annotated genes in the confidence interval surrounding the chromosome 6 peak may play a role in the epistatic interaction we observed.

      The reviewer also raises a reasonable concern about our discussions of both Mbd4 and Ogg1 as candidate genes in the Discussion. Since Mbd4 does not reside within the new empirical bootstrap confidence interval on chromosome 6 and given the strong prior evidence that Ogg1 is involved in C>A mutator phenotypes (and is in the same gene network as Mutyh), we have reframed the Discussion to focus on Ogg1 as the most plausible candidate gene (see lines 357360).

      Using the GeneNetwork resource, we also more carefully explored the potential effects of noncoding variants on the C>A mutator phenotype we observed on chromosome 6. We have updated the Results at lines 240-246 and the Discussion at line 439-447 to provide more evidence for regulatory variants that may contribute to the C>A mutator phenotype. Specifically, we discovered a number of strong-effect cis-eQTLs for Ogg1 in a number of tissues, at which D genotypes are associated with decreased Ogg1 expression. Given new evidence that the original mutator locus we discovered on chromosome 4 harbors an intronic mobile element insertion that significantly affects Mutyh expression (see Ferraj et al. 2023, Cell Genomics), it is certainly possible that the mutator phenotype associated with genotypes on chromosome 6 may also be mediated by regulatory, rather than coding, variation.

      1. Additionally, some claims in the paper are not well-supported by the author's data. For example, in the Discussion, the authors assert that "multiple mutator alleles have spontaneously arisen during the evolutionary history of inbred laboratory mice" and that "... mutational pressure can cause mutation rates to rise in just a few generations of relaxed selection in captivity". However, these statements are undercut by data in this paper and the authors' prior publication demonstrating that a number of candidate variants are segregating in natural mouse populations. These variants almost certainly did not emerge de novo in laboratory colonies, but were inherited from their wild mouse ancestors. Further, the wild mouse population genomic dataset used by the authors falls far short of comprehensively sampling wild mouse diversity; variants in laboratory populations could derive from unsampled wild populations.

      The reviewer raises a good point. In our previous publication (Sasani et al. 2022, Nature), we hypothesized that Mutyh mutator alleles had arisen in wild, outbreeding populations of Mus musculus, and later became fixed in inbred strains like DBA/2J and C57BL/6J. However, in the current manuscript, we included a statement about mutator alleles "spontaneously arising during the evolutionary history of inbred laboratory mice" to reflect new evidence (from Ferraj et al. 2023, Cell Genomics) that the mutator allele we originally identified in Mutyh may not be wild derived after all. Instead, Ferraj et al. suggest that the C>A mutator phenotype we originally identified is caused by an intronic mobile element insertion (MEI) that is present in DBA/2J and a handful of other inbred laboratory strains. Although this MEI may have originally occurred in a wild population of mice, we wanted to acknowledge the possibility that both the original Mutyh mutator allele, as well as the new mutator allele(s) we discovered in this manuscript, could have arisen during the production and inbreeding of inbred laboratory lines. We have also added language to the Discussion at lines 325-327 to acknowledge that the 67 wild mice we analyzed do not comprise a comprehensive picture of the genetic diversity present in wild-derived samples.

      We have added additional language to the Discussion at lines 349-357 in which we acknowledge that the chromosome 6 mutator allele might have originated in either laboratory or wild mice and elaborate on the possibility that mutator alleles with deleterious fitness consequences may be more likely to persist in inbred laboratory colonies.

      1. Finally, the implications of a discovering a mutator whose expression is potentially conditional on the genotype at a second locus are not raised in the Discussion. While not a weakness per se, this omission is perceived to be a missed opportunity to emphasize what, to this reviewer, is one of the most exciting impacts of this work. The potential background dependence of mutator expression could partially shelter it from the action of selection, allowing the allele persist in populations. This finding bears on theoretical models of mutation rate evolution and may have important implications for efforts to map additional mutator loci. It seems unfortunate to not elevate these points.

      We agree and have added additional discussion of the possibility that the C>A mutator phenotypes in the BXDs are a result of interactions between the expression of two DNA repair genes in the same base-excision network to the Discussion section at lines 447-449.

      Private comments

      1. The criteria used to determine or specify haplotype size are not specified in the manuscript. I mention this above but reiterate here as this was a big point of confusion for me when reading the paper. Haplotype length is important consideration for overall power and for proper extension of this method to other systems/populations.

      We may not have been clear enough in our description of our method, and as suggested by Reviewer 1, the name "inter-haplotype distance" may also have been a source of confusion. At a given marker, we compute the aggregate mutation spectrum in BXDs with either B or D genotypes using all genome-wide de novo mutations observed in those BXDs. Since the BXDs were inbred for many generations, we expect that almost all de novo germline mutations observed in an RIL are in near-perfect linkage with the informative genotypes used for distance scans. Thus, the "haplotypes" used in the inter-haplotype distance scans are essentially the lengths of entire genomes.

      1. Results, first paragraph, final sentence. I found the language here confusing. I don't understand how one can compute the cosine distance at single markers, as stated. I'm assuming cosine distance is computed from variants residing on haplotypes delimited by some defined window surrounding the focal marker?

      As discussed above, we aggregate all genome-wide de novo mutations in each group of BXDs at a given marker, rather than only considering DNMs within a particular window surrounding the marker. The approach is discussed in greater detail in the caption of Figure 1.

      1. Nominating candidates for the chr6 locus, Table 1. It would be worth confirming that the three prioritized candidates (Setmar, Ogg1, and Mbd4) all show germline expression.

      Using the Mouse Genome Informatics online resource, we confirmed that all prioritized candidate genes (now including Setmar and Ogg1, but not Mbd4) are expressed in the male and female gonads, and mention this in the Results at lines 228 and 233-234.

      1. Does the chr6 peak on the C>A LOD plot (Figure 2- figure supplement 1) overlap the same peak identified in the IHD scan? And, does this peak rise to significance when using alpha = 0.05? Given that the goal of these QTL scans is to identify loci that interact with the C>A mutator on chr4, it is reasonable to hypothesize that the mutation impact of epistatic loci will also be restricted to C>A mutations. Therefore, I am not fully convinced that the conservative alpha = 0.05/7 threshold is necessary.

      The chromosome 6 peak in Figure 2-figure supplement 1 does, in fact, overlap the peak marker we identified on chromosome 6 using IHD. One reason we decided to use a more conservative alpha of (0.05 / 7) is that we wanted these results to be analogous to the ones we performed in a previous paper (Sasani et al. 2022, Nature), in which we first identified the mutator locus on chromosome 4. However, the C>A peak does not rise to genome-wide significance if we use a less conservative alpha value of 0.05 (see Author response image 5). As discussed in our response to Reviewer 1, we find that QTL mapping is not as powerful as IHD when haplotypes have accumulated variable numbers of germline mutations (as in the BXDs), which likely explains the fact that the peak on chromosome 6 is not genome-wide significant using QTL mapping.

      Author response image 5.

      QTL scan for the fraction of C>A mutations in BXDs harboring D alleles at the locus near Myth QTL scan was performed at a genome-wide significance alpha of 0.05, rather than 0.05/7.

      1. Is there significant LD between the IHD peaks on chr6 and chr4 across the BXD? If so, it could suggest that the signal is driven by cryptic population structure that is not fully accounted for in the author's regression based approach. If not, this point may merit an explicit mention in the text as an additional validation for the authenticity of the chr6 mutator finding.

      This is a good question. We used the scikit-allel Python package to calculate linkage disequilibrium (LD) between all pairs of genotyped markers in the BXD cohort, and found that the two peak loci (on chromosomes 4 and 6) exhibit weak LD (r2 = 4e-5). We have added a mention of this to the main text of the Results at lines 212-213. That being said, we do not think the chromosome 6 mutator association (or the apparent epistasis between the alleles on chromosomes 4 and 6) could be driven by cryptic population structure. Unlike in human GWAS and other association studies in natural populations, there is no heterogeneity in the environmental exposures experienced by different BXD subpopulations. In humans, population structure can create spurious associations (e.g., between height and variants that are in LD and are most common in Northern Europe), but this requires the existence of a phenotypic gradient caused by genetic or environmental heterogeneity that is not likely to exist in the context of inbred laboratory mice that are all the progeny of the same two founder strains.

      1. Discussion, last sentence of the "Possible causal alleles..." section: I don't understand how the absence of the Mariner-family domain leads the authors to this conclusion. Setmar is involved in NHEJ, which to my knowledge is not a repair process that is expected to have a specific C>A mutation bias. I think this is grounds enough for ruling out its potential contributions, in favor of focusing on other candidates, (e.g., Mbd4 and Ogg1).

      The reviewer raises a good point. Our main reason for mentioning the absence of the Marinerfamily domain is that even if NHEJ were responsible for the C>A mutator phenotype, it likely wouldn't be possible for Setmar to participate in NHEJ without the domain. However, the reviewer is correct that NHEJ is not expected to cause a C>A mutation bias, and we have added a mention of this to the text as well at lines 379-382.

      1. Discussion, second to last paragraph of section "Mbd4 may buffer...": The authors speculate that reduced activity of Mbd4 could modulate rates of apoptosis in response to DNA damage. This leads to the prediction that mice with mutator alleles at both Mutyh and Mbd4 should exhibit higher overall mutation rates compared to mice with other genotypes. This possibility could be tested with the authors' data.

      The reviewer raises a good question. As mentioned above, however, we implemented a new approach to calculate confidence intervals surrounding distance peaks and found that this empirical approach (rather than the ad hoc 10-Mbp window approach we used previously) excluded Mbd4 from the credible interval. Although we still mention Mbd4 as a possible candidate (since it still resides within the 10 Mbp window), we have refactored the Discussion section to focus primarily on the evidence for Ogg1 as a candidate gene on chromosome 6.

      In any case, we do not observe that mice with mutator alleles at both the chromosome 4 and chromosome 6 loci have higher overall mutation rates compared to mice with other genotype combinations. This may not be terribly surprising, however, since C>A mutations only comprise about 10% of all possible mutations. Thus, given the variance in other 1-mer mutation counts, even a substantial increase in the C>A mutation rate might not have a detectable effect on the overall mutation rate. Indeed, in our original paper describing the Mutyh mutator allele (Sasani et al. 2022, Nature), we did not identify any QTL for the overall mutation rate in the BXDs and found that mice with the chromosome 4 mutator allele only exhibited a 1.11X increase in their overall mutation rates relative to mice without the mutator allele.

      1. Methods, "Accounting for BXD population structure": An "epoch-aware" permutation strategy is described here, but it is not clear when (and whether) this strategy is used to determine significance of IHD P-values.

      We have added a more explicit mention of this to the Methods section at lines 670-671, as we do, in fact, use the epoch-aware permutation strategy when calculating empirical distance thresholds.

      1. The simulation scheme employed for power calculations is highly specific to the BXD population. This is not a weakness, and perfectly appropriate to the study population used here. However, it does limit the transferability of the power analyses presented in this manuscript to other populations. This limitation may merit an explicit cautionary mention to readers who may aspire to port the IHD method over to their study system.

      This is true. Our simulation strategy is relatively simple and makes a number of assumptions about the simulated population of haplotypes (allele frequencies normally distributed around 0.5, expected rates of each mutation type, etc.). In response to concerns from Reviewer 1, we performed an updated series of simulations in which we varied some of these parameters (mutator allele frequencies, mean numbers of mutations on haplotypes, etc.). However, we have added a mention of the simulation approach's limitations and specificity to the BXDs to the text at lines 545-550.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Author response:

      Reviewer #1:

      The main objective of this study is to achieve the development of a synthetic autotroph using adaptive laboratory evolution. To accomplish this, the authors conducted chemostat cultivation of engineered E. coli strains under xylose-limiting conditions and identified autotrophic growth and the causative mutations. Additionally, the mutational mechanisms underlying these causative mutations were also explored with drill down assays. Overall, the authors demonstrated that only a small number of genetic changes were sufficient (i.e., 3) to construct an autotrophic E. coli when additional heterologous genes were added. While natural autotrophic microorganisms typically exhibit low genetic tractability, numerous studies have focused on constructing synthetic autotrophs using platform microorganisms such as E. coli. Consequently, this research will be of interest to synthetic biologists and systems biologists working on the development of synthetic autotrophic microorganisms. The conclusions of this paper are mostly well supported by appropriate experimental methods and logical reasoning. However, further experimental validation of the mutational mechanisms involving rpoB and crp would enhance readers' understanding and provide clearer insights, despite acknowledgement that these genes impact a broad set of additional genes. Additionally, a similar study, 10.1371/journal.pgen.1001186, where pgi was deleted from the E. coli genome and evolved to reveal an rpoB mutation is relevant to this work and should be placed in the context of the presented findings.

      We thank the reviewer for pointing this study out. It is very interesting that a mutation in a similar region in RpoB was observed in a related context of Pgi loss of activity. We have added a reference to this study in our text (Page 11, line 21).

      he authors addressed rpoB and crp as one unit and performed validation. They cultivated the mutant strain and wild type in a minimal xylose medium with or without formate, comparing their growth and NADH levels. The authors argued that the increased NADH level in the mutant strain might facilitate autotrophic growth. Although these phenotypes appear to be closely related, their relationship cannot be definitively concluded based on the findings presented in this paper alone. Therefore, one recommendation is to explore investigating transcriptomic changes induced by the rpoB and crp mutations. Otherwise, conducting experimental verification to determine whether the NADH level directly causes autotrophic growth would provide further support for the authors' claim.

      We appreciate the valuable comment and agree that the work was lacking such an analysis. Due to various reasons we have opted to use a proteomic approach which we feel fulfills the same purpose as the transcriptomics suggestion. We found interesting evidence in up-regulation of the fdoGH operon (comprising the native formate dehydrogenase O enzyme complex) which could indicate why there is an increase in NADH/NAD+ levels. We also hypothesize that this upregulation might be important more generally by drawing comparisons to natural chemo-autotrophs.

      Further experimental work (which we were not able to include in the current study) could help validate this link by deleting fdoGH and observing a loss of phenotype and, on the flip side, directly overexpressing the fdoGH operon and observing an increase in the NADH/NAD+ ratio. Indeed, if this overexpression were to prove sufficient for achieving an autotrophic phenotype without the mutations in the global transcription regulators, it would be a much more transparent design.

      We have added a section titled "Proteomic analysis reveals up-regulation of rPP cycle and formate-associated genes alongside down-regulation of catabolic genes" to the Results based on this analysis.

      • It would be beneficial to provide a more detailed explanation of the genetic background before the evolution stage, specifically regarding the ∆pfk and ∆zwf mutations. Furthermore, it is suggested to include a figure that provides a comprehensive depiction of the reductive pentose phosphate pathway and the bypass pathway. These will help readers grasp the concept of the "metabolic scaffold" as proposed by the authors.

      We agree with the reviewer that this could be helpful and we added a reference to the original paper Gleizer et al. 2019 that reported this design and also includes the relevant figure. We feel that the figure should not be added to the current manuscript as we continue to show that this design is not relevant in the context of the three reported mutations and such a figure could distract the attention of the reader from the main takeaways of the current study.

      • Despite the essentiality of the rpoB mutation (A1245V) to the autotrophic phenotype in the final strain, the inclusion of this mutation in step C1 does not appear to be justified. According to line 37 on page 3, the authors chose to retain the unintended mutation in rpoB based on its essentiality to the phenotype observed in other evolved strains. However, it should be noted that the mutations found in the evolved strain I, II, and III (P552T or D866E) were entirely different from the unintended mutation (A1245V) during genetic engineering. This aspect should be revised to avoid confusion among readers.

      Thank you for pointing this issue out, we added a clarification in the text (page 4 line 7) to avoid such confusion. We believe this point is much clearer now.

      The rpoB mutation which was shown to be essential in the study is indeed known to be common in ALE experiments in E. coli. Thus, I searched the different rpoB mutations in ALEdb in E. coli and I was able to find a similar mutation in a study where pgi was knocked out and then evolved. https://doi.org/10.1371/journal.pgen.1001186 This study seems very relevant given that pgi was a key mutation in the compact set of this work and the section "Modulation of a metabolic branch-point activity increased the concentration of rPP metabolites" informs that loss of function mutations in pgi were also found. The findings of this study should thus be put in the context of the previous related ALE study. I would recommend a similar analysis of crp mutations from studies in ALEdb to see if there are similar mutations in this gene as well or if this a unique mutation.

      We thank the reviewer for bringing this publication to our attention. We have addressed this observation in the main text (page 11 , line 21). We agree that it could have some connection to the pgi mutation yet we would not want to overspeculate about this role, as we also found the exact same mutation (A1245V) as an adaptation to higher temperature in another E. coli study (Tenaillon et al. 2012). We would like to bring forward the fact that the two reported rpoB mutations are always accompanied by another mutation with pleiotropic effects, either in the transcription factor Crp or in another RNA polymerase subunit (e.g RpoC). As such many epistatic effects could occur, one of which we also report here in page 13, line 18. In conclusion, although there could be a connection between the rpoB and pgi mutations, it could be a mere coincidence and the two mutations could exhibit two distinct roles in two distinct phenotypes.

      We also would like to thank the reviewer for suggesting a similar analysis for crp and found another mutation at a nearby residue with strong adaptive effects and mentioned it in our main text.

      Can the typical number of mutations found in a given ALE experiment be directly compared to those found in this study? It seems like a retrospective analysis of other ALE studies to show how many mutations typically occur in an ALE study and sets which were found to be causal to reproduce the phenotype of interest (through similar reverse engineering in the starting strain) should be presented. Again, the authors cite ALEdb which should provide direct numbers of mutations found in similar ALE studies with E. coli and one could then examine them to find sets of clearly causal mutations which recreate phenotypes of interest. Such an analysis would go a long way in supporting the main finding of "small number" of mutations.

      Discussion, page 12, line 42. "This could serve as a promising strategy for achieving minimally perturbed genotypes in future metabolic engineering attempts". There is an entire body of work around growth-coupled production which can be predicted and evolved with a genome-scale metabolic model and ALE. Thus, if this statement is going to be made, relevant studies should be cited and placed in context.

      The reviewer raises an important point which could indeed yield an interesting perspective. However, it would be difficult to perform this comparison in practice since many of the studies published on ALEdb have not isolated essential mutations from other mutation incidents nor have they determined the role of each mutation in the reported phenotypes. For example, many ALE trajectories include a hypermutator that greatly increases the number of irrelevant mutations and it is nearly impossible to sieve through them to find an essential set.

      Moreover, it is hard to compare the “level of difficulty” of achieving one phenotype over another and therefore feel that even though such an analysis would be insightful, it requires an amount of work which is outside the scope of this study.

      Finally, we would like to highlight our approach of using the iterative approach, isolating the relevant consensus mutations and repeating this process until no evolution process is required, we are not aware of prior studies that used this approach.

      We now clarified what we mean by "promising strategy" in the discussion in order to avoid any false claims about novelty (page 16 line 32): "Using metabolic growth-coupling as a temporary 'metabolic scaffold' that can be removed, could serve as a promising strategy for achieving minimally perturbed genotypes in future metabolic engineering attempts."

      Reviewer #2:

      Synthetic autotrophy of biotechnologically relevant microorganisms offers exciting chances for CO2 neutral or even CO2 negative production of goods. The authors' lab has recently published an engineered and evolved Escherichia coli strain that can grow on CO2 as its only carbon source. Lab evolution was necessary to achieve growth. Evolved strains displayed tens of mutations, of which likely not all are necessary for the desired phenotype.

      In the present paper the authors identify the mutations that are necessary and sufficient to enable autotrophic growth of engineered E. coli. Three mutations were identified, and their phenotypic role in enhancing growth via the introduced Calvin-Benson-Bassham cycle were characterized. It was demonstrated that these mutations allow autotrophic growth of E. coli with the introduced CBB cycle without any further metabolic intervention. Autotrophic growth is demonstrated by 13C labelling with 13C CO2, measured in proteinogenic amino acids. In Figures 2B and S1, the labeling data are shown, with an interval of the "predicted range under 13CO2".

      Here, the authors should describe how this interval was derived.

      The methodology is clearly described and appropriate.

      The present results will allow other labs to engineer E. coli and other microorganisms further to assimilate CO2 efficiently into biomass and metabolic products. The importance is evident in the opportunity to employ such strain in CO2 based biotech processes for the production of food and feed protein or chemicals, to reduce atmospheric CO2 levels and the consumption of fossil resources.

      Please describe in the methodology how the interval of the predicted range of 13C labeling was derived for Figures 2B and S1. Was it calculated by the dilution factor during 4 generations, or did you predict the label incorporation individually with a metabolic model?

      The text needs careful editing, some sentences are incomplete and there are frequent inconsistencies in writing metabolites and enzymes.

      P2L6: unclear sentence (incomplete?)

      P2L19: pastoris with lower case "p"

      P2L40: incomplete sentence

      P2L42: here, and at many other places, the writing of RuBisCO needs to be aligned. It is an abbreviation and should begin with a capital letter. Most commonly it is written as RuBisCO which I would suggest - please unify throughout the text.

      P3L3: formate dehydrogenase ... metabolites and enzymes with lower case letter. And, no hyphen here.

      P5L4: delete the : after unintentionally

      P6L16: carboxylation of RuBP (it is not CO2 that is carboxylated - if any, CO2 is carboxylating)

      P7L25: phosphoglucoisomerase (lower case)

      P8L5: in line

      P8L9: part of glycolysis/ ...

      P10L4: pentose phosphates (lower case, no hyphen).

      P10L4: all metabolites lower case

      P12L28: incomplete sentence

      P18L4: Escherichia coli in italics P18L15: Pseudomonas sp. in italics P18L16: ... promoter and with a strong ...

      P20, chapter Metabolomics: put the numbers of 12C and 13C in superscript P23L9: pentose phosphates ; all metabolites in lower case (as above) P23: all 12C and 13C with superscript numbers.

      Response to reviewer #2:

      We thank the reviewer for their comments, and for pointing out the need to clarify how we derived the predicted range of 13C labeling. We edited the text accordingly, and added the relevant calculation to the methods section (under the “13C Isotopic labeling experiment”). We would like to also thank the reviewer for the required text improvements, which were implemented. 

      Reviewer #3:

      The authors previously showed that expressing formate dehydrogenase, rubisco, carbonic anhydrase, and phosphoribulokinase in Escherichia coli, followed by experimental evolution, led to the generation of strains that can metabolise CO2. Using two rounds of experimental evolution, the authors identify mutations in three genes - pgi, rpoB, and crp - that allow cells to metabolise CO2 in their engineered strain background. The authors make a strong case that mutations in pgi are loss-of-function mutations that prevent metabolic efflux from the reductive pentose phosphate autocatalytic cycle. The authors also argue that mutations in crp and rpoB lead to an increase in the NADH/NAD+ ratio, which would increase the concentration of the electron donor for carbon fixation. While this may explain the role of the crp and rpoB mutations, there is good reason to think that the two mutations have independent effects, and that the change in NADH/NAD+ ratio may not be the major reason for their importance in the CO2-metabolising strain.

      We thank the reviewer for their comments and constructive feedback.

      We agree that there is probably a broader effect caused by the rpoB and crp mutations, besides the change in the NADH/NAD+ ratio. Hence, we performed a proteomics analysis, comparing the rpoB and crp mutations on a WT background to an autotrophic E.coli, searching for a mutual change in both strains compared to their "ancestors". We found up-regulation of rPP cycle and formate-associated genes, and a down-regulation of catabolic genes. We added a section dedicated to this matter under the title "Proteomic analysis reveals up-regulation of rPP cycle and formate-associated genes alongside down-regulation of catabolic genes".

      Specific comments:

      1. Deleting pgi rather than using a point mutation would allow the authors to more rigorously test whether loss-off-function mutants are being selected for in their experimental evolution pipeline. The same argument applies to crp.

      We appreciate this recommendation and indeed tried to delete pgi, but the genetic manipulation caused a knockout of other genes along with pgi (pepE, rluF, yjbD, lysC) so in the time available to us we cannot confidently determine whether the deletion alone is sufficient and can replace the mutation.

      Regarding crp, we do not think there is a reason to believe the mutation is a loss-of-function. In any case, the proteomics-based characterization of the crp mutation is now included in the SI.

      1. Page 10, lines 10-11, the authors state "Since Crp and RpoB are known to physically interact in the cell (26-28), we address them as one unit, as it is hard to decouple the effect of one from the other". CRP and RpoB are connected, but the authors' description of them is misleading. CRP activates transcription by interacting with RNA polymerase holoenzyme, of which the Beta subunit (encoded by rpoB) is a part. The specific interaction of CRP is with a different RNA polymerase subunit. The functions of CRP and RpoB, while both related to transcription, are otherwise very different. The mutations in crp and rpoB are unlikely to be directly functionally connected. Hence, they should be considered separately.

      Indeed, the fact that the proteins are interacting in the cell does not necessarily mean that the mutations are functionally connected. We therefore added as further justification in the new section:

      "As far as we know, the mutations in the Crp and RpoB genes affect the binding of the RNA polymerase complex to DNA and/or its transcription rates. Depending on the transcribed gene target, the effect of the two mutations might be additive, antagonistic, or synergistic. Since each one of these mutations individually (in combination with the pgi mutation) is not sufficient to achieve autotrophic growth, it is reasonable to assume that only the target genes whose levels of expression change significantly in the double-mutant are the ones relevant for the autotrophic phenotype”.

      In our proteomics analysis we considered each mutation separately. We found that in some cases the two mutations together have an additive effect, but in other cases we found that the two mutations together affect differently on the proteome, compared to the effect of each mutation alone. Since both mutations are essential to the phenotype, we decided to go with the approach of addressing the two mutations as one unit for the physiological and metabolic experiments.

      1. A Beta-galactosidase assay would provide a very simple test of CRP H22N activity. There are also simple in vivo and in vitro assays for transcription activation (two different modes of activation) and DNA-binding. H22 is not near the DNA-binding domain, but may impact overall protein structure.

      The mutation is located in “Activating Region 2”, interacting with RNA polymerase. We tried an in-vivo assay to determine the CRP H22N activity and got inconclusive results, we believe the proteomics analysis serves as a good method for understanding the global effect of the mutation.

      1. There are many high-resolution structures of both CRP and RpoB (in the context of RNA polymerase). The authors should compare the position of the sites of mutation of these proteins to known functional regions, assuming H22N is not a loss-of-function mutation in crp.

      We added a supplementary figure regarding the structural location of the two mutations, where it is demonstrated that crp H22N is located in a region interacting with the RNA polymerase and rpoB A1245V is located in proximity to regions interacting with the DNA.

      1. RNA-seq would provide a simple assay for the effects of the crp and rpoB mutations. While the precise effect of the rpoB mutation on RNA polymerase function may be hard to discern, the overall impact on gene expression would likely be informative.

      Indeed we agree that an omics approach to infer the global effect of these mutations is beneficial, we opted to use a proteomics approach and think it serves the purpose of clarifying the final, down-stream, effect on the cell.

      1. Page 2, lines 40-45, the authors should more clearly explain that the deletion of pfkA, pfkB and zwf was part of the experimental evolution strategy in their earlier work (Gleizer et al., 2019), and not a new strategy in the current study.

      We thank you for pointing this out, and edited the text accordingly.

      1. Page 3, line 27. Why did the authors compare the newly acquired mutants to only two mutants from the earlier work, not all 6?

      The 6 clones that were isolated in Gleizer et al., had 2 distinct mutation profiles. During the isolation process the lineage split into two groups. Three out of the 6 clones (clones 1,2,6) came from the same ancestor, and the other three (clones 3,4,5) came from another ancestor. Hence, these two groups shared almost all of their mutations (see Venn diagram). We decided to use for our comparison the representative with the highest number of mutations from each group (clones 5 and 6).

      Author response image 1.

    2. eLife assessment

      This is an important follow-up study to a previous paper in which the authors reconstituted CO2 metabolism (autotrophy) in Escherichia coli. Here, the authors define a set of just three mutations that promote autotrophy, highlighting the malleability of E. coli metabolism. The authors make a convincing case that mutations in pgi are loss-of-function mutations that prevent metabolic efflux from the reductive pentose phosphate autocatalytic cycle, and their data suggest possible roles of mutations in two other genes - crp and rpoB. This research will be particularly interesting to synthetic biologists, systems biologists, and metabolic engineers aiming to develop synthetic autotrophic microorganisms.

    3. Joint Public Review:

      The authors previously showed that expressing formate dehydrogenase, rubisco, carbonic anhydrase, and phosphoribulokinase in Escherichia coli, followed by experimental evolution, led to the generation of strains that can metabolise CO2. Using two rounds of experimental evolution, the authors identify mutations in three genes - pgi, rpoB, and crp - that allow cells to metabolise CO2 in their engineered strain background. The authors make a strong case that mutations in pgi are loss-of-function mutations that prevent metabolic efflux from the reductive pentose phosphate autocatalytic cycle. The authors also use proteomic analysis to probe the role of the mutations in crp and rpoB. While they do not reach strong conclusions about how these mutations promote autotrophic growth, they provide some clues, leading to valuable speculation.

      Comments on revised version:

      The authors have thoroughly addressed the reviewers' comments. The major addition to the paper is the proteomic analysis of single and double mutants of crp and rpoB. These new data provide clues as to the role of the crp and rpoB mutations in promoting autotrophic growth, which the authors discuss. The authors acknowledge that it will require additional experiments to determine whether the speculated mechanisms are correct. Nonetheless, the new data provide valuable new insight into the role of the crp and rpoB mutations. The authors have also expanded their description of the crp and rpoB mutations, making it clearer that the effects of these mutations are likely to be distinct, albeit with potential for overlap in function.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      Continuous attractor networks endowed with some sort of adaptation in the dynamics, whether that be through synaptic depression or firing rate adaptation, are fast becoming the leading candidate models to explain many aspects of hippocampal place cell dynamics, from hippocampal replay during immobility to theta sequences during run. Here, the authors show that a continuous attractor network endowed with spike frequency adaptation and subject to feedforward external inputs is able to account for several previously unaccounted aspects of theta sequences, including (1) sequences that move both forwards and backwards, (2) sequences that alternate between two arms of a T-maze, (3) speed modulation of place cell firing frequency, and (4) the persistence of phase information across hippocampal inactivations. I think the main result of the paper (findings (1) and (2)) are likely to be of interest to the hippocampal community, as well as to the wider community interested in mechanisms of neural sequences. In addition, the manuscript is generally well written, and the analytics are impressive. However, several issues should be addressed, which I outline below.

      Major comments:

      1. In real data, population firing rate is strongly modulated by theta (i.e., cells collectively prefer a certain phase of theta - see review paper Buzsaki, 2002) and largely oscillates at theta frequency during run. With respect to this cyclical firing rate, theta sweeps resemble "Nike" check marks, with the sweep backwards preceding the sweep forwards within each cycle before the activity is quenched at the end of the cycle. I am concerned that (1) the summed population firing rate of the model does not oscillate at theta frequency, and (2) as the authors state, the oscillatory tracking state must begin with a forward sweep. With regards to (1), can the authors show theta phase spike preference plots for the population to see if they match data? With regards to (2), can the authors show what happens if the bump is made to sweep backwards first, as it appears to do within each cycle?

      Thank you for raising these two important points. As the reviewer mentioned, experimental data does show that the population activity (e.g., calculated from the multiunit activity of tetrode recording) is strongly modulated by theta. While we mainly focused on sweeps of bump position, the populational activity also shows cyclical firing at the theta frequency (we added Fig. S7 to reflect this). This is also reflected in Fig. 4d where the bump height (representing the overall activity) oscillates at individual theta cycles. The underlying mechanism of cyclical population activity is as follows: the bump height is determined by the amount of input the neuron received (which located at the center of the bump). While the activity bump sweeps away from the external input, the center neuron receives less input from the external input, and hence the bump height is smaller. Therefore, not only the position sweeps around the external input, also the populational activity sweeps accordingly at the same frequency.

      For the “Nike” check marks: we first clarify that the reason for we observed a forward sweep preceding a backward sweep is that we always force the artificial animal runs from left to right on the track where we treated “right” as “forward”. At the beginning of simulation, the external input to the network moves towards right, and therefore the activity bump starts from a position behind the animals and sweeps towards right (forward). In general, this means that the bump will never do a backward sweep first in our model. However, this does not mean that the forward sweeps precede the backward sweeps in each theta cycle. Experimentally, to determine the “0” phase of theta cycles, the LFP signal in CA1 was first bandpass filtered and then Hilbert transformed to get the phase at each time point. Then, a phase histogram of multiunit activity in CA1 was calculated across locomotor periods; the phase of maximal CA1 firing on the histogram was then defined to be “0” phase. Since we didn’t model LFP oscillation in the attractor model, we cannot obtain a “0” phase reference like the experimental procedure. Instead, we define the “0” phase using the “population activity quenched time”, where phase “0” is defined as the minimum population activity during oscillation cycles, which happens when the activity bump is farthest from the animal position. In this way, we observed a “Nike” pattern where the activity bump begins with a backward sweep towards the external input and then followed up with a forward sweep. This was showed in Fig. 3b in the main text.

      1. I could not find the width of the external input mentioned anywhere in the text or in the table of parameters. The implication is that it is unclear to me whether, during the oscillatory tracking state, the external input is large compared to the size of the bump, so that the bump lives within a window circumscribed by the external input and so bounces off the interior walls of the input during the oscillatory tracking phase, or whether the bump is continuously pulled back and forth by the external input, in which case it could be comparable to the size of the bump. My guess based on Fig 2c is that it is the latter. Please clarify and comment.

      Thank you for your comment. We added the width of the external input to the text and table (see table 1). The bump is continuously pulled back and forth by the external input, as guessed by the reviewer. Experimentally, theta sweeps live roughly in the window of place field size. This is also true in our model, where theta sweep length depends on the strength of recurrent connections which determines the place field size. However, it also depends on the adaptation strength where large adaptation (more intrinsic mobility) leads to large sweep length. We presume that the reason for the reviewer had the guess that the bump may live within a window bounded by the external input is that we also set the width of external input comparable to the place field size (in fact, we don’t know how wide the external location input to the hippocampal circuits is in the biological brain, but it might be reasonable to set the external input width as comparable to the place field size, otherwise the location information conveyed to the hippocampus might be too dispersed). We added a plot in the SI (see Fig. S1) to show that when choosing a smaller external input width, but increasing the adaptation strength, the activity bump lives in a window exceeding the external input.

      We clarified this point by adding the following text to line 159

      “... It is noteworthy that the activity bump does not live within a window circumscribed by the external input bump (bouncing off the interior walls of the input during the oscillatory tracking state), but instead is continuously pulled back and forth by the external input (see Fig. S1)...”

      1. I would argue that the "constant cycling" of theta sweeps down the arms of a T-maze was roughly predicted by Romani & Tsodyks, 2015, Figure 7. While their cycling spans several theta cycles, it nonetheless alternates by a similar mechanism, in that adaptation (in this case synaptic depression) prevents the subsequent sweep of activity from taking the same arm as the previous sweep. I believe the authors should cite this model in this context and consider the fact that both synaptic depression and spike frequency adaptation are both possible mechanisms for this phenomenon. But I certainly give the authors credit for showing how this constant cycling can occur across individual theta cycles.

      Thank you for raising this point. We added the citation of Romani & Tsodyks’ model in the context (line 304). As the reviewer pointed out, STD can also act as a potential mechanism for this phenomenon. We also gave the Romani & Tsodyks’ model credit for showing how this “cycling spanning several theta cycles” can account for the phenomenon of slow (~1Hz) and deliberative behaviors, namely, head scanning (Johson and Redish, 2007). We commented this in line 302

      “... As the external input approaches the choice point, the network bump starts to sweep onto left and right arms alternatively in successive theta cycles (Fig. 5b and video 4; see also Romani and Tsodyks (2015) for a similar model of cyclical sweeps spanning several theta cycles) ...”

      1. The authors make an unsubstantiated claim in the paragraph beginning with line 413 that the Tsodyks and Romani (2015) model could not account for forwards and backwards sweeps. Both the firing rate adaptation and synaptic depression are symmetry breaking models that should in theory be able to push sweeps of activity in both directions, so it is far from obvious to me that both forward and backward sweeps are not possible in the Tsodyks and Romani model. The authors should either prove that this is the case (with theory or simulation) or excise this statement from the manuscript.

      Thank you for your comment. Our claim about the Tsodyks and Romani (2015) model's inability to account for both forward and backward sweeps was inappropriate. We made this claim based on our own implementation of the Tsodyks and Romani (2015) model and didn’t find a parameter region where the bump oscillation shows both forward and backward sweeps. It might be due to the limited parameter range we searched from. Additionally, we also note some difference in these two models, where the Romani & Tsodyks’ model has an external theta input to the attractor network which prevent the bump to move further. This termination may also prevent the activity bump to move backward as well. We didn’t consider external theta input in our model, and the bump oscillation is based on internal dynamics. We have deleted that claim from line 424 in the revised paper, and revised that portion of the manuscript by adding the following text to line 424:

      “…Different from these two models, our model considers firing rate adaptation to implement symmetry breaking and hence generates activity propagation. To prevent the activity bump from spreading away, their model considers an external theta input to reset the bump location at the end of each theta cycle, whereas our model generates an internal oscillatory state, where the activity bump travels back due to the attraction of external location input once it spreads too far away. Moreover, theoretical analysis of our model reveals how the adaptation strength affect the direction of theta sweeps, as well as offers a more detailed understanding of theta cycling in complex environments…”

      1. The section on the speed dependence of theta (starting with line 327) was very hard to understand. Can the authors show a more graphical explanation of the phenomenon? Perhaps a version of Fig 2f for slow and fast speeds, and point out that cells in the latter case fire with higher frequency than in the former?

      Thank you for raising this valuable point. There are two different frequencies showed in Fig. 6 a,c &d. One is the bump oscillation frequency, the other is the firing frequency of single cell. To help understanding, we included experimental results (from Geisler et al, 2007) in Fig. 6a. It showed that when the animal increases its running speed, the LFP theta only increases a bit (compare the blue curve and the green curve), while the single cell firing rate oscillation frequency increases more. In our model, we first demonstrated this result using unimodal cells which have only significant phase precession (Fig. 6c). While the animal runs through the firing field of a place cell, the firing phase will always precess for half a cycle in total. Therefore, faster running speed means that the half cycle will be accomplished faster, and hence single cell oscillation frequency will be higher. We also predicted the results on bimodal cells (Fig. 6d). To make this point clearer, we modified Fig. 6 by including experimental results, and rewrote the paragraph as follows (line 337):

      “…As we see from Fig. 3d and Fig. 4a&b, when the animal runs through the firing field of a place cell, its firing rate oscillates, since the activity bump sweeps around the firing field center of the cell. Therefore, the firing frequency of a place cell has a baseline theta frequency, which is the same as the bump oscillation frequency. Furthermore, due to phase precession, there will be a half cycle more than the baseline theta cycles as the animal runs over the firing field, and hence single cell oscillatory frequency will be higher than the baseline theta frequency (Fig. 6c). The faster the animal runs, the faster the extra half cycle is accomplished. Consequently, the firing frequency of single cells will increase more (a steeper slope in Fig. 6c red dots) than the baseline frequency.…”

      1. I had a hard time understanding how the Zugaro et al., (2005) hippocampal inactivation experiment was accounted for by the model. My intuition is that while the bump position is determined partially by the location of the external input, it is also determined by the immediate history of the bump dynamics as computed via the local dynamics within the hippocampus (recurrent dynamics and spike rate adaptation). So that if the hippocampus is inactivated for an arbitrary length of time, there is nothing to keep track of where the bump should be when the activity comes back online. Can the authors please explain more how the model accounts for this?

      Thank you for the comments. The easiest way to understand how the model account for the experimental result from Zugaro et al., (2005) is from Eq. 8:

      This equation says that the firing phase of a place cell is determined by the time the animal traveled through the place field, i.e., the location of the animal in the place field (with d0,c0 and vext all constant, and tf the only variable). No matter how long the hippocampus is inactivated (for an arbitrary length of time), once the external input is on, the new phase will continue from the new location of the animal in the place field. In other words, the peak firing phase keeps tracking the location of the animal. To make this point clearer, we modified Fig. 6 by including experimental results from Zugaro et al., (2005), and updated the description from line 356:

      “…Based on the theoretical analysis (Eq. 8), we see that the firing phase is determined by the location of the animal in the place field, i.e., vext tf. This means that the firing phase keeps tracking the animal's physical location. No matter how long the network is inactivated, the new firing phase will only be determined by the new location of the animal in the place field. Therefore, the firing phase in the first bump oscillation cycle after the network perturbation is more advanced than the firing phase in the last bump oscillation cycle right before the perturbation, and the amount of precession is similar to that in the case without perturbation (Fig. 6e) …”

      1. Can the authors comment on why the sweep lengths oscillate in the bottom panel of Fig 5b during starting at time 0.5 seconds before crossing the choice point of the T-maze? Is this oscillation in sweep length another prediction of the model? If so, it should definitely be remarked upon and included in the discussion section.

      We appreciate the reviewer’s valuable attention of this phenomenon. We thought it was a simulation artifact due to the parameter setting. However, we found that this phenomenon is quite robust to different parameter settings. While we haven’t found a theoretical explanation, we provide a qualitative explanation for it: this length oscillation frequency may be coupled with the time constant of the firing rate adaptation. Specifically, for a longer sweep, the neurons at the end of the sweep are adapted (inhibited), and hence the activity bump cannot travel that long in the next round. Therefore, the sweep length is shorter compared to the previous one. In the next round, the bump will sweep longer again because those neurons have recovered from the previous adaptation effect. We think this length oscillation is quite interesting and will check that in the experimental data in future works. We added this point in the main text as a prediction in line 321:

      “…We also note that there is a cyclical effect in the sweep lengths across oscillation cycles before the animal enters the left or right arm (see Fig. 5b lower panel), which may be interesting to check in the experimental data in future work (see Discussion for more details) …”

      And line 466:

      “…Our model of the T-maze environment showed an expected phenomenon that as the animal runs towards the decision point, the theta sweep length also shows cyclical patterns (Fig. 5b lower panel). An intuitive explanation is that, due to the slow dynamics in firing rate adaptation (with a large time constant compared to neural firing), a long sweep leads to an adaptation effect on the neurons at the end of the sweep path. Consequently, the activity bump cannot travel as far due to the adaptation effect on those neurons, resulting in a shorter sweep length compared to the previous one. In the next round, the activity bump exhibits a longer sweep again because those neurons have recovered from the previous adaptation effect. We plan to test this phenomenon in future experiments...”

      1. Perhaps I missed this, but I'm curious whether the authors have considered what factors might modulate the adaptation strength. In particular, might rat speed modulate adaptation strength? If so, would have interesting predictions for theta sequences at low vs high speeds.

      Thank you for raising up this important point. As we pointed out in line 279: “…the experimental data (Fernandez et al, 2017) has indicated that there is a laminar difference between unimodal cells and bimodal cells, with bimodal cells correlating more with the firing patterns of deep CA1 neurons and unimodal cells with the firing patterns of superficial CA1 neurons. Our model suggests that this difference may come from the different adaptation strengths in the two layers…”. Our guess is that the adaptation strength might reflect some physiological differences of place cells in difference pyramidal layers in the hippocampus. For example, place cells in superficial layer and deep layer receive different amount of input from MEC and sensory cortex, and such difference may contribute to a different effect of adaptation of the two populations of place cells.

      Our intuition is that animal’s running speed may not directly modulate the adaptation strength. Note that the effect of adaptation and adaptation strength are different. As the animal rapidly runs across the firing field, the place cell experiences a dense firing (in time), therefore the adaptation effect is large; as the animal slowly runs across the field, the place cell experiences sparse firing (in time), and hence the adaptation effect is small. In these two situations, the adaption strength is fixed, but the difference is due to the spike intervals.

      From Eq. 45-47, our theoretical analysis shows several predictions of theta sequences regarding to the parameters in the network. For example, how the sweep length varies when the running speed changes in the network. We simulated the network in both low running speed and high running speed (while kept all other parameters fixed), and found that the sweep length at low speed is larger than that at high speed. This is different from previously data, where they showed that the sweep length increases as the animal runs faster (Maurer et al, 2012). However, we are not sure how other parameters are changed in the biological brain as the animal runs faster, e.g., the external input strength and the place field width might also vary as confounds. We will explore this more in the future and investigate how the adaptation strength is modulated in the brain.

      1. I think the paper has a number of predictions that would be especially interesting to experimentalists but are sort of scattered throughout the manuscript. It would be beneficial to have them listed more prominently in a separate section in the discussion. This should include (1) a prediction that the bump height in the forward direction should be higher than in the backward direction, (2) predictions about bimodal and unimodal cells starting with line 366, (3) prediction of another possible kind of theta cycling, this time in the form of sweep length (see comment above), etc.

      Thank you for pointing this out. We updated the manuscript by including a paragraph in Discussion summarizing the prediction we made throughout the manuscript (from line 459):

      ‘’…Our model has several predictions which can be tested in future experiments. For instance, the height of the activity bump in the forward sweep window is higher than that in the backward sweep window (Fig. 4c) due to the asymmetric suppression effect from the adaptation. For bimodal cells, they will have two peaks in their firing frequency as the animal runs across the firing fields, with one corresponding to phase precession and the other corresponding to phase procession. Similar to unimodal cells, both the phase precession and procession of a bimodal cell after transient intrahippocampal perturbation will continue from the new location of the animal (Fig. S5). Interestingly, our model of the T-maze environment showed an expected phenomenon that as the animal runs towards the decision point, the theta sweep length also shows cyclical patterns (Fig. 5b lower panel). An intuitive explanation is that, due to the slow dynamics in firing rate adaptation (with a large time constant compared to neural firing), a long sweep leads to an adaptation effect on the neurons at the end of the sweep path. Consequently, the activity bump cannot travel as far due to the adaptation effect on those neurons, resulting in a shorter sweep length compared to the previous one. In the next round, the activity bump exhibits a longer sweep again because those neurons have recovered from the previous adaptation effect. We plan to test this phenomenon in future experiments…’

      Reviewer #2:

      In this work, the authors elaborate on an analytically tractable, continuous-attractor model to study an idealized neural network with realistic spiking phase precession/procession. The key ingredient of this analysis is the inclusion of a mechanism for slow firing-rate adaptation in addition to the otherwise fast continuous-attractor dynamics. The latter which continuous-attractor dynamics classically arises from a combination of translation invariance and nonlinear rate normalization. For strong adaptation/weak external input, the network naturally exhibits an internally generated, travelling-wave dynamics along the attractor with some characteristic speed. For small adaptation/strong external stimulus, the network recovers the classical externally driven continuous-attractor dynamics. Crucially, when both adaptation and external input are moderate, there is a competition with the internally generated and externally generated mechanism leading to oscillatory tracking regime. In this tracking regime, the population firing profile oscillates around the neural field tracking the position of the stimulus. The authors demonstrate by a combination of analytical and computational arguments that oscillatory tracking corresponds to realistic phase precession/procession. In particular the authors can account for the emergence of a unimodal and bimodal cells, as well as some other experimental observations with respect the dependence of phase precession/procession on the animal's locomotion. The strengths of this work are at least three-fold: 1) Given its simplicity, the proposed model has a surprisingly large explanatory power of the various experimental observations. 2) The mechanism responsible for the emergence of precession/procession can be understood as a simple yet rather illuminating competition between internally driven and externally driven dynamical trends. 3) Amazingly, and under some adequate simplifying assumptions, a great deal of analysis can be treated exactly, which allows for a detailed understanding of all parametric dependencies. This exact treatment culminates with a full characterization of the phase space of the network dynamics, as well as the computation of various quantities of interest, including characteristic speeds and oscillating frequencies.

      1. As mentioned by the authors themselves, the main limitation of this work is that it deals with a very idealized model and it remains to see how the proposed dynamical behaviors would persist in more realistic models. For example, the model is based on a continuous attractor model that assumes perfect translation-invariance of the network connectivity pattern. Would the oscillating tracking behavior persist in the presence of connection heterogeneities?

      Thank you for raising up this important point. Continuous attractor models have been widely used in modeling hippocampal neural circuits (see McNaughton et al, 2006 for a review), and researchers often assumed that there is a translation-invariance structure in these network models. The theta sweep state we presented in the current work is based on the property of the continuous attractor state. We do agree with the reviewer that the place cell circuit might not be a perfect continuous attractor network. For a simpler case where the connection weights are sampled from a Gaussian distribution around J_0, the theta sweep state still exhibit in the network (see Fig. S8 for an example). We also believe that the model can be extended to more complex cases where there exist over-representations of the “home” location and decision points in the real environment, i.e., the heterogeneity is not random, but has stronger connections near those locations, then the theta sweeps will be more biased to those location. However, if the heterogeneity breaks the continuous attractor state, the theta sweep state may not be presented in the network.

      1. Can the oscillating tracking behavior be observed in purely spiking models as opposed to rate models as considered in this work?

      Thank you for pointing this out. The short answer is yes. If the translation-invariance of the network connectivity pattern hold in the network, i.e., the spiking network is still a continuous attractor network (see the work from Tsodyks et al, 1996; and from Yu et al. "Spiking continuous attractor neural networks with spike frequency adaptation for anticipative tracking"), then the adaptation, which has the mathematical form of spike frequency adaptation (instead of firing rate adaptation), will still generate sweep state of the activity bump. We here chose the rate-based model because it is analytically tractable, which gives us a better understanding of the underlying dynamics. Many of the continuous attractor model related to spatial tuning cell populations are rate-based (see examples Zhang 1996; Burak & Fiete 2009). However, extending to spike-based model would be straightforward.

      1. Another important limitation is that the system needs to be tuned to exhibit oscillation within the theta range and that this tuning involves a priori variable parameters such as the external input strength. Is the oscillating-tracking behavior overtly sensitive to input strength variations?

      Thank you for pointing this out. In rodent studies, theta sequences are thought to result from the integration of both external inputs conveying sensory-motor information, and intrinsic network dynamics possibly related to memory processes (see Drieu and Zugaro 2019; Drieu at al, 2018). We clarified here that, in our modeling work, the generation of theta sweeps also depends on both the external input and the intrinsic dynamics (induced by the firing rate adaptation). Therefore, we don’t think the dependence of theta sweeps on the prior parameter – the external input strength – is a limitation here. We agreed with the reviewer that the system needs to be tuned to exhibit oscillation within the theta range. However, the parameter range of inducing oscillatory state is relatively large (see Fig. 2g in the main text). It will be interesting to investigate (and find experimental evidence) how the biological system adjusts the network configuration to implement the sweep state in network dynamics.

      1. The author mentioned that an external pacemaker can serve to drive oscillation within the desired theta band but there is no evidence presented supporting this.

      Thank you for pointing this out. We made this argument based on our initial simulation before but didn’t go into the details of that. We have deleted that argument in the discussion and rewrote that part. We will carry out more simulations in the future to verify if this is true. See our changes from line 418 to line 431:

      “... A representative model relying on neuronal recurrent interactions is the activation spreading model. This model produces phase precession via the propagation of neural activity along the movement direction, which relies on asymmetric synaptic connections. A later version of this model considers short-term synaptic plasticity (short-term depression) to implicitly implement asymmetric connections between place cells, and reproduces many other interesting phenomena, such as phase precession in different environments. Different from these two models, our model considers firing rate adaptation to implement symmetry breaking and hence generates activity propagation. To prevent the activity bump from spreading away, their model considers an external theta input to reset the bump location at the end of each theta cycle, whereas our model generates an internal oscillatory state, where the activity bump travels back due to the attraction of external location input once it spreads too far away. Moreover, theoretical analysis of our model reveals how the adaptation strength affect the direction of theta sweeps, as well as offers a more detailed understanding of theta cycling in complex environments...”

      1. A final and perhaps secondary limitation has to do with the choice of parameter, namely the time constant of neural firing which is chosen around 3ms. This seems rather short given that the fast time scale of rate models (excluding synaptic processes) is usually given by the membrane time constant, which is typically about 15ms. I suspect this latter point can easily be addressed.

      Thank you for pointing this out. The time constant we currently chose is relatively short as used in other studies. We conducted additional simulation by adjusting the time constant to 10ms, and the results reported in this paper remain consistent. Please refer to Fig S9 for the results obtained with a time constant of 10 ms.

      Reviewer #3:

      With a soft-spoken, matter-of-fact attitude and almost unwittingly, this brilliant study chisels away one of the pillars of hippocampal neuroscience: the special role(s) ascribed to theta oscillations. These oscillations are salient during specific behaviors in rodents but are often taken to be part of the intimate endowment of the hippocampus across all mammalian species, and to be a fundamental ingredient of its computations. The gradual anticipation or precession of the spikes of a cell as it traverses its place field, relative to the theta phase, is seen as enabling the prediction of the future - the short-term future position of the animal at least, possibly the future in a wider cognitive sense as well, in particular with humans. The present study shows that, under suitable conditions, place cell population activity "sweeps" to encode future positions, and sometimes past ones as well, even in the absence of theta, as a result of the interplay between firing rate adaptation and precise place coding in the afferent inputs, which tracks the real position of the animal. The core strength of the paper is the clarity afforded by the simple, elegant model. It allows the derivation (in a certain limit) of an analytical formula for the frequency of the sweeps, as a function of the various model parameters, such as the time constants for neuronal integration and for firing rate adaptation. The sweep frequency turns out to be inversely proportional to their geometric average. The authors note that, if theta oscillations are added to the model, they can entrain the sweeps, which thus may superficially appear to have been generated by the oscillations.

      1. The main weakness of the study is the other side of the simplicity coin. In its simple and neat formulation, the model envisages stereotyped single unit behavior regulated by a few parameters, like the two time constants above, or the "adaptation strength", the "width of the field" or the "input strength", which are all assumed to be constant across cells. In reality, not only assigning homogeneous values to those parameters seems implausible, but also describing e.g. adaptation with the simple equation included in the model may be an oversimplification. Therefore, it remains important to understand to what extent the mechanism envisaged in the model is robust to variability in the parameters or to eg less carefully tuned afferent inputs.

      Thank you for pointing out this important question. As the reviewer pointed out, there is an oversimplification in our model compared to the real hippocampal circuits (also see Q1 and Q3 from reviewer2). We also pointed out that in the main text line 504:

      “…Nevertheless, it is important to note that the CANN we adopt in the current study is an idealized model for the place cell population, where many biological details are missed. For instance, we have assumed that neuronal synaptic connections are translation-invariant in the space...”

      To investigate model robustness to parameter setting, we divided all the parameters into two groups. The first group of parameters determines the bump state, i.e., width of the field a, neuronal density ρ, global inhibition strength k, and connection strength J_0. The second group of parameters determines the bump sweep state (which based on the existence of the bump state), i.e., the input strength α and the adaptation strength m. For the first group of parameters, we refer the reviewer to the Method part: stability analysis of the bump state. This analysis tells us the condition when the continuous attractor state holds in the network (see Eq. 20, which guides us to perform parameter selection). For the second group of parameters, we refer the reviewer to Fig. 2g, which tells us when the bump sweep state occurs regarding to input strength and adaptation strength. When the input strength is small, the range of adaptation strength is also small (to get the bump sweep state). However, as the input strength increases, we can see from Fig. 2g that the range of adaptation strength (to get the bump sweep state) also linearly increases. Although there exists other two state in the network when the two parameters are set out of the colored area in Fig. 2g, the parameter range of getting sweep state is also large, especially when the input strength value is large, which is usually the case when the animal actively runs in the environment.

      To demonstrate how the variability affect the results, we added variability to the connection weights by sampling the connection weights from a Gaussian distribution around J_0 (this introduces heterogeneity in the connection structure). We found that the bump sweep state still holds in this condition (see Fig. S8 as well as Q1 from reviewer2). For the variability in other parameter values, the results will be similar. Although adding variability to these parameters will not bring us difficulty in numerical simulation, it will make the theoretical analysis much more difficult.

      1. The weak adaptation regime, when firing rate adaptation effectively moves the position encoded by population activity slightly ahead of the animal, is not novel - I discussed it, among others, in trying to understand the significance of the CA3-CA1 differentiation (2004). What is novel here, as far as I know, is the strong adaptation regime, when the adaptation strength m is at least larger than the ratio of time constants. Then population activity literally runs away, ahead of the animal, and oscillations set in, independent of any oscillatory inputs. Can this really occur in physiological conditions? A careful comparison with available experimental measures would greatly strengthen the significance of this study.

      Thank you for raising up this interesting question.

      Re: “…firing rate adaptation effectively moves the position encoded by population activity slightly ahead of the animal, is not novel…”, We added Treves, A (2004) as a citation when we introduce the firing rate adaptation in line 116

      To test if the case of “…the adaptation strength m is at least larger than the ratio of time constants…” could occur in physiological conditions, it requires a measure of the adaptation strength as well as the time constant of both neuron firing and adaptation effect. The most straightforward way would be in vivo patch clamp recording of hippocampal pyramidal neurons when the animal is navigating an environment. This will give us a direct measure of all these values. However, we don’t have these data to verify this hypothesis yet. Another possible way of measure these values is through a state-space model. Specifically, we can build a state space model (considering adaptation effect in spike release) by taking animal’s position as latent dynamics, and recorded spikes as observation, then infer the parameters such as adaptation strength and time constant in the slow dynamics. Previous work of state-space models (without firing rate adaptation) in analyzing theta sweeps and replay dynamics have been explored by Denovellis et al. (2021), as well as Krause and Drugowitsch (2022). We think it might be doable to infer the adaptation strength and adaptation time constant in a similar paradigm in future work. We thank the reviewer for pointing out that and hope our replies have clarified the concerns of the reviewer.

    2. eLife assessment

      This work represents an important contribution to computational neuroscience by providing a parsimonious model for spiking-phase precession/procession in the hippocampus. The proposed model, which relies on firing-rate adaptation, is able to capture many distinct experimental observations about phase precession/procession, such as forward and backward sweeps, as well as constant cycling of sweeps across different arms of a T-maze. The convincing evidence presented in support of this work relies on classical analytical and computational techniques about continuous attractor networks.

    3. Reviewer #1 (Public Review):

      Continuous attractor networks endowed with some sort of adaptation in the dynamics, whether that be through synaptic depression or firing rate adaptation, are fast becoming the leading candidate models to explain many aspects of hippocampal place cell dynamics, from hippocampal replay during immobility to theta sequences during run. Here, the authors show that a continuous attractor network endowed with spike frequency adaptation and subject to feedforward external inputs is able to account for several previously unaccounted aspects of theta sequences, including (1) sequences that move both forwards and backwards, (2) sequences that alternate between two arms of a T-maze, (3) speed modulation of place cell firing frequency, and (4) the persistence of phase information across hippocampal inactivations.

      I think the main result of the paper (findings (1) and (2)) are likely to be of interest to the hippocampal community, as well as to the wider community interested in mechanisms of neural sequences. In addition, the manuscript is generally well written and the analytics are impressive. However, several issues should be addressed, which I outline below.

      Major comments:

      In real data, population firing rate is strongly modulated by theta (i.e., cells collectively prefer a certain phase of theta - see review paper Buzsaki, 2002) and largely oscillates at theta frequency during run. With respect to this cyclical firing rate, theta sweeps resemble "Nike" check marks, with the sweep backwards preceding the sweep forwards within each cycle before the activity is quenched at the end of the cycle. I am concerned that (1) the summed population firing rate of the model does not oscillate at theta frequency, and (2) as the authors state, the oscillatory tracking state must begin with a forward sweep. With regards to (1), can the authors show theta phase spike preference plots for the population to see if they match data? With regards to (2), can the authors show what happens if the bump is made to sweep backwards first, as it appears to do within each cycle?

      I could not find the width of the external input mentioned anywhere in the text or in the table of parameters. The implication is that it is unclear to me whether, during the oscillatory tracking state, the external input is large compared to the size of the bump, so that the bump lives within a window circumscribed by the external input and so bounces off the interior walls of the input during the oscillatory tracking phase, or whether the bump is continuously pulled back and forth by the external input, in which case it could be comparable to the size of the bump. My guess based on Fig 2c is that it is the latter. Please clarify and comment.

      I would argue that the "constant cycling" of theta sweeps down the arms of a T-maze was roughly predicted by Romani & Tsodyks, 2015, Figure 7. While their cycling spans several theta cycles, it nonetheless alternates by a similar mechanism, in that adaptation (in this case synaptic depression) prevents the subsequent sweep of activity from taking the same arm as the previous sweep. I believe the authors should cite this model in this context and consider the fact that both synaptic depression and spike frequency adaptation are both possible mechanisms for this phenomenon. But I certainly give the authors credit for showing how this constant cycling can occur across individual theta cycles.

      The authors make an unsubstantiated claim in the paragraph beginning with line 413 that the Tsodyks and Romani (2015) model could not account for forwards and backwards sweeps. Both the firing rate adaptation and synaptic depression are symmetry breaking models that should in theory be able to push sweeps of activity in both directions, so it is far from obvious to me that both forward and backward sweeps are not possible in the Tsodyks and Romani model. The authors should either prove that this is the case (with theory or simulation) or excise this statement from the manuscript.

      The section on the speed dependence of theta (starting with line 327) was very hard to understand. Can the authors show a more graphical explanation of the phenomenon? Perhaps a version of Fig 2f for slow and fast speeds, and point out that cells in the latter case fire with higher frequency than in the former?

      I had a hard time understanding how the Zugaro et al., (2005) hippocampal inactivation experiment was accounted for by the model. My intuition is that while the bump position is determined partially by the location of the external input, it is also determined by the immediate history of the bump dynamics as computed via the local dynamics within the hippocampus (recurrent dynamics and spike rate adaptation). So that if the hippocampus is inactivated for an arbitrary length of time, there is nothing to keep track of where the bump should be when the activity comes back on line. Can the authors please explain more how the model accounts for this?

      Can the authors comment on why the sweep lengths oscillate in the bottom panel of Fig 5b during starting at time 0.5 seconds before crossing the choice point of the T-maze? Is this oscillation in sweep length another prediction of the model? If so, it should definitely be remarked upon and included in the discussion section.

      Perhaps I missed this, but I'm curious whether the authors have considered what factors might modulate the adaptation strength. In particular, might rat speed modulate adaptation strength? If so, would have interesting predictions for theta sequences at low vs high speeds.

      I think the paper has a number of predictions that would be especially interesting to experimentalists but are sort of scattered throughout the manuscript. It would be beneficial to have them listed more prominently in a separate section in the discussion. This should include (1) a prediction that the bump height in the forward direction should be higher than in the backward direction, (2) predictions about bimodal and unimodal cells starting with line 366, (3) prediction of another possible kind of theta cycling, this time in the form of sweep length (see comment above), etc.

    4. Reviewer #2 (Public Review):

      In this work, the authors elaborate on an analytically tractable, continuous-attractor model to study an idealized neural network with realistic spiking phase precession/procession. The key ingredient of this analysis is the inclusion of a mechanism for slow firing-rate adaptation in addition to the otherwise fast continuous-attractor dynamics. The latter continuous-attractor dynamics classically arises from a combination of translation invariance and nonlinear rate normalization.

      For strong adaptation/weak external input, the network naturally exhibits an internally generated, travelling-wave dynamics along the attractor with some characteristic speed. For small adaptation/strong external stimulus, the network recovers the classical externally driven continuous-attractor dynamics. Crucially, when both adaptation and external input are moderate, there is a competition with the internally generated and externally generated mechanisms leading to an oscillatory tracking regime. In this tracking regime, the population firing profile oscillates around the neural field tracking the position of the stimulus. The authors demonstrate by a combination of analytical and computational arguments that oscillatory tracking corresponds to realistic phase precession/procession. In particular the authors can account for the emergence of unimodal and bimodal cells, as well as some other experimental observations with respect the dependence of phase precession/procession on the animal's locomotion.

      The strengths of this work are at least three-fold: 1) Given its simplicity, the proposed model has a surprisingly large explanatory power of the various experimental observations. 2) The mechanism responsible for the emergence of precession/procession can be understood as a simple yet rather illuminating competition between internally driven and externally driven dynamical trends. 3) Amazingly, and under some adequate simplifying assumptions, a great deal of analysis can be treated exactly, which allows for a detailed understanding of all parametric dependencies. This exact treatment culminates with a full characterization of the phase space of the network dynamics, as well as the computation of various quantities of interest, including characteristic speeds and oscillating frequencies.

      As mentioned by the authors themselves, the main limitation of this work is that it deals with a very idealized model and it remains to see how the proposed dynamical behaviors would persists in more realistic models. For example, the model is based on a continuous attractor model that assumes perfect translation-invariance of the network connectivity pattern. Would the oscillating tracking behavior persist in the presence of connection heterogeneities? Another limitation is that the system needs to be tuned to exhibit oscillation within the theta range and that this tuning involves a priori variable parameters such as the external input strength. Is the oscillating-tracking behavior overtly sensitive to input strength variations? The author mentioned that an external pacemaker can serve to drive oscillation within the desired theta band but there is no evidence presented supporting this. A final and perhaps secondary limitation has to do with the choice of parameter, namely the time constant of neural firing which is chosen around 3ms. This seems rather short given that the fast time scale of rate models (excluding synaptic processes) is usually given by the membrane time constant, which is typically about 15ms. I suspect this latter point can easily be addressed.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors focused on genetic variability in relation to insulin resistance. They used genetically different lines of mice and exposed them to the same diet. They found that genetic predisposition impacts the overall outcome of metabolic disturbances. This work provides a fundamental novel view on the role of genetics and insulin resistance.

      Reviewer #2 (Public Review):

      Summary:

      In the present study, van Gerwen et al. perform deep phosphoproteomics on muscle from saline or insulin-injected mice from 5 distinct strains fed a chow or HF/HS diet. The authors follow these data by defining a variety of intriguing genetic, dietary, or gene-by-diet phosphor-sites that respond to insulin accomplished through the application of correlation analyses, linear mixed models, and a module-based approach (WGCNA). These findings are supported by validation experiments by intersecting results with a previous profile of insulin-responsive sites (Humphrey et al, 2013) and importantly, mechanistic validation of Pfkfb3 where overexpression in L6 myotubes was sufficient to alter fatty acid-induced impairments in insulin-stimulated glucose uptake. To my knowledge, this resource provides the most comprehensive quantification of muscle phospho-proteins which occur as a result of diet in strains of mice where genetic and dietary effects can be quantifiably attributed in an accurate manner. Utilization of this resource is strongly supported by the analyses provided highlighting the complexity of insulin signaling in muscle, exemplified by contrasts to the "classically-used" C57BL6/J strain. As it stands, I view this exceptional resource as comprehensive with compelling strength of evidence behind the mechanism explored. Therefore, most of my comments stem from curiosity about pathways within this resource, many of which are likely well beyond the scope of incorporation in the current manuscript. These include the integration of previous studies investigating these strains for changes in transcriptional or proteomic profiles and intersections with available human phospho-protein data, many of which have been generated by this group.

      Strengths:

      Generation of a novel resource to explore genetic and dietary interactions influencing the phospho-proteome in muscle. This is accompanied by the elegant application of in silico tools to highlight the utility.

      Weaknesses:

      Some specific aspects of integration with other data among the same fixed strains could be strengthened and/or discussed.

      Reviewer #3 (Public Review):

      Summary:

      The authors aimed to investigate how genetic and environmental factors influence the muscle insulin signaling network and its impact on metabolism. They utilized mass spectrometry-based phosphoproteomics to quantify phosphosites in the skeletal muscle of genetically distinct mouse strains in different dietary environments, with and without insulin stimulation. The results showed that genetic background and diet both affected insulin signaling, with almost half of the insulin-regulated phosphoproteome being modified by genetic background on an ordinary diet, and high-fat high-sugar feeding affecting insulin signaling in a strain-dependent manner.

      Strengths:

      The study uses state-of-the-art phosphoproteomics workflow allowing quantification of a large number of phosphosites in skeletal muscle, providing a comprehensive view of the muscle insulin signaling network. The study examined five genetically distinct mouse strains in two dietary environments, allowing for the investigation of the impact of genetic and environmental factors on insulin signaling. The identification of coregulated subnetworks within the insulin signaling pathway expanded our understanding of its organization and provided insights into potential regulatory mechanisms. The study associated diverse signaling responses with insulin-stimulated glucose uptake, uncovering regulators of muscle insulin responsiveness.

      Weaknesses:

      Different mouse strains have huge differences in body weight on normal and high-fat high-sugar diets, which makes comparison between the models challenging. The proteome of muscle across different strains is bound to be different but the changes in protein abundance on phosphosite changes were not assessed. Authors do get around this by calculating 'insulin response' because short insulin treatment should not affect protein abundance. The limitations acknowledged by the authors, such as the need for larger cohorts and the inclusion of female mice, suggest that further research is needed to validate and expand upon the findings.

      Reviewer #1 (Recommendations For The Authors):

      I would suggest further discussion of the potential differences between males and females of the various strains.

      In the revised manuscript we have included a more detailed discussion of the potential differences between male and female mice in the "Limitations of this study" section on lines 455-459. In particular, a landmark study of HFD-fed inbred mouse strains found that insulin sensitivity, as inferred from the proxy HOMA-IR, was affected by interactions between sex and strain despite generally being greater in female mice (10.1016/j.cmet.2015.01.002). Furthermore, a recent phosphoproteomics study of human induced pluripotent stem-cell derived myoblasts identified groups of insulin-regulated phosphosites affected by donor sex, and by interactions between sex and donor insulin sensitivity (10.1172/JCI151818). Based on these results, we anticipate that both soleus insulin sensitivity and phoshoproteomic insulin responses would differ between male and female mice through interactions with strain and diet, adding yet another layer of complexity to what we observed in this study. This will be an important avenue for future research to explore.

      Reviewer #2 (Recommendations For The Authors):

      The following are comments to authors - many, if not all are suggestions for extended discussion and beyond the scope of the current elegant study.

      In the discussion section (line 428) the authors make a key point in that the genetic, dietary, and interacting patterns of variation of Phospho-sites could be due to changes in total protein and/or transcript levels across strains. For example, given the increased expression of Pfkfb3 was sufficient to impact glucose uptake, suggesting that the transcript levels of the gene might also show a similar correlation with insulin responsiveness as in Fig 6b. Undoubtedly, phospho-proteomics analyses will provide unique information on top of more classical omics layers and uncover what would be an important future direction. Therefore, I would suggest adding to the discussion some guidance on performing similar applications to datasets from, at least some, of the strains used where RNA-seq and proteomics are available.

      We thank the reviewer for this suggestion. To address this, we mined recently published total proteomics data collected from soleus muscles of seven CHOW or HFD-fed inbred mouse strains, three of which were in common with our study (C57Bl6J, BXH9, BXD34; 10.1016/j.cmet.2021.12.013). In this study ex vivo soleus glucose uptake was measured and correlation analysis was performed, so we directly extracted the resulting glucose uptake-protein associations and compared them to the glucose uptake-phosphoprotein associations identified in our study. Indeed, we found that only a minority of proteins correlated at both the phosphosite and total protein levels, highlighting the utility of phosphoproteomics to provide orthogonal information to more classical omics layers. We have included this analysis in lines 303-311.

      Relevant to this, the authors might want to consider depositing scripts to analyze some aspects of the data (ex. WGCNA on P-protein data or insulin-regulated anova) in a repository such as github so that these can be applied easily to other datasets.

      We refer the reviewer to the section "Code availability" on lines 511-513, where we deposited all code used to analyse the data on github.

      In contrast to the points above, I feel that the short time-course of insulin stimulation was one important aspect of the experimental design that was not emphasized enough as a strength. It was mentioned as a limitation in that other time points could provide more info, yes. But given that the total abundance of proteins and transcripts likely doesn't shift tremendously in this time frame, this provides an important appeal to the analysis of phosphor-proteomic data. I would suggest highlighting the insulin-stimulated response analysis here as something that leverages the unique nature of phosphoproteomics.

      We are grateful for the reviewer's positivity regarding this aspect of our experimental design. We have reiterated the value of the 10min insulin stimulation - that it temporally segregates phosphoproteomic and total proteomic changes - in the "Limitations of this study" section on lines 477-481.

      While I recognize the WGCNA analysis as an instrumental way to highlight global patterns of phospo-peptide abundance co-regulation, the analysis currently seems somewhat underdeveloped. For example, Fig 5f-h shows a lot of overlap between kinase substrates and pathways among modules. Clearly, there are informative differences based on the intersection with Humphries 2013 and the correlation with Pfkbp3. To highlight the specific membership of these modules, most people rank-order module members by correlation with eigen-gene (or P-peptide) and then perform pathway enrichments on these. Alternatively, it looks like all data was used to generate modules across conditions. One consideration would be to perform WGCNA on relevant comparison data separately (ex. chow mice only and HFHS only) and then compare modules whose membership is retained or shift between the two. Or even look at module representation for genes that show large correlations with insulin-responsiveness. This might also be a good opportunity to suggest readers intersect module members with muscle eQTLs which colocalize to glucose or insulin to prioritize some potential key drivers.

      We thank the reviewer for their helpful suggestions, which we feel have substantially improved the WGCNA analysis. To probe specific functional differences between subnetworks, we performed rank-based enrichment using phosphopeptide module membership scores. Interestingly, this did reveal pathways that were enriched only in certain modules. However, we found that after p-value adjustment, virtually all enriched pathways lost statistical significance, hence we interpret these results as suggestive only. We have made this analysis available to readers in Fig S4b-d and lines 263-265: "To further probe functional differences we analysed phosphopeptide subnetwork membership scores, which revealed additional pathways enriched in individual subnetworks. However, these results were not significant after p-value adjustment and hence are suggestive only (Fig. S4b-d)". We also visualised module representation for glucose-uptake correlated phosphopeptides. This agreed with our existing analyis in Fig. 6f, where the eigenpeptides of modules V and I were correlated with glucose uptake (Fig. 6f). We have incorporated this new analysis in Fig. S6b-c and lines 324-325: "Examining the subnetwork membership scores for glucose-uptake correlated phosphopeptides also revealed a preference for clusters V and I, supporting this analysis (Fig. S6b-c)." Finally, in the discussion we have presented the integration of genetic data, such as muscle-specific eQTLs, as a future direction (lines 398-401): "Alternatively, one could overlap subnetworks with genetic information, such as genes associated with glucose homeostasis and other metabolic traits in human GWAS studies, or muscle-specific eQTLs or pQTLs genetically colocalised with similar traits, to further prioritise subnetwork-associated phenotypes and identify potential drivers within subnetworks."

      Have the authors considered using their heritability and GxE estimated for module eigenpeptides? To my knowledge, this has never been performed and might provide some informative information as the co-regulated P-protein structure occurs as a result of relevant contexts.

      In the revised manuscript we have now analysed eigenpeptides with the same statistical tests used to identify Strain and Diet effects in insulin-regulated phosphopeptides. We have displayed the statistical results in Fig. S4a, and have explicitly mentioned examples of StrainxDiet effects on lines 245-247: "For example, HFD-feeding attenuated the insulin response of subnetwork I in CAST and C57Bl6J strains (t-test adjusted p = 0.0256, 0.0365), while subnetwork II was affected by HFD-feeding only in CAST and NOD (Fig. 5e, Fig. S4a, t-test adjusted p = 0.00258, 0.0256)."

      The integration of modules with adipocyte phosphoproteomic data from the authors 2013 Cell metab paper seems like an important way to highlight the integration of this resource to define critical cellular signaling mechanisms. To assess the conservation of signaling mechanisms and relationships to additional key contexts (ex. exercise), the intersection of the insulin-stimulated P-peptides with human datasets generated by this group (ex. cell metab 2015, nature biotech 2022) seems like an obvious future direction to prioritize targets. Figure S3B shows a starting point for these types of integrations.

      To demonstrate the value of integrating our results with related phosphoproteomics data, we have incorporated the reviewer's advice of comparing insulin-regulated phosphosites to exercise-regulated phosphosites from Needham et. Nature Biotech 2022 and Hoffman et al. Cell Metabolism 2015. We identified a small subset of commonly regulated phosphosites (8 across all three studies). Given insulin and exercise both promote GLUT4 translocation, these sites may represent conserved regulatory mechanisms. This analysis is presented in Fig. S3d, Table S2, and lines 129-135: "In addition to insulin, exercise also promotes GLUT4 translocation in skeletal muscle. We identified a small subset of phosphosites regulated by insulin in this study that were also regulated by exercise in two separate human phosphoproteomics studies (Fig. S3d, Table S2, phosphosites: Eef2 T57 and T59, Mff S129 and S131, Larp1 S498, Tbc1d4 S324, Svil S300, Gys1 S645), providing a starting point for exploring conserved signalling regulators of GLUT4 translocation."

      For the Pfkfb3 overexpression system, are there specific P-peptides that are increased/decreased upon insulin stimulation? This might be an interesting future direction to mention in order to link signaling mechanisms.

      We assessed whether canonical insulin signalling was affected by Pfkfb3 overexpression by immunoblotting. Insulin-stimulated phosphorylation of Akt S473, Akt T308, Gsk3a/b S21/S9, and PRAS40 T246 differed little across conditions, with only a weak, statistically insignificant trend towards increased pT308 Akt, pS21/S9 Gsk3a/b, and pT246 PRAS40 in palmitate-treated Pfkfb3-overexpressing cells. Hence, as the reviewer has suggested, an interesting future direction will be to perform phosphoproteomics to characterise more deeply the effects of palmitate and Pfkfb3 overexpression on insulin signalling. We have modified the manuscript to reflect these findings and suggested future directions on lines 362-365: "immunoblotting of canonical insulin-responsive phosphosites on Akt and its substrates GSK3α/β and PRAS40 revealed minimal effect of palmitate treatment and Pfkfb3 overexpression (Fig. S7e-f), hence more detailed phosphoproteomics studies are needed to clarify whether Pfkfb3 overexpression restored insulin action by modulating insulin signalling."

      Reviewer #3 (Recommendations For The Authors):

      This remarkable contribution by the esteemed research group has significantly enriched the field of metabolism. The extensive dataset, intertwined with a sophisticated research design, promises to serve as an invaluable resource for the scientific community. I offer a series of suggestions aimed at potentially elevating the manuscript to an even higher standard.

      Mouse Weight Variation and Correlation Analysis: The pronounced variances in mouse body weights pose a challenge to meaningful comparisons (Fig S1). Could the disparities in the phosphoproteome between basal and insulin-stimulated conditions be attributed to differences in body weight? Consider performing a correlation analysis. Furthermore, does the phosphoproteome of these mouse strains evolve comparably over time? Do these mice age similarly? Kindly incorporate this information.

      We thank the reviewer for the suggested analysis. We found there was a significant correlation between the phosphopeptide insulin response and mouse body weight, either in CHOW-fed mice (Strain effects) or across both diets (Diet effects), for ~ 25% of phosphopeptides that exhibited a Strain or Diet effect. Hence, while there is a clear effect of body weight on insulin signalling, this influences only a small proportion of the entire insulin-responsive phosphoproteome. Notably, insulin was dosed according to mouse lean mass to ensure equivalent dosage received by the soleus muscle, hence any insulin signalling differences associated with body weight are unlikely due to differences in dosing. As the reviewer also alludes to, different strains could have different lifespans. This may result in mice having different biological ages at the time of experimentation, and this in turn could influence insulin signalling. This possibility is challenging to assess in a quantitative manner because lifespan data is not available for most strains used. However, it is worth noting that female CAST mice live 77% as long as C57Bl6J mice (median age of 671 vs 866 (10.1073/pnas.1121113109); data is not available for male mice nor the other three strains), and substantial differences in insulin signalling were observed between these two strains. Ultimately, regardless of whether body weight and/or lifespan altered insulin signalling, such differences would still have arisen solely from the distinct genetic backgrounds and diets of the mice, hence we believe they are meaningful results that should not be dismissed. We have added this analysis to the revised manuscript in the "Limitations of this study" section on lines 471-477: "We were also unable to determine the extent to which signalling changes arose from muscle-intrinsic or extrinsic factors. For instance, body weight varied substantially across mice and correlated significantly with 25% of Strain and Diet-affected phosphopeptides (Fig. S8c), suggesting obesity-related systemic factors likely impact a subset of the muscle insulin signalling network. Furthermore, genetic differences in lifespan could alter the “biological age” of different strains and their phosphoproteomes, though we could not assess this possibility since lifespan data are not available for most strains used. "

      Soleus Muscle Data and Bias Considerations: Were measurements taken for lean mass and soleus muscle weight? If so, please present the corresponding data.

      Measurements for lean mass and the mass of soleus muscle after grinding have been including in Supplementary Figure S1 (panels c-d)

      As outlined in the methods section, the variation in protein yield from the soleus muscle across each strain is substantial. Notably, the distinct peptide input for phospho enrichment introduces biases, given that muscles with lower input may exhibit reduced identification (Fig S2). This bias might also manifest in the PCA plot (S2C). Ideally, adopting a uniform protein/peptide input would have been advantageous. Address this concern and contemplate moving the PCA plot to the main figure. It's prudent to reconsider the sentence stating, "Samples from animals of the same strain and diet were highly correlated and generally clustered together, implying the data are highly reproducible (Fig. S2b-d)," particularly if the input and total IDs were not matched.

      The reviewer highlights an important point. As the reviewer comments, it would have been our preference to use the same amount of protein material for all samples. However, as there was a wide range in the mass of the soleus muscle across mouse strains (in particular much lower in CAST mice), it was not appropriate to use the same amount of material for all strains. This is indeed evident in the PCA plot (Figure S2c), whereby samples cluster in the second component (PC2) based on the amount of protein material. However, this clustering is not observed in the hierarchical clustering (Figure S2d), and nor are the number of phosphopeptides quantified in each sample substantially impacted by these differences (Figure S2a) as implied by the reviewer. Indeed, the number of phosphopeptides quantified did not noticeably vary when comparing BXH9/BXD34 to C57Bl6J/NOD despite 32.3% less material used, and there were only 12.4% fewer phosphopeptides (average #13891.56 vs 15851.29) in CAST compared to C57Bl6J/NOD strains, despite 51.8% less material used. To further emphasise the minimal effect that input material had on phosphopeptide quantification, we have additionally plotted the number of phosphopeptides quantified in each sample following the filtering steps we employed prior to statistical analysis of the dataset (i.e. ANOVA). This plot (Author response image 1) shows that there is even less variation in the number of quantified phosphopeptides between strains, with only 9.12% fewer phosphopeptides quantified and filtered on average in CAST compared to C57Bl6J/NOD (average #9026.722 vs 9932.711). From a quantitative perspective, in both the PCA (Principal Component 1) and hierarchical clustering analyses, samples are additionally clustered by individual strains, and in the latter they also cluster generally by diet, implying that biological variation between samples remains the primary variation captured in our data. We have modified the manuscript so that these observations are forefront (lines 103-106): "Furthermore, while different strains clustered by the amount of protein material used in the second component of the PCA (Figure S2c), samples from animals of the same strain and diet were highly correlated and generally clustered together, indicating that our data are highly reproducible". To ensure that readers are aware of our decision to alter protein starting material and its implications, we have moved the description of this from the methods to the results, and we have highlighted the impact on phosphopeptide quantification in CAST mice (lines 99-103): "Due to the range in soleus mass across strains (Fig. S1D) we altered the protein material used for EasyPhos (C57Bl6J and NOD: 755 µg, BXH9 and BXD34: 511 µg, CAST: 364 µg), though phosphopeptide quantification was minimally affected, with only 12.4% fewer phosphopeptides quantified on average in CAST compared to the C57lB6J/NOD (average 13891.56 vs 15851.29 Fig. S2a)."

      Author response image 1.

      Phosphopeptide quantification following filtering. a) The number of phosphopeptides quantified in each sample after filtering prior to statistical analysis.

      Phosphosite Quantification Filtering: The quantified phosphosites have been dropped from 23,000 to 10,000. Could you elucidate the criteria employed for filtering and provide a concise explanation in the main text?

      We thank the reviewer for drawing this ambiguity to our attention. Before testing for insulin regulation, we performed a filtering step requiring phosphopeptides to be quantified well enough for comparisons across strains and diets. Specifically, phosphopeptides were retained if they were quantified well enough to assess the effect of insulin in more than eight strain-diet combinations (≥ 3 insulin-stimulated values and ≥ 3 unstimulated values in each combination). We have now included this explanation of the filtering in the main text on lines 108-114.

      ANOVA Choice Clarification: In Figure 4, there's a transition from one-way ANOVA in B to two-way ANOVA in C. Could you expound on the rationale for selecting these distinct methods?

      In panel B, we first focussed on kinase regulation differences between strains in the absence of a dietary perturbation. Hence, we performed one-way ANOVAs only within the CHOW-fed mice. In panel C, we then consider the effect of perturbation with the HFD. We perform two-way ANOVAs, allowing us to identify effects of the HFD that are uniform across strains (Diet main effect) or variable across strains (Strain-by-diet interaction).

      Cell Line Selection for Functional Experiments: Could you elucidate the rationale behind opting for L6 cells of rat origin over C2C12 mouse cells for functional experiments?

      We acknowledge that C2C12 cells have the benefit of being of mouse origin, which aligns with our mouse-derived phosphoproteomics data. However, they are unsuitable for glucose uptake experiments as they lack an insulin-responsive vesicular compartment even upon GLUT4 overexpression, and undergo spontaneous contraction when differentiated resulting in confounding non-insulin dependent glucose uptake (10.1152/ajpendo.00092.2002, 10.1007/s11626-999-0030-8). In contrast, L6 cells readily express insulin-responsive GLUT4, and cannot contract (doi.org/10.1113/JP281352, 10.1007/s11626-999-0030-8). Therefore they are a superior model for studying insulin-dependent glucose transport. We have added a justification of L6 cells over C2C12 cells in the revised manuscript, on lines 352-354: "While L6 cells are of rat origin, they are preferable to the popular C2C12 mouse cell line since the latter lack an insulin-responsive vesicular compartment and undergo spontaneous contraction, resulting in confounding non-insulin dependent glucose uptake."

      It's intriguing that while a phosphosite was modulated on Pfkfb2, functional assays were conducted on a different isoform (Pfkfb3) wherein the phosphosite was not detected.

      The correlation between Pfkfb2 S469 phosphorylation and insulin-stimulated glucose uptake suggests that F2,6BP production, and subsequent glycolytic activation, positively regulate insulin responsiveness. There are several ways of testing this: 1) Knock down endogenous Pfkfb2, and re-express either wild-type protein or a S469A phosphomutant. If S469 phosphorylation positively regulates insulin responsiveness, then knockdown should decrease insulin responsiveness and re-expression of wild-type Pfkfb2, but not S469A, should restore it. 2) Induce insulin resistance (e.g. through palmitate treatment), and overexpress phosphomimetic S469D or S469E Pfkfb2 to enhance F2,6BP production. Under our hypothesis, this should reverse insulin resistance. 3) There is some evidence that dual phosphorylation of S469 and S486, another activating phosphosite on Pfkfb2, enhances F2,6BP production through 14-3-3 binding (10.1093/emboj/cdg363). Hence, we may expect that introduction of an R18 sequence into Pfkfb2, which causes constitutive 14-3-3 binding (10.1074/jbc.M603274200), would increase Pfkfb2-driven F2,6BP production, and under our hypothesis this should reverse insulin resistance. 4) The paralog Pfkfb3 lacks Akt regulatory sites and has substantially higher basal activity than Pfkfb2. Thus, overexpression of Pfkfb3 should mimic the effect of phosphorylated Pfkfb2, and hence reverse insulin resistance under our hypothesis. While approaches 1), 2), and 3) directly target Pfkfb2, they have drawbacks. For example, 1) may not work if Pfkfb2 knockdown is compensated for by other Pfkfb isoforms, 2) may not work since D/E phosphomimetics often do not recapitulate the molecular effects of S/T phosphorylation (10.1091/mbc.E12-09-0677), and 3) may not work if S469 phosphorylation does not operate through 14-3-3 binding. Hence we performed 4) as it seemed to be the most robust and cleanest experiment to test our hypothesis. We have revised the manuscript to further clarify the challenges of directly targeting Pfkfb2 and the benefits of targeting Pfkfb3 on lines 342-349: "Since Pfkfb2 requires phosphorylation by Akt to produce F2,6BP substantially, increasing F2,6BP production via Pfkfb2 would require enhanced activating site phosphorylation, which is difficult to achieve in a targeted fashion, or phosphomimetic mutation of activating sites to aspartate/glutamate, which often does not recapitulate the molecular effects of serine/threonine phosphorylation. By contrast, the paralog Pfkfb3 has high basal production rates and lacks an Akt motif at the corresponding phosphosites. We therefore rationalised that overexpressing Pfkfb3 would robustly increase F2,6BP production and enhance glycolysis regardless of insulin stimulation and Akt signalling."

      Insulin-Independent Action of Pfkfb3: The functionality of Pfkfb3 unfolds in an insulin-independent manner, yet it restores insulin action (Fig 6h). Could you shed light on the mechanism underpinning this phenomenon? Consider measuring F2,6BP concentrations or assessing kinase activity upon overexpression.

      Pfkfb3 overexpression increased the glycolytic capacity of L6 myotubes in the absence of insulin stimulation, as inferred by extracellular acidification rate (Fig. S7c). This is indeed consistent with Pfkfb3 enhancing glycolysis through increased F2,6BP concentration in an insulin-independent manner. To shed light on the mechanism connecting this to insulin action, we performed immunoblotting experiments to assess the kinase activity of Akt, a master regulator of the insulin response. Indeed, this experimental direction has precedent as we previously observed that Pfkfb3 overexpression enhanced insulin-stimulated Akt signalling in HEK293 cells, while small-molecule inhibition of Pfkfb kinase activity reduced Akt signalling in 3T3-L1 adipocytes (10.1074/jbc.M115.658815). However, insulin-stimulated phosphorylation of Akt S473, Akt T308, Gsk3a/b S21/S9, and PRAS40 T246 differed little across conditions, with only a weak, statistically insignificant trend towards increased pT308 Akt, pS21/S9 Gsk3a/b, and pT246 PRAS40 in palmitate-treated Pfkfb3-overexpressing cells. Hence, a more detailed phosphoproteomics study will be needed to assess whether Pfkfb3 restores insulin action by modulating insulin signalling. We have described these immunoblotting experiments in lines 361-365 and Fig. S7e-f. We also discussed potential mechanisms through which Pfkfb3-enhanced glycolysis could connect to insulin action in the discussion (lines 427-434).

      Figure 6h Statistical Analysis: For the 2DG uptake in Figure 6h, a conventional two-way ANOVA might be more appropriate than a repeated measures ANOVA.

      On reflection, we agree that a conventional ANOVA is more appropriate. Furthermore, for simplicity and conciseness we have decided to analyse and present only insulin-stimulated/unstimulated 2DG uptake fold change values in Figure 6h. We have presented all unstimulated and insulin-stimulated values in Figure S7d.

    2. eLife assessment

      This fundamental study provides a unique tool for assessing the range of phosphorylation in insulin reactions due to genetic variation and dietary influence through the utilization of genetically distinct mouse strains. The discoveries of this study hold substantial importance, as they shed light on the interplay between genetic attributes and environmental conditions in shaping the insulin-signaling network within skeletal muscle, a crucial regulator of metabolism. The supporting evidence presented is compelling, and the work is anticipated to captivate a wide audience within the metabolism discipline due to its extensive appeal and by providing inspiration for further hypothesis-driven research.

    3. Reviewer #1 (Public Review):

      The authors focused on genetic variability in relation to insulin resistance. The used genetically different lines of mice and exposed them to the same diet. They found that genetic predisposition impacts the overall outcome of metabolic disturbances.

    4. Reviewer #2 (Public Review):

      Summary:<br /> In the present study, van Gerwen et al. perform deep phosphoproteomics on muscle from saline or insulin-injected mice from 5 distinct strains fed a chow or HF/HS diet. The authors follow these data by defining a variety of intriguing genetic, dietary or gene-by-diet phosphor-sites which respond to insulin accomplished through application of correlation analyses, linear mixed models and a module-based approach (WGCNA). These findings are supported by validation experiments by intersecting results with a previous profile of insulin-responsive sites (Humphrey et al, 2013) and importantly, mechanistic validation of Pfkfb3 where overexpression in L6 myotubes was sufficient to alter fatty acid-induced impairments in insulin-stimulated glucose uptake. To my knowledge, this resource provides the most comprehensive quantification of muscle phospho-proteins which occur as a result of diet in strains of mice where genetic and dietary effects can be quantifiably attributed in an accurate manner. Utilization of this resource is strongly supported by the analyses provided highlighting the complexity of insulin signaling in muscle, exemplified by contrasts to the "classically-used" C57BL6/J strain. As it stands, I view this exceptional resource as comprehensive with compelling strength of evidence behind the mechanism explored. I raised several comments in the last round of assessment but all of them have now been thoughtfully addressed.

      Strengths: Generation of a novel resource to explore genetic and dietary interactions influencing the phospho-proteome in muscle. This is accompanied by elegant application of in silico tools to highlight the utility

      Weaknesses: none noted

    5. Reviewer #3 (Public Review):

      Summary:<br /> The authors aimed to investigate how genetic and environmental factors influence the muscle insulin signaling network and its impact on metabolism. They utilized mass spectrometry-based phosphoproteomics to quantify phosphosites in skeletal muscle of genetically distinct mouse strains in different dietary environments, with and without insulin stimulation. The results showed that genetic background and diet both affected insulin signaling, with almost half of the insulin-regulated phosphoproteome being modified by genetic background on an ordinary diet, and high-fat high-sugar feeding affecting insulin signaling in a strain-dependent manner.

      Strengths:<br /> Study uses state-of-the-art phosphoproteomics workflow allowing quantification of a large number of phosphosites in skeletal muscle, providing a comprehensive view of the muscle insulin signaling network. The study examined five genetically distinct mouse strains in two dietary environments, allowing for the investigation of the impact of genetic and environmental factors on insulin signaling. The identification of coregulated subnetworks within the insulin signaling pathway expanded our understanding of its organization and provided insights into potential regulatory mechanisms. The study associated diverse signaling responses with insulin-stimulated glucose uptake, uncovering regulators of muscle insulin responsiveness.

      Weaknesses:<br /> The limitations acknowledged by the authors, such as the need for larger cohorts and the inclusion of female mice. Moreover as acknowledged by authors, they are unable to dissect to what extent the obesity and different life span cycle for different strain affects insulin signaling. This suggest that further research is needed to validate and expand upon the findings.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Thank you for overseeing the assessment of our manuscript, “Comprehensive mutagenesis maps the effect of all single codon mutations in the AAV2 rep gene on AAV production". We would also like to thank the reviewers for their feedback. We have carried out the suggested experiments that we feel are most central to our conclusions and summarized the revisions to the manuscript below.

      We appreciate the reviewers’ suggestion with regards to testing different rAAV genomes. We have measured the effect of Rep variants on the production of rAAV containing three additional genomes: a 4.4 kb single-stranded genome, a 3.9 kb single-stranded genome, and a 2.1 kb self-complementary genome (Figures 5C and 5D). The DNase-resistant particles titers - reported as a percent of wild-type Rep titers - are relatively consistent across these three constructs as well as the 5.0 kb single-stranded genome previously tested.

      We agree with the reviewers that measurement of the relative transduction efficiency of rAAV produced with different Rep variants is an important experiment to conduct. To address this, we transduced HEK293T cells with rAAVs, containing a luciferase genome, which were produced using two different Rep variants. When a constant volume of purified rAAV was used for transduction, we observed that the rAAV produced with the S110R Rep variant resulted in higher transduction than rAAV produced with wild-type Rep (as measured by luciferase signal). While we tested only a small number of variants, these results indicate that at least one of the Rep variants we identified can increase not only the viral genome titer but also the titer of transducing particles.

      To generate this transduction data, we produced additional rAAV preps using S110R and Q439T Rep variants. In the previous version of this manuscript, we used the Q439T variant to produce rAAV and noted a 10% increase in the ratio of viral genomes: capsids as determined by comparison of qPCR and capsid ELISA titers. However, a similar increase was not observed in the more recent experiment discussed above. We attribute this discrepancy to changes in the plasmid quantification methods used for transfection. Previously, we quantified plasmids using a fluorometric assay (Qubit); in our more recent experiments, we used qPCR to quantify plasmids for transfection. qPCR provides a more accurate measurement of plasmid concentration due to the specific nature of the primers and probes used, which may account for the subtle shift in quantification. While outside the scope of the current work, it will also be interesting to further investigate the proportion of full capsids using additional Rep variants and more direct methods, such as cryoEM or analytical ultracentrifugation.

      We agree with the reviewers’ observation that there are differences in the production fitness values for synonymous variants. However, the variation in production fitness values between synonymous variants is smaller than that between non-synonymous variants. We conducted the following analysis to clarify this point. We calculated two mean centered fitness values for each codon variant in the WT AAV2 library. The “positional mean centered fitness value” was determined using the production fitness values of all variants at a given amino acid position and describes how far a given fitness value diverges from the mean fitness value for that position. The “synonymous codon mean centered fitness value” was determined using the production fitness values of all synonymous variants at a given position and describes how far a given fitness value diverges from the mean fitness value for all its synonymous codon variants. We then plotted both mean centered fitness values versus amino acid position (Figure S8).

      The distribution of mean centered selection values is narrower when calculated at the synonymous codon level as opposed to the position level. This indicates that, in general, synonymous variants have more tightly distributed production fitness values than non-synonymous variants. This observation precludes us from conducting a more thorough analysis of the effects of synonymous codons on AAV production. (Although, there is at least one instance where clear differences between synonymous codons can be observed (Figure S9C and Figure S9D).) We agree with the reviewers that synonymous variants almost certainly influence aspects of AAV production, such as genome replication, transcriptional regulation, mRNA stability, and protein expression. However, our assay measures the aggregate effect of rep variants on all steps in the AAV production process and is likely unable to detect the effects of synonymous variants on specific steps in this process if those steps are not rate-limiting. We have updated the discussion section to include an explanation of the above.

      The X-axes in Figures 5B and 5D have been updated to plot s’ instead of percent WT titer. We have also added asterisks to indicate significance in Figures 5A and 5C. Thank you for these suggestions.

      We agree with Reviewer 3 that it would be interesting to sequence barcodes from the mRNA pool. The 20 bp barcodes are located upstream of the polyA site and should be present in mRNA transcripts. Something to consider is that AAV2 transcripts expressed from all three promoters (p5, p19, and p40) are polyadenylated at the same site (Stutika et al., 2016). As such, in our WT AAV2 library, barcode representation in the mRNA pool would indicate the aggregate effect of a rep variant on the levels of all AAV2 transcripts. In the pCMV-Rep78/68 library, only two AAV2 transcripts are generated - a spliced and unspliced version of the p5 product. Sequencing of barcodes present in the mRNA pool could be informative regarding the effect of rep variants on combined Rep78/68 expression levels. However, we feel that this experiment is outside the scope of the current work.

      We were also surprised at the number of novel functional Rep variants that were identified in our library. As the reviewer pointed out, optimal rAAV production likely does not equate to optimal fitness of naturally occurring AAV in the endogenous host. Naturally occurring AAV has both a latent and a lytic cycle and the Rep proteins play a role in both these processes (Pereira et al., 1997; Surosky et al., 1997). rAAV production, however, is primarily analogous to the lytic cycle of naturally occurring AAV. In their endogenous hosts, AAV must balance the effect of any mutations on fitness in both the lytic and latent contexts while we assay specifically for production fitness. We additionally attribute this finding to the relatively small number of AAV serotypes, for which rep sequences are available. We have added a discussion of the above to the manuscript.

      Finally, in response to feedback from other researchers, we determined which amino acid substitutions resulted in production fitness values that were significantly different from that of wild-type (Figure S4). These results further emphasized the importance of the origin-binding domain; most statistically significant beneficial substitutions clustered here. Additionally, we noted that the majority of substitutions in the zinc-finger domain resulted in production fitness changes that were not significant. This lines up with previous work indicating that the zinc-finger domain is dispensable for rAAV production. We have added a discussion of these results to the main text.

      We again thank the reviewers for their suggestions; we feel that incorporation of their suggestions has strengthened support for our conclusions and enhanced the utility of this work for others in the field.

      References Pereira, D. J., McCarty, D. M., & Muzyczka, N. (1997). The adeno-associated virus (AAV) Rep protein acts as both a repressor and an activator to regulate AAV transcription during a productive infection. Journal of Virology, 71(2), 1079–1088. https://doi.org/10.1128/jvi.71.2.1079-1088.1997

      Stutika, C., Gogol-Döring, A., Botschen, L., Mietzsch, M., Weger, S., Feldkamp, M., Chen, W., & Heilbronn, R. (2016). A Comprehensive RNA Sequencing Analysis of the Adeno-Associated Virus (AAV) Type 2 Transcriptome Reveals Novel AAV Transcripts, Splice Variants, and Derived Proteins. Journal of Virology, 90(3), 1278–1289. https://doi.org/10.1128/JVI.02750-15

      Surosky, R. T., Urabe, M., Godwin, S. G., McQuiston, S. A., Kurtzman, G. J., Ozawa, K., & Natsoulis, G. (1997). Adeno-associated virus Rep proteins target DNA sequences to a unique locus in the human genome. Journal of Virology, 71(10), 7951–7959. https://doi.org/10.1128/jvi.71.10.7951-7959.1997

    2. eLife assessment

      This study presents a valuable and comprehensive mutagenesis map of the AAV2 rep gene, which will undoubtedly capture the interest of scientists working with adeno-associated viruses and those engaged in the field of gene therapy. The thorough characterization of massive rep variants across multiple AAV production systems bolsters the claims made in the study, highlighting its utility in enhancing our understanding of Rep protein function and advancing gene therapy applications. The evidence presented is convincing and establishes a strong foundation that will stimulate and inform future research in the field.

    3. Reviewer #2 (Public Review):

      The authors use a high-throughput sequencing-based enrichment assay to measure how individual amino acids substitutions in the Rep proteins of AAV change the production of AAV. The key experiment involved the creation of all possible single codon mutations of the AAV2 rep gene in a barcoded format, transfection of the library into HEK293T cells for production of AAV, and sequencing to see which rep variants were enriched in the viral particles produced from the library. As the library rep variants were flanked by inverted terminal repeats for packaging into viral particles, the authors could use high-throughput sequencing of the barcodes to determine how much each rep variant supported the production of AAV. The rep gene libraries were cleverly made through a cloning process that ensured each mutant was attached to an exactly known 20nt barcode included in each mutagenic oligo (and subsequently moved to the end of the library genes by another cloning step). This allowed the authors to confidently observe nearly all rep variants in their experiments, resulting in a comprehensive map between Rep protein variants and AAV production. The overall map should act as a useful guide for AAV engineering. Not only did certain variants improve AAV production by ~2-fold and show generality across AAV capsid serotypes, the map might be used to predict greater effects through combinations of mutations, especially if augmented by natural evolutionary datasets and statistical learning.

      In interpreting the results of this study, the reader should bear in mind that what has been measured and validated in high throughput is the production of intact genome-containing AAVs. The authors also successfully show transduction for selected high production variants. This is important as the efficiency by which an AAV preparations transduce cells is most relevant property for gene therapy.

      Overall, this is a well-executed and well-analyzed study. The results support the conclusions and claims of the work. I see this work as a useful resource for engineering recombinant AAVs to increase their production, which should have broad impact as the use of AAVs in gene therapy grows.

    4. Reviewer #3 (Public Review):

      The study by Jain et al. on recombinant adeno-associated viruses (rAAVs) represents a valuable contribution to the fields of virus genetics and gene therapy. As non-pathogenic vectors, rAAVs have become a popular choice for delivering gene therapies. The authors have previously investigated the effects of all possible single codon substitutions, deletions, and insertions in the AAV2 cap gene on AAV production. In this study, they extend their analysis to the AAV2 rep gene and rep genes in two additional capsid serotypes, establishing a genotype-phenotype landscape that enhances our understanding of Rep protein function and offers potential strategies for improving Rep function in gene therapy applications. The experimental design is rigorous, the analyses well-executed, and the interpretations of the data are convincing. While I have a few suggestions to further refine the study, I believe it is overall an excellent piece of research.

      One aspect that may warrant further consideration is the assumption, as mentioned in Figure 2's legend, that synonymous mutations are neutral and can serve as controls for normalizing the production rate. However, Figures S5-6 and Figures S11-12 suggest that synonymous mutations are not necessarily neutral, as their distribution is similar to that of nonsynonymous mutations. Thus, it may be beneficial to more thoroughly examine the potential effects of synonymous mutations on the genotype-phenotype landscape.

      Additionally, previous research by Jeff Collar and others has reported that synonymous mutations can affect mRNA levels through mRNA degradation rate. It would be interesting to determine if the 20-bp barcodes located at the 3' end are positioned within the untranslated regions and could thus be employed to quantify the mRNA levels of individual variants. This information could offer insight into another potential mechanism by which single codon mutations impact the production rate of rAAV.

      The authors discovered several novel mutations that enhance AAV production yet are absent in natural occurrences. This intriguing finding could benefit from further elaboration, particularly with regard to the distribution of these mutations within the protein structure and the nature of the amino acid transitions involved. It would also be informative if the authors could provide a brief discussion as to why these mutations have not been observed in nature. For instance, could it be that optimal viral fitness necessitates an intermediate production rate rather than an excessively rapid one? Expanding on these points may further enrich the paper and offer valuable insights for readers.

      The authors have taken commendable steps to address the concerns I raised in my previous evaluation. They have provided comprehensive clarifications, performed necessary revisions, and expanded upon certain key points in the manuscript.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      The authors' finding that PARG hydrolase removal of polyADP-ribose (PAR) protein adducts generated in response to the presence of unligated Okazaki fragments is important for S-phase progression is potentially valuable, but the evidence is incomplete, and identification of relevant PARylated PARG substrates in S-phase is needed to understand the role of PARylation and dePARylation in S-phase progression. Their observation that human ovarian cancer cells with low levels of PARG are more sensitive to a PARG inhibitor, presumably due to the accumulation of high levels of protein PARylation, suggests that low PARG protein levels could serve as a criterion to select ovarian cancer patients for treatment with a PARG inhibitor drug.

      Thank you for the assessment and summary. Please see below for details as we have now addressed the deficiencies pointed out by the reviewers.

      We believe that PARP1 is one of the major relevant PARG substrates in S phase cells. Previous studies reported that PARP1 recognizes unligated Okazaki fragments and induces S phase PARylation, which recruits single-strand break repair proteins such as XRCC1 and LIG3 that acts as a backup pathway for Okazaki fragment maturation (Hanzlikova et al., 2018; Kumamoto et al., 2021). In this study, we revealed that accumulation of PARP1/2-dependent S phase PARylation eventually led to cell death (Fig. 2). Furthermore, we found that chromatin-bound PARP1 as well as PARylated PARP1 increased in PARG KO cells (Fig. S4A and Fig. 4A), suggesting that PARP1 is one of the key substrates of PARG in S phase cells. Of course, PARG may have additional substrates besides PARP1 which are required for its roles in S phase progression, as PARG is known to be recruited to DNA damage sites through pADPr- and PCNA-dependent mechanisms (Mortusewicz et al., 2011). Precisely how PARG regulates S phase progression warrants further investigation.

      Public Reviews:

      Reviewer #1 (Public Review):

      I have a major conceptual problem with this manuscript: How can the full deletion of a gene (PARG) sensitize a cell to further inhibition by its chemical inhibitor (PARGi) since the target protein is fully absent?

      Please see below for details about this point. Briefly, we found that PARG is an essential gene (Fig. 7). There was residual PARG activity in our PARG KO cells, although the loss of full-length PARG was confirmed by Western blotting and DNA sequencing (Fig. S9). The residual PARG activity in these cells can be further inhibited by PARG inhibitor, which eventually lead to cell death.

      The authors state in the discussion section: "The residual PARG dePARylation activity observed in PARG KO cells likely supports cell growth, which can be further inhibited by PARGi". What does this statement mean? Is the authors' conclusion that their PARG KOs are not true KOs but partial hypomorphic knockdowns? Were the authors working with KO clones or CRISPR deletion in populations of cells?

      The reviewer is correct that our PARG KOs are not true KOs. We were working with CRISPR edited KO clones. As shown in this manuscript, we validated our KO clones by Western blotting, DNA sequencing and MMS-induced PARylation. Despite these efforts and our inability to detect full-length PARG in our KO clones, we suspect that our PARG KO cells may still express one or more active fragments of PARG due to alternative splicing and/or alternative ATG usage.

      As shown in Fig. 7, we believe that PARG is essential for proliferation. Our initial KO cell lines are not complete PARG KO cells and residual PARG activity in these cells could support cell proliferation. Unfortunately, due to lack of appropriate reagents we could not draw solid conclusions regarding the isoforms or the truncated PARG expressed in these cells (Please see Western blots below).

      Are there splice variants of PARG that were not knocked down? Are there PARP paralogues that can complement the biochemical activity of PARG in the PARG KOs? The authors do not discuss these critical issues nor engage with this problem.

      There are five reviewed or potential PARG isoforms identified in the Uniprot database. The two sgRNAs (#1 and #2) used to generate initial PARG KO cells in this manuscript target all three catalytically active isoforms (isoforms 1, 2 and 3), and sgRNA#2 used in HeLa cells also targets isoforms 4 and 5, but these isoforms are considered catalytically inactive according to the Uniprot database. However, it is likely that sgRNA-mediated genome editing may lead to the creation of new alternatively spliced PARG mRNAs or the use of alternative ATG, which can produce catalytically active forms of PARG. Instead of searching for these putative spliced PARG RNAs, we used two independent antibodies that recognize the C-terminus of PARG for WB as shown below. Unfortunately, besides full-length PARG, these antibodies also recognized several other bands, some of them were reduced or absent in PARG KO cells, others were not. Thus, we could not draw a clear conclusion which functional isoform was expressed in our PARG KO cells. Nevertheless, we directly measured PARG activity in PARG KO cells (Fig. S9) and showed that we were still able to detect residual PARG activity in these PARG KO cells. These data clearly indicate that residual PARG activity are present and detected in our KO cells, but the precise nature of these truncated forms of PARG remains elusive.

      Author response image 1.

      These issues have to be dealt with upfront in the manuscript for the reader to make sense of their work.

      We thank this reviewer for his/her constructive comments and suggestions. We will include the data above and additional discussion upfront in our revised manuscript to avoid any further confusion by our readers.

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Nie et al investigate the effect of PARG KO and PARG inhibition (PARGi) on pADPR, DNA damage, cell viability, and synthetic lethal interactions in HEK293A and Hela cells. Surprisingly, the authors report that PARG KO cells are sensitive to PARGi and show higher pADPR levels than PARG KO cells, which are abrogated upon deletion or inhibition of PARP1/PARP2. The authors explain the sensitivity of PARG KO to PARGi through incomplete PARG depletion and demonstrate complete loss of PARG activity when incomplete PARG KO cells are transfected with additional gRNAs in the presence of PARPi. Furthermore, the authors show that the sensitivity of PARG KO cells to PARGi is not caused by NAD depletion but by S-phase accumulation of pADPR on chromatin coming from unligated Okazaki fragments, which are recognized and bound by PARP1. Consistently, PARG KO or PARG inhibition shows synthetic lethality with Pol beta, which is required for Okazaki fragment maturation. PARG expression levels in ovarian cancer cell lines correlate negatively with their sensitivity to PARGi.

      Thank you for your nice comments. The complete loss of PARG activity was observed in PARG complete/conditional KO (cKO) cells. These cKO clones were generated using wild-type cells transfected with sgRNAs targeting the catalytic domain of PARG in the presence of PARP inhibitor.

      Strengths:

      The authors show that PARG is essential for removing ADP-ribosylation in S-phase.

      Thanks!

      Weaknesses:

      1. This begs the question as to the relevant substrates of PARG in S-phase, which could be addressed, for example, by analysing PARylated proteins associated with replication forks in PARG-depleted cells (EdU pulldown and Af1521 enrichment followed by mass spectrometry).

      We believe that PARP1 is one of the major relevant PARG substrates in S phase cells. Previous studies reported that PARP1 recognizes unligated Okazaki fragments and induces S phase PARylation, which recruits single-strand break repair proteins such as XRCC1 and LIG3 that acts as a backup pathway for Okazaki fragment maturation (Hanzlikova et al., 2018; Kumamoto et al., 2021). In this study, we revealed that accumulation of PARP1/2-dependent S phase PARylation eventually led to cell death (Fig. 2). Furthermore, we found that chromatin-bound PARP1 as well as PARylated PARP1 increased in PARG KO cells (Fig. S4A and Fig. 4A), suggesting that PARP1 is one of the key substrates of PARG in S phase cells. Of course, PARG may have additional substrates besides PARP1 which are required for its roles in S phase progression, as PARG is known to be recruited to DNA damage sites through pADPr- and PCNA-dependent mechanisms (Mortusewicz et al., 2011). Precisely how PARG regulates S phase progression warrants further investigation.

      1. The results showing the generation of a full PARG KO should be moved to the beginning of the Results section, right after the first Results chapter (PARG depletion leads to drastic sensitivity to PARGi), otherwise, the reader is left to wonder how PARG KO cells can be sensitive to PARGi when there should be presumably no PARG present.

      Thank you for your suggestion! However, we would like to keep the complete PARG KO result at the end of the Results section, since this was how this project evolved. Initially, we did not know that PARG is an essential gene. Thus, we speculated that PARGi may target not only PARG but also a second target, which only becomes essential in the absence of PARG. To test this possibility, we performed FACS-based and cell survival-based whole-genome CRISPR screens (Fig. 5). However, this putative second target was not revealed by our CRISPR screening data (Fig. 5). We then tested the possibility that these cells may have residual PARG expression or activity and only cells with very low PARG expression are sensitive to PARGi, which turned out to be the case for ovarian cancer cells. Equipped with PARP inhibitor and sgRNAs targeting the catalytic domain of PARG, we finally generated cells with complete loss of PARG activity to prove that PARG is an essential gene (Fig. 7). This series of experiments underscore the challenge of validating any KO cell lines, i.e. the identification of frame-shift mutations, absence of full-length proteins, and phenotypic changes may still not be sufficient to validate KO clones. This is an important lesson we learned and we would like to share it with the scientific community.

      To avoid further misunderstanding, we will include additional statements/comments at the end of “PARG depletion leads to drastic sensitivity to PARGi” section and at the beginning of “CRISPR screens reveal genes responsible for regulating pADPr signaling and/or cell lethality in WT and PARG KO cells”. Hope that our revised manuscript will make it clear.

      1. Please indicate in the first figure which isoforms were targeted with gRNAs, given that there are 5 PARG isoforms. You should also highlight that the PARG antibody only recognizes the largest isoform, which is clearly absent in your PARG KO, but other isoforms may still be produced, depending on where the cleavage sites were located.

      The two sgRNAs (#1 and #2) used to generate initial PARG KO cells in this manuscript target all three catalytically active isoforms (isoforms 1, 2 and 3), and sgRNA#2 used in HeLa cells also targets isoforms 4 and 5, but these isoforms are considered catalytically inactive according to the Uniprot database. As suggested, we will modify Fig. S1D and the figure legends.

      The manufacturer instruction states that the Anti-PARG antibody (66564S) can only recognize isoform 1, this antibody could recognize isoforms 2 and 3 albeit weakly based on Western blot results with lysates prepared from PARG cKO cells reconstituted with different PARG isoforms, as shown below. As suggested, we will add a statement in the revised manuscript and provide the Western blotting data below.

      Author response image 2.

      To test whether other isoforms were expressed in 293A and/or HeLa cells, we used two independent antibodies that recognize the C-terminus of PARG for WB as shown below. Unfortunately, besides full-length PARG, these antibodies also recognized several other bands, some of them were reduced or absent in PARG KO cells, others were not. Thus, we could not draw a clear conclusion which functional isoforms or truncated forms were expressed in our PARG KO cells.

      Author response image 3.

      1. FACS data need to be quantified. Scatter plots can be moved to Supplementary while quantification histograms with statistical analysis should be placed in the main figures.

      We agree with this reviewer that quantification of FACS data may provide straightforward results in some of our data. However, it is challenging to quantify positive S phase pADPr signaling in some panels, for example in Fig. 3A and Fig. 4C. In both panels, pADPr signaling was detected throughout the cell cycle and therefore it is difficult to know the percentage of S phase pADPr signaling in these samples. Thus, we decide to keep the scatter plots to demonstrate the dramatic and S phase-specific pADPr signaling in PARG KO cells treated with PARGi. We hope that these data are clear and convincing even without any quantification.

      1. All colony formation assays should be quantified and sensitivity plots should be shown next to example plates.

      As suggested, we will include the sensitivity plot next to Fig. 3D. However, other colony formation assays in this study were performed with a single concentration of inhibitor and therefore we will not provide sensitivity plots for these experiments. Nevertheless, the results of these experiments are straightforward and easy to interpret.

      1. Please indicate how many times each experiment was performed independently and include statistical analysis.

      As suggested, we will add this information in the revised manuscript.

      Reviewer #3 (Public Review):

      Here the authors carried out a CRISPR/sgRNA screen with a DDR gene-targeted mini-library in HEK293A cells looking for genes whose loss increased sensitivity to treatment with the PARG inhibitor, PDD00017273 (PARGi). Surprisingly they found that PARG itself, which encodes the cellular poly(ADP-ribose) glycohydrolase (dePARylation) enzyme, was a major hit. Targeted PARG KO in 293A and HeLa cells also caused high sensitivity to PARGi. When PARG KO cells were reconstituted with catalytically-dead PARG, MMS treatment caused an increase in PARylation, not observed when cells were reconstituted with WT PARG or when the PARG KO was combined with PARP1/2 DKO, suggesting that loss of PARG leads to a strong PARP1/2-dependent increase in protein PARylation. The decrease in intracellular NADH+, the substrate for PARP-driven PARylation, observed in PARG KO cells was reversed by treatment with NMN or NAM, and this treatment partially rescued the PARG KO cell lethality. However, since NAD+ depletion with the FK868 nicotinamide phosphoribosyltransferase (NAMPT) inhibitor did not induce a similar lethality the authors concluded that NAD+ depletion/reduction was only partially responsible for the PARGi toxicity. Interestingly, PARylation was also observed in untreated PARG KO cells, specifically in S phase, without a significant rise in γH2AX signals. Using cells synchronized at G1/S by double thymidine blockade and release, they showed that entry into S phase was necessary for PARGi to induce PARylation in PARG KO cells. They found an increased association of PARP1 with a chromatin fraction in PARG KO cells independent of PARGi treatment, and suggested that PARP1 trapping on chromatin might account in part for the increased PARGi sensitivity. They also showed that prolonged PARGi treatment of PARG KO cells caused S phase accumulation of pADPr eventually leading to DNA damage, as evidenced by increased anti-γH2AX antibody signals and alkaline comet assays. Based on the use of emetine, they deduced that this response could be caused by unligated Okazaki fragments. Next, they carried out FACS-based CRISPR screens to identify genes that might be involved in cell lethality in WT and PARG KO cells, finding that loss of base excision repair (BER) and DNA repair genes led to increased PARylation and PARGi sensitivity, whereas loss of PARP1 had the opposite effects. They also found that BER pathway disruption exhibited synthetic lethality with PARGi treatment in both PARG KO cells and WT cells, and that loss of genes involved in Okazaki fragment ligation induced S phase pADPr signaling. In a panel of human ovarian cancer cell lines, PARGi sensitivity was found to correlate with low levels of PARG mRNA, and they showed that the PARGi sensitivity of cells could be reduced by PARPi treatment. Finally, they addressed the conundrum of why PARG KO cells should be sensitive to a specific PARG inhibitor if there is no PARG to inhibit and found that the PARG KO cells had significant residual PARG activity when measured in a lysate activity assay, which could be inhibited by PARGi, although the inhabited PARG activity levels remained higher than those of PARG cKO cells (see below). This led them to generate new, more complete PARG KO cells they called complete/conditional KO (cKO), whose survival required the inclusion of the olaparib PARPi in the growth medium. These PARG cKO cells exhibited extremely low levels of PARG activity in vitro, consistent with a true PARG KO phenotype.

      We thank this reviewer for his/her constructive comments and suggestions.

      The finding that human ovarian cancer cells with low levels of PARG are more sensitive to inhibition with a small molecule PARG inhibitor, presumably due to the accumulation of high levels of protein PARylation (pADPr) that are toxic to cells is quite interesting, and this could be useful in the future as a diagnostic marker for preselection of ovarian cancer patients for treatment with a PARG inhibitor drug. The finding that loss of base excision repair (BER) and DNA repair genes led to increased PARylation and PARGi sensitivity is in keeping with the conclusion that PARG activity is essential for cell fitness, because it prevents excessive protein PARylation. The observation that increased PARylation can be detected in an unperturbed S phase in PARG KO cells is also of interest. However, the functional importance of protein PARylation at the replication fork in the normal cell cycle was not fully investigated, and none of the key PARylation targets for PARG required for S phase progression were identified. Overall, there are some interesting findings in the paper, but their impact is significantly lessened by the confusing way in which the paper has been organized and written, and this needs to be rectified.

      We believe that PARP1 is one of the major relevant PARG substrates in S phase cells. Previous studies reported that PARP1 recognizes unligated Okazaki fragments and induces S phase PARylation, which recruits single-strand break repair proteins such as XRCC1 and LIG3 that acts as a backup pathway for Okazaki fragment maturation (Hanzlikova et al., 2018; Kumamoto et al., 2021). In this study, we revealed that accumulation of PARP1/2-dependent S phase PARylation eventually led to cell death (Fig. 2). Furthermore, we found that chromatin-bound PARP1 as well as PARylated PARP1 increased in PARG KO cells (Fig. S4A and Fig. 4A), suggesting that PARP1 is one of the key substrates of PARG in S phase cells. Of course, PARG may have additional substrates besides PARP1 which are required for its roles in S phase progression, as PARG is known to be recruited to DNA damage sites through pADPr- and PCNA-dependent mechanisms (Mortusewicz et al., 2011). Precisely how PARG regulates S phase progression warrants further investigation.

      As suggested, we will revise our manuscript accordingly and provide additional explanation/statement upfront to avoid any misunderstandings.  

      Reviewer #1 (Recommendations For The Authors):

      1. Figure 1c. Why does the viability of PARG KO cells improve at higher doses of PARGi? How do the authors explain this paradox?

      This phenomenon was observed in 293A PARG KO cells and happened in CellTiter-Glo assay, especially with the top three PARGi concentrations (100 µM, 33.33 µM and 11.11 µM). This may due to the low solubility of this PARGi in the medium, since we sometimes observed precipitation at high concentrations when PARGi stock was diluted in medium.

      1. Figure 2d. The authors show that PARGi reduced NAD+ level by 20%. This reduction in NAD+ probably does not explain the cell death phenotype observed by parthanatos cell death. What pathway is activated by PARGi to induce cell death?

      Since PARG KO cells treated with PARGi led to uncontrolled pADPr accumulation, it is possible that some of these cells may die due to parthanotos. However, we did not observe a dramatic reduction in NAD+ level. A previous study showed that Parg(-/-) mouse ES cells predominantly underwent caspase-dependent apoptosis (Shirai et al., 2013). Indeed, PARP1 cleavage was detected in PARG KO cells with prolonged PARGi treatment, indicating that at least some of these cells die due to apoptosis (Fig. 2A). Cytotoxicity of PARGi in PARG KO cells may due to several mechanisms including apoptosis, parthanatos and NAD+ reduction.

      1. The authors refer to FK866 in the text without explaining what this agent is. FK866 is a noncompetitive inhibitor of nicotinamide phosphoribosyltransferase (NAPRT), a key enzyme in the regulation of NAD+ biosynthesis from the natural precursor nicotinamide. The authors should explain experimental tools in the text as they use them for clarity to the reader.

      Thanks for the suggestion! We will include additional citations and discuss how FK866 works in our revised manuscript.

      1. In addition to these issues, there are significant formatting and textual problems, such that there are multiple gaps in the body of the text that make coherent reading of the manuscript impossible. Examples are: Page 3 line 10. Page 6 line 5 and line 15, Page 7 line 2, 3, and line 8. Page 8, line 1, and line 3 from bottom. Page 9 line 1, line 7 from bottom and line 9 from the bottom, Page 18 of the results in several places, etc. etc. etc. These formatting errors convey the impression that the submitting authors did not adequately review the manuscript for technical problems prior to submission. The authors need to correct these errors.

      Sorry, we will edit the text and remove these gaps as suggested.

      Reviewer #3 (Recommendations For The Authors):

      1. The major problem with this paper is conceptual - namely, how could PARG knockout cells be hypersensitive to a selective PARG small molecular inhibitor. The evidence in Figure 7 that there is measurable residual PARG activity in the so-called PARG KO 293A and HeLa cells provides a partial explanation for why PARG inhibitor treatment might be deleterious to the PARG KO cells, i.e., because PARGi blocks this residual PARG activity. However, although the authors characterized the PARG alleles in the 293A PARG KO cells by sequencing, the molecular origin of the significant level of residual PARG activity remains unclear (see points 7-9).

      Yes, in our study we showed that PARGi treatment inhibited the residual PARG activity in PARG KO cells, which mimics complete loss of PARG as PARG is an essential gene. These data agree with a previous study using Parg(-/-) mouse cells (Koh et al., 2004).We attempted to define the molecular origin of the residual PARG activity, unfortunately this was challenging (please see below for additional discussions). Nevertheless, we showed that residual PARG activity could be detected in PARG KO cells and more importantly cells with reduced PARG expression or activity are sensitive to PARGi. These results indicate that PARG expression and/or activity may be used as a biomarker for PARGi-based therapy.

      1. Although the most obvious explanation for the PARGi sensitivity data presented in Figures 1-4 is that the PARG KO cells have residual PARG activity, the authors wait until the discussion on page 26 to raise the possibility that the PARG KO cells might have residual PARG activity that renders them sensitive to PARGi. It would be more logical to move the PARG activity data in Figure 7 earlier in the paper as a supplementary figure, so that the reader is not left wondering how a PARG KO cell remains sensitive to a PARG inhibitor. For this reason, it is recommended that the whole paper be reorganized and rewritten to provide a more logical flow that allows the reader to understand what was done, and why it is hard to generate complete PARG KO cells because the accumulation of pADPR adducts is toxic to the cell.

      Thank you for your suggestion! However, we would like to keep the complete PARG KO result at the end of the Results section, since this was how this project evolved. Initially, we did not know that PARG is an essential gene. Thus, we speculated that PARGi may target not only PARG but also a second target, which only becomes essential in the absence of PARG. To test this possibility, we performed FACS-based and cell survival-based whole-genome CRISPR screens (Fig. 5). However, this putative second target was not revealed by our CRISPR screening data (Fig. 5). We then tested the possibility that these cells may have residual PARG expression or activity and only cells with very low PARG expression are sensitive to PARGi, which turned out to be the case for ovarian cancer cells. Equipped with PARP inhibitor and sgRNAs targeting the catalytic domain of PARG, we finally generated cells with complete loss of PARG activity to prove that PARG is an essential gene (Fig. 7). This series of experiments underscore the challenge of validating any KO cell lines, i.e. the identification of frame-shift mutations, absence of full-length proteins, and phenotypic changes may still not be sufficient to validate KO clones. This is an important lesson we learned and we would like to share it with the scientific community.

      To avoid further misunderstanding, we will include additional statements/comments at the end of “PARG depletion leads to drastic sensitivity to PARGi” section and at the beginning of “CRISPR screens reveal genes responsible for regulating pADPr signaling and/or cell lethality in WT and PARG KO cells”. Hope that our revised manuscript will make it clear.

      1. Exactly how PARG activity would be coordinated with PARP1/2 activity during normal S phase to ensure that PARylation can serve its required function, whatever that may be, and is then removed by PARG is unclear - how would this be orchestrated at the level of a replication fork?

      PARG is known to be recruited to sites of DNA damage through pADPr- and PCNA-dependent mechanisms (Mortusewicz et al., 2011). Our current hypothesis is that PARP1 is one of the major PARG substrates in S phase cells. Previous studies reported that PARP1 recognizes unligated Okazaki fragments and induces S phase PARylation, which recruits single-strand break repair proteins such as XRCC1 and LIG3 that acts as a backup pathway for Okazaki fragment maturation (Hanzlikova et al., 2018; Kumamoto et al., 2021). In this study, we revealed that accumulation of PARP1/2-dependent S phase PARylation eventually led to cell death (Fig. 2). Furthermore, we found that chromatin-bound PARP1 as well as PARylated PARP1 increased in PARG KO cells (Fig. S4A and Fig. 4A), suggesting that PARP1 is one of the key substrates of PARG in S phase cells. Of course, PARG may have additional substrates besides PARP1 which are required for its roles in S phase progression. Precisely how PARG regulates S phase progression warrants further investigation.

      1. Figure 2B: What gRNAs were used to generate the 293A and HeLa PARG knock clones, i.e., where are they located in the PARG gene? If they are not in the catalytic domain it might be possible to generate PARG proteins with N-terminal deletions that are still active (see points 8-10 below).

      The two sgRNAs (#1 and #2) used to generate initial PARG KO cells in this manuscript target all three catalytically active isoforms (isoforms 1, 2 and 3), and sgRNA#2 used in HeLa cells also targets isoforms 4 and 5, but these isoforms are considered catalytically inactive according to the Uniprot database. As suggested, we will modify Fig. S1D and the figure legends to show the localization of gRNAs.

      We agree with this reviewer that truncated but active forms of PARG exist in these KO cells. We attempted to identify these trunated forms of PARG by using two independent antibodies that recognize the C-terminus of PARG for WB as shown below. Unfortunately, besides full-length PARG, these antibodies also recognized several other bands, some of them were reduced or absent in PARG KO cells, others were not. Thus, we could not draw a clear conclusion which functional isoform/truncated form was expressed in our PARG KO cells. Nevertheless, we directly measured PARG activity in PARG KO cells (Fig. S9) and showed that we were still able to detect residual PARG activity in these PARG KO cells. Based on these results, we stated that the residual PARG activity was detected in our KO cells, but we were not able to specify the truncated variants of PARG in these cells.

      Author response image 4.

      1. Figure 3B/page 19: The authors state that "emetine, which diminishes Okazaki fragments, greatly inhibited S phase pADPr signaling in PARG KO cells", and from this deduced that Okazaki fragments on the lagging strand activate PARylation. However, emetine is not a specific lagging strand synthesis inhibitor, as implied here, but rather a protein synthesis inhibitor, which inhibits Okazaki fragment formation indirectly (see PMID: 36260751). The authors need to rewrite this section to explain how emetine works in this context.

      As suggested, we will cite this reference and discuss how emetine inhibits Okazaki fragment maturation in our revised manuscript. Additionally, we used three different POLA1 inhibitors to diminish Okazaki fragments. As shown in Fig. S3B, all three POLA1 inhibitors significantly abolished S-phase pADPr induced by PARGi in PARG KO cells. Furthermore, POLA1 inhibitors, adarotene and CD437, were able to rescue cell lethality caused by PARGi in PARG KO cells (Fig. 3E).

      1. Figure 7: It is not clear why these cells are called PARG complete/conditional KO cells (cKO). Generally, "conditional knockout" refers to a cell or animal in which a gene can be conditionally knocked out by inducible expression of Cre. Here, it appears that "conditional" refers to the fact that the PARG KO cells only grow in the presence of olaparib - is this the case?

      Yes, we used the name to separate these cells from our initial PARG KO cells. Moreover, we were only able to obtain and maintain these PARG cKO clones with complete loss of PARG activity in the presence of PARP inhibitor. Therefore, we called them PARG complete/conditional KO (cKO) cells.

      1. Figure 7B and D: The level of full-length PARG protein was much lower in the 293A and HeLa cKO cells compared to WT cells consistent with cKO cells representing a more complete PARG KO. The level of PARG protein in the 293A PARG cKO cells was apparently also lower than in the original PARG KO cells, but the KO and cKO samples should be run side by side to demonstrate this conclusively, and the bands need to be quantified. In panel B, it is not clear from the legend what cKO_3 and cKO_4 are, but presumably, they are different clones, and this should be stated.

      Full-length PARG was not detected in either PARG KO or PARG cKO cells by WB. The apparent lower level of endogenous PARG in Fig. 7D was due to the fact that reconstituted cells had high exogenous PARG expression and therefore we had to reduce exposure time for WB.

      As for cKO_3 and cKO_4 in Fig.7, they are different clones created by different sgRNAs. As suggested, we will include additional information in figure legends to clearly state which sgRNA was used to generate the respective KO and cKO clones.

      1. Figure S8: There is not enough information here or in the text to allow the reader to interpret these PARG allele sequences obtained from the PARG KO cells. From the Methods section, it appears that the PARG KO cells were clonal, with sequence data from one clone of each of the 293A and HeLa cell PARG KO cells being shown. If this is right, then in both cell types one out of four PARG alleles is wild type, and therefore one would expect the PARG protein signal to be ~25% of that in WT cells. However, based on the 293A PARG KO cells PARG immunoblot in Figure 2B the PARG protein signal is clearly much lower than 25% (these bands need to be quantified), and this discrepancy needs to be explained. What is the level of PARG protein in the PARG KO HeLa cells? If different PARG KO cell clones are analyzed by sequencing, do they all have an apparently intact PARG allele? Four different gRNA target sites in the PARG gene are shown in panel A in Figure 7, but the description in the text regarding how the four gRNAs were used is totally inadequate - were all four used simultaneously or only the two in the catalytic domain? Were pairs of gRNAs used in an attempt to generate a large intervening deletion - some Southern blots of the PARG gene region in the PARG cKO cells are needed to figure this out. The gRNAs are given numbers in Figure 7A, but it is unclear from the sequences shown in Figures S8 and S9 which gRNA sites are shown. All of this has to be clarified, so that the reader can understand the nature of the KO/cKO cells knockout alleles, and what PARG-related products, if any, they can express.

      Yes, all KO and cKO cells used in this study are single clones. As suggested, we will revise figure legends in Fig.7, S8 and S9 to include detailed information. To avoid any further misunderstanding, we will label the allele “WT” to “WT (reference)” in Fig. S8 and S9. We did not detect intact/wild-type PARG sequence in any single KO/cKO clone by DNA sequencing. Sequencing of single KO/cKO clones was performed by using TOP TA Cloning kit. Briefly, genomic DNA was extracted from each single KO/cKO clone. Approximately 300bp surrounding the sgRNA targeting sequence was amplified by PCR. The PCR product was cloned into the vector and approximately 10-15 bacteria clones were extracted and sent for sequencing. If any intact/wild-type PARG sequence was detected in these 10-15 bacteria clones, this KO/cKO clone was considered heterozygous clone and discarded.

      HEK293A and HeLa cells are not diploid cells and have complex karyotypes. PARG gene is located on chromosome 10. Karyotyping by M-FISH shows that HeLa cells have 3 copies of chromosome 10 (Landry et al., 2013). HEK293 cells predominantly have 3 copies of chromosome 10 and sometimes 4 copies can be detected by G-banding (Binz et al., 2019). Therefore, it is anticipated that 1 to 4 mutant alleles would be detected in each KO/cKO clone by sequencing.

      Only one sgRNA was transfected into cells for the selection of single clones. We did not use paired or multiple sgRNAs in any of these experiments. As shown in Fig. S1D and Fig. 7A, HEK293A derived and HeLa derived PARG KO single clones were generated with the use of different sgRNAs. In addition, the two PARG cKO single clones from HEK293A and HeLa cells were also generated by the use of two different sgRNAs, as shown in Fig. 7A-B. We will include all the information above in the revised manuscript, i.e. in Methods section as well as in figure legends.

      1. Figure S9A: The sequences of the 293A PARG alleles in the cKO cells suggest that these cells also have one intact PARG allele, which again does not fit with the very low level of intact PARG protein shown in Figure 7B. How do the authors explain this?

      Sorry, this is a misunderstanding. The allele “WT” in Fig. S8 and S9 is the reference sequence. We will change it to “Reference sequence” to avoid further confusion. As mentioned above, we did not detect any intact/wild-type PARG sequence in any of our single KO/cKO clones by sequencing.

      1. Figure S9B: These critical lysate activity data show that the PARG KO cells have ~50% of the PARG activity detected in WT cells. However, this is not consistent with the PARG protein level detected in PARG immunoblot in Figure 1B, which appears to be less than 5% of the PARG protein level in WT cells (with one intact PARG allele in these cells one would theoretically expect~ 25%, although this depends on whether all four alleles are expressed equally). One possibility is that active PARG fragments are generated from one or more of the PARG KO alleles in the PARG KO cells. Targeted sequencing of PARG mRNAs might reveal whether there are shorter RNAs that could encode a protein containing the C-terminal catalytic domain (aa 570-910). In addition, the authors need to show the entire immunoblot to determine if there are smaller proteins recognized by the anti-PARG antibodies that might represent shorter PARG gene products (for this we need to know where the epitope against which the PARG antibodies are directed are located within the PARG protein - ideally they authors need to use an antibody directed against an epitope near the C-terminus).

      As stated in the Methods section, we incubated cell lysates with substrates overnight to evaluate the maximum level of pADPr hydrolysis, i.e. PARG activity, we were able to detect in this assay. It is very likely that the PARG activity in PARG KO cells was much lower than 50%, due to saturation of signals for lysates isolated from wild-type cells. Thus, the data presented in our manuscript probably underestimate the reduction of PARG activity in PARG KO cells. Nevertheless, these data indicate that residual PARG activity was detected in PARG KO cells, however this activity was absent in PARG cKO cells.

      As aforementioned, we used two independent antibodies that recognize the C-terminus of PARG for WB. Unfortunately, we could not draw a clear conclusion which functional isoforms or truncated proteins were expressed in our PARG KO cells. The dePARylation assay used here may be the best way to test the residual PARG activity in our KO and cKO cells.

      1. Figure 7D: In this experiment, the level of re-expressed WT PARG protein was much higher than that of the endogenous PARG protein (quantification is needed) - how might this affect the interpretation of these experiments (N.B., WT and catalytically-dead PARG were also re-expressed for the experiments shown in Figure 1, but there are no PARG immunoblots to demonstrate how much the exogenous proteins were overexpressed, or activity measurements). If regulated pADPr signaling is important for a normal S phase, then one would have thought that expressing a very high level of active PARG would create problems.

      In Fig. S1E, we blotted endogenous PARG level in control cells and exogenous PARG level in reconstituted cells. The reviewer is correct that exogenous PARG expression was much higher (~10-fold) than that of endogenous PARG in WT control cells. Nevertheless, we did not observe any obvious phenotypes in PARG KO/cKO cells reconstituted with high level of exogeneous PARG, which may reflect excess PARG level/activity in wild-type control cells.

      References:

      Binz, R. L., Tian, E., Sadhukhan, R., Zhou, D., Hauer-Jensen, M., and Pathak, R. (2019). Identification of novel breakpoints for locus- and region-specific translocations in 293 cells by molecular cytogenetics before and after irradiation. Sci Rep 9, 10554.

      Hanzlikova, H., Kalasova, I., Demin, A. A., Pennicott, L. E., Cihlarova, Z., and Caldecott, K. W. (2018). The Importance of Poly(ADP-Ribose) Polymerase as a Sensor of Unligated Okazaki Fragments during DNA Replication. Mol Cell 71, 319-331 e313.

      Koh, D. W., Lawler, A. M., Poitras, M. F., Sasaki, M., Wattler, S., Nehls, M. C., Stoger, T., Poirier, G. G., Dawson, V. L., and Dawson, T. M. (2004). Failure to degrade poly(ADP-ribose) causes increased sensitivity to cytotoxicity and early embryonic lethality. Proc Natl Acad Sci U S A 101, 17699-17704.

      Kumamoto, S., Nishiyama, A., Chiba, Y., Miyashita, R., Konishi, C., Azuma, Y., and Nakanishi, M. (2021). HPF1-dependent PARP activation promotes LIG3-XRCC1-mediated backup pathway of Okazaki fragment ligation. Nucleic Acids Res 49, 5003-5016.

      Landry, J. J., Pyl, P. T., Rausch, T., Zichner, T., Tekkedil, M. M., Stutz, A. M., Jauch, A., Aiyar, R. S., Pau, G., Delhomme, N., et al. (2013). The genomic and transcriptomic landscape of a HeLa cell line. G3 (Bethesda) 3, 1213-1224.

      Mortusewicz, O., Fouquerel, E., Ame, J. C., Leonhardt, H., and Schreiber, V. (2011). PARG is recruited to DNA damage sites through poly(ADP-ribose)- and PCNA-dependent mechanisms. Nucleic Acids Res 39, 5045-5056.

      Shirai, H., Fujimori, H., Gunji, A., Maeda, D., Hirai, T., Poetsch, A. R., Harada, H., Yoshida, T., Sasai, K., Okayasu, R., and Masutani, M. (2013). Parg deficiency confers radio-sensitization through enhanced cell death in mouse ES cells exposed to various forms of ionizing radiation. Biochem Biophys Res Commun 435, 100-106.

    2. eLife assessment

      The demonstration that the PARG dePARylation enzyme is required in S phase to remove polyADP-ribose (PAR) protein adducts that are generated in response to the presence of unligated Okazaki fragments is potentially valuable, but the evidence is incomplete, and identification of relevant PARylated PARG substrates in S-phase is needed to understand the role of PARP1-mediated PARylation and PARG-catalyzed dePARylation in S-phase progression.

    3. Reviewer #2 (Public Review):

      Summary:

      In this manuscript Nie et al investigate the effect of PARG KO and PARG inhibition (PARGi) on pADPR, DNA damage, cell viability and synthetic lethal interactions in HEK293A and Hela cells. Surprisingly, the authors report that PARG KO cells are sensitive to PARGi and show higher pADPR levels than PARG KO cells, which is abrogated upon deletion or inhibition of PARP1/PARP2. The authors explain the sensitivity of PARG KO to PARGi through incomplete PARG depletion and demonstrate complete loss of PARG activity when incomplete PARG KO cells are transfected with additional gRNAs in the presence of PARPi. Furthermore, the authors show that the sensitivity of PARG KO cells to PARGi is not caused by NAD depletion but by S-phase accumulation of pADPR on chromatin coming from unligated Okazaki fragments, which are recognized and bound by PARP1. Consistently, PARG KO or PARG inhibition show synthetic lethality with Pol beta, which is required for Okazaki fragment maturation. PARG expression levels in ovarian cancer cell lines correlate negatively with their sensitivity to PARGi.

      Strengths:

      The authors show that PARG is essential for removing ADP-ribosylation in S-phase.

      Weaknesses:

      1) This begs the question as to the relevant substrates of PARG in S-phase, which could be addressed, for example, by analysing PARylated proteins associated with replication forks in PARG-depleted cells (EdU pulldown and Af1521 enrichment followed by mass spectrometry).<br /> 2) The results showing the generation of a full PARG KO should be moved to the beginning of the Results section, right after the first Results chapter (PARG depletion leads to drastic sensitivity to PARGi), otherwise the reader is left to wonder how PARG KO cells can be sensitive to PARGi when there should be presumably no PARG present.<br /> 3) Please indicate in the first figure which isoforms were targeted with gRNAs, given that there are 5 PARG isoforms. You should also highlight that the PARG antibody only recognizes the largest isoform, which is clearly absent in your PARG KO, but other isoforms may still be produced, depending on where the cleavage sites were located.<br /> 4) FACS data need to be quantified. Scatter plots can be moved to Supplementary while quantification histograms with statistical analysis should be placed in the main figures.<br /> 5) All colony formation assays should be quantified and sensitivity plots should be shown next to example plates.<br /> 6) Please indicate how many times each experiment was performed independently and include statistical analysis.

    4. Reviewer #3 (Public Review):

      In the revised version the authors have addressed some of the reviewers' concerns, but, despite the new explanatory paragraph on page 16, the paper remains confusing because as shown in Figure 7 at the end of the Results the PARG KO 293A cells that were analyzed at the beginning of the Results are not true PARG knockouts. The authors stated that they did not rewrite the Results because they wanted to describe the experiments in the order in which they were carried out, but there is no imperative for the experiments to be described in the order in which they were done, and it would be much easier for the uninitiated reader to appreciate the significance of these studies if the true PARG KO cell data were presented at the beginning, as all three of the original reviewers proposed.

      While the authors have to some extent clarified the nature of the PARG KO alleles, they have not been able to identify the source of the residual PARG activity in the PARG KO cells, in part because different commercial PARG antibodies give different and conflicting immunoblotting results. Additional sequence characterization of PARG mRNAs expressed in the PARG cKO cells, and also in-depth proteomic analysis of the different PARG bands could provide further insight into the origins and molecular identities of the various PARG proteins expressed from the different KO PARG alleles, and determine which of them might retain catalytic activity.

      The authors have made no progress in identifying which are the key PARG substrates required for S phase progression, although they suggest that PARP1 itself may be an important target.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors were trying to investigate whether viral IBs are involved in antagonizing IFN-I production during EBOV trVLPs infection. They found that IRF3 is hijacked and sequestered into EBOV IBs after viral infection, thereby leading to the spatial isolation of IRF3 with TBK1 and IKKε. In such a progress, the activity of IRF3 is suppressed and downstream IFN-I induction is inhibited. The authors designed many experiments, such as the PLA that examined the colocalization, to support their conclusions. However, necessary negative controls were missed in several assays. More key index is needed to be examined in several assays.

      The paper is well organized and most data in this paper could support the conclusions, while there are several issues that need to be further solved.

      1. In Figure 2-4, authors should examine the expression of downstream IFNs as well as the phosphorylation and nuclear localization of IRF3 to further prove the suppression of IRF3 activity by infecting with trVLPs.

      Response: The inhibitory effect of trVLPs infection on the phosphorylation of IRF3 S396 and SeV-induced IRF3 nuclear localization was determined by immunoprecipitation (Figure 3D) and immunofluorescence (Figure 4A and 4B), respectively. In addition, we demonstrated that IFN-β transcription was inhibited more potently by EBOV viral inclusion bodies compared with VP35 alone (Figure 7B and 7C).

      Moreover, EBOV viral inclusion bodies were demonstrated to inhibit the transcription of IFN downstream genes (e.g., CXCL10, ISG15 and ISG56) more potently than VP35 alone (new Figure 7D-F).

      1. In Figure 5, to better prove the conclusion that EBOV NP and VP35 play an important role in sequestering IRF3 in IBS, authors should add the "NP+VP35+VP30" and "NP+VP35+VP24" groups to reperform the assay.

      Response: According to the reviewer’s suggestion, VP24 or VP30 was added to the “VP35+NP” group, and the results showed that the “NP+VP35+VP24” and “NP+VP35+VP30” groups exhibited little, if any, effect on the distribution of IRF3 compared with the “NP+VP35” group (new Figure 5 - figure supplement 2A-B).

      1. In Figure 6f, the expression of STING should be examined by immunostaining to show the knockdown efficiency in trVLPs-infected cells.

      Response: As suggested by the reviewer, immunostaining was performed to visually detect the effect of STING knockdown on the IRF3 distribution during trVLPs infection (new Figure 6F).

      Reviewer #2 (Public Review):

      The manuscript by Zhu et al explored molecular mechanisms by which Ebola virus (EBOV) evades host innate immune response. EBOV has a number of means to shut down the type I interferon induction (by viral VP35 protein) and block type I interferon action (by viral VP24 protein). This study reported a new mechanism that inclusion body (IB) used for viral replication sequesters IRF3, a key transcription factor involved in the interferon signaling, resulting in blockade of downstream type I interferon gene transcription. This finding is potentially interesting and may provide a new insight into EBOV's evasion of innate immunity. However, there are some flaws in the experimentations and analyses that need to be addressed.

      1. Most of experiments were performed by transfection of trVLP plasmids, which is very different from virus infection. The conclusions should be examined and verified in the context of virus infection.

      Response: As suggested by the reviewer, the effects of IRF3 depletion on live Ebola virus replication were examined as described in the revised manuscript. Consistent with the results obtained after trVLPs infection, IRF3 depletion exerted little, if any, effect on viral replication (new Figure 7H), which supports the notion that, upon EBOV infection and the formation of inclusion bodies, IRF3 has little, if any, transcription activation activity after sequestration by inclusion bodies.

      1. Fig 1 - VP35 displayed a classical IB staining only in Panel A, while much less so in Panel C and not in panel B. It seemed that the VP35 staining images were chosen in a way towards the authors' favor. The statistical analysis of co-localization of VP35 and IRF3, TBK1 or IKKe should be performed to draw the conclusion. Another concern is that IKKe is normally lowly expressed under a rest condition and becomes induced only when the interferon signaling is activated. It seemed to be expressed at a high level even when the interferon signaling is blocked in Panel C. The authors should comment on this discrepancy.

      Response: Ebola virus inclusion bodies show variations in both shape and size. According to the reviewer’s suggestion, the colocalization of TBK1 or IKKε and VP35 is shown in new figures (new Figure 1C and 1E), and quantitatively analyzed by the fluorescence intensity using ImageJ software (new Figure 1B, 1D and 1F).

      1. Fig 2 - Was this experiment done by transfection or infection? The description of result is not consistent with the figure legend. The labeling was also not consistent between panel A and B. I would suggest performing Western blot to analyze the expression level of IRF3.

      Response: We apologize for the incorrect description of the data. Ebola virus trVLPs were initially produced based on transfection but also involved the viral infection process. The use of “transfection” in the figure and figure legends has been changed to “infection” in the revised manuscript. As suggested by the reviewer, Western blotting was performed to analyze the IRF3 expression levels at different time points after trVLPs infection (new Figure 2D).

      1. Fig 3 and 4 - As VP35 is well known for its highly efficient blockade of type I interferon activation, how would the authors differentiate the effect of VP35 alone from the sequestration of IRF3 in IBs in these experiments?

      Response: Previous studies have found that VP35, rather than NP, inhibits the expression of interferon, and the “VP35+NP” treatment, which induces IRF3 sequestration, showed inhibited IFN-β luciferase activity much more potently than VP35 expression alone (Figure 7B).

      1. Fig 3 - PolyIC can activate both RLR and TLR signaling pathways. Can the author comment on which pathway it activates in this experiment?

      Response: In this study, the effect of poly(I:C) was consistent with the results observed with SeV, which indicated that poly(I:C) may mainly activate the RLR signaling pathway. A discussion was added to the revised manuscript.

      1. The authors demonstrated that VP35 interacts with STING and recruit the latter to IBs. How would this affect the function of STING given that STING plays essential roles in cGAS/cGAMP pathway?

      Response: This study unexpectedly showed that VP35 can recruit IRF3 into viral inclusion bodies through STING, but whether it regulates the cGAS-STING pathway remains to be further investigated. Related discussion was added to the revised manuscript.

      1. It is difficult to follow the logics of Fig 7. The expression level of each viral protein should be determined. Ideally, a mutation in VP35 that disrupts its ability to antagonize the interferon signaling but still allows for the IB formation can be used to assess the relative contribution of IB sequestering IRF3.

      Response: As suggested by the reviewer, a series of VP35 mutants were constructed, but we failed to obtain a VP35 mutant that contains a mutation that disrupts the ability of the protein to antagonize interferon signaling but still allows IB formation. Instead, coexpression of “NP+VP35+VP30+L”, which induces IBs formation, inhibited IFN-I more potently than the expression of VP35 alone (Figure 7B). IRF3 knockout inhibited poly(I:C)-induced IFN-I production but had little, if any, effect on poly(I:C)-induced IFN-I production in the “NP+VP35+VP30+L” group (Figure 7C). IRF3 knockout in the cells did not significantly affect viral replication, but overexpression of activated IRF3 (IRF3/5D), instead of wild-type IRF3, inhibited viral replication (new Figure 7G-H). These results collectively suggested that almost all IRF3 in cells was hijacked and sequestered into IBs in the Ebola virus-infected cells.

    2. Reviewer #3 (Public Review):

      Summary:

      In the manuscript "Ebola Virus Sequesters IRF3 in Viral Inclusion bodies to Evade Host Antiviral Immunity " by Lin Zhu et al, the authors elucidated an evasion mechanism by which EBOV evades host innate immunity.

      Strengths:

      Using data from immunofluorescence analysis, TEM and Western Blot, the authors conclude that Ebola virus VP35 protein evades host antiviral immunity by interacting with STING to sequester IRF3 into IBs and inhibit type-I interferon production.

      Weaknesses:

      Similar mechanisms have already been found in other viruses, such as SFTSV, RSV and so on. In addition, the presented results are also relatively rough, and the mechanism explained is not deep enough, so this story is not innovative

    3. Reviewer #4 (Public Review):

      The manuscript entitled "Ebola Virus Sequesters IRF3 in Viral Inclusion Bodies to Evade Host Antiviral Immunity" mainly describes that the function of IBs formed by the viral proteins VP35 and NP in evading host antiviral immunity. They proved that Ebola virus VP35 protein can interact with STING, but not IRF3, to sequester IRF3 into inclusion bodies and thereby inhibit type-I interferon production. This work will be of some interest to readers in the Ebola Virus field, however, the current data do not clearly explain the relationship of VP35 protein and IRF3.

    1. Author Response

      The following is the authors’ response to the original reviews.

      RESPONSE TO REVIEWERS:

      Reviewer #1 (Recommendations For The Authors):

      I think the manuscript of this excellent work can be improved, especially in writing (including a suggestion in the title) and presentation (Figure 6); Also some additional specific experiments and analyses could be important, as I suggest below,

      1. For the title, perhaps a shorter "The acetylase activity of Cdu1 protects Chlamydia effectors from degradation" would be better to convey the major significance of this work. Of course, Cdu1 must regulate the function of InaC, IpaM and CTL0480. But perhaps it is speculative to think that egress is the major function of these effectors as their activity on other host cell processes during the cycle could eventually impact the extrusion process indirectly.

      Although we concur with the insights provided by reviewer 1, we wish to underscore that a significant breakthrough presented in our study revolves around the regulation of Chlamydia exit by Cdu1. Consequently, we believe that this noteworthy discovery should be incorporated into the title.

      1. For the writing:

      a. The description of ubiquitination and DUBs could be synthesized to the essential, so that space is gained to explain things that then come a bit out of the blue in the results (what are Incs, the specific functions of InaC, IpaM, and CTL0480 - at least place the citations in lines 110-112 next to the corresponding Incs -, Cdu2, etc - see specifics below)

      In lines 182-196 of the revised manuscript, we have incorporated additional contextual information concerning the roles of Incs, along with descriptions of the functions of InaC, IpaM, and CTL0480.

      b. In the Results, there is a lot of Chlamydia- and maybe lab-specific jargon that could be significantly simplified for the more general reader. I detail some suggestions below in the specific issues.

      We have improved the readability of our manuscript for a general audience by removing Chlamydia-specific terminology from the entire text and figures.

      1. For the figures:

      a. Figure 6, this figure could be reorganized: why two graphs in panel D? If detailed quantifications were done, perhaps in panel B just zoom on the examples of Golgi distributed/compacted? And again the labelling Rif-R L2, L2 pBOMB, M407 p2TK2, etc, simplify?

      Figure 6 has undergone restructuring. The representative images have been relocated to Supplemental Figures 5 and 6, while we have introduced sample images demonstrating F-actin assembly and Golgi repositioning. Furthermore, the quantification of Golgi dispersal has been streamlined into a single panel. Additionally, we have simplified the labeling of the strains utilized in the study.

      b. Figure 3, in the labelling, WT, inaC null, cdu1::GII wouldn't be enough? Leave the details to the legend and/or M&M.

      We have simplified the labeling of Ct strains in Figure 3.

      c. Figure 3C, these arrowheads should not be so symmetric (small arrows instead?) and it is unclear that the indicated cells do not show CTL0480.

      We have substituted arrowheads with small arrow symbols and have also revised the Figure to incorporate a new representative image that prominently illustrates the absence of CTL0480 at the inclusion membrane of some cdu1::GII inclusions within infected Hela cells at 36 hpi.

      1. Experiments:

      a. In Figure 7, at least extrusion should be analysed also with the Cdu1-deficient strain expressing Ac-deficient Cdu1 and the inaC and ipaM phenotypes should be complemented.

      We have conducted additional experiments to analyze extrusion production in Hela cells infected with a cdu1 null strain expressing the acetylase-deficient Cdu1 variant. We have incorporated the relevant data into revised Figure 7, where the impact of this strain on extrusion production and size is presented. Additionally, we updated Supplemental Figure 8 to include data illustrating the number of inclusions produced by this strain. We have also addressed these new results in the revised manuscript (lines 424-432). We are currently complementing inaC and ipaM mutant strains with various InaC and IpaM constructs that will be used in a follow up manuscript.

      b. Does overexpression of InaC, IpaM, or CTL0480 in a cdu1-null background prevent the degradation of these Incs and suppress the defects of cells infected by the cdu1 mutant (F-actin, Golgi, MYPT1)? This would show that the multiple phenotypes displayed by cells infected by the cdu1 null mutant are indeed related to the decreased levels of InaC, IpaM and CTL0480.

      We opted not to include data from the overexpression of these effectors in a cdu1-null background due to an unexpected decrease in shuttle plasmid load during overexpression. This development prompted concerns regarding the potential detrimental effects of overexpressing these effectors in the absence of Cdu1. Data supporting this observation are not included in this report.

      c. Figures 3A and 3B should be quantified (it says it is from 3 independent experiments). It would be important to have a relative perspective of how much Cdu1 protects these Incs over time (for InaC, it would also be nice to have the 36 and 48 hpi time-point). This is in contrast with the microscopy data in Figure 5, which illustrates very clear effects, and the quantification is a bit redundant.

      In Figure 3, we have incorporated a new Western Blot image showing endogenous InaC protein levels in Hela cells following infection with both WT Ct and cdu1::GII strains at 24, 36, and 48 hours post-infection (hpi). Additionally, we have quantified the Western Blot signals for both InaC and IpaM, and these results are also presented in Figure 3. The quantification of MYPT1 recruitment has been relocated to a supplementary figure. We have also included details regarding the methodology employed for the quantification of Western Blot signals in the Materials and Methods section.

      d. What is the subcellular localization of InaC, IpaM, CTL0480 and Cdu1 when analysed by transfection? Does Cdu1 bind to of InaC, IpaM, CTL0480 in infected cells? If this was attempted and unsuccessful it should be mentioned.

      In transfected HEK cells, InaC, IpaM, CTL0480, and Cdu1 all exhibit cytoplasmic localization with a diffuse pattern (data not shown). Despite our efforts, we encountered challenges in observing co-immunoprecipitation of Cdu1 with all three Incs in infected Hela cells at 24 hpi, We have duly acknowledged this limitation in our findings, as reflected in line 221-226 of the revised manuscript.

      1. Specific issues:

      2. Line 87, "propagule" is really needed to describe the EB?

      The EB is the infectious form of Chlamydia species that spreads within the host to renew its life cycle; thus, "propagule" is a suitable term to characterize the EB.

      • Exocytosis implies fusion with the plasma membrane so "inclusion is exocytosed" (line 91) is not entirely correct.

      In line 91 of the revised manuscript, we referred to extrusion as the exit of an intact inclusion from the host cell and omitted the use of "exocytosed" to describe this process.

      • Line 126, "a Ct L2 (LGV L2 434 Bu) background". Maybe "a Ct cdu1-null strain" would be enough and leave the detail for Materials and Methods.

      In line 128 of the revised manuscript, we omitted "(LGV L2 434 Bu)" to avoid using jargon that may be unfamiliar to readers not well-versed in Chlamydia terminology.

      • Line 138, in the previous Pruneda et al, Nature Microbiol 2018, the title of figure 4 is "ChlaDUB deubiquitinase activity is required for C. trachomatis Golgi fragmentation", so why raise this hypothesis? And why in the end is the acetylation activity of Cdu1 that promotes Golgi distribution? I think this related with infection vs transfection experiments but it deserved to be briefly explained/discussed.

      In lines 140-142 of the revised manuscript, we provide clarification that the DUB activity of Cdu1 is required for Golgi fragmentation in transfected cells. This observation supports our initial hypothesis suggesting that the DUB activity of Cdu1 is also required for Golgi distribution in infected cells, and our rationale for identifying targets of its DUB activity.

      • Lines 147-155, what is the relevance of this non-ubiquitinated proteins that come along? Couldn't this be synthesized?

      We have included a discussion on non-ubiquitinated proteins, as they could potentially encompass proteins that interact with those protected by Cdu1. This perspective provides supplementary insights into the roles of proteins targeted for ubiquitination in the absence of Cdu1. The results of this analysis have been succinctly summarized in a single paragraph within the initial manuscript (lines 151-159 of the revised manuscript).

      • Line 170, I think it is the first time that "Type 3 secretion"; perhaps explain in the introduction.

      Type 3 secretion systems have been extensively characterized and discussed in the literature, and we anticipate that the majority of our readers are well-acquainted with this secretory mechanism.

      • Line 184, I think it is the first time "microdomains" are mentioned; perhaps mention in the introduction.

      The definition of "microdomains" has been provided in line 191 of the revised manuscript.

      • Figure 2, as it stands the analysis with truncated Cdu1 proteins adds little to the work. Binding to the Incs seems to be affected when the TM domain is not present, but it still binds. And this is in a transfection context.

      The results depicted in Figure 2, involving truncated Cdu1 proteins, illustrates that Cdu1 is capable of interacting with InaC, IpaM, and CTL0480 even in the absence of infection. This finding serves as evidence suggesting that all three Incs could potentially serve as direct targets for Cdu1 activity. As a result, we prefer to keep these findings in the manuscript.

      • Line 219, "late stages of infection", this is shown (albeit not completely quantified) for IpaM and CTL0480, but not for InaC.

      In the revised Figure 3, we show InaC protein levels at 24, 36, and 48 hours post-infection, and we have incorporated quantitative data for both InaC and IpaM protein levels in the context of Hela cells infected with both WT L2 and cdu1::GII strains. This updated figure serves to emphasize the pivotal role of Cdu1 in safeguarding all three Incs during the late stages of infection.

      • Line 233, "pBOMB-MCI backbone" - is this needed in the Results section? And this refers to Figure 4 while pBOMB appear already in Fig. 3.

      We have removed “pBOMB-MCI backbone” in the revised manuscript.

      • Line 236, should be cdu1 endogenous promoter.

      In line 265 of the revised manuscript we have replaced Cdu1 with cdu1 (italicized).

      • Line 263, WT.

      In line 293 of the revised manuscript we replaced “wild type” with “WT”.

      • Line 277, IncA instead of "the Inc protein IncA".

      In the manuscript we wanted to emphasize that IncA is also an inclusion membrane protein, therefore we have included “the Inc protein IncA” in the revised manuscript to avoid any confusion.

      • How does the data in Figure 5 relates to the relatively few proteins ubiquitinated in cells infected with cdu1-mutant Ct? These Ub-labelling corresponds to ubiquitinated InaC, IpaM and CTL0480?

      The findings presented in Figure 5 demonstrate that the acetylase activity of Cdu1 plays a crucial role in enabling Ct to block all ubiquitination events taking place on or in proximity to the periphery of the inclusion membrane. This encompasses Cdu1 targets that might not have been identified through our proteomic analysis.

      • Lines 299-301, "M923 inclusions", there is certainly a clear way to write this.

      In lines 326-327 and 332-332 of the revised manuscript, we have clarified that “M923” is an incA null strain to provide clarification.

      • Line 309, is "peripheries" correct?

      We have changed “peripheries” with “periphery” in the revised manuscript (line 360).

      • Line 312, "Rif-R L2" and "M407" - can this be simplified?

      In the revised manuscript, "Rif-R L2" was substituted with "WT L2" in lines 363 and 382, while "M407" was exchanged with "an inaC null strain" in lines 311, 367, and 368. These same replacements were applied to the Figures and their corresponding legends for consistency.

      • Lines 308-321, and 326-335, these % are all approximate figures and this should be made clear.

      In lines 364-395 of the revised manuscript we have stated that all percentages are approximate values.

      • Fig. S1, kb and not k.b; what's the "+ control"; and is not really possible to have a PCR that works for the *? 3 kb is not that long.

      In the updated Figure S1, we have corrected "k.b" to "kb". In the legend of Figure S1, we have clarified that the + control corresponds to the cdu2 locus. Moreover, we could not cleanly amplify a 3 kb PCR product from bacteria in whole cell lysates of infected mammalian cells (Vero cells).

      • Fig. S2, kb and not k.b, bp and not b.p

      In the updated Figure S2, we have corrected “k.b” with “kb” and “b.p” with “bp”.

      Reviewer #2 (Recommendations For The Authors):

      Figure 1 describes an affinity-based purification and mass spectrometric identification of differentially ubiquitinated proteins (host and chlamydial). Through different permutations of combinations of infection (mock, wild type, and Cdu1 mutant), three effectors, IpaM, InaC, and CTL0480, were identified as putative targets of Cdu1. The authors used a high-stringency cutoff, which could explain identification of only three targets. Having said this, the localization of Cdu1 to the inclusion membrane would be expected to also narrow down the number of targets. Interestingly, Cdu2, another deubiquitinase remained active in these experiments, which could have affected identification of Cdu1 targets. The authors addressed this issue by referring to previously reported structural studies. A somewhat glaring omission is the lack of reference to NF-kB as a substrate of ChlaDub1/Cdu1. In experiments by Le Negrate et al., ChlaDub1 ectopic overexpression in cells led to the deubiquitination of IkB-alpha, thus inhibiting the nuclear translation of NF-kB. Based on the inclusion membrane localization of Cdu1 during infection, is the identification of IkB an artifact of overexpression of Cdu1, or is it still a bona fide Cdu1 target?

      We conducted experiments using our cdu1 null strain to investigate whether IκBα could be a target of Cdu1 activity. While our findings are intriguing and relevant, it is not feasible to determine, at this stage, whether our findings result from a direct or indirect consequence of Cdu1 localizing to the inclusion membrane. Consequently, these findings extend beyond the scope of the current manuscript. We plan to explore the implications of our observations more deeply in a subsequent manuscript, where we intend to provide a more comprehensive and mechanistic analysis based on these preliminary findings. Additionally, we have referenced the potential targeting of IκBα by Cdu1 in lines 100-101 and 166-171 of the revised manuscript.

      Figure 2 demonstrates the individual interaction of the identified effectors with Cdu1. Interaction at the inclusion membrane is inferred from colocalization studies, while protein-protein interaction is monitored using ectopic overexpression of tagged versions of Cdu1 and the individual effectors. This is somewhat of a weakness of the manuscript because the mechanism of action of Cdu1 towards its target hinges on protein-protein interaction.

      Despite our efforts, we encountered challenges in co-immunoprecipitating endogenous Cdu1 with all three Incs in infected Hela cells at 24 hpi. There are multiple technical reasons as to why these interactions, which are predicted to be transient, will not be captured by bulk affinity approaches such as immunoprecipitations, especially when the starting materials are present in very low abundance. We acknowledged these limitations in our findings, as reflected in lines 221-226 of the revised manuscript.

      Figure 3 provides the first evidence in this paper of the importance of the inferred interaction of Cdu1 with the three effectors. The authors show that the loss of cdu1 has stability consequences on the three effectors. This figure would benefit from quantifying InaC- or IpaM-positive inclusions in the same manner done with CTL0480. The timepoint-dependent effect of Cdu1 loss of function is intriguing. Do InaC and IpaM retention at the inclusion show the same timepoint-dependent characteristic?

      In the revised Figure 3, we have incorporated InaC protein levels at 24, 36, and 48 hours post-infection. Additionally, we have included quantitative data representing both InaC and IpaM protein levels in HeLa cells infected with both WT L2 and cdu1::GII strains. The quantification of CTL0480 localization to cdu1::GII inclusions has been moved to a supplementary figure.

      This updated figure illustrates that the absence of Cdu1 has a time-dependent impact on both InaC and IpaM. However, it is noteworthy that the kinetics of degradation for these two proteins diverge significantly.

      For Figure 7, the authors should consider monitoring timing of inclusion extrusion to gain additional insight into the functional interactions between the effectors. For example, the loss of CTL0480 leads to increased extrusion, implying a role in delaying or suppressing extrusion. In a time-course experiment, a CTL0480 mutant could exhibit an earlier occurrence of inclusion extrusion.

      One of the principal discoveries of this study is that Cdu1, InaC, IpaM, and CTL0480 collaborate to facilitate optimal extrusion of Ct from host cells. These findings represent a significant contribution to our understanding of how Chlamydia controls its exit from infected cells. We are currently in the process of expanding on these results. A forthcoming follow-up manuscript will provide more detailed and comprehensive exploration of these findings.

      Reviewer #3 (Recommendations For The Authors):

      Specific comments.

      a. I have some concerns related to the time point chosen for mass spec analysis and potential caveats and alternative interpretations. This work was done relatively early (24 hours) compared to the most convincing Cdu1 functions that occur later, thus this may limit the authors global understanding of protein changes. For example, the known substrate of Cdu1, Mcl-1 was not identified but this is altered relatively late during infection. Thus, the surprise that minimal host proteins are altered in ubiquitination may be partially driven by the timing of the assay. This should be more clearly discussed as a caveat.

      In the revised manuscript (lines 166-171), we have acknowledged that there might be additional targets of Cdu1 that remain unidentified, primarily due to the specific time point we utilized in our study.

      b. Another caveat to these studies is while the loss of Cdu1 alters different effectors stability and function and extrusion size, these changes do not modulate bacterial growth in cells. The authors speculate that regulating extrusion size may alter interactions with innate cells to drive dissemination. However, a previous study found defects in an animal model using a Cdu1 transposon mutant found decreased bacterial load in the genital tract. It is also possible that redundancy of effectors may mask importance in growth of Cdu1, but the authors strongly argue against redundancy of Cdu1 and Cdu2 so this weakens the authors argument here. These concepts and published data should be more directly discussed in the context of the authors proposed extrusion model and the role in driving Chlamydia growth and pathogenesis.

      In our revised manuscript (lines 460-466) we propose that while we do not observe any growth impairments during Ct growth in the absence of Cdu1 in HeLa cells, the reduction in bacterial loads observed in murine models of infection with an independent cdu1 mutant strain (cdu1::Tn) may potentially be linked to defects in extrusion production or alterations in Cdu1-dependent regulation of extrusion size.

      c. Recent studies have found that IFNg activation can result in dramatic changes in ubiquitination to pathogen containing vacuoles. While some of these are blocked by the newly found GarD, it seems possible that Cdu1 may also play a role (and perhaps use its deubiquinating activity) to further protect the inclusion. In light of published results showing that Cdu1 mutants have lower IFU burst size only in IFNg activated cells, this may be an important caveat in the current studies. This should be more directly addressed in the current manuscript.

      We have incorporated two experimental findings indicating that the presence of Cdu1 is not required for Ct to defend itself against IFN cellular immunity in human cells. These recent discoveries are now presented in the updated Figure 5 and detailed in lines 338-355 of the revised manuscript.

      d. On lines 433-434 the authors claim that Cdu1 is atypical since it is not encoded with the metaeffector/target pairs. However, this is an oversimplification of what is known about metaeffectors. For example, there are meta-effector/effector pairs that are not encoded together in Legionella (see table 1 DOI: https://doi.org/10.3390/pathogens10020108). Thus, the discussion should be adjusted. It seems Cdu1 is the first meta-effector found in Chlamydia, and maybe this should be highlighted more strongly rather than its uniqueness in this aspect of meta-effector/effector functions.

      In lines 488-489 of the revised manuscript, we have removed the assertion that Cdu1 functions as an atypical metaeffector and emphasized that it represents the initial discovery of a metaeffector within Ct.

    2. Reviewer #1 (Public Review):

      The objective of this study was to investigate the influence of the C. trachomatis effector Cdu1 on the ubiquitination of proteins in infected host cells and its correlation with the previously identified role of Cdu1 in facilitating Golgi distribution around the Chlamydia inclusion.

      To achieve this, the authors created a cdu1-null mutant in C. trachomatis and employed proteomics to analyze ubiquitinated proteins in cells infected with Cdu1-producing and Cdu1-deficient chlamydiae, comparing them to mock-infected cells. The results revealed that, among the proteins specifically ubiquitinated after infection with Cdu1-deficient chlamydiae, three were other C. trachomatis effectors (InaC, IpaM, and CTL0480), members of a large family of Chlamydia effectors (Incs) that insert in the inclusion membrane.

      Subsequently, the authors focused on understanding how Cdu1 shields InaC, IpaM, and CTL0480 from ubiquitination and the implications of this protection for the protein levels and functions of these Incs during infection. Data is presented showing that Cdu1 can bind to InaC, IpaM, and CTL0480, and protects these Incs and itself from ubiquitination and proteasomal degradation. This protective role of Cdu1 is dependent on its acetylation, but not on its deubiquitinating activity. Host cells infected by the cdu1 null mutant displayed defects resembling those observed in cells infected by inaC, ipaM, or ctl0480 null mutants.

      Additionally, it was previously established that CTL0480 inhibits a chlamydial egress pathway involving the extrusion of the inclusion. This study now revealed that InaC and IpaM also play a role in promoting the extrusion of C. trachomatis inclusion, and the cdu1 null mutant exhibited a defect in this process. This leads to the title's conclusion that Cdu1 regulates chlamydial exit from host cells by safeguarding specific C. trachomatis effectors from degradation.

      In summary, this work is excellent and impressive, both technically and conceptually, providing mechanistic insights into the action of Cdu1. The data provides convincing support for the proposed model, illustrating how the acetylation activity of Cdu1 protects itself and three Incs (InaC, IpaM, and CTL0480) from degradation. While the study indicates that the observed phenotypes in cells infected by the cdu1 null mutant are linked to reduced levels of InaC, IpaM, and CTL0480, these Incs are still detectable in cells infected by the cdu1 null mutant. Even if very unlikely, this leaves room for the possibility that Cdu1 directly promotes assembly of F-actin and Golgi repositioning around the inclusion, MYPT1 recruitment to the inclusion, and extrusion of the inclusion. Nevertheless, the major significance of this work lies in the integration of proteomics and chlamydial genetics to unveil a unique mechanism in which one effector controls the levels of other effectors, emphasizing the intricate relationships among bacterial effectors injected into host cells.

    3. Reviewer #2 (Public Review):

      Based on the corresponding author's response, the questions I raised were not addressed for various reasons. This is not necessarily a negative. The authors indicated that most of the points raised will be addressed in a separate manuscript. Specifically, the Cdu1 targeting of IkBa. They mentioned intriguing findings regarding IkBa in cells infected with a cdu1-null strain C. trachomatis in their response to reviewers. Similar to this, there appears to be a planned manuscript that will address the question of the timing of CTL0480's function in inclusion extrusion.

      The lack of more direct infection-related evidence of Cdu1 interaction with various type III effectors was raised; and the authors attributed this to technical difficulties and low abundance of starting materials. It was not clear if they tried other approaches to demonstrate interaction.

      Another suggestion was the quantitation of the three target effectors of Cdu1 in wild type and cdu1-null background. The authors provided western blot data and immunofluorescence images that revealed potential differences in stability/turnover kinetics. The authors might want to discuss the implications of the different kinetics of stability/turnover. For example, if all three proteins are necessary for optimal extrusion of inclusions, and concertedly act to mediate this process, all three would need to be present at the required levels. Could this be a temporal regulation strategy? Does acetylation also regulate function, interactions, etc.?

      In short, the response to some of the questions is forthcoming in the form of follow-up manuscripts. New observations on the different stability profiles could be elaborated in the Discussion section, with a brief discussion on functional and/or regulatory implications.

    4. Reviewer #3 (Public Review):

      In this article by Bastidas et al. the authors examine the functions of the Chlamydia deubiquitinating enzyme 1 (Cdu1) during infections of human cells. First, a mutant lacking Cdu1 but not Cdu2 was constructed using targetron and quantitative proteomics was used to identify differences in ubiquitinated proteins (both host and bacterial) during infection. While they found minimal changes in host protein ubiquitination, they identified three Chlamydia effector proteins, IpaM, InaC and CTL0480 were all ubiquitinated in the absence of Cdu1. Microscopy and immunoprecipitations found Cdu1 directly interacts with these Chlamydia effectors and confirmed that Cdu1 mediates the stabilization of these effectors at the inclusion membrane during late infection time points. Surprisingly rather than deubiquitination driving this stabilization, the acetylation function of Cdu1 was required, and acetylation on lysine residues prevented degradative ubiquitination of Cdu1, IpaM, InaC and CTL0480. In line with this observation the authors show that loss of Cdu1 phenocopies the loss of single effector mutants of InaC, IpaM and CTL0480, including golgi stack formation and the recruitment of MYPT1 to the inclusion. The aggregation of changes to the Chlamydia inclusion does not alter growth but controls extrusion of chlamydia from cells with reduced extrusion in Cdu1 mutant Chlamydia infections. The strengths of the manuscript are the range of assays used to convincingly examine the biochemical and cellular biology underlying Cdu1 functions. The finding that acetylation of lysine residues is a mechanisms for bacterial effectors to block degradative ubiqutination is impactful and will open new investigations into this mechanism for many intracellular pathogens. The authors revisions to the manuscript have addressed my primary concerns and the authors present compelling arguments for remaining questions that are outside the scope of this study. Altogether this is an important series of findings that help to understand the mechanisms underpinning Chlamydia pathogenesis using orthologous methods and is an impactful study.

    1. Author Response

      eLife assessment

      This important work describes the first high-resolution structure of HGSNAT, a lysosomal membrane protein required for the degradation of heparan sulfate (HS). Through careful structural analysis, this work proposes potential reasons why certain mutations in HGSNAT lead to lysosomal storage disorders and outlines the enzyme's catalytic mechanism. The experimental evidence presented provides incomplete support for the proposed molecular mechanism of the HS acetylation reaction and the impact of disease-causing mutations.

      We thank the editors and reviewers for taking the time to provide a critical assessment of our manuscript. We appreciate the input and suggestions to improve the analysis. Included here are only our provisional responses. We will address the concerns raised in more detail and incorporate them in the revised version of the manuscript.

      Reviewer #1 (Public Review):

      This article by Navratna et al. reports the first structure of human HGSNAT in an acetyl-CoAbound state. Through careful structural analysis, the authors propose potential reasons why certain human mutations lead to lysosomal storage disorders and outline a catalytic mechanism. The structural data are of good quality, and the manuscript is clearly written. This study represents an important step toward understanding the mechanism of HGSNAT and is valuable to the field. I have the following suggestions:

      We thank the reviewer for their encouraging and positive overall assessment of our work.

      1. The authors should characterize whether the purified protein is active. Otherwise, how does one know if the detergent used maintains the protein in a biologically relevant state? The authors should at least attempt to do so. If these prove to be challenging, at the very least, the authors should try a cell-based assay to demonstrate that the GFP tag does not interfere with the function.

      Thank you for highlighting this concern. The cryo-EM sample was prepared without the exogenous addition of ligand, as noted in the manuscript; the acetyl-CoA that we see in the structure was intrinsically bound to the protein, indicating the ability of GFP-tagged HGSNAT protein to bind the ligand. We purified the protein at a pH optimal for acetyl-CoA binding, as suggested by Bame, K. J. and Rome, L. H. (1985) and Meikle, P. J. et al., (1995). Because we see acetyl-CoA in a structure obtained using a GFP fusion, we argue that GFP does not interfere with protein stability and ability to bind to the co-substrate. As demonstrated by existing literature HGSNAT catalyzed reaction is compartmentalized spatially and conditionally. The binding of acetyl-CoA happens towards the cytosol and is optimal at pH 7-0.8.0, while the transfer of the acetyl group to heparan sulfate occurs towards the luminal side and is optimal at pH 5.0-6.0. We are working on establishing a robust assay to study this complicated and compartmentalized acetyl transfer assay.

      1. In Figure 5, the authors present a detailed schematic of the catalytic cycle, which I find to be too speculative. There is no evidence to suggest that this enzyme undergoes isomerization, like a transporter, between open-to-lumen and open-to-cytosol states. Could it not simply involve some movements of side chains to complete the acetyl transfer?

      The acetyl-CoA bound structure presented in the paper does not conclusively support a potential for isomerization and conformational dynamics. We agree with the reviewer that the reaction schematic presented in Figure 5 is speculative. We acknowledge in the discussion that our structure represents only a single step of the reaction, and defining the precise mechanism of acetyl transfer needs additional work. However, we will reword the discussion and change Figure 5 to address this concern raised by multiple reviewers.

      Reviewer #2 (Public Review):

      Summary:

      This work describes the structure of Heparan-alpha-glucosaminide N-acetyltransferase (HGSNAT), a lysosomal membrane protein that catalyzes the acetylation reaction of the terminal alpha-D-glucosamine group required for the degradation of heparan sulfate (HS). HS degradation takes place during the degradation of the extracellular matrix, a process required for restructuring tissue architecture, regulation of cellular function, and differentiation. During this process, HS is degraded into monosaccharides and free sulfate in lysosomes.

      HGSNAT catalyzes the transfer of the acetyl group from acetyl-CoA to the terminal non-reducing amino group of alpha-D-glucosamine. The molecular mechanism by which this process occurs has not been described so far. One of the main reasons to study the mechanism of HGSNAT is that multiple mutations spanning the entire sequence of the protein, such as nonsense mutations, splicesite variants, and missense mutations lead to dysfunction that causes abnormal accumulation of HS within the lysosomes. This accumulation is a cause of mucopolysaccharidosis IIIC (MPS IIIC), an autosomal recessive neurodegenerative lysosomal storage disorder, for which there are no approved drugs or treatment strategies.

      This paper provides a 3.26A structure of HGSNAT, determined by single-particle cryo-EM. The structure reveals that HGSNAT is a dimer in detergent micelles and a density assigned to acetylCoA. The authors speculate about the molecular mechanism of the acetylation reaction, map the mutations known to cause MPS IIIC on the structure and speculate about the nature of the HGSNAT disfunction caused by such mutations.

      Strengths:

      The description of the architecture of HGSNAT is the highlight of the paper since this corresponds to the first description of the structure of a member of the transmembrane acyl transferase (TmAT) superfamily. The high resolution of an HGSNAT bound to acetyl-CoA is an important leap in our understanding of the HGSNAT mechanism. The density map is of high quality, except for the luminal domain. The location of the acetyl-CoA allows speculation about the mechanistic role of multiple residues surrounding this molecule. The authors thoroughly describe the architecture of HGSNAT and map the mutations leading to MPS IIIC. The description of the dimeric interphase is a novel result, and future studies are left to confirm the importance of oligomerization for function.

      We thank the reviewer for their time and for highlighting both the quality and novelty of the structure presented in this work.

      Weaknesses:

      Apart from the cryo-EM structure, the article does not provide any other experimental evidence to support or explain a molecular mechanism. Due to the complete absence of functional assays, mutagenesis analysis, or other structures such as a ternary complex or an acetylated enzyme intermediate, the mechanistic model depicted in Figure 5 should be taken with caution.

      Thank you for pointing out this concern. The proposed mechanistic model in Figure 5 is a hypothesis based on previously reported biochemical characterization of HGSNAT by Rome & Crain (1981), Rome et al, (1983), Miekle et al., (1995) and Fan et al., (2011). However, we agree with the reviewer that this schematic is not experimentally proven and is speculative at best. Especially because our structure presents only a single step of the reaction, which does not conclusively support either ping-pong or random-order bi-substrate reactions. We will rephrase this section of our discussion and edit Figure 5 to address this concern.

      The authors discuss that H269 is an essential residue that participates in the acetylation reaction, possibly becoming acetylated during the process. However, there is no solid experimental evidence, e.g. mutagenesis analysis or structural analysis, in this or previous articles, that demonstrates this to be the case.

      H269, as a crucial catalytic residue, was suggested by monitoring the effect of chemical modifications of amino acids on acetylation of HGSNAT membranes by Bame, K. J. and Rome, L. H. (1986). We agree that mutagenesis, catalysis, and structural evidence for the same are not currently available. We are pursuing a more thorough exploration of the role of both H269 (previous studies) and N258 (from this study) on the stability and function of HGSNAT.

      In the discussion part, the authors mention previous studies in which it was postulated that the catalytic reaction can be described by a random order mechanistic model or a Ping Pong Bi Bi model. However, the authors leave open the question of which of these mechanisms best describes the acetylation reaction. The structure presented here does not provide evidence that could support one mechanism or the other.

      We agree with the reviewer’s observation that the structure doesn’t indeed support one reaction mechanism or another. We are pursuing the structural and kinetic characterization of HGSNAT in the presence of other co-substrates and multiple pHs that are required to address this concern thoroughly.

      Although the authors map the mutations leading to MPS IIIC on the structure and use FoldX software to predict the impact of these mutations on folding and fold stability, there is no experimental evidence to support FoldX's predictions.

      We are working on assessing the impact of specific mutations on the stability of HGSNAT and will add them to the revised version of the manuscript. We thank the reviewer for this suggestion.

      Reviewer #3 (Public Review):

      Summary:

      Navratna et al. have solved the first structure of a transmembrane N-acetyltransferase (TNAT), resolving the architecture of human heparan-alpha-glucosaminide N-acetyltransferase (HGSNAT) in the acetyl-CoA bound state using single particle cryo-electron microscopy (cryoEM). They show that the protein is a dimer and define the architecture of the alpha- and beta- GSNAT fragments, as well as convincingly characterizing the binding site of acetyl-CoA.

      Strengths:

      This is the first structure of any member of the transmembrane acyl transferase superfamily, and as such it provides important insights into the architecture and acetyl-CoA binding site of this class of enzymes.

      The structural data is of a high quality, with an isotropic cryoEM density map at 3.3Å facilitating the building of a high-confidence atomic model. Importantly, the density of the acetyl-CoA ligand is particularly well-defined, as are the contacting residues within the transmembrane domain.

      The open-to-lumen structure of HSGNAT presented here will undoubtedly lay the groundwork for future structural and functional characterization of the reaction cycle of this class of enzymes.

      We thank the reviewer for their positive assessment of the data presented in this work. We really appreciate and agree with the reviewer's comment that the “structure of HSGNAT presented here will undoubtedly lay the groundwork for future structural and functional studies.”

      Weaknesses:

      While the structural data for the open-to-lumen state presented in this work is very convincing, and clearly defines the binding site of acetyl-CoA, to get a complete picture of the enzymatic mechanism of this family, additional structures of other states will be required.

      We agree with the reviewers’ assessment and are heavily invested in pursuing the structures of all the steps of acetyl transfer by HGSNAT.

      A potentially significant weakness of the study is the lack of functional validation. The enzymatic activity of the enzyme characterized was not measured, and the enzyme lacks native proteolytic processing, so it is a little unclear whether the structure represents an active enzyme.

      We thank the reviewer for this comment. While the proteolytic cleavage of the protein remains debated, we find no evidence of such an event in our purification (SDS-PAGE and SEC). Studies like Durand et al., (2010) and Fan et al., (2011) suggest that even the ER retained monomeric HGSNAT is active. Because we see acetyl-CoA (co-substrate) bound to the protein in our structure, we surmise that proteolysis is not necessary for function, at least not for substrate binding. However, we are working towards the structural and kinetic characterization of recombinant α- and β-HGSNAT construct to explore the role of proteolysis on HGSNAT stability and function.

    2. Reviewer #2 (Public Review):

      Summary:<br /> This work describes the structure of Heparan-alpha-glucosaminide N-acetyltransferase (HGSNAT), a lysosomal membrane protein that catalyzes the acetylation reaction of the terminal alpha-D-glucosamine group required for the degradation of heparan sulfate (HS). HS degradation takes place during the degradation of the extracellular matrix, a process required for restructuring tissue architecture, regulation of cellular function, and differentiation. During this process, HS is degraded into monosaccharides and free sulfate in lysosomes.

      HGSNAT catalyzes the transfer of the acetyl group from acetyl-CoA to the terminal non-reducing amino group of alpha-D-glucosamine. The molecular mechanism by which this process occurs has not been described so far. One of the main reasons to study the mechanism of HGSNAT is that multiple mutations spanning the entire sequence of the protein, such as nonsense mutations, splice-site variants, and missense mutations lead to dysfunction that causes abnormal accumulation of HS within the lysosomes. This accumulation is a cause of mucopolysaccharidosis IIIC (MPS IIIC), an autosomal recessive neurodegenerative lysosomal storage disorder, for which there are no approved drugs or treatment strategies.

      This paper provides a 3.26A structure of HGSNAT, determined by single-particle cryo-EM. The structure reveals that HGSNAT is a dimer in detergent micelles and a density assigned to acetyl-CoA. The authors speculate about the molecular mechanism of the acetylation reaction, map the mutations known to cause MPS IIIC on the structure and speculate about the nature of the HGSNAT disfunction caused by such mutations.

      Strengths:<br /> The description of the architecture of HGSNAT is the highlight of the paper since this corresponds to the first description of the structure of a member of the transmembrane acyl transferase (TmAT) superfamily. The high resolution of an HGSNAT bound to acetyl-CoA is an important leap in our understanding of the HGSNAT mechanism. The density map is of high quality, except for the luminal domain. The location of the acetyl-CoA allows speculation about the mechanistic role of multiple residues surrounding this molecule. The authors thoroughly describe the architecture of HGSNAT and map the mutations leading to MPS IIIC. The description of the dimeric interphase is a novel result, and future studies are left to confirm the importance of oligomerization for function.

      Weaknesses:<br /> Apart from the cryo-EM structure, the article does not provide any other experimental evidence to support or explain a molecular mechanism. Due to the complete absence of functional assays, mutagenesis analysis, or other structures such as a ternary complex or an acetylated enzyme intermediate, the mechanistic model depicted in Figure 5 should be taken with caution.

      The authors discuss that H269 is an essential residue that participates in the acetylation reaction, possibly becoming acetylated during the process. However, there is no solid experimental evidence, e.g. mutagenesis analysis or structural analysis, in this or previous articles, that demonstrates this to be the case.

      In the discussion part, the authors mention previous studies in which it was postulated that the catalytic reaction can be described by a random order mechanistic model or a Ping Pong Bi Bi model. However, the authors leave open the question of which of these mechanisms best describes the acetylation reaction. The structure presented here does not provide evidence that could support one mechanism or the other.

      Although the authors map the mutations leading to MPS IIIC on the structure and use FoldX software to predict the impact of these mutations on folding and fold stability, there is no experimental evidence to support FoldX's predictions.

    3. eLife assessment

      This important work describes the first high-resolution structure of HGSNAT, a lysosomal membrane protein required for the degradation of heparan sulfate (HS). Through careful structural analysis, this work proposes potential reasons why certain mutations in HGSNAT lead to lysosomal storage disorders and outlines the enzyme's catalytic mechanism. The experimental evidence presented provides incomplete support for the proposed molecular mechanism of the HS acetylation reaction and the impact of disease-causing mutations.

    4. Reviewer #1 (Public Review):

      This article by Navratna et al. reports the first structure of human HGSNAT in an acetyl-CoA-bound state. Through careful structural analysis, the authors propose potential reasons why certain human mutations lead to lysosomal storage disorders and outline a catalytic mechanism. The structural data are of good quality, and the manuscript is clearly written. This study represents an important step toward understanding the mechanism of HGSNAT and is valuable to the field. I have the following suggestions:

      1. The authors should characterize whether the purified protein is active. Otherwise, how does one know if the detergent used maintains the protein in a biologically relevant state? The authors should at least attempt to do so. If these prove to be challenging, at the very least, the authors should try a cell-based assay to demonstrate that the GFP tag does not interfere with the function.

      2. In Figure 5, the authors present a detailed schematic of the catalytic cycle, which I find to be too speculative. There is no evidence to suggest that this enzyme undergoes isomerization, similar to a transporter, between open-to-lumen and open-to-cytosol states. Could it not simply involve some movements of side chains to complete the acetyl transfer?

    5. Reviewer #3 (Public Review):

      Summary:<br /> Navratna et al. have solved the first structure of a transmembrane N-acetyltransferase (TNAT), resolving the architecture of human heparan-alpha-glucosaminide N-acetyltransferase (HGSNAT) in the acetyl-CoA bound state using single particle cryo-electron microscopy (cryoEM). They show that the protein is a dimer, and define the architecture of the alpha- and beta- GSNAT fragments, as well as convincingly characterizing the binding site of acetyl-CoA.

      Strengths:<br /> This is the first structure of any member of the transmembrane acyl transferase superfamily, and as such it provides important insights into the architecture and acetyl-CoA binding site of this class of enzymes.

      The structural data is of a high quality, with an isotropic cryoEM density map at 3.3Å facilitating the building of a high-confidence atomic model. Importantly, the density of the acetyl-CoA ligand is particularly well-defined, as are the contacting residues within the transmembrane domain.

      The open-to-lumen structure of HSGNAT presented here will undoubtedly lay the groundwork for future structural and functional characterization of the reaction cycle of this class of enzymes.

      Weaknesses:<br /> While the structural data for the open-to-lumen state presented in this work is very convincing, and clearly defines the binding site of acetyl-CoA, to get a complete picture of the enzymatic mechanism of this family, additional structures of other states will be required.

      A potentially significant weakness of the study is the lack of functional validation. The enzymatic activity of the enzyme characterized was not measured, and the enzyme lacks native proteolytic processing, so it is a little unclear whether the structure represents an active enzyme.

    1. eLife assessment

      Based on a technological advance which couples onboard calcium imaging with in vivo electrophysiology in freely behaving mice, this important work presents data about the modulation of some long range brain activity correlations during social interactions. Solid evidence shows that neural activity across cerebellum and cingulate cortex is more correlated during social behaviors than during non-social epochs. This study is of interest for a broad range of neurophysiologists.

    1. eLife assessment

      This important manuscript focuses on the mechanisms by which food signals and food ingestion modulate animal foraging. The authors provide convincing support for the interesting idea that chemosensory and interoceptive signals converge on transcriptional regulation of the TGF-beta ligand DAF-7 in a single pair of C. elegans chemosensory neurons (ASJ) to regulate behavior. Their studies implicate a conserved signaling molecule, ALK, in this regulation, suggesting a conserved link between food cues and the neuroendocrine control of foraging behavior.

    2. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Here, Boor et al focus on the regulation of daf-7 transcription in the ASJ chemosensory neurons, which has previously been shown to be sensitive to a variety of external and internal signals. Interestingly, they find that soluble (but not volatile) signals released by food activate daf-7 expression in ASJ, but that this is counteracted by signals from the ASIC channels del-3 and del-7, previously shown to detect the ingestion of food in the pharynx. Importantly, the authors find that ASJ-derived daf-7 can promote exploration, suggesting a feedback loop that influences locomotor states to promote feeding behavior. They also implicate signals known to regulate exploratory behavior (the neuropeptide receptor PDFR-1 and the neuromodulator serotonin) in the regulation of daf-7 expression in ASJ. Additionally, they identify a novel role for a pathway previously implicated in C. elegans sensory behavior, HEN1/SCD-2, in the regulation of daf-7 in ASJ, suggesting that the SCD-2 homolog ALK may have a conserved role in feeding and metabolism.

      Strengths:

      The studies reported here, particularly the quantitation of gene expression and the careful behavioral analysis, are rigorously done and interpreted appropriately. The results suggest that, with respect to food, DAF-7 expression encodes a state of "unmet need" - the availability of nearby food to animals that are not currently eating. This is an interesting finding that reinforces and extends our understanding of the neurobiological significance of this important signaling pathway. The identification of a role for ASJ-derived daf-7 in motor behavior is a valuable advance, as is the finding that SCD-2 acts in the AIA interneurons to influence daf-7 expression in ASJ.

      We appreciate the Reviewer 1’s thoughtful assessment of our work and inference that the expression of daf-7 encodes internal state corresponding to “unmet need.” Based on comments of Reviewer 1 and other reviewers, we have revised the title, abstract, and parts of the discussion to highlight not only the functional contribution of daf-7 expression in the ASJ neurons to behavioral state, but also the remarkable correlation between gene expression and internal state driving foraging behavior.

      Weaknesses:

      A limitation of the work is that some mechanistic relationships between the identified signaling pathways are not carefully examined, but this provides interesting opportunities for future work.

      To enable the reader to begin to infer the relative contributions of the identified signaling pathways to the circuitry coupling distinct bacterial cues to foraging behavior, we have added data for the analysis of DAF-7 expression in the ASJ neurons in the tph-1 and pdfr-1 mutants in the complete absence of food. Our current leaning is that multiple pathways, including those we have begun to characterize here, may function in parallel to influence DAF-7 expression and internal state driving foraging behavior. Future work to explore this further is certainly of interest.

      A minor weakness concerns the experiment in which daf-7 is conditionally deleted from ASJ. This is an ideal approach for probing the function of daf-7, but these experiments seem to be carried out in the well-fed, on-food condition in which control animals should express little or no daf-7 in ASJ. Thus, the experimental design does not allow an assessment of the role of daf-7 under conditions in which its expression is activated (e.g., in animals exposed to un-ingestible food).

      The interpretation of genetic analysis in the complete absence of food is complicated by what we think are multiple parallel pathways that function to strongly promote roaming, as indicated in the prior work of Ben Arous et al. Our observation that the conditional deletion of daf-7 from the ASJ pair of neurons confers altered roaming behavior on a lawn of bacterial food supports physiological ongoing role for dynamic daf-7 expression from the ASJ neurons even in the presence of bacterial food that may contribute to the control of transitions between foraging states and the persistence of roaming and dwelling states.

      To demonstrate the functional contribution of DAF-7 expression from the ASJ neuron pair during constitutive expression favoring roaming, we examined the roaming behavior of scd2(syb2455) animals that carry a gain-of-function mutation in scd-2 that promotes roaming and how the selective deletion of daf-7 from the ASJ neurons in the scd-2(syb2455) genetic background influences roaming behavior. This new experiment supports a model in which DAF-7 expression from the ASJ neurons contributes to the increased roaming behavior exhibited by scd-2(syb2455) animals. The new experiment is added as Figure 4I.

      An additional minor issue concerns the interpretation of the scd-2 experiments. The authors' findings do support a role for scd-2 signaling in the activation of daf-7 expression by un-ingestible food, but the data also suggest that scd-2 signaling is not essential for this effect, as there is still an effect in scd-2 mutants (Figure 4B).

      Considering that most of previous Figure 4B is redundant with previous Figure 4D, we removed previous Figure 4B. Our current Figure 4 has redesignated previous Figure 4D as 4B. We have also added qualification to the text to indicate that other pathways may modulate the daf-7 expression response to ingested food in parallel to SCD-2 signaling.

      Reviewer #2 (Public Review):

      Summary:

      In this work, Boor and colleagues explored the role of microbial food cues in the regulation of neuroendocrine-controlled foraging behavior. Consistent with previous reports, the authors find that C. elegans foraging behavior is regulated by the neuroendocrine TGFβ ligand encoded by daf-7. In addition to its known role in the neuroendocrine/sensory ASI neurons, Boot and colleagues show that daf-7 expression is dynamically regulated in the ASJ sensory neurons by microbial food cues - and that this regulation is important for exploration/exploitation balance during foraging. They identify at least two independent pathways by which microbial cues regulate daf-7 expression in ASJ: a likely gustatory pathway that promotes daf-7 expression and an opposing interoceptive pathway, also likely chemosensory in nature but which requires microbial ingestion to inhibit daf-7 expression. Two neuroendocrine pathways known to regulate foraging (serotonin and PDF-1) appear to act at least in part via daf-7 induction. They further identify a novel role for the C. elegans ALK orthologue encoded by scd-2, which acts in interneurons to regulate daf-7 expression and foraging behavior. These results together imply that distinct cues from microbial food are used to regulate the balance between exploration and exploitation via conserved signaling pathways.

      Strengths:

      The findings that gustatory and interoceptive inputs into foraging behavior are separable and opposing are novel and interesting, which they have shown clearly in Figure 1. It is also clear from their results that removal of the interoceptive cue (via transfer to non-digestible food) results in rapid induction of daf-7::gfp in ASJ, and that ASJ plays an important role in the regulation of foraging behavior.

      We thank Reviewer 2 for underscoring the modulation of neuroendocrine gene expression in the ASJ neuron pair by distinct gustatory and interoceptive inputs derived from bacterial food that we show in Figure 1.

      The role of the hen-1/scd-2 pathway in mediating the effects of ingested food is also compelling and well-interpreted. The use of precise gain-of-function alleles further supports their conclusions. This implies that important elements of this food-sensing pathway may be conserved in mammals.

      We thank Reviewer 2 for emphasizing the implications of our study on SCD-2/ALK as well as the generation and use of gain-of-function scd-2 alleles based on oncogenic mutations in ALK.

      Weaknesses:

      What is less clear to me from the work at this stage is how the gustatory input fits into this picture and to what extent can it be strongly concluded that the daf-7regulating pathways that they have identified (del-3/7, 5-HT, PDFR-1, scd-2) act via the interoceptive pathway as opposed to the gustatory pathway.

      It follows from the work of the Flavell lab that del-3/7 likely acts via the interoceptive pathway in this context as well but this isn't shown directly - e.g. comparing the effects of aztreonam-treated bacteria and complete food removal to controls. The roles of 5-HT and PDFR-1 are even a bit less clear. Are the authors proposing that these are entirely parallel pathways? This could be explained in better detail.

      We have added additional data regarding daf-7 expression from the ASJ neurons in the complete absence of food in the different mutant backgrounds noted by Reviewer 2. Data regarding daf-7 expression in the ASJ neurons under three distinct conditions—ingestible bacterial food, non-ingestible bacterial food, and the complete absence of food—enable the pairwise comparison of mutant data that allows for inference regarding the relative contributions of the genes to the interoceptive vs. gustatory pathways. In particular, effects on the interoceptive pathway can be inferred from the comparison of daf-7 expression on ingestible vs. non-ingestible food, whereas effects on the gustatory pathway can be inferred from the comparison of daf-7 expression on non-ingestible food vs. the absence of food (newly added).

      These additional data are most informative for del-3; del-7 (Figure 1H), where the added data corroborate a role for these genes in the interoceptive pathway, consistent with the findings of the Flavell lab. Specifically, the observation that daf-7 expression levels are equivalent between wild-type and del-3;del-7 animals when there is no ingestible food (either no food or non-ingestible food conditions) suggest that DEL-3 and DEL-7 are functioning specifically to sense ingested food.

      For pdfr-1, the analysis of the gain-of-function allele suggest that this pathway may have a greater relative effect on the gustatory pathway compared with the interoceptive pathway (Figure 3D). The robust upregulation seen in the pdfr-1(syb3826) animals between animals on ingestible and non-ingestible food, suggests that the interoceptive regulation is functional in these mutants, while the lack of upregulation between no-food and noningestible-food conditions suggests that the gustatory pathway is affected.

      The observations with the 5-HT biosynthesis mutant are most consistent with serotonin signaling affecting daf-7 expression in the ASJ neurons through a mechanism that is parallel to the gustatory and interoceptive inputs into daf-7 expression in the ASJ neurons, as tph1(n4622) animals appear to have an elevated baseline expression of daf-7 in the ASJ neurons while retaining sensitivity to both gustatory and interoceptive food cues (Figure 3B).

      The data with scd-2 are consistent with a role in the epistatic interoceptive pathway, considering the roughly equivalent levels of daf-7 expression in the ASJ neurons under all food conditions in scd-2(syb2455) animals (Figure 4B). However it is difficult to exclude the possibility that SCD-2 functions in both pathways or parallel to the gustatory and interoceptive inputs.

      While we agree that our genetic analysis alone cannot distinguish between genes acting in parallel or directly in serial with the gustatory or interoceptive inputs, our data do establish that signaling through SCD-2, 5-HT or PDFR-1-dependent pathways can act on the same gene expression and signaling node (i.e. daf-7 expression in the ASJ neurons) to modulate the effects of bacterial food inputs on foraging behavior, with the effects on daf-7 expression in the ASJ neurons in scd-2, tph-1 and pdfr-1 mutants correlating with their effects on roaming and dwelling behaviors.

      It would also be helpful to elaborate more on why the identified transcriptional positive feedback loop is predicted to extend roaming state duration - as opposed to some other mechanism of increasing roaming such as increased probability of roaming state initiation. This doesn't seem self-evident to me.

      Given that animals can exist in only two states, the increased probability of roaming state initiation would present as shorter dwelling states, which we do not see for daf-7 mutants. As described in Flavell, et al., 2013, a decreased fraction of time roaming can be attributed to longer dwelling states, shorter roaming states, or both. Our positive feedback loop is predicted to extend roaming states because of the predicted effect of DAF-7 on stabilizing the roaming state.

      Related to this point is the somewhat confusing conclusion that the effects of tph-1 and pdfr-1 mutations on daf-7 expression are due to changes in ingestion during roaming/dwelling. From my understanding (e.g. Cermak et al., 2020), pharyngeal pumping rate does not reliably decrease during roaming - so is it clear that there are in fact lower rates of ingestion during roaming in their experiments?

      This is an interesting point. Despite consistent pumping rates, we still believe that roaming animals ingest less food than dwelling animals. For instance, dwelling animals are localized to areas with bacterial food, while roaming animals might traverse patches with no food where pumping does not result in food ingestion.

      If so, why does increased roaming (via tph-1 mutation) result in further increases in daf-7 expression in animals fed aztreonam-treated food (Fig 3B)?

      This is possibly because although roaming animals are eating less, when animals are on non-ingestible food, they’re not eating at all, resulting in further daf-7 upregulation.

      Alternatively, there could be a direct signaling connection between the 5-HT/PDFR-1 pathways and daf-7 expression which could be acknowledged or explained.

      Yes, this is certainly possible. We do not propose that all of the difference in daf-7 expression is due to changes in foraging behavior, but rather we are highlighting further instances of the correlation between daf-7 expression in the ASJ neurons and roaming. For instance, in the case of our tph-1 mutants, we see a relatively modest effect on daf-7 expression in the ASJ neurons but a large difference in the fraction of time roaming. This suggests that the magnitude of change in one (daf-7 expression in ASJ or roaming) does not predict the magnitude of the change in the other, but rather that they trend in the same direc<on.

      Reviewer #3 (Public Review):

      Summary:

      In this interesting study, the authors examine the function of a C. elegans neuroendocrine TGF-beta ligand DAF-7 in regulating foraging movement in response to signals of food and ingestion. Building on their previous findings that demonstrate the critical role of daf-7 in a sensory neuron ASJ in behavioral response to pathogenic P. aeruginosa PA14 bacteria and different foraging behavior between hermaphrodite and male worms, the authors show, here, that ingestion of E. coli OP50, a common food for the worms, suppresses ASJ expression of daf-7 and secreted water-soluble cues of OP50 increases it. They further showed that the level of daf-7 expression in ASJ is positively associated with a higher level of roaming/exploration movement. Furthermore, the authors identify that a C. elegans ortholog of Anaplastic Lymphoma Kinase, scd-2, functions in an interneuron AIA to regulate ASJ expression of daf-7 in response to food ingestion and related cues. These findings place the DAF-7 TGF-beta ligand in the intersection of environmental food conditions, food intake, and foodsearching behavior to provide insights into how orchestrated neural functions and behaviors are generated under various internal and external conditions.

      Strengths:

      The study addresses an important question that appeals to a wide readership. The findings are demonstrated by generally strong results from carefully designed experiments.

      We thank Reviewer 3 for the comments and interest in the work.

      Weaknesses:

      However, a few questions remain to provide a complete picture of the regulatory pathways and some analyses need to be strengthened. Specifically,

      1. The authors show that diffusible cues of bacteria OP50 increase daf-7 expression in ASJ which is suppressed by ingestible food. Their results on del-3 and del-7 suggest that NSM neuron suppresses daf-7 ASJ expression. What sensory neurons respond to bacterial diffusible cues to increase daf-7 expression of ASJ? Since ASJ is able to respond to some bacterial metabolites, does it directly regulate daf-7 expression in response to diffusible cues of OP50 or does it depend on neurotransmission for the regulation? Some level of exploration in this question would provide more insights into the regulatory network of daf-7.

      The focus of our study has been on the modulation of daf-7 expression in the ASJ neurons by distinct bacterial food cues and the downstream neuroendocrine circuitry that is influenced. The question of whether bacterial cues are directly sensed by the ASJ neurons remains unresolved by our study. However, we have previously demonstrated that the daf-7 expression in the ASJ neurons induced by P. aeruginosa metabolites is likely the result of direct detection by the ASJ neurons. We would also note (and have added to the manuscript) the observation of Zaslaver et al. (2015), in which increased calcium transients were observed in the ASJ neurons in response to the withdrawal of E. coli OP50 supernatant, which is consistent with our observations of the effect of a soluble bacterial food signal on daf-7 expression in the ASJ neurons.

      1. The results including those in Figure 2 strongly support that daf-7 in ASJ is required for roaming. Meanwhile, authors also observe increased daf-7 expression in ASJ under several conditions, such as non-ingestible food. Does non-ingestible food induce more roaming?

      Yes, this has been published by Ben Arous, et al., 2009. Figure 3C shows increased roaming on aztreonam-treated food. We have added specific mention of this in the text.

      It would complete the regulatory loop by testing whether a higher (than wild type) level of daf-7 in ASJ could further increase roaming. The results in pdf-1 and scd-2 gain-of-function alleles support more ASJ leads to more roaming, but the effect of these gain-of-function alleles may not be ASJ-specific and it would be interesting to know whether ASJ-specific increase of daf-7 leads to a higher level of roaming. In my opinion, either outcome would be informative and strengthen our understanding of the critical function of daf-7 in ASJ demonstrated here.

      We looked at roaming in animals with a ptrx-1::daf-7 cDNA transgene in a wild-type background and did not see changes in the fraction of time animals roam. However, multiple experimental factors could contribute to our inability to detect an effect, including relative promoter strength and context of other variables that alter daf-7 expression. Nevertheless, our data confirmed that ASJ neuron-specific expression of daf-7 cDNA can increase roaming in a daf-7 mutant background (Figure 2B).

      We have also included an experiment (Figure 4I) looking at roaming in the scd-2(syb2455) gain-of-function animals in animals with daf-7 deleted from the ASJ neurons. These results suggest that part of the increased roaming seen in these scd-2(syb2455) animals is specifically due to increased daf-7 expression in the ASJ neurons.

      1. The analyses in Figure 4 cannot fully support "We further observed that the magnitude of upregulation of daf-7 expression in the ASJ neurons when animals were moved from ingestible food to non-ingestible food was reduced in scd-2(syb2455) to levels only about one-fourth of those seen in wild-type animals (Figure 4D)...", because the authors tested and found the difference in daf-7 expression between ingestible and non-ingestible food conditions in both wild type and the mutant worms. The authors did not analyze whether the induction was different between wild type and mutant. Under the ingestible food condition, ASJ expression of daf-7 already looks different in scd-2(syb2455).

      We appreciate the reviewer pointing out our lack of clarity in discussing our analysis of the data. The 4x difference represents the difference in fold change from ingested to noningested food in wild type and scd-2(syb2455) backgrounds. For wild-type animals, daf-7 expression in the ASJ neurons on non-ingestible food is 8.1-times higher on non-ingestible food than on ingestible food. In scd-2(syb2455) animals, this difference is 1.7 times. We have clarified this in the text.

      1. The authors used unpaired two-tailed t-tests for all the statistical analyses, including when there are multiple groups of data and more than one treatment. In their previous study Meisel et al 2014, the authors used one-way ANOVA, followed by Dunnett's or Tukey's multiple comparison test when they analyzed daf-7 expression or lawn leaving in different mutants or under different bacterial conditions. It is not clear why a two-tailed t-test was used in similar analyses in this study

      We have performed one-way ANOVAs for all comparisons included, and the results were largely consistent with what we found for t-tests. Ultimately, for our analysis we were most interested in pairwise comparisons and decided that t-tests would be most appropriate.

      *Reviewer #1 (Recommendations For The Authors):

      Line 170: For clarity, I suggest editing this to: "When animals are removed from edible food but are still exposed to soluble food signals, upregulation of daf-7..."

      We have edited this in the text and appreciate the suggestion.

      The authors report that pdfr-1(syb3826) was retrieved from "a screen done in parallel to this work." syb3826 is a Suny Biotech allele, suggesting that this screen may not have been done in the authors' lab but rather outsourced. Some additional details might be useful.

      This S325F allele was originally recovered as qd385 in an EMS screen performed in our lab. syb3826 is an independently generated Suny Biotech allele we ordered to confirm that the S325F substitution in PDFR-1 was responsible for our phenotypes. This has been clarified in the text.

      Line 210: Please provide a citation for the screen that identified hen-1(qd259).

      This is the first time the allele is being published. The screen is included in two theses from our lab, Meisel 2016 and Park 2019.

      Line 214: It would be useful here to also mention the previously identified role of scd2 in sensory integration.

      Yes, we have added this to the text. Additionally, we have included a couple of sentences in the discussion about how previous studies that have found a role for SCD-2 in sensory integration may instead be detecting the role for SCD-2 in food sensing, as many of the assays used for sensory integration are also sensitive to nutritional status of the animals.

      Line 271: Please provide a citation for the sex differences in food-leaving behavior (Lipton 2004 PMID 15329389 is the first careful characterization of this).<br /> We have added this to the text.

    1. eLife assessment

      This work advances on two Aso et al 2014 eLife papers to describe further resources that are valuable for the field. This paper identified and contributes additional MBON split-Gal4s, convincingly describing their anatomy, connectivity and function.

    1. eLife assessment

      This valuable study shows that auxin exposure perturbs feeding behavior, survival rates, lipid metabolism, and gene expression patterns in adult Drosophila flies. The results are solid with proper methods and data analyses, and the evidence broadly supports the conclusions with only minor weaknesses. This work is relevant for fly geneticists who are interested in using the auxin-inducible gene expression system for inducing target protein degradation acutely.

    2. Author Response

      The following is the authors’ response to the original reviews.

      REVIEWER 1:

      Reviewer 1 stated: “The authors have provided strong evidence that high levels of auxin exposure perturb feeding behavior, survival rates, lipid metabolism, and gene expression patterns, providing a cautionary note for the field in using this technology. They also concluded that “overall, the experiments were suitably designed with appropriate sample size and data analysis methods.”

      Reviewer 1 provided the following recommendations for improvement, which are addressed below:

      Point 1: “Although authors showed that auxin causes gene expression changes including the possible alteration of Gal4 expression levels, no cell-type-specific data is provided. It would be informative to the Drosophila field if the authors could examine major Gal4 drivers in their expression levels, such as the ones used in studying metabolism and oogenesis.”

      We agree with the reviewer that cell-type specific Gal4 expression should be thoroughly analyzed by scientists in the community wishing to use the current auxin-inducible gene expression system (AGES) in their studies; however, those analyses are beyond the scope of our manuscript. There are many tissues and cell types that are used to study metabolism and oogenesis (e.g., muscle, adipocytes, oenocytes, multiple cell types in the gut, multiple cell types in the ovary), and Gal4 expression patterns could be different depending on age, sex, and diet. It is therefore impossible for us to pinpoint one or two key tissues important for regulating lipid levels and would be a significant investment of time. We believe that each researcher should thoroughly check the Gal4 expression pattern for their specific tissue of interest under their normal standard or altered food conditions. As this reviewer pointed out, our current study provides a cautionary note for the field in using this technology. Nevertheless, we have provided a reference to a recent micropub (Hawley et al; PMID: 37396791) which describes neuronal Gal4 expression patterns comparing the AGES and temporal and regional gene expression targeting (TARGET) systems and updated the text in lines 539-544 of the revised manuscript.

      Point 2: “Although the authors briefly mentioned aging research, feeding behavior, and lipid metabolism, RNA-seq data are provided only for short-term treatment (2 days). The ovary phenotype was examined with long-term treatment (15 days). It would be informative if the authors could also show other long-term treatment data.”

      We respectfully point out to the reviewer that a 5-day auxin feeding assay was provided in Figure S4H, which reproduces the data provided for the 2-day auxin treatment. In addition, the original AGES paper (McClure et al, PMID: 35363137) provided adult survival data that extended to 80 days. In our updated manuscript, we have provided data for a 10-day auxin treatment that also addresses Point #4 below regarding whether the decrease in lipid levels upon auxin feeding is reversible.

      Point 3: “The auxin used in this work is a more water-soluble version and at a high concentration (10 mM). In the C. elegans system, researchers are using a much lower concentration of auxin typically at 1 mM. Therefore, the discussion of their results in terms of potential impacts on other experimental systems should be done carefully. It would be helpful to know what impacts might be observed at a lower concentration of auxin. The recommendation would be that the authors add the 1 mM auxin data point to key elements of their analysis.”

      The concentration of 10 mM auxin used in our study is the recommended dose to use in Drosophila (see McClure et al) and has been used in at least one additional study (Hawley et al). We also would like to point out that other systems (e.g., C. elegans and mice) have many differences in physiology and therefore the concentration of auxin used to elicit a response are likely to be different (e.g., 71.4 mM final concentration is the recommended concentration used in mice; Macdonald et al; PMID: 35736539). We have merely suggested that researchers using auxin for protein degradation should carefully check whether lipid levels (or other physiological processes of interest) are altered upon auxin feeding (or soaking) alone compared to a 0 mM auxin control. The text in lines 467-470 has been altered to reflect this. In addition, the specific recommended dose for Drosophila is highlighted and referenced in multiple places (i.e., methods and results and discussion) throughout the updated text.

      Point 4: “Another related question is whether these detected changes are reversible or not after exposure to auxin at different concentrations. This would be informative for researchers to better design their temporally controlled experiments.”

      We thank the reviewer for this suggestion and have provided the data in Figure S4I. Briefly, we found that after a 5-day treatment of auxin, removal of auxin for an additional 5 days does not recover lipid levels to those of control animals never exposed to auxin.

      Point 5: “It would also be helpful to know whether spermatogenesis is affected or not.”

      Although this would be an interesting developmental process to determine if affected by auxin exposure, we believe that these analyses are beyond the scope of the current manuscript.

      Point 6: “A few other points include changing the nomenclature and validating some of the key genes shown in Figure 3 using quantitative RT-PCR experiments with the tissues where the affected genes are known to be expressed and functional.”

      We thank the reviewer for this suggestion. We have provided qRT-PCR analysis using whole body samples and this data is now provided in the new Figure S8. We used whole-body samples for the qRT-PCR analysis because it would be impossible to pinpoint the specific tissue the differentially regulated genes are required for eliciting the response to auxin exposure. For example, according to Flybase (flybase.org) GstE3 transcripts are moderately to highly expressed in 15 of the 23 cell types annotated by the Fly Cell Atlas project (Li et al; PMID: 35239393).

      REVIEWER 2:

      Reviewer 2 stated: “The authors provide evidence of several Auxin effects. Experiments are suitably designed with appropriate sample size and data analysis methods.”

      This reviewer expressed the following concerns, which are addressed below:

      Point 1: “The provided information is limited and not very helpful for many applications. For example, although authors briefly mentioned aging research, feeding behavior, and lipid data, RNA seq data are provided only for short-term (48 hours) treatment. Especially, since ovary phenotype was examined with long-term treatment (15 days), authors should also show other data for long-term treatment as well.”

      Please see our response to Point #2 of Reviewer 1 regarding long-term treatment experiments. Furthermore, although the ending timepoint for the ovarian analyses is 15 days, we also provide analysis at shorter time points (e.g., daily analysis for egg counts, 5 and 10 day timepoints for fixed sample analyses).

      Point 2: “Although the authors show that Auxin causes a change in gene expression patterns and suggests the possible alteration of Gal4 expression levels, no cell-type-specific data is provided. It would be informative if the authors could examine the expression level of major Gal4 drivers. Authors should discuss how severe these changes are by comparing them with other treatments or conditions, such as starvation or mutant data (ideally, comparing with reported data or their own data if any?).”

      Please see our response to Point #1 from Reviewer 1.

      REVIEWER 3:

      Reviewer 3 stated that they “found the study to be carefully done” and “this study will be of interest to researchers using the Drosophila system, especially those focusing on fatty acid metabolism or physiology.”

      Reviewer 3 also had the following minor points, which are addressed below:

      Point 1: “Auxin, actually 1-naphthaleneaceid acid here, which is a more water-soluble version of auxin (indole-3-acetic acid) is used at what I consider to be a high concentration-10 mM. The problem I have is that the authors are discussing their results in terms of potential impacts on other experimental systems. At least for C. elegans, I think this is not a reasonable extension of the current dataset. In the C. elegans system, researchers are using 1 mM auxin. The authors note that their RNA-seq results suggest a xenobiotic response. Could this apparent xenobiotic response be due to a metabolic byproduct following auxin administration at high concentrations? Figure S1A shows that there is quite a robust transcriptional response at 1 mM auxin. It would be helpful to know what impacts might be observed at this lower concentration in which the transcriptional induction could be used in the context of biologically meaningful experiments. The recommendation would be that the authors add the 1 mM auxin data point to key elements of their analysis.”

      Regarding the comparisons to other model organisms, we refer to our response to Point #3 from Reviewer 1. We also point out that although there is a robust response to 1 mM auxin using the 3.1Lsp2-Gal4 driver, 1 mM is not sufficient for a robust response using additional driver lines in Drosophila (see Hawley et al). It is possible that the xenobiotic response is due to using the recommended dose of auxin (McClure et al).

      However, given the fact that researchers are currently using the 10 mM dose for experiments in Drosophila, we believe that the 10 mM transcription dataset is the most relevant. Nevertheless, we do agree that researchers who choose to use lower concentrations of auxin in the future should carefully look at whether any transcriptional induction alters physiological processes of interest.

      Point 2: “This reviewer was confused by the genetic nomenclature the authors use. The authors have chosen to use the designation 3.1Lsp2-Gal4 (3.1Lsp2-Gal4AID). I think this is potentially confusing because a reader might think that it is the Gal4 transcription factor that is the direct target of auxin- and TIR1-mediated protein degradation, as I initially did. Rather, it is the Gal80 repressor protein that is the direct target. The authors might consider a nomenclature that is more reflective of how this system works. It would also be helpful if the full genotypes of strains were included in each figure legend.”

      We apologize for the nomenclature confusion in our original submission. We have changed our “AID” nomenclature throughout the manuscript to “AGES,” which is the nomenclature used in McClure et al. We respectfully note that the traditional nomenclature for using the temperature-sensitive Gal80 system is Gal80ts or adding the “ts” superscript to the Gal4 line used (e.g., 3.1Lsp2ts).

      Point 3: “The RNA-seq dataset does not appear to be validated by RT-PCR experiments. The authors should consider validating some of the key genes shown in Figure 3 using quantitative RT-PCR experiments, potentially adding a 1 mM auxin data point.”

      Please see our response to Point #6 to Reviewer 1.

      REVIEWER 4:

      Reviewer 4 stated: “Overall, the experiments were well-designed and carefully executed. The results were quantified with appropriate statistical analyses. The paper was also well-written and the results were presented logically.”

      RECOMMENDATIONS FOR THE AUTHORS:

      We have further addressed reviewer recommendations below. Thank you again, for your critique of our manuscript.

      REVIEWER 2:

      As I mentioned in my public review, long-term treatment data would be especially helpful. Examining changes in the expression level of major Gal4 lines is also informative.

      Please see our responses to Points #1 and #2 to Reviewer 1 in the “Public Reviews” section. Although examination of Gal4 expression patterns is extremely important, we believe that these analyses should be carefully performed on a case-by-case basis in the future for labs who wish to continue to use this methodology.

      REVIEWER 4:

      I feel addressing #2 would be a great addition to the current version, while #1 and #3 could be addressed in future studies or by researchers who are interested in these processes.

      Recommendation 1: “Both the metabolomics and transcriptome analyses were done using the whole animals, would it be more informative if these were done using specific tissue/organs such as the adult adipose tissue?”

      Please see our response to Points #1 and #6 to Reviewer 1 in the “Public Reviews” section.

      Recommendation 2: “Another related question is whether these detected changes are reversible or not after exposure to auxin? This would be informative for researchers to better design their temporally controlled experiments.”

      We thank the reviewer for this suggestion and the analysis for this experiment is now provided in Figure S4I.

      Recommendation 3: “Is spermatogenesis affected at all?”

      We respectfully point out that many processes in spermatogenesis (as well as other biological processes) are affected by feeding (e.g., starvation) and would be extremely time consuming to carefully perform the analyses with the rigor required. We agree with Reviewer 4 and believe that this would be best to be performed on a case-by-case examination in the future.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This valuable paper examines the Bithorax complex in several butterfly species, in which the complex is contiguous and not split, as it is in the well-studied fruit fly Drosophila. Based on genetic screens and genetic manipulations of a boundary element involved in segment-specific regulation of Ubx, the authors provide solid evidence for their conclusions, which could be further strengthened by additional data and analyses. The data presented are relevant for those interested in the evolution and function of Hox genes and of gene regulation in general.

      We are deeply grateful to the eLife editorial team and the two reviewers for their thoughtful and constructive feedback. We have used this feedback to improve our manuscript and have provided a point-by-point response below.

      Public Reviews:

      Reviewer #1 (Public Review):

      In their article, "Cis-regulatory modes of Ultrabithorax inactivation in butterfly forewings," Tendolkar and colleagues explore Ubx regulation in butterflies. The authors investigated how Ubx expression is restricted to the hindwing in butterflies through a series of genomic analyses and genetic perturbations. The authors provide evidence that a Topologically Associated Domain (TAD) maintains a hindwing-enriched profile of chromatin around Ubx, largely through an apparent boundary element. CRISPR mutations of this boundary element led to ectopic Ubx expression in forewings, resulting in homeotic transformation in the wings. The authors also explore the results of the mutation in two non-coding RNA regions as well as a possible enhancer module. Each of these induces homeotic phenotypes. Finally, the authors describe a number of homeotic phenotypes in butterflies, which they relate to their work.

      Together, this was an interesting paper with compelling initial data. That said, I have several items that I feel would warrant further discussion, presentation, or data.

      First, I would not state, "Little is known about how Hox genes are regulated outside of flies." They should add "in insects" since so much in known in vertebrates

      Corrected

      For Figure 1, it would aid the readers if the authors could show the number of RNAseq reads across the locus. This would allow the readership to evaluate the frequency of the lncRNAs, splice variants, etc.

      We have found it useful in the past to feature “Sashimi Plots”, as they provide a good overview of transcript splicing junctions and read support. Here we could not accommodate this in our Fig. 1A as this would require compiling the RNAseq reads from many tissues and stages to be meaningful, and we would lose the resolution on forewing vs hindwing tissues that is important in this article (only the Kallima inachus dataset allows this comparison, and was used in Fig 1B). More specifically, the wing transcriptomes available for J. coenia and V. cardui are not deep enough to provide a good visualization of Antp alternative promoter usage or on AS5’ transcription.

      How common are boundary elements within introns? Typically, boundary elements are outside gene bodies, so this could be explored further. This seems like an interesting bit of biology which, following from the above point, it would be interesting to, at a minimum, discuss, but also relate to how transcription occurs through a possible boundary element (are there splice variants, for example?).

      We do not see evidence of alternative splicing, and prefer to avoid speculating on transcriptional effects, but we agree that the intragenicity of the TAD boundary is interesting. We briefly highlighted this point in the revised Discussion:

      "Lastly, it is worth noting that the Antp/Ubx TAD boundary we identified is intragenic, within the last intron of Ubx. It is unclear if this feature affects Ubx transcription, but this configuration might be analogue to the Notch locus in Drosophila, which includes a functional TAD boundary in an intronic position (Arzate-Mejía et al. 2020)."

      The CRISPR experiments led to compelling phenotypes. However, as a Drosophila biologist, I found it hard to interpret the data from mosaic experiments. For example, in control experiments, how often do butterflies die? Are there offsite effects? It's striking that single-guide RNAs led to such strong effects. Is this common outside of this system? Is it possible to explore the function effects at the boundary element - are these generating large deletions (for example, like Mazo-Vargas et al., 2022)? For the mosaic experiments, how frequent are these effects in nature or captive stocks? Would it be possible to resequence these types of effects? At the moment, this data, while compelling, was hard to put into the context of the experiments above without understanding how common the effects are. Ideally, there would be resequencing of these tissues, which could be targeted, but it was not clear to me the general rates of these variants.

      We agree with this assessment completely: mosaics complicate the proper interpretation of CRISPR based perturbation assays in regulatory regions. Here, unlike in Mazo-Vargas et al. (2022), we were unable to breed homeotic effects to a G1 generation, possibly because the phenotypes are dominant and lethal at the embryonic stage (see also our reply to Reviewer 2). This means that mosaic mutants are often survivors with clones of restricted size in the wing, and they are probably rare, but we are unable to meaningfully measure a mutation spectrum frequency (e.g. how often large deletions are generated). As mentioned in the first paragraph of our Discussion, we think that many of the phenotypes we observed (besides the Ubx GOF effects from the BE targeting) were confounded by alleles that could include large SVs. We aim to address these questions in an upcoming manuscript, at a locus where regulatory perturbation does not impact survival, including using germline mutants and unbiased genotyping (whole genome resequencing).

      We elaborated on this issue in our Discussion:

      "It is crucial here to highlight the limitations of the method, in order to derive proper insights about the functionality of the regulatory regions we tested. In essence, butterfly CRISPR experiments generate random mutations by non-homologous end joining repair, that are usually deletions (Connahs et al. 2019; Mazo-Vargas et al. 2022; Van Belleghem et al. 2023). Ideally, regulatory CRISPR-induced alleles require genotyping in a second (G1) generation to be properly matched to a phenotype (Mazo-Vargas et al. 2022). Possibly because of lethal effects, we failed to pass G0 mutations to a G1 generation for genotyping, and were thus limited here to mosaic analysis. As adult wings have lost scale building cells that may underlie a given phenotype, we circumvented this issue by genotyping a pupal forewing displaying an homeotic phenotype in the more efficient Antp-Ubx_BE perturbation experiment (Fig. S4). In this case, PCR amplification of a 600 bp fragment followed by Sanger sequencing recovered signatures of indel variants, with mixed chromatograms starting at the targeted sites. But in all other experiments (CRM11, IT1, and AS5’ targets), we did not genotype mutant tissues, as they were only detected in adult stages and generally with small clone sizes. Some of these clones may have been the results of large structural variants, as data from other organisms suggests that Cas9 nuclease targeting can generate larger than expected mutations that evade common genotyping techniques (Shin et al. 2017; Adikusuma et al. 2018; Kosicki et al. 2018; Cullot et al. 2019; Owens et al. 2019). Even under the assumption that such mutations are relatively rare in butterfly embryos, the fact we injected >100 embryos in each experiment makes their occurrence likely (Fig. 9), and we are unable to assign a specific genotype to the homeotic effects we obtained in CRM11, IT1 and AS5’ perturbation assays."

      Our revision also includes a new Fig. S4 that features the mosaic genotyping of a G0 Antp-Ubx_BE mutant tissue. While this does not fully address the reviewer questions, it provides reasonable validation that the frequent GOF effects we observed upon perturbation at this target site are generated by on-target indels from DNA repair.

      Author response image 1.

      Validation of CRISPR-induced DNA Lesions in an Antp-Ubx_BE crispant pupat forewing. (A-A') Pupal forewing cuticle phenotype of an Antp-Ubx_BE J. coenia crispant, as in Fig. S3. (B-B") Aspect of the same forewing under trans-illumination following dissection out of the pupal case. Regions from mutant clones have a more transparent appearance. (C). Sanger sequencing of an amplicon targeting the Antp-Ubx_BE region in the mutant tissue shown in panel B", compared to a control wing tissue, showing mixed chromatogram around the expected CRISPR cutting site due to indel mutations from non-homologous end-joining.

      In sum, I enjoyed the extensive mosaic perturbations. However, I feel that more molecular descriptions would elevate the work and make a larger impact on the field.

      Reviewer #2 (Public Review):

      Summary:

      The existence of hox gene complexes conserved in animals with bilateral symmetry and in which the genes are arranged along the chromosome in the same order as the structures they specify along the anteroposterior axis of organisms is one of the most spectacular discoveries of recent developmental biology. In brief, homeotic mutations lead to the transformation of a given body segment of the fly into a copy of the next adjacent segment. For the sake of understanding the main observation of this work, it is important to know that in loss-of-function (LOF) alleles, a given segment develops like a copy of the segment immediately anterior to it, and in gain-of-function mutations (GOF), the affected segment develops like a copy of the immediately posterior segment. Over the last 30 years the molecular lesions associated with GOF alleles led to a model where the sequential activation of the hox genes along the chromosome result from the sequential opening of chromosomal domains. Most of these GOF alleles turned out to be deletions of boundary elements (BE) that define the extent of the segment-specific regulatory domains. The fruit fly Drosophila is a highly specialized insect with a very rapid mode of segmentation. Furthermore, the hox clusters in this lineage have split. Given these specificities it is legitimate to question whether the regulatory landscape of the BX-C we know of in D.melanogaster is the result of very high specialization in this lineage, or whether it reflects a more ancestral organization. In this article, the authors address this question by analyzing the continuous hox cluster in butterflies. They focus on the intergenic region between the Antennapedia and the Ubx gene, where the split occurred in D.melanogaster. Hi-C and ATAC-seq data suggest the existence of a boundary element between 2 Topologically-Associated-Domain (TAD) which is also characterized by the presence of CTCF binding sites. Butterflies have 2 pairs of wings originating from T2 (forewing) specified by Antp and T3 specified by Ubx (hindwing). Remarkably, CRISPR mutational perturbation of this boundary leads to the hatching of butterflies with homeotic clones of cells with hindwings identities in the forewing (a posteriorly oriented homeotic transformation). In agreement with this phenotype, the authors observe ectopic expression of Ubx in these clones of cells. In other words, CRISPR mutagenesis of this BE region identified by molecular tool give rise to homeotic transformations directed towards more posterior segment as the boundary mutations that had been 1st identified on the basis of their posterior oriented homeotic transformation in Drosophila. None of the mutant clones they observed affect the hindwing, indicating that their scheme did not affect the nearby Ubx transcription unit. This is reassuring and important first evidence that some of the regulatory paradigms that have been proposed in fruit flies are also at work in the common ancestor to Drosophilae and Lepidoptera.

      Given the large size of the Ubx transcription unit and its associated regulatory regions it is not surprising that the authors have identified ncRNA that are conserved in 4 species of Nymphalinae butterflies, some of which also present in D.melanogaster. Attempts to target the promoters by CRISPR give rise to clones of cells in both forewings and hindwings, suggesting the generation of regulatory mutations associated with both LOF and GOF transformations. The presence of clones with dual homeosis suggests the targeting of Ubx activator and repression CRMs. Unfortunately, these experiments do not allow us to make further conclusions on the role of these ncRNA or in the identification of specific regulatory elements. To the opinion of this reviewer, some recent papers addressing the role that these ncRNA may play in boundary function should be taken with caution, and evidence that ncRNA(s) regulate boundaries in the BX-C in a WT context is still lacking.

      Strengths:

      The convincing GOF phenotype resulting from the targeting of the Antp-Ubx_BE.

      Weaknesses:

      The lack of comparisons with the equivalent phenotypes obtained in D.melanogaster with for example the Fub mutation.

      We are grateful for this excellent contextualization of our findings and have incorporated some of the historical elements into our revision, as detailed below.

      Reviewer #2 (Recommendations For The Authors):

      In the whole paper, the authors bring the notion of boundaries through the angle of the existence of TADs and ignore almost entirely to explain the characteristics of boundary mutation in the BX-C. To my knowledge examples where targeted boundary deletions between TADs result in misregulation of the neighboring genes, and/or a phenotype, are extremely sparse (especially in the context of the mouse hox genes). Given the extensive litterature describing the boundary mutations and their associated GOF phenotypes, the paper would certainly gain strength if the authors justify their approach through this wealth of information. I must admit that this referee is surprised by the absence of any references to the founding work of the Karch and Bender laboratories on this topic. As a matter of fact, one of the founding members of the boundary class of regulatory elements was already brought in 1993 with the Fab-7 and Mcp elements of the BX-C. Based on gain-of-function homeotic phenotypes, additional Fab boundaries were added to the list. Finally, in 2013, Bender and Lucas (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3606092/) identified the Fub boundary element that delimits the Ubx and abd-A domains in the BX-C. Fub fulfills the criterium of lying at the border of 2 neighboring TADs. Significantly, a deletion of Fub leads to a very penetrant and strong homeotic gain-of-function phenotype in which the flies hatch with a 1st abdominal segment transformed into the 2nd. In agreement with this, abd-A is expressed one parasegment too anterior in embryos. This is exactly the observation gathered from the targeted mutations in the Antp-Ubx_BE; a dominant transformation of anterior to posterior wing accompanied by an ectopic expression of Ubx in the forming primordia of the forwing where it is normally silenced. I believe the paper would gain credibility if the results were reported with the knowledge of the similarities with Fub.

      Line 53, I am not aware of the existence of TADs for each of the 9 regulatory domains. The insulators delimit the extent of the regulatory domains but certainly not of TADs.

      We thank the reviewer for these suggestions, as well as for the correction – we agree our previous text suggested that all BX-C boundaries are TAD boundaries, which was incorrect. We added a new introduction paragraph that combines classic literature on GOF mutations at boundary elements with recent evidence these are TAD insulators, including Fub (as suggested), and adding Fab-7 for breadth of scope.

      "For instance, the deletion of a small region situated between Ubx and abd-A produces the Front-ultraabdominal phenotype (Fub) where the first abdominal segment (A1) is transformed into a copy of the second abdominal segment A2, due to a gain-of-expression of abd-A in A1 where it is normally repressed (Bender and Lucas 2013). At the molecular level, the Fub boundary is enforced by insulating factors that separate Topologically Associating Domains (TADs) of open-chromatin, while also allowing interactions of Ubx and abd-A enhancers with their target promoters (Postika et al. 2018; Srinivasan and Mishra 2020). Likewise, the Fab-7 deletion, which removes a TAD boundary insulating abd-A and Abd–B (Moniot-Perron et al. 2023), transforms parasegment 11 into parasegment 12 due to an anterior gain-of-expression of Abd-B (Gyurkovics et al. 1990). By extrapolation, one may expect that if the Drosophila Hox locus was not dislocated into two complexes, Antp and Ubx 3D contact domains would be separated by a Boundary Element (BE), and that deletions similar with Fub and Fab-7 mutations would result in gain-of-function mutations of Ubx that could effectively transform T2 regions into T3 identities."

      A reference to the 1978 Nature article of Lewis should be added after line 42 of introduction.

      Added

      Line 56-57; the BX-C encoded miRNAs are known to regulate Ubx and abd-A, but not Abd-B.

      Corrected

      From lines 57 to 61, the authors mention reports aimed at demonstrating a role of ncRNA into Ubx regulation. To my eyes, these gathered evidences are rather weak. A reference to the work of Pease et al in Genetics in 2013 should be mentioned (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3832271/).

      Added. Our paragraph includes qualifier language about the functionality of the Ubx-related ncRNAs (“are thought to”, “appears to”), and updated references regarding bxd (Petruk et al. 2006; Ibragimov et al. 2023).

      Line 62 authors, should write "Little is known about how Hox genes are regulated outside of Drosophila" and not flies.

      Corrected

      Lines 110-112 could lncRNA:Ubx-IT1 correspond to PS4 antisense reported by Pease et al in 2023 (see URL above)? Lines 115-117, could lncRNA:UbxAS5' correspond to bxd antisense of Pease et al in 2023 (see above)?

      As we could not detect sequence similarities, we preferred to avoid drawing homology, and we intentionally avoided reference to the fly transcripts when we named IT1 and AS5’. This said, we agree it is important to clarify that further studies are needed to clarify this relationship. We elaborated on this point in our discussion:

      "Of note, a systematic in-situ survey (Pease et al. 2013) showed that Drosophila embryos express an antisense transcripts in its 5’ region (lncRNA:bxd), as well as within its first intron (lncRNA:PS4). It is thought that Drosophila bxd regulates Ubx, possibly by transcriptional interference or by facilitation of the Fub-1 boundary effect (Petruk et al. 2006; Ibragimov et al. 2023), while the possible regulatory roles of PS4 remain debated (Hermann et al. 2022). While these dipteran non-coding transcripts lack detectable sequence similarity with the lepidopteran IT1 and AS5’ transcripts, further comparative genomics analyses of the Ubx region across the holometabolan insect phylogeny should clarify the extent to which Hox cluster lncRNAs have been conserved or independently evolved."

      Lines 154-155: "This concordance between Hi-C profiling and CTCF motif prediction thus indicates that Antp-Ubx_BE region functions as an insulator between regulatory domains of Antp and Ubx ». This is only correlative, I would write "suggests" instead of "indicates" and add a "might function".

      Corrected as suggested.

      Line 254, I assume the authors wish to write Ubx-IT1 in V. cardui instead of Ubx-T1.

      Typo corrected

      Line 255 : Fig.5 is absent from the pdf file and replaced by table 1. I did not find a legend for Table 1.

      Corrected, with our sincere apologies for the loss of this image in our first submission.

      Line 293 "Individual with hindwing clones 2.75 times more common than...." "are" is missing?

      Corrected

      Lines 303-313, it is not entirely clear how many guide RNAs were injected. Would be useful to indicate the sites targeted in Fig.S8.

      We specify in the revised text : using a single guide RNA (Ubx11b9)

      Lines 323-337: it is not entirely clear to this referee (a drosophilist) if those spontaneous mutations can be inbred or whether these individuals are occasional mosaics. In general, did anyone try to derive lines from those mosaic animals? Is it possible to hit the germline at the syncitial stages at which the guides are injected? Are the individuals with wing phenotype fertile? Given the fact that the Antp-Ubx_BE mutations should be dominant, I wonder if this characteristic would not help in identifying germline transmission. Similar remark for the discussion where the authors explain at line 360, that genotyping can only be done in the progeny of the Go. I do not have the impression that the authors have performed this genotyping and if I am right, I do not understand why.

      We improved our discussion section on this topic (new text in orange):

      "It is crucial here to highlight the limitations of the method, in order to derive proper insights about the functionality of the regulatory regions we tested. In essence, butterfly CRISPR experiments generate random mutations by non-homologous end joining repair, that are usually deletions (Connahs et al. 2019; Mazo-Vargas et al. 2022; Van Belleghem et al. 2023). Ideally, regulatory CRISPR-induced alleles require genotyping in a second (G1) generation to be properly matched to a phenotype (Mazo-Vargas et al. 2022). Possibly because of lethal effects, we failed to pass G0 mutations to a G1 generation for genotyping, and were thus limited here to mosaic analysis. As adult wings have lost scale building cells that may underlie a given phenotype, we circumvented this issue by genotyping a pupal forewing displaying an homeotic phenotype in the more efficient Antp-Ubx_BE perturbation experiment (Fig. S4). In this case, PCR amplification of a 600 bp fragment followed by Sanger sequencing recovered signatures of indel variants, with mixed chromatograms starting at the targeted sites. But in all other experiments (CRM11, IT1, and AS5’ targets), we did not genotype mutant tissues, as they were only detected in adult stages and generally with small clone sizes. Some of these clones may have been the results of large structural variants, as data from other organisms suggests that Cas9 nuclease targeting can generate larger than expected mutations that evade common genotyping techniques (Shin et al. 2017; Adikusuma et al. 2018; Kosicki et al. 2018; Cullot et al. 2019; Owens et al. 2019). Even under the assumption that such mutations are relatively rare in butterfly embryos, the fact we injected >100 embryos in each experiment makes their occurrence likely (Fig. 9), and we are unable to assign a specific genotype to the homeotic effects we obtained in CRM11, IT1 and AS5’ perturbation assays."

      We agree that the work we conducted with mosaics has important caveats. So far, our attempts at breeding homeotic G0 mutants have not been fruitful at this locus, while less deleterious loci can yield viable alleles into further generations, such as WntA (published) and cortex (in prep.). We prefer to stay vague about negative data here, as it is difficult to disentangle if they were due to real mutational effects (e.g. the alleles can be dominant and lethal in the G1 generation) to failure to germline carriers of mutations as founders, or to health issues that are often amplified by inbreeding depression (including a possible iflavirus in our V. cardui cultures).

      We concur with the prediction that Antp-Ubx_BE mutations are probably dominant, and intend to follow up with similar GOF experiments in the Plodia pantry moth, a laboratory model for lepidopteran functional genomics that is more amenable than butterflies to inbreeding and long-term studies in mutant lines. In our experience (https://www.frontiersin.org/articles/10.3389/fevo.2021.643661/full), Ubx coding knock-out can be more extensive in Plodia than in butterflies, so we think these animals will also be more resilient to the deleterious effects of the GOF phenotype.

      Line 423, 425, I am not a fan of the term "de-insulating!!!!!

      We replaced this neologism by Similar deletion alleles resulting in a TAD fusion and misexpression effect (see below).

      Line 425, why bring the work on Notch while there are so many examples in the BX-C itself....

      Our revised sentence makes it more clear we are referring here to documented examples of deletion-mediated TAD fusion (ie. featuring a conformation capture assay such as HiC/micro-C):

      This suggests a possible loss of the TAD boundary in the crispant clones, resulting in a TAD fusion or in a long-range interaction between a T2-specific enhancer and Ubx promoter. Similar deletion alleles resulting in a TAD fusion and misexpression effect have been described at the Notch locus in Drosophila (Arzate-Mejía et al. 2020), in digit-patterning mutants in mice and humans (Lupiáñez et al. 2015; Anania et al. 2022), or at murine and fly Hox loci depleted of CTCF-mediated regulatory blocking (Narendra et al. 2015; Gambetta and Furlong 2018; Kyrchanova et al. 2020).

      Our revision also includes more emphasis on the Drosophila BX-C boundary elements Fub and Fab-7 (see above).

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      The manuscript is very well written, the data are clearly presented and the methodology is robust. I only have suggestions to improve the manuscript, to make the study more appealing or to discuss in more detail some questions raised by the work.

      1. In the study as it stands, PFG seems to come out of the blue. The authors apparently selected this protein based on sequence conservation between species but this is unlikely to be sufficient to identify novel TFs. Explaining in more detail the reasoning that led to PFG would make the story more appealing. Perhaps PFG was identified through a large reverse genetics screening?

      Response: Thank you for your suggestion. We identified this gene solely by the strategy we described in the manuscript. We decided on this strategy based on the findings of our previous study on AP2-Family TFs, whose DNA binding domains are highly conserved among Plasmodium orthologues. Using this screening strategy, we identified a novel AP2 family TF AP2-Z. The results of the present study demonstrated that this strategy is applicable to TFs other than those belonging to the AP2 family. We are aware that this strategy is not all-encompassing. In fact, we failed to identify HDP1 as a candidate TF when it was also in the target list of AP2-G. However, at present, this is our primary strategy for identifying novel TFs in the targetome.

      1. The authors propose that PFG and AP2-FG form a complex, but this is actually not shown. Did they try to document a physical interaction between the two proteins, for example using co-IP?

      Response: Even when the two molecules were identified to be at the same position by ChIPseq, it cannot be concluded that they form a physical complex because it is possible that they competitively occupy the region. However, in this study, we performed ChIP-seq in the absence of PFG and demonstrated that the cAP2-FG peaks disappeared while those of sAP2-FG remained. This result can only be explained by the two proteins forming a complex at this region, which excludes the possibility that AP2-FG binds the region independently.

      1. It is unclear how PFG can bind to DNA in the absence of DNA-binding domain. Did the authors search for unconventional domains in the protein? This should be at least discussed in the manuscript.

      Response: We speculate that the two highly conserved regions, region 1 and region 2, function as DNA-binding domains in PFG. However, this domain is not similar to any DNA binding domains reported thus far. A straightforward way to demonstrate this would be to perform in vitro binding assays using a recombinant protein. However, thus far, we have not succeeded in obtaining soluble recombinant proteins for these regions. We have added the following sentences to the results section.

      “At present, we speculate that PFG directly interacts with genomic DNA through two highly conserved regions; region 1 and region 2. However, these regions are not similar to any DNA binding domains reported thus far. In other apicomplexan orthologues, these two domains are located adjacent to one another in the protein (Fig. 1A). Therefore, these two regions may be separated by a long interval region but constitute a DNA binding domain of PFG as a result of protein folding.”

      1. How do the authors explain that PFG is still expressed in the absence of AP2-FG? Is AP2G alone sufficient to express sufficient levels of the protein? Is PFG down-regulated in the absence of AP2-FG?

      Response: Our previous ChIP-seq data indicate that PFG is a target of AP2-G. According to the study by Kent et al. (2018), this gene is up-regulated in the early period following conditional AP2-G induction. The results of the present study showed that PFG is capable of autoactivation through a transcriptional positive feed-back loop. These results suggest that PFG can maintain its expression to a certain level once activated by AP2-G, even in the absence of AP2-FG. In our previous microarray analysis, significant decreases in PFG expression were not observed in AP2-FG-diaruptedparasites.

      1. How do AP2-FG regulated genes (based on RNAseq) compare with the predicted cAP2FG/sAP2-FG predicted genes (based on ChIPseq)? Are the two subsets included in the genes that are actually down-regulated in AP2-FG(-)?

      Response: Disruption of the AP2-FG gene impairs gametocyte development. We considered that the direct effect of this disruption would be difficult to analyze in gametocyte-enriched blood, in which gametocytes are pooled during sulfadiazine treatment to deplete asexual stages. Therefore, in our previous paper, we performed microarray analysis between WT and KO parasites to detect the direct effect of AP2-FG disruption on target gene expression, using mice which were synchronously infected with parasites. According to our results, 206 genes were down-regulated in AP2-FG-disrupted parasites. Of these genes, 40 and 117 were targets of sAP2-FG and cAP2-FG, respectively. However, it is still possible that a significant proportion of genes were indirectly down-regulated by AP2-FG disruption, which may impair gametocyte development. Moreover, based on the results of the present study, expression of a significant proportion of AP2-FG target genes could be complemented by PFG transcription. We believe that it would be difficult to compare the direct effects of these TFs on gene expression via transcriptome analysis (therefore, targetome analysis is important). In this study, we compared the expression of target genes of sAP2-FG and cAP2FG between PFG(-) and WT parasites. We expected that down-regulation of PFG (cAP2FG) targets would be complemented with transcription by sAP2-FG.

      1. Minor points

      -Page 5 Line 10, remove "as"

      Response: We have corrected this.

      -Page 7 Lines 4-13: is it possible to perform the assay in PFG(-) parasites?

      Response: Thank you for your question. Even when the marker gene expression was decreased in PFG(-) parasites, we cannot conclude the reason to be a direct effect of the mutation. To determine the function of the motif, it is necessary to perform the assay using wild-type parasites.

      -Page 7 Line 45: Fig6C instead of 5C

      Response: Thank you for pointing this out. We have corrected this.

      -Page 8 Line 27: "decreases"

      Response: Thank you for pointing this out. We have corrected this.

      -Page 8 Line 36: PFG instead of PGP

      Response: We have corrected this.

      -Page 8 Line 39: remove "the fact"

      Response: We have removed this word.

      -Page 8 Line 42: Fig6G instead of 5G

      Response: We have corrected this.

      -Page 8 Line 43: PFG instead of PGP

      Response: We have corrected this.

      -Page 9 Line 23: "electroporation"

      Response: We have corrected this.

      -Page 9 Line 32: "BamHI"

      Response: We have corrected this.

      -Fig 2E: in the crosses did the authors check oocyst formation in the mosquito?

      Response: We did not check oocyst formation because abnormalities in males may not affect oocyst formation.

      -Page 17, legend Fig3, Line 14, there is probably an inversion between left and right for PFG versus AP2-FG (either in the legend or in the figure)

      Response: Thank you for pointing this out. PFG peaks are located in the center in both heat maps. The description “AP2-FG peaks” over the arrowhead in the left map was incorrect. We have corrected this to “PFG peaks”. The peaks in the left heat map must be located in the center; thus, this figure might be redundant.

      Reviewer #2 (Recommendations for the Authors):

      • Could the authors please state in the results section that PFG stands for partner of AP2FG.

      Response: Thank you for the comment. We have added the following to the results section:

      “Through this screening, a gene encoding a 2709 amino acid protein with two regions highly conserved among Plasmodium was identified (PBANKA0902300, designated as a partner of AP2-FG (PFG; Fig. 1A).”

      • Given that the transcriptional program is so dynamic, the timing of the ChIP-seq experiments is crucial. Could the authors clarify the timings of the different ChIP-seq experiments (AP2-FG, PFG, PFG in AP2-FG-, AP2-FG in PFG-, ...)

      Response: Thank you for the comment. To deplete any parasites in the asexual stages, all ChIP-seq experiments in this study were performed using blood from mice treated with sulfadiazine, namely, gametocyte-enriched blood. As the reviewer points out, timing is important, and samples from the period when TFs are maximally expressed are optimal for ChIP-seq. However, when parasites in the asexual stages are present, the background becomes higher. Thus we usually use gametocyte-enriched blood for ChIP-seq when expression of the TF is observed in mature gametocytes. The exception was our ChIP-seq analysis of AP2-G, because is not present in mature gametocytes.

      • Fig 4c is an example of great overlap of peaks, but it would be helpful if the authors could quantify the overlaps between experiments (and describe the overlap parameters used).

      Response: According to the comment, we have created a Venn diagram of overlapping peaks (attached below). However, the peaks used for this Venn diagram were selected after peakcalling via fold-enrichment values. Thus, even if the counterpart of a peak is absent in these selected peaks (non-overlapping peaks in the Venn diagram), it does not indicate that it is absent in the original read map. We believe the overlap of peaks would be estimated more correctly in the heat maps.

      Author response image 1.

      Legged: The Venn diagram shows the number of common peaks between these ChIP seq experiments (distance of peak summits < 150

      • Additionally, how were the promoter coordinates used for each gene when they associate ChIP peaks to a gene target. Did the authors choose 1-2kb? Or use a TSS/5utr dataset such as Adjalley 2016 or Chappell 2020?

      Response: We selected a 1.2 Kbp region for target prediction based on our previous studies. As the reviewer pointed out, target prediction using TSS information may be more accurate. However, reliable TSS information is not available for P. berghei to the best of our knowledge.

      The two papers are studies on P. falciparum.

      • In the absence of evidence of physical interaction, it remains unclear if AP2-FG and PFG actually interact directly or as part of the same complex. A more detailed characterisation with IPs/co-IPs followed by mass spectrometry of the GFP-tagged version of PFG in the presence and absence of AP2-FG would be highly informative.

      Response: Thank you for the comment. Even when these two TFs occupy the same genomic region, it cannot be conclusively said that they exist at the same time in the region: they might competitively occupy the region. However, we showed that the cAP2-FG peaks disappear from the region when PFG was disrupted, while sAP2-FG peaks remain. We believe that this is evidence that the two TFs physically interact with each other.

      • It was not clear if the assessment of motif binding using cytometry was performed using all the required controls and compensation. This section should be clarified.

      Response: Thank you for the comment. Condensation was performed using parasites expressing a single fluorescent protein. The results are attached below. The histogram of mCherry using control parasites expressing GFP under the control of the HSP70 promoter is also attached.

      Author response image 2.

      However, we found that descriptions of the filters for detecting red signals were not correct. This assay was performed using parasites which expressed GFP constitutively and mCherry under the control of the p28 promoter. These two fluorescent proteins were excited by independent lasers (488 and 561, respectively), and the emission spectra were detected using independent detectors (through 530/30 and 610/20 filters, respectively). We have revised the description regarding our FACS protocols as follows:

      “Flow cytometric analysis was performed using an LSR-II flow cytometer (BD Biosciences). In experiments using 820 parasites, the tail blood from infected mice was selected via gating with forward scatter and staining with Hoechst 33342 (excitation =355 nm, emission = 450/50). The gated population was then analyzed for GFP fluorescence (excitation = 488 nm, emission = 530/30) and RFP fluorescence (excitation = 561 nm, emission = 610/20). In the promoter assay (using parasites transfected with a centromere plasmid), the tail blood from infected mice was selected via gating with forward scatter and staining with Hoechst 33342 (excitation =355 nm, emission = 450/50), followed by GFP fluorescence (excitation = 488 nm, emission = 530/30). The gated population was analyzed for mCherry fluorescence (excitation = 561 nm, emission = 610/20). Analysis was performed using the DIVER program (BD Biosciences).”

      Minor points:

      • Page 4, line 37: The authors should specify the timing of expression of AP2-FG on the text.

      Response: We have added the following description to the text.

      “The timing of the expression was approximately four hours later than that of AP2-FG, which started at 16 hpi (9).” .

      • Ref 9 and 17 are repeated

      Response: Thank you for pointing this out. We have corrected this.

      • Fig 1D and 1F do not have scale bars

      Response: We have added scale bars to Fig. 1D.

      We have not changed Fig. 1F, because we believe that the scales can be estimated from the size of the erythrocyte.

      • Page 5, line 29-30. Could the authors specify how many and which of the de-regulated genes have a PFG in their promoter.

      Response: Thank you for the comment, As described in a later section (page 7; Impact of PFG disruption on the expression of AP2-FG target genes), among the 279 genes significantly downregulated in PFG(-) parasites, 165 genes were targets for PFG (unique for PFG or common for sAP2-FG and PFG). In contrast, only four genes were targets unique to sAP2-FG. Therefore, 165 genes harbor the upstream peaks of PFG. These genes are shown in Table S1.

      • Fig 5F. in the methods associated with this figure there seems to be a mixup with the description of the lasers. In addition, given the spillover of the red and green signal between detectors this experiment needs compensation parameters. The authors should provide the gating strategy before and after compensation as this is critical for the correct calculation of the number of red parasites. Indeed, the lowest red cloud on the gate shown could be green signal spill over.

      Response: Thank you for the comment. As described above, there were some incorrect descriptions about the conditions of our FACS protocols in the methods section. We have revised them.

      -Page 7, line 19. Could the authors explicitly say in the text that the 810 genes are those with 1 (or more?) PFG peaks in their promoter (out of a total of 1029) to best guide the reader. Additionally, it is important to define the maximum distance allowed between a peak and CDS for it to be associated with said CDS.

      Response: We have revised Table S2 by adding the nearest genes. The revised table shows the relationship between a PFG peak and its nearest genes, together with their distances.

      • Page 7, line 45: fig 6c, not 5c

      Response: Thank you for the comment. We have corrected this.

      • Page 7 last paragraph: This section is very hard to follow. For instance, on line 50 do the authors mean that the sAP2-FG unique targets are LESS de-regulated? On line 51: do the authors mean unique targets of cAP2-FG or unique targets of PFG? Line 53: do the authors mean that genes expressed in the "common" category are LESS de-regulated than the PFG unique targets?

      Response: We are sorry for the lack of clarity; after reviewing the manuscript, it appears to be unclear what the fold change means in this section. Here, fold change means the ratio of PFG(-)/wild type. Thus “High log2(fold change) value” means that the genes were less downregulated. We have revised the description as follows:

      “The log2 distribution (fold change = PFG(-)/wild type) in the three groups of target genes showed that the average value was significantly higher (i.e., less down-regulated) in targets unique to sAP2-FG than in the other two groups (targets unique to cAP2-FG or common targets for both), with p-values of 1.3 × 10-10 and 1.4 × 10-5, respectively, by two-tailed Student’s t-test (Fig. 6F). In addition, the average log2 (fold change) value of the common target genes was relatively higher (i.e., less down-regulated) than that of targets unique to PFG, suggesting that transcriptional activation by sAP2-FG partly complements the impact of PFG disruption on these common targets.”

      • Page 8, line 42: Fig 6G, not 5G

      Response: Thank you for pointing this out. We have corrected this.

      Reviewer #3 (Recommendations For The Authors):

      1. The gene at the center of this study (PBANKA_0902300) was identified in an earlier genetic screen by Russell et al. as being a female specific gene with essential role in transmission and named Fd2 (for female-defective 2). Since this name entered the literature first and is equally descriptive, the Fd2 name should be used instead of PFG to maintain clarity and avoid unnecessary confusion. Surprisingly, this study is neither cited nor acknowledged despite a preprint having been available since August of 2021. This should be remedied.

      Response: Thank you for the comment. We have added the paper by Russell et al. accordingly and mentioned the name FD2 in the revised manuscript. However, we have retained the use of PFG throughout the paper. We believe that this usage of PFG shouldn’t be confusing, as FD2 has only been used in one previous paper. We have added the following:

      “Through this screening, a gene encoding a 2709 amino acid protein with two regions highly conserved among Plasmodium was identified (PBANKA0902300, designated as a partner of AP2-FG (PFG; Fig. 1A). This gene is one of the P. berghei genes that were previously identified as genes involved in female gametocyte development (named FD2), based on mass screening combined with single cell RNA-seq (ref).”

      1. While it isn't really important how the authors came to arrive at studying the function of Fd2, the rationale/approach given in the first paragraph of the result section seems far too broad to lead to Fd2, given that it lacks identifiable domains and many other ortholog sets exist across these species.

      Response: We selected this gene from the list of AP2-G targets as a candidate for a sequence-specific TF based on the hypothesis that the amino acid sequences of DNAbinding domains are highly conserved. We successfully identified two TFs (including PFG) using this method. However, there may be TFs that do not fit this hypothesis which are also targets of AP2-G. In fact, we were unable to identify HDP1 as a TF candidate, despite being a AP2-G target.

      1. Fig. 1A-C: Gene IDs for the orthologs should be provided, as well as the methodology for generating the alignments.

      Response; We have added the gene IDs and method for alignment in the legend as follows:

      (A) Schematic diagram of PFG from P. berghei and its homologs in apicomplexan parasites. Regions homologous to Regions 1 and 2, which are highly conserved among Plasmodium species, are shown as yellow and blue rectangles, respectively. Nuclear localization signals were predicted using the cNLS mapper (http://nls-10 mapper.iab.keio.ac.jp/cgibin/NLS_Mapper_form.cgi). The gene IDs of P. berghei PFG, P. falciparum PFG, and their homologs in Toxoplasma gondii, Eimeria tenella and Vitrella brassicaformis are PBANKA_0902300, PF3D7_1146800, TGGT1_239670, ETH2_1252400, and Vbra_10234, respectively.

      (C) The amino acid sequences of Regions 1 and 2 from P. berghei PFG and its homologs from other apicomplexan parasites in (A) were aligned using the ClustalW program in MEGA X. The positions at which all these sequences have identical amino acids are indicated by two asterisks, and positions with amino acid residues possessing the same properties are indicated by one asterisk.

      1. Figure 2: The Phenotype of Fd2 knockout should be characterized more comprehensively.

      It remains unclear whether ∆Fd2 parasite generate the same number of females but these are defective upon fertilization or whether there is also a decrease in the number of female gametocytes. Is the defect just post-fertilization and zygotes lyse or are there fewer fertilization events? If so is activation of female GCs effected?

      The number of male and female gametocytes should be quantified using sex-specific markers not affected by Fd2 knockout rather than providing a single image of each. The ability of ∆Fd2 GCs should also be evaluated.

      This is also important for the interpretation of Fig 2G. Is the down-regulation of the genes due to fewer female GCs or are the down-regulated genes only a subset of female-specific genes.

      Response: In PFG(-) parasites, the rate of conversion into zygotes of female gametocytes decreased, and zygotes had lost capacity for developing into ookinetes. This indicates that gametocyte development (i.e., the ability to egress the erythrocyte and to fertilize) and zygote development were both impaired. This phenotype is consistent with the observation that genes expressed in female gametocytes are broadly downregulated. PFG is a TF, and its disruption led to decreased expression of hundreds of female genes. Thus, the observed phenotype may be derived from combined decreased expression of these genes. We believe further detailed phenotypic analyses will not generate much novel information on this TF. Instead, RNA-seq data in PFG(-) parasites and the targetome have promise in helping to characterize the functions of this TF.

      1. Figure 3: what fraction of down-regulated genes have the Fd2 10mer motif?

      Response: Thank you for the question. We investigated the upstream binding motifs of these genes. Of the 279 significantly down-regulated genes (containing 165 targets), 161 genes harbor the motif (including nine-base motifs that lack one lateral base which is likely not essential for binding) in their upstream regions (within 1,200 bp from the first methionine codon). However, this result has not been described in the revised manuscript because it is more important whether these regions harbor PFG peaks (upstream motifs can exist without being involved in the binding of PFG).

      1. sAP2-FG (single) vs cAP2-FG (complex) nomenclature is confusing and possibly misleading since few TFs function in isolation and sAP2-FG likely functions in a complex that doesn't contain Fd2, possibly with another DNA binding protein that binds the TGCACA hexamer. The name for the distinct peaks should refer to the presence or absence of Fd2 in the complex, or maybe simply refer to them as complex A & B.

      Response: As shown in the DIP-seq analysis results, AP2-FG can bind the motif by itself. In contrast, AP2-FG must form a complex with PFG to bind to the ten-base motif. The complex and single forms are named according to this difference (the presence or absence of PFG) and used solely in its relation with PFG. We wrote “In the following, we refer to the form with PFG as cAP2-FG or the complex form, and the form without PFG as sAP2-FG or the single form.” We believe that the nomenclature has sufficient clarity. However, we have partially (underlined) revised certain sentences in the discussion section as follows.

      “As the expression of PFG increases via this mechanism, AP2-FG recruited by PFG (cAP2FG) increases and eventually becomes predominant in the transcriptional regulation of female gametocytes.”

      “This suggests that the promoter of the CCP2 gene, which is a target of PFG only, is still active in AP2-FG(-)820 parasites.”

      We recently reported that the TGCACA motif is a cis-activation motif in early gametocytes and important for both male and female gametocyte development. Thus we speculate that sAP2-FG is not involved in cis-activation by the TGCACA motif. The p-value of the six-base motif is indeed comparable to that of the five-base motif. However, the pvalue (calculated by Fisher’s exact test) in six-base motifs tend to be lower than that calculated in five-base motifs, because the population is much large. We speculate that there is a sequence-specific TF that may be expressed in early gametocytes and bind this motif, independently of AP2-FG.

      1. I compared the overlap of peaks in the 4 ChIP-seq data sets:

      90% of the Fd2 peaks are shared with AP2-FG (binding 24% of shared peaks is lost in ∆AP2FG)

      10% are bound by Fd2 alone (binding at 35% of Fd2 is lost in ∆AP2-FG)

      75% of Fd2 peaks are bound independently of AP2-FG

      47% of AP2-FG peaks shared with Fd2 (binding at 71% of shared peaks is lost in ∆Fd2) 53% of AP2-FG peaks are bound only by AP2-FG (but binding at 82% of AP2-FG only peaks is still lost in the ∆Fd2)

      Binding at 78% of all AP2-FG peaks is lost in ∆Fd2

      This indicates that much of AP2-FG binding in regions even in regions devoid of Fd2 still depends on Fd2. What are possible explanations for this?

      https://elife-rp.msubmit.net/eliferp_files/2023/04/03/00117573/00/117573_0_attach_10_17936_convrt.pdf

      Response: In the ChIP-seq of AP2-FG in the absence of PFG, 441 peaks are still called. This means that at least 441 binding sites for AP2-FG independent of PFG exist. This is a straightforward conclusion from our ChIP-seq data. On the other hand, simple deduction of peaks between two ChIP-seq experiments (AP2-FG peaks minus PFG peaks) is not a precise method for determining sAP2-FG. Peak-calling is independently performed in each ChIP-seq experiment. Thus, peaks remaining after the deduction between two experiments can still contain peaks that are actually common, but which are differentially picked up through the process of peak calling. Even when using data obtained by the same ChIP-seq experiment, markedly different numbers of peaks are called according to the conditions for peak calling (in contrast, common peaks between two independent experiments increase the reliability of the data). If wanting to identify sAP2-FG peaks via comparisons between AP2-FG peaks and PFG peaks, the reviewer has to increase the number of PFG peaks by reducing the peak-calling threshold until the number of overlapping peaks between AP2-FG and PFG are saturated, and then deduce the overlapping peaks from the AP2-FG peaks. However, as described above, for the purposes of estimating the number of sAP2-FG, it would be better to perform ChIP-seq of AP2-FG in the absence of PFG.

      1. Possible explanations of why recombinant Fd2 doesn't bind the TGCACA hexamer. It would also be good to note that the GCTCA AP2-FG motif found in Fig4G is now perfect match for the motif identified by protein binding microarray in Campbell et al.

      Response: It is not known what sequence recombinant PFG binds. The TGCACA motif is not enriched in PFG peaks. If the reviewer is referring to AP2-FG, our findings that the recombinant AP2 domain binds the five-base motif strongly suggests that other TFs recognize this motif. As described in our response to comment 9, we recently reported that TGCACA is a cis-activating sequence important for the normal development of both male and female gametocytes. Therefore, we currently speculate that this motif is a binding motif of other TFs and is independent of AP2-FG.

      We have mentioned the protein binding microarray data in the Results section as follows.

      “The most enriched motif matched well with the binding sequence of the AP2 domain of P. falciparum AP2-FG, which was reported by Campbell et al.”

      1. What might explain the strong enrichment for TGCACA in ChIPseq but when pulled down by AP2-FG DBD: another binding partner? requires more of AP2-DF than just DBD?

      Response: As described above in our response to comment 6, we have recently submitted a preprint studying the roles of the remodeler subunit PbARID in gametocyte development. We reported that the remodeler subunit is recruited to the six-base motif and that the motif is a novel cis-activation element for early gametocyte development. We speculate that a proportion of AP2-FG targets are also targets of a TF that recognizes this motif and recruits the remodeler subunit. These two TFs may be involved in the regulation of early gametocyte genes but function independently.

      1. Calling DNA pulldown with recombinant AP2-FG DNA-binding domain DNAImmunoprecipitation sequencing (DIP-seq) is confusing since there are no antibodies involved. Describing it directly as a pulldown of fragmented DNA will be clearer to the reader.

      Response: Thank you for the comment. We have also recognized this discrepancy. However we called the method DIP-seq because the original paper reporting this method used this name, wherein it did not use antibodies to capture the MBP-fusion recombinant protein. Our experiment was performed using essentially the same methods, and thus we retained the name.

      1. The legends and methods are very sparse and should include substantially more detail.

      Response: Thank you for the comment. We have revised the description of the FACS experimental method for clarity.

      1. BigWig files for all ChIPseq enrichment used for analysis in this study need to be provided.

      (two replicates each of : Fd2 in WT, Fd2 in ∆AP2-GF, AP2-FG in WT, AP2-FG in ∆Fd2)

      Response: We have deposited the BigWig files to GEO (GSE.226028 and GSE114096).

      1. Tables of ChIP data need to have both summits and peaks and need to list nearest gene. Also the ChIPseq peaks for Fd2 are surprisingly broad (ChIP peaks are very large, e.g. 68% of Fd2 peaks (dataset2) are greater than 1000kb) give its specificity for a long motif. Why is this?

      Response: We have revised Table S2 to include the nearest genes. We are unsure why peaks in the over 1000-bp peak region exist in such high proportions. However, this proportion was also high in our previous ChIP-seq data. Therefore, we speculate that this is a tendency of peak-calling by MACS2. We did not use these values in this paper. For example, targets were predicted using peak summits, and binding motifs were calculated using the 100-base regions around peak summits.

      1. Figure 5E: The positions of the 10mer and 5mer motifs in the promoter should be indicated as well as the length of the promoter. Moreover, mutation of just the 5bp motifs would be valuable to understand if 10mer is sufficient for expression of the reporter.

      Response: Thank you for the comment. We have revised the figure accordingly. The majority of female-specific promoters only harbor ten-base motifs. Thus the ten-base motif is sufficient for evaluating reporter activity (i.e., it would function without five-base motifs).

      1. How is AP2-FG expression affected in ∆Fd2 and vice versa?

      Response: According to our previous microarray data, PFG expression was not significantly downregulated by disruption of AP2-FG. This may be because PFG transcriptionally activates itself through a positive feedback loop after being induced by AP2-G. Similarly, according to our present study, AP2-FG expression was not downregulated by PFG disruption. This may be because AP2-FG is transcriptionally activated by AP2-G.

      1. The single cell data in Russell et al. could easily be used to indicate the order of expression.

      Response: Determining the expression order of gametocyte TFs via the single cell RNA-seq data from Russel et al. is difficult, because only a small number of parasite cells were considered to be in the early gametocyte stage in this study. This is because the parasites were cultured for 24h before the analysis. The analysis suggested by the reviewer may be possible via single cell RNA-seq, but the experiments must be performed with more focus on the early gametocyte stage.

      1. A discussion of the implication of P. falciparum transmission would be appreciated.

      Response: Thank you for the comment. We have added the following to the Discussion section:

      “P. falciparum gametocytes require 9-12 days to mature, which is much longer than that of P. berghei. Meanwhile, it has been reported that the ten-base motif is highly enriched in the upstream regions of female-specific genes also in P. falciparum. Thus, despite the difference in maturation periods, PFG is likely to play an important role in the transcriptional regulation of female P. falciparum gametocyte development."

      1. The lack of identifiable DNA binding domains in Fd2 is intriguing given the strong sequence-specificity. Do the authors think they have identified a new DNA-binding fold ?

      Alphafold of the orthologs with contiguous regions 1&2 might offer insight.

      Response: We speculate that these regions function as DNA binding domains. We performed analysis using Alfafold2 according to the comment. However, the predicted structure of the region was not similar to any other canonical DNA-binding domains. Thus, it may be a novel DNA-binding fold as the reviewer mentioned. Further studies such as binding assays using recombinant proteins would be necessary to confirm this, but thus far we have not successfully obtained the soluble proteins of these regions.

    1. Author Response

      The following is the authors’ response to the previous reviews.

      Thank you and the reviewers for further providing constructive comments and suggestions on our manuscript. On behalf of all the co-authors, I have enclosed a revised version of the above referenced paper. Below, I have merged similar public reviews and recommendations (if applicable) from each reviewer and provided point-by-point responses.

      Reviewer #1:

      People can perform a wide variety of different tasks, and a long-standing question in cognitive neuroscience is how the properties of different tasks are represented in the brain. The authors develop an interesting task that mixes two different sources of difficulty, and find that the brain appears to represent this mixture on a continuum, in the prefrontal areas involved in resolving task difficulty. While these results are interesting and in several ways compelling, they overlap with previous findings and rely on novel statistical analyses that may require further validation.

      Strengths

      1. The authors present an interesting and novel task for combining the contributions of stimulus-stimulus and stimulus-response conflict. While this mixture has been measured in the multi-source interference task (MSIT), this task provides a more graded mixture between these two sources of difficulty.

      2. The authors do a good job triangulating regions that encoding conflict similarity, looking for the conjunction across several different measures of conflict encoding. These conflict measures use several best-practice approaches towards estimating representational similarity.

      3. The authors quantify several salient alternative hypothesis and systematically distinguish their core results from these alternatives.

      4. The question that the authors tackle is important to cognitive control, and they make a solid contribution.

      The authors have addressed several of my concerns. I appreciate the authors implementing best practices in their neuroimaging stats.

      I think that the concerns that remain in my public review reflect the inherent limitations of the current work. The authors have done a good job working with the dataset they've collected.

      Response: We would like to thank the reviewer for the positive evaluation of our manuscript and the constructive comments and suggestions. In response to your suggestions and concerns, we have removed the Stroop/Simon-only and the Stroop+Simon models, revised our conclusion and modified the misleading phrases.

      We have provided detailed responses to your comments below.

      1. The evidence from this previous work for mixtures between different conflict sources makes the framing of 'infinite possible types of conflict' feel like a strawman. The authors cite classic work (e.g., Kornblum et al., 1990) that develops a typology for conflict which is far from infinite. I think few people would argue that every possible source and level of difficulty will have to be learned separately. This work provides confirmatory evidence that task difficulty is represented parametrically (e.g., consistent with the n-back, MOT, and random dot motion literature).

      notes for my public concerns.

      In their response, the authors say:

      'If each combination of the Stroop-Simon combination is regarded as a conflict condition, there would be infinite combinations, and it is our major goal to investigate how these infinite conflict conditions are represented effectively in a space with finite dimensions.'

      I do think that this is a strawman. The paper doesn't make a strong case that this position ('infinite combinations') is widely held in the field. There is previous work (e.g., n-back, multiple object tracking, MSIT, dot motion) that has already shown parametric encoding of task difficulty. This paper provides confirmatory evidence, using an interesting new task, that demand are parametric, but does not provide a major theoretical advance.

      Response: We agree that the previous expression may have seemed somewhat exaggerative. While it is not “infinite”, recent research indeed suggests that the cognitive control shows domain-specificity across various “domains”, including conflict types (Egner, 2008), sensory modalities (Yang et al., 2017), task-irrelevant stimuli (Spape et al., 2008), and task sets (Hazeltine et al., 2011), to name a few.

      These findings collectively support the notion that cognitive control is contextspecific (Bream et al., 2014). That is, cognitive control can be tuned and associated with different (and potentially large numbers of) contexts. Recently, Kikumoto and Mayr (2020) demonstrated that combinations of stimulus, rule and response in the same task formed separatable, conjunctive representations. They further showed that these conjunctive representations facilitate performance. This is in line with the idea that each stimulus-location combination in the present task may be represented separately in a domain-specific manner. Moreover, domain-general task representation can also become domain-specific with learning, which further increases the number of domain-specific conjunctive representations (Mill et al., 2023). In line with the domain-specific account of cognitive control, we referred to the “infinite combinations” in our previous response to emphasize the extreme case of domainspecificity. However, recognizing that the term “infinite” may lead to ambiguity, we have replaced it with phrases such as “a large number of”, “hugely varied”, in our revised manuscript.

      We appreciate the reviewer for highlighting the potential connection of our work to existing literature that showed the parametric encoding of task difficulty (e.g., Dagher et al., 1999; Ritz & Shenhav, 2023). For instance, in Ritz et al.’s (2023) study, they parametrically manipulated target difficulty based on consistent ratios of dot color, and found that the difficulty was encoded in the caudal part of dorsal anterior cingulate cortex. Analogically, in our study, the “difficulty” pertains to the behavioral congruency effect that we modulated within the spatial Stroop and Simon dimensions. Notably, we did identify univariate effects in the right dmPFC and IPS associated with the difficulty in the Simon dimension. This parametric effect may lend support to our cognitive space hypothesis, although we exercised caution in interpreting their significance due to the absence of a clear brain-behavioral relevance in these regions. We have added the connection of our work to prior literature in the discussion. The parametric encoding of conflict also mirrors prior research showing the parametric encoding of task demands (Dagher et al., 1999; Ritz & Shenhav, 2023).

      However, our analyses extend beyond solely testing the parametric encoding of difficulty. Instead, we focused on the multivariate representation of different conflict types, which we believe is independent from the univariate parametric encoding. Unlike the univariate encoding that relies on the strength within one dimension, the multivariate representation of conflict types incorporates both the spatial Stroop and Simon dimensions. Furthermore, we found that similar difficulty levels did not yield similar conflict representation, as indicated by the low similarity between the spatial Stroop and Simon conditions, despite both showing a similar level of congruency effect (Fig. S1). Additionally, we also observed an interaction between conflict similarity and difficulty (i.e., congruency, Fig. 4B/D), such that the conflict similarity effect was more pronounced when conflict was present. Therefore, we believe that our findings make contribution to the literature beyond the difficulty effect.

      Reference:

      Egner, T. (2008). Multiple conflict-driven control mechanisms in the human brain. Trends in Cognitive Sciences, 12(10), 374-380. https://doi.org/10.1016/j.tics.2008.07.001

      Yang, G., Nan, W., Zheng, Y., Wu, H., Li, Q., & Liu, X. (2017). Distinct cognitive control mechanisms as revealed by modality-specific conflict adaptation effects. Journal of Experimental Psychology: Human Perception and Performance, 43(4), 807-818. https://doi.org/10.1037/xhp0000351

      Spapé MM, Hommel B (2008). He said, she said: episodic retrieval induces conflict adaptation in an auditory Stroop task. Psychonomic Bulletin Review,15(6):1117-21. https://doi.org/10.3758/PBR.15.6.1117

      Hazeltine E, Lightman E, Schwarb H, Schumacher EH (2011). The boundaries of sequential modulations: evidence for set-level control. Journal of Experimental Psychology: Human Perception & Performance. 2011 Dec;37(6):1898-914. https://doi.org/10.1037/a0024662

      Braem, S., Abrahamse, E. L., Duthoo, W., & Notebaert, W. (2014). What determines the specificity of conflict adaptation? A review, critical analysis, and proposed synthesis. Frontiers in Psychology, 5, 1134. https://doi.org/10.3389/fpsyg.2014.01134

      Kikumoto A, Mayr U. (2020). Conjunctive representations that integrate stimuli, responses, and rules are critical for action selection. Proceedings of the National Academy of Sciences, 117(19):10603-10608. https://doi.org/10.1073/pnas.1922166117.

      Mill, R. D., & Cole, M. W. (2023). Neural representation dynamics reveal computational principles of cognitive task learning. bioRxiv. https://doi.org/10.1101/2023.06.27.546751

      Dagher, A., Owen, A. M., Boecker, H., & Brooks, D. J. (1999). Mapping the network for planning: a correlational PET activation study with the Tower of London task. Brain, 122 ( Pt 10), 1973-1987. https://doi.org/10.1093/brain/122.10.1973

      Ritz, H., & Shenhav, A. (2023). Orthogonal neural encoding of targets and distractors supports multivariate cognitive control. https://doi.org/10.1101/2022.12.01.518771

      1. (Public Reviews) The degree of Stroop vs Simon conflict is perfectly negatively correlated across conditions. This limits their interpretation of an integrated cognitive space, as they cannot separately measure Stroop and Simon effects. The author's control analyses have limited ability to overcome this task limitation. While these results are consistent with parametric encoding, they cannot adjudicate between combined vs separated representations.

      (Recommendations) I think that it is still an issue that the task's two features (stroop and simon conflict) are perfectly correlated. This fundamentally limits their ability to measure the similarity in these features. The authors provide several control analyses, but I think these are limited.

      Response: We need to acknowledge that the spatial Stroop and Simon components in the five conflict conditions were not “perfectly” correlated, with r = –0.89. This leaves some room for the preliminary model comparison to adjudicate between these models. However, it’s essential to note that conclusions based on these results must be tempered. In line with the reviewer’s observation, we agree that the high correlation between the two conflict sources posed a potential limitation on our ability to independently investigate the contribution of spatial Stroop and Simon conflicts. Therefore, in addition to the limitation we have previously acknowledged, we have now further revised our conclusion and adjusted our expressions accordingly.

      Specifically, we now regard the parametric encoding of cognitive control not as direct evidence of the cognitive space view but as preliminary evidence that led us to propose this hypothesis, which requires further testing. Notably, we have also modified the title from “Conflicts are represented in a cognitive space to reconcile domain-general and domain-specific cognitive control” to “Conflicts are parametrically encoded: initial evidence for a cognitive space view to reconcile the debate of domain-general and domain-specific cognitive control”. Also, we revised the conclusion as: In sum, we showed that the cognitive control can be parametrically encoded in the right dlPFC and guides cognitive control to adjust goal-directed behavior. This finding suggests that different cognitive control states may be encoded in an abstract cognitive space, which reconciles the long-standing debate between the domain-general and domain-specific views of cognitive control and provides a parsimonious and more broadly applicable framework for understanding how our brains efficiently and flexibly represents multiple task settings.

      From Recommendations The authors perform control analyses that test stroop-only and simon-only models. However, these analyses use a totally different similarity metric, that's based on set intersection rather than geometry. This metric had limited justification or explanation, and it's not clear whether these models fit worse because of the similarity metric. Even here, Simon-only model fit better than Stroop+Simon model. The dimensionality analyses may reflect the 1d manipulation by the authors (i.e. perfectly corrected stroop and simon effects).

      Response: The Jaccard measure is the most suitable method we can conceive of for assessing the similarity between two conflicts when establishing the Stroop-only and Simon-only models, achieved by projecting them onto the vertical or horizontal axes, respectively (Author response image 1A). This approach offers two advantages. First, the Jaccard similarity combines both similarity (as reflected by the numerator) and distance (reflected by the difference between denominator and numerator) without bias towards either. Second, the Jaccard similarity in our design is equivalent to the cosine similarity because the denominator in the cosine similarity is identical to the denominator in the Jaccard similarity (both are the radius of the circle, Author response image 1B).

      Author response image 1.

      Definition of Jaccard similarity. A) Two conflicts (1 and 2) are projected onto the spatial Stroop/Simon axis in the Stroop/Simon-only model, respectively. The Jaccard similarity for Stroop-only and Simon-only model are and respectively. Letters a-d are the projected vectors from the two conflicts to the two axes. Blue and red colors indicate the conflict conditions. Shorter vectors are the intersection and longer vectors are the union. B) According to the cosine similarity model, the similarity is defined as , where e is the projected vector from conflict 1 to conflict 2, and g is the vector of conflict 1. The Jaccard similarity for this case is defined by , where f is the projector vector from conflict 2 to itself. Because f = g in our design, the Jaccard similarity is equivalent to the cosine similarity.

      Therefore, we believe that the model comparisons between cosine similarity model and the Stroop/Simon-Only models were equitable. However, we acknowledge the reviewer’s and other reviewers’ concerns about the correlation between spatial Stroop and Simon conflicts, which reduces the space to one dimension (1d) and limits our ability to distinguish between the Stroop-only and Simon-only models, as well as between Stroop+Simon and cosine similarity models. While these distinctions are undoubtedly important for understanding the geometry of the cognitive space, we recognize that they go beyond the major objective of this study, that is, to differentiate the cosine similarity model from domain-general/specific models. Therefore, we have chosen to exclude the Stroop-only, Simon-only and Stroop+Simon models in our revised manuscript.

      Something that raised additional concerns are the RSMs in the key region of interest (Fig S5). The pure stroop task appears to be represented very differently from all of the conditions that include simon conflict.

      Together, I think these limitations reflect the structure of the task and research goals, not the statistical approach (which has been meaningfully improved).

      Response: We appreciate the reviewer for pointing this out. It is essential to clarify that our conclusions were based on the significant similarity modulation effect identified in our statistical analysis using the cosine similarity model, where we did not distinguish between the within-Stroop condition and the other four within-conflict conditions (Fig. 7A, now Fig. 8A). This means that the representation of conflict type was not biased by the seemingly disparities in the values shown here. Moreover, to specifically test the differences between the within-Stroop condition and the other within-conflict conditions, we conducted a mixed-effect model analysis only including trial pairs from the same conflict type. In this analysis, the primary predictor was the cross-condition difference (0 for within-Stroop condition and 1 for other within-conflict conditions). The results showed no significant cross-condition difference in either the incongruent (t = 1.22, p = .23) or the congruent (t = 1.06, p = .29) trials. Thus, we believe the evidence for different similarities is inconclusive in our data and decided not to interpret this numerical difference. We have added this note in the revised figure caption for Figure S5.

      Author response image 2.

      Fig. S5. The stronger conflict type similarity effect in incongruent versus congruent conditions. (A) Summary representational similarity matrices for the right 8C region in incongruent (left) and congruent (right) conditions, respectively. Each cell represents the averaged Pearson correlation of cells with the same conflict type and congruency in the 1400×1400 matrix. Note that the seemingly disparities in the values of Stroop and other within-conflict cells (i.e., the diagonal) did not reach significance for either incongruent (t = 1.22, p = .23) or congruent (t = 1.06, p = .29) trials. (2) Scatter plot showing the averaged neural similarity (Pearson correlation) as a function of conflict type similarity in both conditions. The values in both A and B are calculated from raw Pearson correlation values, in contrast to the z-scored values in Fig. 4D.

      Minor:

      • In the analysis of similarity_orientation, the df is very large (~14000). Here, and throughout, the df should be reflective of the population of subjects (ie be less than the sample size).

      Response: The large degrees of freedom (df) in our analysis stem from the fact that we utilized a mixed-effect linear model, incorporating all data points (a total of 400×35=14000). In mixed-effect models, the df is determined by subtracting the number of fixed effects (in our case, 7) from the total number of observations. Notably, we are in line with the literature that have reported the df in this manner (e.g., Iravani et al., 2021; Schmidt & Weissman, 2015; Natraj et al., 2022).

      Reference:

      Iravani B, Schaefer M, Wilson DA, Arshamian A, Lundström JN. The human olfactory bulb processes odor valence representation and cues motor avoidance behavior. Proc Natl Acad Sci U S A. 2021 Oct 19;118(42):e2101209118. https://doi.org/10.1073/pnas.2101209118.

      Schmidt, J.R., Weissman, D.H. Congruency sequence effects and previous response times: conflict adaptation or temporal learning?. Psychological Research 80, 590–607 (2016). https://doi.org/10.1007/s00426-015-0681-x.

      Natraj, N., Silversmith, D. B., Chang, E. F., & Ganguly, K. (2022). Compartmentalized dynamics within a common multi-area mesoscale manifold represent a repertoire of human hand movements. Neuron, 110(1), 154-174. https://doi.org/10.1016/j.neuron.2021.10.002.

      • it would improve the readability if there was more didactic justification for why analyses are done a certain way (eg justifying the jaccard metric). This will help less technically-savvy readers.

      Response: We appreciate the reviewer’s suggestion. However, considering the Stroop/Simon-only models in our design may not be a valid approach for distinguishing the contributions of the Stroop/Simon components, we have decided not to include the Jaccard metrics in our revised manuscript.

      Besides, to improve the readability, we have moved Figure S4 to the main text (labeled as Figure 7), and added the domain-general/domain-specific schematics in Figure 8.

      Author response image 3.

      Figure 8. Schematic of key RSMs. (A) and (B) show the orthogonality between conflict similarity and orientation RSMs. The within-subject RSMs (e.g., Group1-Group1) for conflict similarity and orientation are all the same, but the cross-group correlations (e.g., Group2-Group1) are different. Therefore, we can separate the contribution of these two effects when including them as different regressors in the same linear regression model. (C) and (D) show the two alternative models. Like the cosine model (A), within-group trial pairs resemble between-group trial pairs in these two models. The domain-specific model is an identity matrix. The domain-general model is estimated from the absolute difference of behavioral congruency effect, but scaled to 0(lowest similarity)-1(highest similarity) to aid comparison. The plotted matrices here include only one subject each from Group 1 and Group 2. Numbers 1-5 indicate the conflict type conditions, for spatial Stroop, StHSmL, StMSmM, StLSmH, and Simon, respectively. The thin lines separate four different sub-conditions, i.e., target arrow (up, down) × congruency (incongruent, congruent), within each conflict type.

      Reviewer #2:

      This study examines the construct of "cognitive spaces" as they relate to neural coding schemes present in response conflict tasks. The authors use a novel experimental design in which different types of response conflict (spatial Stroop, Simon) are parametrically manipulated. These conflict types are hypothesized to be encoded jointly, within an abstract "cognitive space", in which distances between task conditions depend only on the similarity of conflict types (i.e., where conditions with similar relative proportions of spatial-Stroop versus Simon conflicts are represented with similar activity patterns). Authors contrast such a representational scheme for conflict with several other conceptually distinct schemes, including a domain-general, domain-specific, and two task-specific schemes. The authors conduct a behavioral and fMRI study to test which of these coding schemes is used by prefrontal cortex. Replicating the authors' prior work, this study demonstrates that sequential behavioral adjustments (the congruency sequence effect) are modulated as a function of the similarity between conflict types. In fMRI data, univariate analyses identified activation in left prefrontal and dorsomedial frontal cortex that was modulated by the amount of Stroop or Simon conflict present, and representational similarity analyses (RSA) that identified coding of conflict similarity, as predicted under the cognitive space model, in right lateral prefrontal cortex.

      This study tackles an important question regarding how distinct types of conflict might be encoded in the brain within a computationally efficient representational format. The ideas postulated by the authors are interesting ones and the statistical methods are generally rigorous.

      Response: We would like to express our sincere appreciation for the reviewer’s positive evaluation of our manuscript and the constructive comments and suggestions. In response to your suggestions and concerns, we excluded the StroopOnly, SimonOnly and Stroop+Simon models, and added the schematic of domain-general/specific model RSMs. We have provided detailed responses to your comments below.

      The evidence supporting the authors claims, however, is limited by confounds in the experimental design and by lack of clarity in reporting the testing of alternative hypotheses within the method and results.

      1. Model comparison

      The authors commendably performed a model comparison within their study, in which they formalized alternative hypotheses to their cognitive space hypothesis. We greatly appreciate the motivation for this idea and think that it strengthened the manuscript. Nevertheless, some details of this model comparison were difficult for us to understand, which in turn has limited our understanding of the strength of the findings.

      The text indicates the domain-general model was computed by taking the difference in congruency effects per conflict condition. Does this refer to the "absolute difference" between congruency effects? In the rest of this review, we assume that the absolute difference was indeed used, as using a signed difference would not make sense in this setting. Nevertheless, it may help readers to add this information to the text.

      Response: We apologize for any confusion. The “difference” here indeed refers to the “absolute difference” between congruency effects. We have now clarified this by adding the word “absolute” accordingly.

      "Therefore, we defined the domain-general matrix as the absolute difference in their congruency effects indexed by the group-averaged RT in Experiment 2."

      Regarding the Stroop-Only and Simon-Only models, the motivation for using the Jaccard metric was unclear. From our reading, it seems that all of the other models --- the cognitive space model, the domain-general model, and the domain-specific model --- effectively use a Euclidean distance metric. (Although the cognitive space model is parameterized with cosine similarities, these similarity values are proportional to Euclidean distances because the points all lie on a circle. And, although the domain-general model is parameterized with absolute differences, the absolute difference is equivalent to Euclidean distance in 1D.) Given these considerations, the use of Jaccard seems to differ from the other models, in terms of parameterization, and thus potentially also in terms of underlying assumptions. Could authors help us understand why this distance metric was used instead of Euclidean distance? Additionally, if Jaccard must be used because this metric seems to be non-standard in the use of RSA, it would likely be helpful for many readers to give a little more explanation about how it was calculated.

      Response: We believe that the Jaccard similarity measure is consistent with the Cosine similarity measure. The Jaccard similarity is calculated as the intersection divided by the union. To define the similarity of two conflicts in the Stroop-only and Simon-only models, we first project them onto the vertical or horizontal axes, respectively (as shown in Author response image 1A). The Jaccard similarity in our design is equivalent to the cosine similarity because the denominator in the Jaccard similarity is identical to the denominator in the cosine similarity (both are the radius of the circle, Author response image 1B).

      However, it is important to note that a cosine similarity cannot be defined when conflicts are projected onto spatial Stroop or Simon axis simultaneously. Therefore, we used the Jaccard similarity in the previous version of our manuscript.

      Author response image 4.

      Definition of Jaccard similarity. A) Two conflicts (1 and 2) are projected onto the spatial Stroop/Simon axis in the Stroop/Simon-only model, respectively. The Jaccard similarity for Stroop-only and Simon-only model are and respectively. Letters a-d are the projected vectors from the two conflicts to the two axes. Blue and red colors indicate the conflict conditions. Shorter vectors are the intersection and longer vectors are the union. B) According to the cosine similarity model, the similarity is defined as , where e is the projected vector from conflict 1 to conflict 2, and g is the vector of conflict 1. The Jaccard similarity for this case is defined by , where f is the projector vector from conflict 2 to itself. Because f = g in our design, the Jaccard similarity is equivalent to the cosine similarity.

      However, we agree with the reviewer’s and other reviewers’ concern that the correlation between spatial Stroop and Simon conflicts makes it less likely to distinguish the Stroop+Simon from cosine similarity models. While distinguishing them is essential to understand the detailed geometry of the cognitive space, it is beyond our major purpose, that is, to distinguish the cosine similarity model with the domain-general/specific models. Therefore, we have chosen to exclude the Stroop-only, Simon-only and Stroop+Simon models from our revised manuscript.

      When considering parameterizing the Stroop-Only and Simon-Only models with Euclidean distances, one concern we had is that the joint inclusion of these models might render the cognitive space model unidentifiable due to collinearity (i.e., the sum of the Stroop-Only and Simon-Only models could be collinear with the cognitive space model). Could the authors determine whether this is the case? This issue seems to be important, as the presence of such collinearity would suggest to us that the design is incapable of discriminating those hypotheses as parameterized.

      Response: We acknowledge that our design does not allow for a complete differentiation between the parallel encoding (StroopOnly+SimonOnly) model and the cognitive space model, given their high correlation (r = 0.85). However, it is important to note that the StroopOnly+SimonOnly model introduces more free parameters, making the model fitting poorer than the cognitive space model.

      Additionally, the cognitive space model also shows high correlations with the StroopOnly and SimonOnly models (both rs = 0.66). It is crucial to emphasize that our study’s primary goal does not involve testing the parallel encoding hypothesis (through the StroopOnly+SimonOnly model). As a result, we have chosen to remove the model comparison results with the StroopOnly, SimonOnly and StroopOnly+SimonOnly models. Instead, the cognitive space model shows lower correlation with the purely domain-general (r = −0.16) and domain-specific (r = 0.46) models.

      1. Issue of uniquely identifying conflict coding

      We certainly appreciate the efforts that authors have taken to address potential confounders for encoding of conflict in their original submission. We broach this question not because we wish authors to conduct additional control analyses, but because this issue seems to be central to the thesis of the manuscript and we would value reading the authors' thoughts on this issue in the discussion.

      To summarize our concerns, conflict seems to be a difficult variable to isolate within aggregate neural activity, at least relative to other variables typically studied in cognitive control, such as task-set or rule coding. This is because it seems reasonable to expect that many more nuisance factors covary with conflict -- such as univariate activation, level of cortical recruitment, performance measures, arousal --- than in comparison with, for example, a well-designed rule manipulation. Controlling for some of these factors post-hoc through regression is commendable (as authors have done here), but such a method will likely be incomplete and can provide no guarantees on the false positive rate.

      Relatedly, the neural correlates of conflict coding in fMRI and other aggregate measures of neural activity are likely of heterogeneous provenance, potentially including rate coding (Fu et al., 2022), temporal coding (Smith et al., 2019), modulation of coding of other more concrete variables (Ebitz et al., 2020, 10.1101/2020.03.14.991745; see also discussion and reviews of Tang et al., 2016, 10.7554/eLife.12352), or neuromodulatory effects (e.g., Aston-Jones & Cohen, 2005). Some of these origins would seem to be consistent with "explicit" coding of conflict (conflict as a representation), but others would seem to be more consistent with epiphenomenal coding of conflict (i.e., conflict as an emergent process). Again, these concerns could apply to many variables as measured via fMRI, but at the same time, they seem to be more pernicious in the case of conflict. So, if authors consider these issues to be germane, perhaps they could explicitly state in the discussion whether adopting their cognitive space perspective implies a particular stance on these issues, how they interpret their results with respect to these issues, and if relevant, qualify their conclusions with uncertainty on these issues.

      Response: We appreciate the reviewer’s insightful comments regarding the representation and process of conflict.

      First, we agree that the conflict is not simply a pure feature like a stimulus but often arises from the interaction (e.g., dimension overlap) between two or more aspects. For example, in the manual Stroop, conflict emerges from the inconsistent semantic information between color naming and word reading. Similarly, other higher-order cognitive processes such as task-set also underlie the relationship between concrete aspects. For instance, in a face/house categorization task, the taskset is the association between face/house and the responses. When studying these higher-order processes, it is often impossible to completely isolate them from bottomup features. Therefore, methods like the representational similarity analysis and regression models are among the limited tools available to attempt to dissociate these concrete factors from conflict representation. While not perfect, this approach has been suggested and utilized in practice (Freund et al., 2021).

      Second, we agree that conflict can be both a representation and an emerging process. These two perspectives are not necessarily contradictory. According to David Marr’s influential three-level theory (Marr, 1982), representation is the algorithm of the process to achieve a goal based on the input. Therefore, a representation can refer to not only a static stimulus (e.g., the visual representation of an image), but also a dynamic process. Building on this perspective, we posit that the representation of cognitive control consists of an array of dynamic representations embedded within the overall process. A similar idea has been proposed that the abstract task profiles can be progressively constructed as a representation in our brain (Kikumoto & Mayr, 2020).

      We have incorporated this discussion into the manuscript:

      "Recently an interesting debate has arisen concerning whether cognitive control should be considered as a process or a representation (Freund, Etzel, et al., 2021). Traditionally, cognitive control has been predominantly viewed as a process. However, the study of its representation has gained more and more attention. While it may not be as straightforward as the visual representation (e.g., creating a mental image from a real image in the visual area), cognitive control can have its own form of representation. An influential theory, Marr’s (1982) three-level model proposed that representation serves as the algorithm of the process to achieve a goal based on the input. In other words, representation can encompass a dynamic process rather than being limited to static stimuli. Building on this perspective, we posit that the representation of cognitive control consists of an array of dynamic representations embedded within the overall process. A similar idea has been proposed that the representation of task profiles can be progressively constructed with time in the brain (Kikumoto & Mayr, 2020)."

      Reference:

      Freund, M. C., Etzel, J. A., & Braver, T. S. (2021). Neural Coding of Cognitive Control: The Representational Similarity Analysis Approach. Trends in Cognitive Sciences, 25(7), 622-638. https://doi.org/10.1016/j.tics.2021.03.011

      Marr, D. C. (1982). Vision: A computational investigation into human representation and information processing. New York: W.H. Freeman.

      Kikumoto A, Mayr U. (2020). Conjunctive representations that integrate stimuli, responses, and rules are critical for action selection. Proceedings of the National Academy of Sciences, 117(19):10603-10608. https://doi.org/10.1073/pnas.1922166117.

      1. Interpretation of measured geometry in 8C

      We appreciate the inclusion of the measured similarity matrices of area 8C, the key area the results focus on, to the supplemental, as this allows for a relatively model-agnostic look at a portion of the data. Interestingly, the measured similarity matrix seems to mismatch the cognitive space model in a potentially substantive way. Although the model predicts that the "pure" Stroop and Simon conditions will have maximal self-similarity (i.e., the Stroop-Stroop and Simon-Simon cells on the diagonal), these correlations actually seem to be the lowest, by what appears to be a substantial margin (particularly the Stroop-Stroop similarities). What should readers make of this apparent mismatch? Perhaps authors could offer their interpretation on how this mismatch could fit with their conclusions.

      Response: We appreciate the reviewer for bringing this to our attention. It is essential to clarify that our conclusions were based on the significant similarity modulation effect observed in our statistical analysis using the cosine similarity model, where we did not distinguish between the within-Stroop condition and the other four withinconflict conditions (Fig. 7A). This means that the representation of conflict type was not biased by the seemingly disparities in the values shown here. Moreover, to specifically address the potential differences between the within-Stroop condition and the other within-conflict conditions, we conducted a mixed-effect model. In this analysis, the primary predictor was the cross-condition difference (0 for within-Stroop condition and 1 for other within-conflict conditions). The results showed no significant cross-condition difference in either the incongruent trials (t = 1.22, p = .23) or the congruent (t = 1.06, p = .29) trials. Thus, we believe the evidence for different similarities is inconclusive in our data and decided not to interpret this numerical difference.

      We have added this note in the revised figure caption for Figure S5.

      Author response image 5.

      Fig. S5. The stronger conflict type similarity effect in incongruent versus congruent conditions. (A) Summary representational similarity matrices for the right 8C region in incongruent (left) and congruent (right) conditions, respectively. Each cell represents the averaged Pearson correlation of cells with the same conflict type and congruency in the 1400×1400 matrix. Note that the seemingly disparities in the values of Stroop and other within-conflict cells (i.e., the diagonal) did not reach significance for either incongruent (t = 1.22, p = .23) or congruent (t = 1.06, p = .29) trials. (2) Scatter plot showing the averaged neural similarity (Pearson correlation) as a function of conflict type similarity in both conditions. The values in both A and B are calculated from raw Pearson correlation values, in contrast to the z-scored values in Fig. 4D.

      1. It would likely improve clarity if all of the competing models were displayed as summarized RSA matrices in a single figure, similar to (or perhaps combined with) Figure 7.

      Response: We appreciate the reviewer’s suggestion. We now have incorporated the domain-general and domain-specific models into the Figure 7 (now Figure 8).

      Author response image 6.

      Figure 8. Schematic of key RSMs. (A) and (B) show the orthogonality between conflict similarity and orientation RSMs. The within-subject RSMs (e.g., Group1-Group1) for conflict similarity and orientation are all the same, but the cross-group correlations (e.g., Group2-Group1) are different. Therefore, we can separate the contribution of these two effects when including them as different regressors in the same linear regression model. (C) and (D) show the two alternative models. Like the cosine model (A), within-group trial pairs resemble between-group trial pairs in these two models. The domain-specific model is an identity matrix. The domain-general model is estimated from the absolute difference of behavioral congruency effect, but scaled to 0(lowest similarity)-1(highest similarity) to aid comparison. The plotted matrices here include only one subject each from Group 1 and Group 2. Numbers 1-5 indicate the conflict type conditions, for spatial Stroop, StHSmL, StMSmM, StLSmH, and Simon, respectively. The thin lines separate four different sub-conditions, i.e., target arrow (up, down) × congruency (incongruent, congruent), within each conflict type.

      1. Because this model comparison is key to the main inferences in the study, it might also be helpful for most readers to move all of these RSA model matrices to the main text, instead of in the supplemental.

      Response: We thank the reviewer for this suggestion. We have moved the Fig. S4 to the main text, labeled as the new Figure 7.

      1. It may be worthwhile to check how robust the observed brain-behavior association (Fig 4C) is to the exclusion of the two datapoints with the lowest neural representation strength measure, as these points look like they have high leverage.

      Response: We calculated the Pearson correlation after excluding the two points and found it does not affect the results too much, with the r = 0.50, p = .003 (compared to the original r = 0.52, p = .001).

      Additionally, we found the two axes were mistakenly shifted in Fig 4C. Therefore, we corrected this error in the revised manuscript. The correlation results would not be influenced.

      Author response image 7.

      Fig. 4. The conflict type effect. (A) Brain regions surviving the Bonferroni correction (p < 0.0001) across the regions (criterion 1). Labeled regions are those meeting the criterion 2. (B) Different encoding of conflict type in the incongruent with congruent conditions. * Bonferroni corrected p < .05. (C) The brain-behavior correlation of the right 8C (criterion 3). The x-axis shows the beta coefficient of the conflict type effect from the RSA, and the y-axis shows the beta coefficient obtained from the behavioral linear model using the conflict similarity to predict the CSE in Experiment 2. (D) Illustration of the different encoding strength of conflict type similarity in incongruent versus congruent conditions of right 8C. The y-axis is derived from the z-scored Pearson correlation coefficient, consistent with the RSA methodology. See Fig. S4B for a plot with the raw Pearson correlation measurement. l = left; r = right.

      Reviewer #3:

      Yang and colleagues investigated whether information on two task-irrelevant features that induce response conflict is represented in a common cognitive space. To test this, the authors used a task that combines the spatial Stroop conflict and the Simon effect. This task reliably produces a beautiful graded congruency sequence effect (CSE), where the cost of congruency is reduced after incongruent trials. The authors measured fMRI to identify brain regions that represent the graded similarity of conflict types, the congruency of responses, and the visual features that induce conflicts. They applied univariate, multivariate, and connectivity analyses to fMRI data to identify brain regions that represent the graded similarity of conflict types, the congruency of responses, and the visual features that induce conflicts. They further directly assessed the dimensionality of represented conflict space.

      The authors identified the right dlPFC (right 8C), which shows 1) stronger encoding of graded similarity of conflicts in incongruent trials and 2) a positive correlation between the strength of conflict similarity type and the CSE on behavior. The dlPFC has been shown to be important for cognitive control tasks. As the dlPFC did not show a univariate parametric modulation based on the higher or lower component of one type of conflict (e.g., having more spatial Stroop conflict or less Simon conflict), it implies that dissimilarity of conflicts is represented by a linear increase or decrease of neural responses. Therefore, the similarity of conflict is represented in multivariate neural responses that combine two sources of conflict.

      The strength of the current approach lies in the clear effect of parametric modulation of conflict similarity across different conflict types. The authors employed a clever cross-subject RSA that counterbalanced and isolated the targeted effect of conflict similarity, decorrelating orientation similarity of stimulus positions that would otherwise be correlated with conflict similarity. A pattern of neural response seems to exist that maps different types of conflict, where each type is defined by the parametric gradation of the yoked spatial Stroop conflict and the Simon conflict on a similarity scale. The similarity of patterns increases in incongruent trials and is correlated with CSE modulation of behavior.

      The main significance of the paper lies in the evidence supporting the use of an organized "cognitive space" to represent conflict information as a general control strategy. The authors thoroughly test this idea using multiple approaches and provide convincing support for their findings. However, the universality of this cognitive strategy remains an open question.

      (Public Reviews) Taken together, this study presents an exciting possibility that information requiring high levels of cognitive control could be flexibly mapped into cognitive map-like representations that both benefit and bias our behavior. Further characterization of the representational geometry and generalization of the current results look promising ways to understand representations for cognitive control.

      Response: We would like to thank the reviewer for the positive evaluation of our manuscript and for providing constructive comments. In response to your suggestions, we have acknowledged the potential limitation of the design and the cross-subject RSA approach, and incorporated the open questions to the discussions. Please find our detailed responses below.

      The task presented in the study involved two sources of conflict information through a single salient visual input, which might have encouraged the utilization of a common space.

      Response: We agree that the unified visual input in our design may have facilitated the utilization of a common space. However, we believe the stimuli are not necessarily unified in the construction of the common space. To further test the potential interaction between the concrete stimulus setting and the cognitive space representation, it is necessary to use varied stimuli in future research. We have left this as an open question in the discussion:

      Can we effectively map any sources of conflict with completely different stimuli into a single space?

      The similarity space was analyzed at the level of between-individuals (i.e., crosssubject RSA) to mitigate potential confounds in the design, such as congruency and the orientation of stimulus positions. This approach makes it challenging to establish a direct link between the quality of conflict space representation and the patterns of behavioral adaptations within individuals.

      Response: By setting the variables as random effects at the subject level, we have extracted the individual effects that incorporate both the group-level fixed effects and individual-level random effects. We believe this approach yields results that are as reliable, if not more, than effects calculated from individual data only. First, the mixed effect linear (LME) model has included all the individual data, forming the basis for establishing random effects. Therefore, the individual effects derived from this approach inherently reflect the individual-specific effects. To support this notion, we have included a simulation script (accessible in the online file “simulation_LME.mlx” at https://osf.io/rcq8w) to demonstrate the strong consistency between the two approaches (see Author response image 8). In this simulation, we generated random data (Y) for 35 subjects, each containing 20 repeated measurements across 5 conditions. To streamline the simulation, we only included one predictor (X), which was treated as both fixed and random effects at the subject level. We applied two methods to calculate the individual beta coefficient. The first involved extracting individual beta coefficients from the LME model by summing the fixed effect with the subject-specific random effect. The second method was entailed conducting a regression analysis using data from each subject to obtain the slope. We tested their consistency by calculating the Pearson correlation between the derived beta coefficients. This simulation was repeated 100 times.

      Author response image 8.

      The consistent individual beta coefficients between the mixed effect model and the individual regression analysis. A) The distribution of Pearson correlation between the two methods for 100 times. B) An example from the simulation showing the highly correlated results from the two methods. Each data point indicates a subject (n=35).

      Second, the potential difference between the two methods lies in that the LME model have also taken the group-level variance into account, such as the dissociable variances of the conflict similarity and orientation across subject groups. This enabled us to extract relatively cleaner conflict similarity effects for each subject, which we believe can be better linked to the individual behavioral adaptations. Moreover, we have extracted the behavioral adaptations scores (i.e., the similarity modulation effect on CSE) using a similar LME approach. Conducting behavioral analysis solely using individual data would have been less reliable, given the limited sample size of individual data (~32 points per subject). This also motivated us to maintain consistency by extracting individual neural effects using LME models.

      Furthermore, it remains unclear at which cognitive stages during response selection such a unified space is recruited. Can we effectively map any sources of conflict into a single scale? Is this unified space adaptively adjusted within the same brain region? Additionally, does the amount of conflict solely define the dimensions of this unified space across many conflict-inducing tasks? These questions remain open for future studies to address.

      Response: We appreciate the reviewer’s constructive open questions. We respond to each of them based on our current understanding.

      1) It remains unclear at which cognitive stages during response selection such a unified space is recruited.

      We anticipate that the cognitive space is recruited to guide the transference of behavioral CSE at two critical stages. The first stage involves the evaluation of control demands, where the representational distance/similarity between previous and current trials influences the adjustment of cognitive control. The second stage pertains to is control execution, where the switch from one control state to another follows a path within the cognitive space. It is worth noting that future studies aiming to address this question may benefit from methodologies with higher temporal resolutions, such as EEG and MEG, to provide more precise insights into the temporal dynamics of the process of cognitive space recruitment.

      2) Can we effectively map any sources of conflict into a single scale?

      It is possible that various sources of conflict can be mapped onto the same space based on their similarity, even if finding such an operational defined similarity may be challenging. However, our results may offer an approach to infer the similarity between two conflicts. One way is to examine their congruency sequence effect (CSE), with a stronger CSE suggesting greater similarity. The other way is to test their representational similarity within the dorsolateral prefrontal cortex.

      3) Is this unified space adaptively adjusted within the same brain region? We do not have an answer to this question. We showed that the cognitive space does not change with time (Note. S3). What have adjusted is the control demand to resolve the quickly changing conflict conditions from trial to trial. Though, it is an interesting question whether the cognitive space may be altered, for example, when the mental state changes significantly. And if yes, we can further test whether the change of cognitive space is also within the right dlPFC.

      4) Additionally, does the amount of conflict solely define the dimensions of this unified space across many conflict-inducing tasks?

      Our understanding of this comment is that the amount of conflict refers to the number of conflict sources. Based on our current finding, the dimensions of the space are indeed defined by how many different conflict sources are included. However, this would require the different conflict sources are orthogonal. If some sources share some aspects, the cognitive space may collapse to a lower dimension. We have incorporated the first question into the discussion:

      Moreover, we anticipate that the representation of cognitive space is most prominently involved at two critical stages to guide the transference of behavioral CSE. The first stage involves the evaluation of control demands, where the representational distance/similarity between previous and current trials influences the adjustment of cognitive control. The second stage pertains to control execution, where the switch from one control state to another follows a path within the cognitive space. However, we were unable to fully distinguish between these two stages due to the low temporal resolution of fMRI signals in our study. Future research seeking to delve deeper into this question may benefit from methodologies with higher temporal resolutions, such as EEG and MEG.

      We have included the other questions into the manuscript as open questions, calling for future research.

      Several interesting questions remains to be answered. For example, is the dimension of the unified space across conflict-inducing tasks solely determined by the number of conflict sources? Can we effectively map any sources of conflict with completely different stimuli into a single space? Is the cognitive space geometry modulated by the mental state? If yes, what brain regions mediate the change of cognitive space?

      Minor comments:

      • The original comment about out-of-sample predictions to examine the continuity of the space was a suggestion for testing neural representations, not behavior (I apologize for the lack of clarity). Given the low dimensionality of the conflict space shown by the participation ratio, we expect that linear separability exists only among specific combinations of conditions. For example, the pair of conflicts 1 and 5 together is not linearly separable from conflicts 2 and 3. But combined with other results, this is already implied.

      Response: We apologize for the misunderstanding. In fact, performing a prediction analysis using the extensive RSM in our study does presents certain challenges, primarily due to its substantial size (1400x1400) and the intricate nature of the mixed-effect linear model. In our efforts to simplify the prediction process by excluding random effects, we did observe a correlation between the predicted and original values, albeit a relatively small Pearson correlation coefficient of r = 0.024, p < .001. This small correlation can be attributed to two key factors. First, the exclusion of data points impacts not only the conflict similarity regressor but also other regressors within the model, thereby diminishing the predictive power. Secondly, the large amount of data points in the model heightens the risk of overfitting, subsequently reducing the model’s capacity for generalization and increasing the likelihood of unreliable predictions. Given these potential problems, we have opted not to include this prediction in the revised manuscript.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment:

      This important study advances the understanding of physiological mechanisms in deep-sea Planctomycetes bacteria, revealing unique characteristics such as the only known Phycisphaerae using a budding mode of division, extensive involvement in nitrate assimilation and release phage particles without cell death. The study uses convincing evidence, based on experiments using growth assays, phylogenetics, transcriptomics, and gene expression data. The work will be of interest to bacteriologists and microbiologists in general.

      Response: Thanks for the Editor’s and Reviewers’ positive comments, which help us improve the quality of our manuscript entitled “Physiological and metabolic insights into the first cultured anaerobic representative of deep-sea Planctomycetes bacteria” (paper#eLife-RP-RA-2023-89874). The comments are all valuable, and we have studied the comments carefully and have made corresponding revisions according to the suggestions. Revised portions are marked in blue in the modified manuscript.

      Please find the detailed responses as following.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors of the manuscript cultivated a Planctomycetes strain affiliated with Phycisphaerae. The strain was one of the few Planctomycetes from deep-sea environments and demonstrated several unique characteristics, such as being the only known Phycisphaerae using a budding mode of division, extensive involvement in nitrate assimilation, and being able to release phage particles without cell death. The manuscript is generally well-written. However, a few issues need to be more clearly addressed, especially regarding the identification and characterization of the phage.

      Response: Thanks for your positive comments. Please find the detailed responses as following.

      Reviewer #1 (Recommendations For The Authors):

      • Line 75-77, add a reference for this statement.

      Response: Thanks for your suggestion. We have added a reference (Fuerst and Sagulenko, 2011) for this statement in the revised manuscript (Line 77).

      References related to this response:

      Fuerst, J.A., and Sagulenko, E. Beyond the bacterium: planctomycetes challenge our concepts of microbial structure and function. Nat Rev Microbiol. 2011;9:403-413.

      • Line 124-134, add key statistics (such as ANI) of strain ZRK32 and KS4 to this section.

      Response: Thanks for your suggestion. We added the key statistics of strain ZRK32 and KS4, and described as “Based on the 16S rRNA sequence of strain ZRK32, a sequence similarity calculation using the NCBI server indicated that the closest relatives of strain ZRK32 were Poriferisphaera corsica KS4T (98.06%), Algisphaera agarilytica 06SJR6-2T (88.04%), Phycisphaera mikurensis NBRC 102666T (85.28%), and Tepidisphaera mucosa 2842T (82.94%). Recently, the taxonomic threshold for species based on 16S rRNA gene sequence identity value was 98.65% (Kim et al., 2014). Based on these criteria, we proposed that strain ZRK32 might be a novel representative of the genus Poriferisphaera. In addition, to clarify the phylogenetic position of strain ZRK32, the genome relatedness values were calculated by the average nucleotide identity (ANI), the tetranucleotide signatures (Tetra), and in silico DNA-DNA similarity (isDDH), against the genomes of strains ZRK32 and KS4. The ANIb, ANIm, Tetra, and isDDH values were 72.89%, 85.34%, 0.97385, and 20.90%, respectively (Table S1). These results together demonstrated the strain ZRK32 genome to be obviously below established ‘cut-off’ values (ANIb: 95%, ANIm: 95%, Tetra: 0.99, isDDH: 70%) for defining bacterial species, suggesting strain ZRK32 represents a novel strain within the genus Poriferisphaera.” in the revised manuscript (Lines 124-139).

      • Fig. 2A missing description for figure key.

      Response: Thanks for your comments. We modified the Figure 2A, shown as below:

      Author response image 1.

      Figure. 2. Growth assay and transcriptomic analysis of P. heterotrophicis ZRK32 strains cultivated in basal medium and rich medium.

      • Regarding the page released, could this be a membrane vesicle-engulfed phage? I would recommend checking "Spontaneous Prophage Induction Contributes to the Production of Membrane Vesicles by the Gram-Positive Bacterium Lacticaseibacillus casei BL23" and "Chronic Release of Tailless Phage Particles from Lactococcus lactis" for further references.

      Response: Thanks for your valuable comments. We carefully read these two papers and found that phage ZRK32 is most likely a membrane vesicle-engulfed phage. We added the corresponding description as “Moreover, it has recently been reported that the tailless Caudoviricetes phage particles are enclosed in lipid membrane and are released from the host cells by a nonlytic mechanism (Liu et al., 2022), and the prophage induction contributes to the production of membrane vesicles by Lacticaseibacillus casei BL23 during cell growth (da Silva Barreira et al., 2022). Considering that strain ZRK32 has a large number of membrane vesicles during cell growth (Figure S9), we speculated that Phage-ZRK32 might be a membrane vesicle-engulfed phage and its release should be related to membrane vesicles.” in the revised manuscript (Lines 381-388).

      References related to this response:

      Liu Y, Alexeeva S, Bachmann H, Guerra Martníez J.A, Yeremenko N, Abee T et al. Chronic release of tailless phage particles from Lactococcus lactis. Appl Environ Microbiol. 2022; 88: e0148321.

      Silva Barreira, D., Lapaquette, P., Novion Ducassou, J., Couté, Y., Guzzo, J., and Rieu, A. Spontaneous prophage induction contributes to the production of membrane vesicles by the gram-positive bacterium Lacticaseibacillus casei BL23. mBio. 2022;13:e0237522.

      • How were the reference sequences for Fig. S10-S13 retrieved, was it by blasting the phage gene against the entire NCBI database, or only the virus sequence within the NCBI? Please clarify this.

      Response: Thanks for your comments. The reference sequences for Fig. S10-S13 were retrieved by blasting the phage gene against the entire NCBI database. We clarified this as “The reference sequences of four AMGs encoding amidoligase, glutamine amidotransferase, gamma-glutamylcyclotransferase, and glutathione synthase were retrieved by blasting the phage gene against the entire NCBI database, respectively.” in the revised manuscript (Lines 444-447).

      Reviewer #2 (Public Review):

      Summary:

      Planctomycetes encompass a group of bacteria with unique biological traits, the compartmentalized cells make them appear to be organisms in between prokaryotes and eukaryotes. However, only a few of the Planctomycetes bacteria are cultured thus far, and this hampers insight into the biological traits of these evolutionarily important organisms. This work reports the methodology details of how to isolate the deep-sea bacteria that could be recalcitrant to laboratory cultivation, and further reveals the distinct characteristics of the new species of a deep-sea Planctomycetes bacterium, such as the chronic phage release without breaking the host and promote the host and related bacteria in nitrogen utilization. Therefore, the finding of this work is of importance in extending our knowledge of bacteria.

      Response: Thanks for your positive comments.

      Strengths:

      Through the combination of microscopic, physiological, genomics, and molecular biological approaches, this reports the isolation and comprehensive investigation of the first anaerobic representative of the deep-sea Planctomycetes bacterium, in particular in that of the budding division, and release phage without lysis of the cells. Most of the results and conclusions are supported by the experimental evidence.

      Response: Thanks for your positive comments.

      Weaknesses:

      1. While EMP glycolysis is predicted to be involved in energy conservation, no experimental evidence indicated any sugar utilization by the bacterium.

      Response: Thanks for your comments. We have previously tested the sugar utilization of strain ZRK32, and now added this description as “Consistent with the presence of EMP glycolysis pathway in strain ZRK32, we found that it could use a variety of sugars including glucose, maltose, fructose, isomaltose, galactose, D-mannose, and rhamnose (Table S2).” in the revised manuscript (Lines 281-284).

      1. "anaerobic representative" is indicated in the Title, the contrary, TCA in energy metabolism is predicted by the bacterium.

      Response: Thanks for your valuable comments. Currently, anaerobic microorganisms can use other alternative electron acceptors (such as sulfate reducers, nitrate reducers, iron reducers, etc) in place of oxygen for the TCA cycle. For example, Proteus mirabilis uses the whole oxidative TCA cycle without using oxygen as the final electron acceptor when it performs multicellular swarming (Alteri et al., 2012). In this study, all the genes involved in the TCA cycle were present in anaerobic strain ZRK32 and most of them are upregulated, thus we speculate that it might function through the complete TCA metabolic pathway to obtain energy. We added the related description as “Notably, when growing in the rich medium, the expressions of most genes involved in the TCA cycle and EMP glycolysis pathway in strain ZRK32 were upregulated (Figure 2B-D, Figure S5B and Figure S6), suggesting that strain ZRK32 might function through the complete TCA metabolic pathway and EMP glycolysis pathway to obtain energy for growth (Figure S8) (Zheng et al., 2021b). Consistent with the presence of EMP glycolysis pathway in strain ZRK32, we found that it could use a variety of sugars including glucose, maltose, fructose, isomaltose, galactose, D-mannose, and rhamnose (Table S2). As for the presence of TCA cycle in the anaerobic strain ZRK32, we propose that it might use other alternative electron acceptors (such as sulfate reducers, nitrate reducers, iron reducers, etc) in place of oxygen for the TCA cycle, as shown in other anaerobic bacteria (Alteri et al., 2012).” in the revised manuscript (Lines 277-287).

      References related to this response:

      Alteri CJ, Himpsl SD, Engstrom MD, Mobley HL. Anaerobic respiration using a complete oxidative TCA cycle drives multicellular swarming in Proteus mirabilis. mBio. 2012; 3(6): e00365-12.

      1. The possible mechanisms of the chronic phage release without breaking the host are not discussed.

      Response: Thanks for your valuable comments. The possible mechanism of the chronic phage release without breaking the host might be that it was enclosed in lipid membrane and released from the host cells by a nonlytic mechanism. We added the corresponding description as “Moreover, it has recently been reported that the tailless Caudoviricetes phage particles are enclosed in lipid membrane and are released from the host cells by a nonlytic mechanism (Liu et al., 2022), and the prophage induction contributes to the production of membrane vesicles by Lacticaseibacillus casei BL23 during cell growth (da Silva Barreira et al., 2022). Considering that strain ZRK32 has a large number of membrane vesicles during cell growth (Figure S9), we speculated that Phage-ZRK32 might be a membrane vesicle-engulfed phage and its release should be related to membrane vesicles.” in the revised manuscript (Lines 381-388).

      References related to this response:

      Liu Y, Alexeeva S, Bachmann H, Guerra Martníez J.A, Yeremenko N, Abee T et al. Chronic release of tailless phage particles from Lactococcus lactis. Appl Environ Microbiol. 2022; 88: e0148321. da Silva Barreira, D., Lapaquette, P., Novion Ducassou, J., Couté, Y., Guzzo, J., and Rieu, A. Spontaneous prophage induction contributes to the production of membrane vesicles by the gram-positive bacterium Lacticaseibacillus casei BL23. mBio. 2022;13:e0237522.

      Reviewer #2 (Recommendations For The Authors):

      • Have you tested whether strain ZRK32 uses any sugars? If not, why it uses EMP pathway to obtain energy?

      Response: Thanks for your comments. We have previously tested the sugar utilization of strain ZRK32, and now added this description as “Consistent with the presence of EMP glycolysis pathway in strain ZRK32, we found that it could use a variety of sugars including glucose, maltose, fructose, isomaltose, galactose, D-mannose, and rhamnose (Table S2).” in the revised manuscript (Lines 281-284).

      • Further discussion on possible mechanisms of the chronic phage release without breaking the host is expected.

      Response: Thanks for your valuable comments. The possible mechanism of the chronic phage release without breaking the host might be that it was enclosed in lipid membrane and released from the host cells by a nonlytic mechanism. We added the corresponding description as “Moreover, it has recently been reported that the tailless Caudoviricetes phage particles are enclosed in lipid membrane and are released from the host cells by a nonlytic mechanism (Liu et al., 2022), and the prophage induction contributes to the production of membrane vesicles by Lacticaseibacillus casei BL23 during cell growth (da Silva Barreira et al., 2022). Considering that strain ZRK32 has a large number of membrane vesicles during cell growth (Figure S9), we speculated that Phage-ZRK32 might be a membrane vesicle-engulfed phage and its release should be related to membrane vesicles.” in the revised manuscript (Lines 381-388).

      References related to this response:

      Liu Y, Alexeeva S, Bachmann H, Guerra Martníez J.A, Yeremenko N, Abee T et al. Chronic release of tailless phage particles from Lactococcus lactis. Appl Environ Microbiol. 2022; 88: e0148321.

      da Silva Barreira, D., Lapaquette, P., Novion Ducassou, J., Couté, Y., Guzzo, J., and Rieu, A. Spontaneous prophage induction contributes to the production of membrane vesicles by the gram-positive bacterium Lacticaseibacillus casei BL23. mBio. 2022;13:e0237522.

      • It is recommended that the writing is improved, including presentation style and grammar.

      Response: Thanks for your comments. We have invited an English native speaker (Dr. Diana Walsh from Life Science Editors, USA) to revise our manuscript, which we hope to meet your approval.

    1. Author Response

      We are delighted that eLife has assessed our study as a valuable contribution as well as appreciating the importance of working on asymptomatic reservoirs of P. falciparum in high transmission where not just children, but adolescents and adults harbor multiclonal infections. The constructive public reviews will serve to improve our manuscript.

      Detailed responses to referees’ comments and a revised manuscript are forthcoming. Here we make a provisional response to three key areas addressed by the referees:

      (1) census population size

      Referee 1 raises important questions although we respectfully disagree on the terminology we have adopted (of “census”) and on the unclear utility of the proposed quantity.

      We consider the quantity a census in that it is a total enumeration or count of the infections in a given population sample and over a given time period. In this sense, it gives us a tangible notion of the size of the parasite population, in an ecological sense, distinct from the formal effective population size used in population genetics. Given the low overlap between var repertoires of parasites (as observed in monoclonal infections), the population size we have calculated translates to a diversity of strains or repertoires. But our focus here is in a measure of population size itself. The distinction between population size in terms of infection counts and effective population size from population genetics has been made before for pathogens (see for example Bedford et al. 2011 for the seasonal influenza virus and for the measles virus) and is a clear one in the ecological literature for non-pathogen populations (Palstra et al. 2012).

      Both referees 1 and 2 point out that census population size will be sensitive to sample size. We completely agree with the dependence of our quantity on sample size. We used it for comparisons across time of samples of the same depth, to describe the large population size characteristic of high transmission, and persistent across the IRS intervention. Of course, one would like to be able to use this notion across studies that differ in sampling depth.

      Here, referee 1 makes an insightful and useful suggestion. It is true that we can use mean MOI, and indeed there is a simple map between our population size and mean MOI (as we just need to divide or multiply by sample size). We can do even more, as with mean MOI we can presumably extrapolate to the full sample size of the host population, or the population size of another sample in another location. What is needed for this purpose is a stable mean MOI relative to sample size. We can show that indeed in our study mean MOI is stable in that way, by subsampling to different depths of our original sample. We will include in the revision discussion of this point and result, which allows an extrapolation of the census population size to the whole population of hosts in the local area. We’ll also clarify the time denominator, as given the typical duration of infections, we expect our population size to be representative of a per-generation measure.

      Referee 2 suggests we adopt the term “census count” but as a census in our mind is a count we prefer to use “census”.

      Referee 3 considers the genetic data tracking parasite MOI and census changes gives the same result as prevalence which tracks infected hosts. Respectfully, we disagree and will provide an expanded response.

      (2) the importance of lineages (in response to referee 2)

      We do not think that lineages moving exclusively through a given type of host or “patch” is a requirement for enumerating the size of the total infections in such a subset. It is true that what we have is a single parasite population, but we are enumerating for the season the respective size in host classes (children and adults). This is akin to enumerating subsets of a population in ecological settings.

      We are also not clear on the concept of lineage for these highly recombinant parasites as we struggle to find highly related repertoires. In fact, we see the use of the var fingerprinting methodology as a means to capture changes in strain or var repertoires dynamics as a result of changing transmission conditions.

      (3) var methodology

      Comments and queries were made by all three referees about aspects of var methodology, including the Bayesian approach. These will be addressed in our full response.

      Here we respond to a very good point made by referee 2: “Thinking about the applicability of this approach to other studies, I would be interested in a larger treatment of how overlapping DBLa repertoires would impact MOIvar estimates. Is there a definable upper bound above which the method is unreliable? Alternatively, can repertoire overlap be incorporated into the MOI estimator?”

      There is no predefined threshold one can present a priori. Intuitively, the approach to estimate MOI would appear to breakdown as overlap moves away from extremely low, and therefore, for locations with lower transmission intensity. Interestingly, we have observed that this is not the case in our paper by Labbé et al. 2023 where we used model simulations in a gradient of three transmission intensities, from high to low. The original varcoding method performed well across the gradient. This may arise from a nonlinear and fast transition from low overlap to high overlap that is accompanied by the MOI transitioning quickly from primarily multiclonal (MOI > 1) to monoclonal (MOI = 1). This issue needs to be investigated further, including ways to extend the estimation to explicitly include the distribution of DBL repertoire overlap.

      References: Bedford T, Cobey S, Pascual, M. 2011. Strength and tempo of selection revealed in viral gene genealogies. BMC Evol Biol 11, 220. https://doi.org/10.1186/1471-2148-11-220

      Labbé F, He Q, Zhan Q, Tiedje KE, Argyropoulos DC, Tan MH, Ghansah A, Day KP, Pascual M. 2023. Neutral vs . non-neutral genetic footprints of Plasmodium falciparum multiclonal infections. PLoS Comput Biol 19 :e1010816. doi:doi.org/10.1101/2022.06.27.49780

      Palstra FP, Fraser DJ. 2012. Effective/census population size ratio estimation: a compendium and appraisal. Ecol Evol. Sep;2(9):2357-65. doi:10.1002/ece3.329.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This research advance arctile describes a valuable image analysis method to identify individual cells (neurons) within a population of fluorescently labeled cells in the nematode C. elegans. The findings are solid and the method succeeds to identify cells with high precision. The method will be valuable to the C. elegans research community.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this paper, the authors developed an image analysis pipeline to automatically identify individual neurons within a population of fluorescently tagged neurons. This application is optimized to deal with multi-cell analysis and builds on a previous software version, developed by the same team, to resolve individual neurons from whole-brain imaging stacks. Using advanced statistical approaches and several heuristics tailored for C. elegans anatomy, the method successfully identifies individual neurons with a fairly high accuracy. Thus, while specific to C. elegans, this method can become instrumental for a variety of research directions such as in-vivo single-cell gene expression analysis and calcium-based neural activity studies.

      The analysis procedure depends on the availability of an accurate atlas that serves as a reference map for neural positions. Thus, when imaging a new reporter line without fair prior knowledge of the tagged cells, such an atlas may be very difficult to construct. Moreover, usage of available reference atlases, constructed based on other databases, is not very helpful (as shown by the authors in Fig 3), so for each new reporter line a de-novo atlas needs to be constructed.

      We thank the reviewer for pointing out a place where we can use some clarification. While in principle that every new reporter line would need fair prior knowledge, atlases are either already available or not difficult to construct. If one can make the assumption that the anatomy of a particular line is similar to existing atlases (Yemini 2021,Nejatbakhsh 2023,Toyoshima 2020), the cell ID can be immediately performed. Even in the case that one suspects the anatomy may have changes from existing atlases (e.g. in the case of examining mutants), existing atlases can serve as a starting point to provide a draft ID, which facilitates manual annotation. Once manual annotations on ~5 animals are available as we have shown in this work (which is a manageable number in practice), this new dataset can be used to build an updated atlas that can be used for future inferences. We have added this discussion in the manuscript: “If one determines that the anatomy of a particular animal strain is substantially different from existing atlases, new atlases can be easily constructed using existing atlases as starting points.” (page 18).

      I have a few comments that may help to better understand the potential of the tool to become handy.

      1. I wonder the degree by which strain mosaicism affects the analysis (Figs 1-4) as it was performed on a non-integrated reporter strain. As stated, for constructing the reference atlas, the authors used worms in which they could identify the complete set of tagged neurons. But how senstiive is the analysis when assaying worms with different levels of mosaicism? Are the results shown in the paper stem from animals with a full neural set expression? Could the authors add results for which the assayed worms show partial expression where only 80%, 70%, 50% of the cells population are observed, and how this will affect idenfication accuracy? This may be important as many non-integrated reporter lines show high mosaic patterns and may therefore not be suitable for using this analytic method. In that sense, could the authors describe the mosaic degree of their line used for validating the method.

      We appreciate the reviewer for this comment. We want to clarify that most of the worms used in the construction of the atlas are indeed affected by mosaicism and thus do not express the full set of candidate neurons. We have added such a plot as requested (Figure 3 – figure supplement 2, copied below). Our data show that there is no correlation between the fraction of cells expressed in a worm and neuron ID correspondence. We agree with the reviewer this additional insight may be helpful; we have modified the text to include this discussion: “Note that we observed no correlation between the degree of mosaicism and neuron ID correspondence (Figure 3- figure supplement 2).” (page 10).

      Author response image 1.

      No correlation between the degree of mosaicism (fraction of cells expressed in the worm) and neuron ID correspondence.

      1. For the gene expression analysis (Fig 5), where was the intensity of the GFP extracted from? As it has no nuclear tag, the protein should be cytoplasmic (as seen in Fig 5a), but in Fig 5c it is shown as if the region of interest to extract fluorescence was nuclear. If fluorescence was indeed extracted from the cytoplasm, then it will be helpful to include in the software and in the results description how this was done, as a huge hurdle in dissecting such multi-cell images is avoiding crossreads between adjacent/intersecting neurons.

      For this work, we used nuclear-localized RFP co-expressed in the animal, and the GFP intensities were extracted from the same region RFP intensities were extracted. If cytosolic reporters are used, one would imagine a membrane label would be necessary to discern the border of the cells. We clarified our reagents and approach in the text: “The segmentation was done on the nuclear-localized mCherry signals, and GFP intensities were extracted from the same region.” (page21).

      1. In the same mater: In the methods, it is specified that the strain expressing GCAMP was also used in the gene expression analysis shown in Figure 5. But the calcium indicator may show transient intensities depending on spontaneous neural activity during the imaging. This will introduce a significant variability that may affect the expression correlation analysis as depicted in Figure 5.

      We apologize for the error in text. The strain used in the gene expression analysis did not express GCaMP. We did not analyze GCaMP expression in figure 5. We have corrected the error in the methods.

      Reviewer #2 (Public Review):

      The authors succeed in generalizing the pre-alignment procedure for their cell idenfication method to allow it to work effectively on data with only small subsets of cells labeled. They convincingly show that their extension accurately identifies head angle, based on finding auto fluorescent tissue and looking for a symmetric l/r axis. They demonstrate that the method works to identify known subsets of neurons with varying accuracy depending on the nature of underlying atlas data. Their approach should be a useful one for researchers wishing to identify subsets of head neurons in C. elegans, for example in whole brain recording, and the ideas might be useful elsewhere.

      The authors also strive to give some general insights on what makes a good atlas. It is interesting and valuable to see (at least for this specific set of neurons) that 5-10 ideal examples are sufficient. However, some critical details would help in understanding how far their insights generalize. I believe the set of neurons in each atlas version are matched to the known set of cells in the sparse neuronal marker, however this critical detail isn't explicitly stated anywhere I can see.

      This is an important point. We have made text modifications to make it clear to the readers that for all atlases, the number of entities (candidate list) was kept consistent as listed in the methods. In the results section under “CRF_ID 2.0 for automatic cell annotation in multi-cell images,” we added the following sentence: “Note that a truncated candidate list can be used for subse-tspecific cell ID if the neuronal expression is known” (page 3). In the methods section, we added the following sentence: “For multi-cell neuron predictions on the glr-1 strain, a truncated atlas containing only the above 37 neurons was used to exclude neuron candidates that are irrelevant for prediction” (Page 20).

      In addition, it is stated that some neuron positions are missing in the neuropal data and replaced with the (single) position available from the open worm atlas. It should be stated how many neurons are missing and replaced in this way (providing weaker information).

      We modified the text in the result section as follows: “Eight out of 37 candidate neurons are missing in the neuroPAL atlas, which means 40% of the pairwise relationships of neurons expressing the glr-1p::NLS-mcherry transgene were not augmented with the NeuroPAL data but were assigned the default values from the OpenWorm atlas” (page 10).

      It also is not explicitly stated that the putative identities for the uncertain cells (designated with Greek letters) are used to sample the neuropal data. Large numbers of openworm single positions or if uncertain cells are misidentified forcing alignment against the positions of nearby but different cells would both handicap the neuropal atlas relative to the matched florescence atlas. This is an important question since sufficient performance from an ideal neuropal atlas (subsampled) would avoid the need for building custom atlases per strain.

      The putative identities are not used to sample the NeuroPAL data. They were used in the glr-1 multi-cell case to indicate low confidence in manual identification/annotation. For all steps of manual annotation and CRF_ID predictions, we used real neuron labels, and the Greek labels were used for reporting purposes only. It is true that the OpenWorm values (40% of the atlas) would be a handicap for the neuroPAL atlas. This is mainly due to the difficulty of obtaining NeuroPAL data as it requires 3-color fluorescence microscopy and significant time and labor to annotate the large set of neurons. This is one reason to take a complementary approach as we do in this paper.

      Reviewer #1 (Recommendations For The Authors):

      1. Figure 3, there is a confusion in the legend relating to panels c-e (e.g. panel c is neuron ID accuracy but it is described per panel e in the legend.

      We made the necessary changes.

      1. Figure 3, were statistical tests performed for panels d-e? if so, and the outcome was not significant, then it might be good to indicate this in the legend.

      We have added results of statistical tests in the legend as the following sentence: “All distributions in panel d and e had a p-value of less than 0.0001 for one sample t-test against zero.” One sample t-tests were performed because what is plotted already represents each atlas’ differences to the glr-1 25 dataset atlas, we didn’t think the statistical analyses between the other atlases would add significant value.

      1. Figure 4, no asterisks are shown in the figure so it is possible to remove the sentence in the legend describing what the asterisk stands for.

      Thank you. We made the necessary changes.

      Reviewer #2 (Recommendations For The Authors):

      Comparison with deep learning approaches could be more nuanced and structured, the authors (prior) approach extended here combines a specific set of comparative relationship measurements with a general optimization approach for matching based on comparative expectations. Other measurements could be used whether explicit (like neighbor expectations) or learned differences in embeddings. These alternate measurements would both need to be extensively re-calibrated for different sets of cells but might provide significant performance gains. In addition deep learning approaches don't solve the optimization part of the matching problem, so the authors approach seems to bring something strong to the table even if one is committed to learned methods (necessary I suspect for human level performance in denser cell sets than the relatively small number here). A more complete discussion of these themes might better frame the impact of the work and help readers think about the advantages and disadvantages or different methods for their own data.

      We thank the reviewer for bringing up this point. We apologize perhaps not making the point clearer in the original submission. This extension of the original work (Chaudhary et al) is not changing the CRF-based framework, but only augmenting the approach with a better defined set of axes (solely because in multicell and not whole-brain datasets, the sparsity of neurons degrades the axis definition and consequently the neuron ID predictions). We are not fundamentally changing the framework, and therefore all the advantages (over registration-based approaches for example) also apply here. The other purpose of this paper is to demonstrate a couple of use-cases for gene expression analysis, which is common in studies in C. elegans (and other organisms). We hope that by showing a use-case others can see how this approach is useful for their own applications.

      We have clarified these points in the paper (page 18). “The fundamental framework has not been changed from CRF_ID 1.0, and therefore the advantages of CRF_ID outlined in the original work apply for CRF_ID 2.0 as well.”

      The atribution of anatomical differences to strain is interesting, but seems purely speculative, and somewhat unlikely. I would suspect the fundamentally more difficult nature of aligning N items to M>>N items in an atlas accounts for the differences in using the neuroPAL vs custom atlas here. If this is what is meant, it could be stated more clearly.

      It is important to note that the same neuron candidate list (listed in methods) was used for all atlases, so there is no difference among the atlases in terms of the number of cells in the query vs. candidate list. In other words, the same values for M and for N are used regardless of the reference atlas used.

      We have preliminary data indicating differences between the NeuroPAL and custom atlas. For instance, the NeuroPAL atlas scales smaller than the custom glr-1 atlas. Since direct comparisons of the different atlases are beyond the scope of this paper, we will leave the exact comparisons for future work. We suspect that the differences are from a combination of differences in anatomy and imaging conditions. While NeuroPAL atlas may not be exactly fitting for the custom dataset, it can serve as a good starting point for guesses when no custom atlases are available, as we have discussed earlier (response to Public Comments from Reviewer 1 Point 1). As explained earlier, we have added these discussions in the paper (see page 18).

      I was also left wondering if the random removal of landmarks had to be adjusted in this work given it is (potentially) helping cope with not just occasional weak cells but the systematic loss of most of the cells in the atlas. If the parameters of this part of the algorithm don't influence the success for N to M>>N alignment (here when the neuroPAL or OpenWorm atlas is used) this seems interesting in itself and worth discussing. Conversely, if these parameters were opitmized for the matched atlas and used for the others, this would seem to bias performance results.

      We may have failed to make this clear in the main text. As we have stated in our responses in the public review section, we do systematically limit the neuron labels in the candidate list to neurons that are known to be expressed by the promotor. The candidate list, which is kept consistent for all atlases, has more neurons than cells in the query, so it is always an N-to-M matching where M>N. We did not use landmarks, but such usage is possible and will only improve the matching.

      We have attempted to clarify these points in the manuscript. In the results section under “CRF_ID 2.0 for automatic cell annotation in multi-cell images,” we added the following sentence: “Note that a truncated candidate list can be used for subset-specific cell ID if the neuronal expression is known” (page 3). In the methods section, we added the following sentence: “For multi-cell neuron predictions on the glr-1 strain, a truncated atlas containing only the above 37 neurons was used to exclude neuron candidates that are irrelevant for prediction” (Page 20).

    2. eLife assessment

      This research advance article describes a valuable image analysis method to identify individual cells (neurons) within a ‎population of fluorescently labeled cells in the nematode C. elegans. The findings are solid and the method succeeds to identify cells with high precision. The method will be valuable to the C. elegans research community.

    3. Reviewer #1 (Public Review):

      In this paper, the authors developed an image analysis pipeline to automatically identify individual ‎‎neurons within a population of fluorescently tagged neurons. This application is optimized to deal with ‎‎multi-cell analysis and builds on a previous software version, developed by the same team, to resolve ‎‎individual neurons from whole-brain imaging stacks. Using advanced statistical approaches and ‎‎several heuristics tailored for C. elegans anatomy, the method successfully identifies individual ‎‎neurons with a fairly high accuracy. Thus, while specific to C. elegans, this method can ‎become ‎instrumental for a variety of research directions such as in-vivo single-cell gene expression ‎analysis ‎and calcium-based neural activity studies.‎

    4. Reviewer #2 (Public Review):

      The authors succeed in generalizing the pre-alignment procedure for their cell identification method to allow it to work effectively on data with only small subsets of cells labeled. They convincingly show that their extension accurately identifies head angle, based on finding auto florescent tissue and looking for a symmetric l/r axis. They demonstrate method works to allow the identification of a particular subset of neurons. Their approach should be a useful one for researchers wishing to identify subsets of head neurons in C. elegans, and the ideas might be useful elsewhere.

      The authors also assess the relative usefulness of several atlases for making identity predictions. They attempt to give some additional general insights on what makes a good atlas, but here insights seem less clear as available data does not allow for experiments that cleanly decouple: 1. the number of examples in the atlas 2. the completeness of the atlas. and 3. the match in strain and imaging modality discussed. In the presented experiments the custom atlas, besides the strain and imaging modality mismatches discussed is also the only complete atlas with more than one example. The neuroPAL atlas, is an imperfect stand in, since a significant fraction of cells could not be identified in these data sets, making it a 60/40 mix of Openworm and a hypothetical perfect neuroPAL comparison. This waters down general insights since it is unclear if the performance is driven by strain/imaging modality or these difficulties creating a complete neuroPal atlas. The experiments do usefully explore the volume of data needed. Though generalization remains to be shown the insight is useful for future atlas building that for the specific (small) set of cells labeled in the experiments 5-10 examples is sufficient to build a accurate atlas.

    1. eLife assessment

      This study provides important findings on the evolution and function of the X-linked miR-506 miRNA cluster. The evidence supporting the conclusions is convincing, including the generation and characterization of an impressive number of the miRNA deletion mutants. This work will be of interest to RNA biologists, evolution biologists and reproductive biologists.

    2. Reviewer #1 (Public Review):

      Wang et al investigated the evolution, expression, and function of the X-linked miR-506 miRNA family. They showed that the miR-506 family underwent rapid evolution. They provided evidence that miR-506 appeared to have originated from the MER91C DNA transposons. Human MER91C transposon produced mature miRNAs when expressed in cultured cells. A series of mouse mutants lacking individual clusters, a combination of clusters, and the entire X-linked cluster (all 22 miRNAs) were generated and characterized. The mutant mice lacking four or more miRNA clusters showed reduced reproductive fitness (litter size reduction). They further showed that the sperm from these mutants were less competitive in polyandrous mating tests. RNA-seq revealed the impact of deletion of miR-506 on the testicular transcriptome. Bioinformatic analysis analyzed the relationship among miR-506 binding, transcriptomic changes, and target sequence conservation. The miR-506-deficient mice did not have apparent effect on sperm production, motility, and morphology. Lack of severe phenotypes is typical for miRNA mutants in other species as well. However, the miR-506-deficient males did exhibit reduced litter size, such an effect would have been quite significant in an evolutionary time scale. The number of mouse mutants and sequencing analysis represent a tour de force. This study is a comprehensive investigation of the X-linked miR-506 miRNA family. It provides important insights into the evolution and function of the miR-506 family.

      The conclusions of this preprint are mostly supported by the data except being noted below. Some descriptions need to be revised for accuracy.

      L219-L285: The conclusion that X-linked miR-506 family miRNAs are expanded via LINE1 retrotransposition is not supported by the data. LINE1s and SINEs are very abundant, accounting for nearly 30% of the genome. In addition, the LINE1 content of the mammalian X chromosome is twice that of the autosomes. One can easily find flanking LINE1/SINE repeat. Therefore, the analyses in Fig. 2G, Fig. 2H and Fig. S3 are not informative. In order to claim LINE1-mediated retrotransposition, it is necessary to show the hallmarks of LINE1 retrotransposition, which are only possible for new insertions. The X chromosome is known to be enriched for testis-specific multi-copy genes that are expressed in round spermatids (PMID: 18454149). The conclusion on the LINE1-mediated expansion of miR-506 family on the X chromosome is not supported by the data and does not add additional insights. I think that the LINE1 related figure panels and description (L219-L285) need to be deleted. In discussion (L557-558), "...and subsequently underwent sequence divergence via LINE1-mediated retrotransposition during evolution" should also be deleted. This section (L219-L285) needs to deal only with the origin of miR-506 from MER91C DNA transposons, which is both convincing and informative.

      Fig. 3A: can you speculate/discuss why the miR-506 expression in sperm is higher than in round spermatids?

    3. Reviewer #2 (Public Review):

      In this paper, Wang and collaborators characterize the rapid evolution of the X-linked miR-506 cluster in mammals and characterize the functional reference of depleting a few or most of the miRNAs in the cluster. The authors show that the cluster originated from the MER91C DNA transposon and provide some evidence that it might have expanded through the retrotransposition of adjacent LINE1s. Although the animals depleted of most miRNAs in the cluster show normal sperm parameters, the authors observed a small but significant reduction in litter size. The authors then speculate that the depletion of most miRNAs in the cluster could impair sperm competitiveness in polyandrous mating. Using a successive mating protocol, they show that, indeed, sperm lacking most X-linked miR-506 family members is outcompeted by wild-type sperm. The authors then analyze the evolution of the miR-506 cluster and its predicted targets. They conclude that the main difference between mice and humans is the expansion of the number of target sites per transcript in humans.

      The conclusions of the paper are, in most cases, supported by the data; however, a more precise and in-depth analysis would have helped build a more convincing argument in most cases.

      1) In the abstracts and throughout the manuscript, the authors claim that "... these X-linked miRNA-506 family miRNA [...] have gained more targets [...] " while comparing the human miRNA-506 family to the mouse. An alternative possibility is that the mouse has lost some targets. A proper analysis would entail determining the number of targets in the mouse and human common ancestor.

      2) The authors claim that the miRNA cluster expanded through L1 retrotransposition. However, the possibility of an early expansion of the cluster before the divergence of the species while the MER91C DNA transposon was active was not evaluated. Although L1 likely contributed to the diversity within mammals, the generalization may not apply to all species. For example, SINEs are closer on average than L1s to the miRNAs in the SmiR subcluster in humans and dogs, and the horse SmiR subcluster seems to have expanded by a TE-independent mechanism.

      3) Some results are difficult to reconcile and would have benefited from further discussion. The miR-465 sKO has over two thousand differentially expressed transcripts and no apparent phenotype. Also, the authors show a sharp downregulation of CRISP1 at the RNA and protein level in the mouse. However, most miRNAs of the cluster increase the expression of Crisp1 on a reporter assay. The only one with a negative impact has a very mild effect. miRNAs are typically associated with target repression; however, most of the miRNAs analyzed in this study activate transcript expression.

      4) More information is required to interpret the results of the differential RNA targeting by the murine and human miRNA-506 family. The materials and methods section needs to explain how the authors select their putative targets. In the text, they mention the use of four different prediction programs. Are they considering all sites predicted by any method, all sites predicted simultaneously by all methods, or something in between? Also, what are they considering as a "shared target" between mice and humans? Is it a mRNA that any miR-506 family member is targeting? Is it a mRNA targeted by the same miRNA in both species? Does the targeting need to occur in the same position determined by aligning the different 3'UTRs?

      5) The authors highlight the particular evolution of the cluster derived from a transposable element. Given the tendency of transposable elements to be expressed in germ cells, the family might have originated to repress the expression of the elements while still active but then remained to control the expression of the genes where the element had been inserted. The authors did not evaluate the expression of transcripts containing the transposable element or discuss this possibility. The authors proposed an expansion of the target sites in humans. However, whether this expansion was associated with the expansion of the TE in humans was not discussed either. Clarifying whether the transposable element was still active after the divergence of the mouse and human lineages would have been informative to address this outstanding issue.

      Post-transcriptional regulation is exceptionally complex in male haploid cells, and the functional relevance of many regulatory pathways remains unclear. This manuscript, together with recent findings on the role of piRNA clusters, starts to clarify the nature of the selective pressure that shapes the evolution of small RNA pathways in the male germ line.

    4. Reviewer #3 (Public Review):

      Summary:

      In this manuscript, the authors conducted a comprehensive study of the X-linked miR-506 family miRNAs in mice on its origin, evolution, expression, and function. They demonstrate that the X-linked miR-506 family, predominantly expressed in the testis, may be derived from MER91C DNA transposons and further expanded by retrotransposition. By genetic deletion of different combinations of 5 major clusters of this miRNA family in mice, they found these miRNAs are not required for spermatogenesis. However, by further examination, the mutant mice show mild fertility problem and inferior sperm competitiveness. The authors conclude that the X-linked miR-506 miRNAs finetune spermatogenesis to enhance sperm competition.

      Strengths:

      This is a comprehensive study with extensive computational and genetic dissection of the X-linked miR-506 family providing a holistic view of its evolution and function in mice. The finding that this family miRNAs could enhance sperm competition is interesting and could explain their roles in finetuning germ cell gene expression to regulate reproductive fitness.

      Weaknesses:

      The authors specifically addressed the function of 5 clusters of X-link miR-506 family containing 19 miRNAs. There is another small cluster containing 3 miRNAs close to the Fmr1 locus. Would this small cluster act in concert with the 5 clusters to regulate spermatogenesis? In addition, any autosomal miR-506 like miRNAs may compensate for the loss of X-linked miR-506 family. These possibilities should be discussed.

      Direct molecular link to sperm competitiveness defect remains unclear but is difficult to address.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this study, the authors examined the putative functions of hypothalamic groups identifiable through Foxb1 expression, namely the parvofox Foxb1 of the LHA and the PMd Foxb1, with emphasis on innate defensive responses. First, they reported that chemogenetic activation of Foxb1hypothalamic cell groups led to tachypnea. The authors tend to attribute this effect to the activation of hM3Dq expressed in the parvofox Foxb1 but did not rule out the participation of the PMd Foxb1 cell group which may as well have expressed hM3Dq, particularly considering the large volume (200 nl) of the viral construct injected. It is also noteworthy that the activation of the Foxb1hypothalamic cell groups in this experiment did not alter the gross locomotor activity, such as time spent immobile state. Thus, contrasts with the authors finding on the optogenetic activation of the Foxb1hypothalamic fibers projecting to the dorsolateral PAG. In the second experiment, the authors applied optogenetic ChR2-mediated excitation of the Foxb1+ cell bodies' axonal endings in the dlPAG leading to freezing and, in a few cases, bradycardia as well. The effective site to evoke freezing was the rostral PAGdl, and fibers positioned either ventral or caudal to this target had no response. Considering the pattern of Foxb1hypothalamic cell groups projection to the PAG, the fibers projecting to the rostral PAGdl are likely to arise from the PMd Foxb1 cell group, and not from the parvofox Foxb1 of the LHA. Here it is important to consider that optogenetic ChR2-mediated excitation of the axonal endings is likely to have activated the cell bodies originating these fibers, and one cannot ascertain whether the behavioral effects are related to the activation of the terminals in the PAGdl or the cell bodies originating the projection.

      Authors’ reply: We acknowledge and agree about the possibility of backpropagation in ChR2mediated terminal stimulation experiments. We have introduced a paragraph in the discussion section discussing this issue. In short, the observation of an opposing phenotype in ArchT3.0 animals indicates, that the ChR2-mediated phenotype is indeed Foxb1PAG projection specific. This is due to the fact, that the use of light-activated proton pumps for terminal stimulation can not induce backpropagation of an inhibitory effect to the soma. Potential downsides of the use of proton pumps in small compartments as e.g. in the axon are also discussed.

      Moreover, activation of PMd CCK cell group, which consists of around 90% of the PMd cells, evokes escape, and not freezing. According to the present findings, a specific population of PMd Foxb1 cells may be involved in producing freezing. In addition, only a small number of the animals with correct fiber placement presented sudden onset of bradycardia in response to the photostimulation. Considering the authors' findings, the Foxb1+ hypothalamic groups are likely to mediate behavioral responses related to innate defensive responses, where the parvofox Foxb1 of the LHA would be involved in promoting tachypnea and the PMd Foxb1group in mediating freezing and bradycardia. These findings are very interesting, and, at this point, they need to be tested in a scenario of real exposure to a natural predator.

      Authors’ reply: We fully agree with the proposed experiments. Due to the previously mentioned retirement of Prof. Celio and the concomitant expiration of licenses for animal experimentation we are prevented from conducting these experiments on our own. We have integrated a statement in the discussion, regarding these potential future experiments.

      Reviewer #2 (Public Review):

      The authors aimed to examine the role of a group of neurons expressing Foxb1 in behaviors through projections to the dlPAG. Standard chemogenetic activation or inhibition and optogentic terminal activation or inhibition at local PAG were used and results suggested that, while activation led to reduced locomotion and breathing, inhibition led to a small degree of increased locomotion.

      The observed effects on breathing are evident and dramatic. However, this study needs significant improvements in terms of data analysis and presentation and some of studies seem incomplete; and therefore the data may not yet support the conclusion.

      1. Fig.1 has no experimental data and needs to be replaced with detailed pictures from the viral injected mice showing the projections diagrammed.

      Authors’ reply: We believe that this graphic illustration is helpful to the reader to comprehend the spatial relationship between the parvafoxFoxb1 nucleus, the mammillary nuclei, and the PAG. In a previous study we have characterized the projections of the parvafoxFoxb1 nucleus in detail (using the same Foxb1-Cre mouse line as in the present study) and, in this regard, would like to refer Reviewer #2 to this publication (https://onlinelibrary.wiley.com/doi/10.1002/cne.24057).

      1. Fig. 3 needs control pictures and statistical comparison with different conditions in c-Fos. Also expression in other nearby regions needs to be presented to demonstrate the specificity of the expression.

      Authors’ reply: We have modified the original Fig. 3 with more pictures across all three conditions used in the chemogenetic experiments. Since the new figure now takes up a whole page, and because the data in this figure is for validation purpose of the DREADD experiments, we have decided to rather put it into the supplementary files. The figure is now labelled as “Supplementary File S1”. All figure and file numberings throughout the text have been adjusted accordingly.

      1. Fig. 5, a great effort has been made to illustrate the point that CCK and Foxb1 are differentially expressed. Why not just perform a double in situ experiment to directly illustrate the point?

      Authors’ reply: We have addressed this comment in the initial release of the eLife manuscript. In short, we agree that a double ISH experiment would have been an alternative approach, but would like to state that scRNAseq is a well established and valid method for this purpose.

      1. Fig. 7 data on optogenetic stimulation on immobility and breathing, since not all mice showed the same phenotype, what is the criterion for allocating these mice to hit or no hit groups? Given the dramatically reduced breathing and locomotion, what is the temperature response? More data needs to be gathered to support that this is a defense behavior.

      Authors’ reply: The criteria for allocation of animals to the experimental groups is described in section “Optogenetic modulation of Foxb1 terminals in the dlPAG induces immobility” and is based on the stereotaxic coordinates of the tips of the glass fiber implants. We did not perform any experiments, in which we recorded body temperatures or temperature preferences in optogenetic animals. Such experiments were outside the scope of the study. As mentioned in a previous comment above, we have added an additional paragraph to the discussion section regarding future investigations of these hypothalamic Foxb1 neurons during exposure to natural predators. Such experiments would certainly allow more insight into the defensive nature of the described phenotype.

      1. The authors claim to target dlPAG. However, in the picture shown in Fig. 8C, almost all PAG contains ChR2 fibers and it is likely all the fibers will be activated by light. Thus, as presented, the data does not support the claim of the specificity on dlPAG. Also c-Fos data needs to be presented on the degree of activation of downstream PAG neurons after light exposure.

      Authors’ reply: We attach the original image 8c, without arrows and indications, in which the localization of ChR2-positive fibers in the dlPAG is better visible. They are located exactly under the tip of the fiberoptic fiber. We do not know the functional characteristics of the post-synaptic PAG neurons and have not determined experimentally their downstream targets. Investigating the downstream target was outside the scope of the current publication.

      Author response image 1.

      1. Fig. 9 only showed one case. A statistical comparison needs to be presented.

      Authors’ reply: Our cardiovascular experiments are of exploratory and descriptive nature (i.e. pilot experiments). It was a conscious decision to not perform hypothesis tests on these experiments. We did not have enough mice to perform statistical tests with sufficient statistical power. Providing results from hypothesis tests on these data would lead to statistically unjustified conclusions. To clarify this issue, we have added a paragraph to the relevant results section.

      1. Optogentic terminal activation in the PAG will likely elicit back-propagation and subsequent activation of additional downstream brain sites of Foxb1 neurons. More experiments need to be done to assess this and as presented, the data does not support the role of PAG necessarily.

      Authors’ reply: Please see our answer to Reviewer #1 regarding the same issue.

      1. The authors claim negative data from PVH-Cre mice. More data need to be presented to make this case.

      Authors’ reply: We would like to refer to our answer to point 6) that was raised by Reviewer #2

      The conclusion, even as presented, adds to the known evidence of the PAG in the defense behavior.

      Reviewer #1 (Recommendations For The Authors):

      In the pharmacogenetic experiments, the authors need to clarify which Foxb1hypothalamic presented the activation of hM3Dq. It is important to know whether this activation-producing tachypnea was restricted to the parvofoxFoxb1 or also included the PMd Foxb1 group. It would be important to isolate the effect of the pharmacogenetic activation of each one of these Foxb1 hypothalamic cell groups.

      After determining which cell group would be involved in mediating this respiratory effect, it would be nice to discuss the possible pathways involved in this effect.<br /> In the optogenetic experiments, the authors should differentiate between the effects of the PAG projecting fibers from the PMd and those from the parvofox groups. As it stands, it seems that the freezing and bradycardia depend on projection from the PMd Foxb1 group to the rostral PAGdl. However, considering the large volume (200 nl) of the viral construct injected, both groups were likely to express channelrhodopsin, and it would be important if the authors could restrict the viral injections to each one of the Foxb1 hypothalamic cell groups.

      Authors’ reply: We fully agree with the suggestion, but due to the recent retirement of Prof. Celio we unfortunately not allowed to conduct any further animal experiments.

      The authors also reported that photoactivation ventral to the PAGdl, possibly in the PAGl did not yield any clear behavioral response. However, as pointed out in the discussion, a recent publication found that the parvofox Foxb1 projection to the lateral PAG drives social avoidance, and we were wondering whether there was any avoidance behavior during the photoactivation of the PAGl fibers.

      Authors’ reply: We did not conduct any social avoidance experiments ourselves. However, we did perform ultrasonic vocalization experiments (unpublished data) in which we optogenetically stimulated Foxb1+ terminals in the PAG. Due to experimental issues related to the age of the tested mice, we did not obtain conclusive results regarding the ultrasonic vocalizations. By a purely observational account, we did not observe any active avoidance during optogenetic stimulation, but rather a cessation of interaction. We are unable to judge whether this was more pronounced in the PAGl targeted mice or not.

      Another important point is that optogenetic ChR2-mediated excitation of the axonal endings is likely to activate the cell bodies originating these fibers, and one cannot ascertain whether the behavioral effects depend on the activation of the terminals in the PAGdl or the activation of the cell bodies originating these terminals. Note, in the present case, PMd cell bodies may also project elsewhere, such as the cuneiform nucleus, known to mediate freezing responses. To circumvent this problem, during photoactivation of the PAGdl terminals, the authors should inhibit the cell bodies originating these terminals.

      Authors’ reply: We would like to refer to the answer we provided above regarding the issue of backpropagation or ChR2-mediated phenotypes and projection-specificity.

      Another important issue is related to the fact that around 90% of the PMd express CCK (Wang et al., 2021), and previous work showed that activation of these cells yielded escape and not freezing (Wang et al., 2021). Although the authors claim that the single-cell RNA sequencing dataset reveals distinct Foxb1 expression in the PMd, these results derive from tissues collected in the posterior hypothalamus, not exactly restricted to the PMd. Therefore, it would be desirable if the authors could show CCK and Foxb1doulbe labeled PMd sections to evaluate the exact percentage of cells expressing either one of these peptides.

      Authors’ reply: The tissues for the scRNAseq data were obtained from hypothalamic tissues between stereotaxic coordinates of AP-2.54 to AP-3.16 (please see Fig. 1b in Mickelsen et al. 2020) and not purely from the posterior hypothalamic nucleus. These tissues hence include a large proportion of the PMd neurons. We would like to point out that the expression profile of the PMd cluster matches well with the ISH data from the Allen Brain Atlas that we have put together in "Supplementary File S6” (originally “Supplementary File S5”)

      The authors should also explain why only a small number of animals that received PAGdl photoactivation presented bradycardia. Moreover, they should also discuss the possible pathways mediating this effect. Here, it is important to point out that the cuneiform nucleus, as suggested by the authors as one possible way to mediate this effect, promotes sympathetic vasomotor activity (Verbene, 1995).

      We have added the sentence: “The projections of the cuneiform nucleus to the rostral ventrolateral medulla promote sympathetic vasomotor activity (Verberne 1995).” to the Discussion section.

      Reviewer #2 (Recommendations For The Authors):

      In this reviewer's view, this study needs substantial improvement:

      1. The writing is very sloppy and difficult to follow. There is no clear logic flow in the main text and the figures need substantial realigning for panels, additions of labelling etc.

      We have added the sentence.

      1. Fig. 6 the hot plate data is out of place and should be placed in supplementary or removed completely.

      Authors’ reply: We and others have previously shown that the parvalbumin+ population of the Parvafox nucleus is involved in nociceptive behavior. Hence, we believe it is of interest to show, that we do not see the same phenotype with the stimulation of the Foxb+ population of the parvafox nucleus. This data shows that the nociceptive component of the parvafox nucleus is confined to its parvalbumin+ population.

      1. The authors discussed social behavior data in the Discussion, but no such data is presented, which is very confusing.

      Authors’ reply: Indeed we did not perform any experiments to investigate social behavior. However, we address that the observed locomotive phenotype of optogenetic Foxb1+-terminals could have lead to a bias in the interpretation of the social behavior experiments published elsewhere by others.

      1. The authors discussed a great deal on potential differences between parvafox and PMd Foxb1 neurons, however, no clear data was presented to show a functional difference between them, which is also confusing.

      Authors’ reply: Even though investigations on the functional differences of parvafox and PMd Foxb1 neurons would be highly interesting, it was outside the scope of the current study. Due to the recent retirement of Prof. Celio, we are not allowed to perform any additional animal experiments.

    2. Reviewer #1 (Public Review):

      In this study, the authors examined the putative functions of hypothalamic groups identifiable through Foxb1 expression, namely the parvofox Foxb1 of the LHA and the PMd Foxb1, emphasizing innate defensive responses. First, they reported that chemogenetic activation of Foxb1hypothalamic cell groups led to tachypnea. The authors tend to attribute this effect to the activation of hM3Dq expressed in the parvofox Foxb1 but did not rule out the participation of the PMd Foxb1 cell group, which may as well have expressed hM3Dq, particularly considering the large volume (200 nl) of the viral construct injected. Notably, the activation of the Foxb1hypothalamic cell groups in this experiment did not alter the gross locomotor activity, such as time spent immobile state. Thus, this contrasts with the authors' finding on the optogenetic activation of the Foxb1hypothalamic fibers projecting to the dorsolateral PAG. In the second experiment, the authors applied optogenetic ChR2-mediated excitation of the Foxb1+ cell bodies' axonal endings in the dlPAG, leading to freezing and, in a few cases, bradycardia. The effective site to evoke freezing was the rostral PAGdl, and fibers positioned either ventral or caudal to this target had no response. Considering the pattern of Foxb1hypothalamic cell groups projection to the PAG, the fibers projecting to the rostral PAGdl are likely to arise from the PMd Foxb1 cell group and not from the parvofox Foxb1 of the LHA. Here, it is important to consider that activation of PMd CCK cell group, which consists of around 90% of the PMd cells, evokes escape, not freezing. According to the present findings, a specific population of PMd Foxb1 cells may be involved in producing freezing. In addition, only a few of the animals with correct fiber placement presented sudden onset of bradycardia in response to the photostimulation. Considering the authors' findings, the Foxb1+ hypothalamic groups are likely to mediate behavioral responses related to innate defensive responses, where the parvofox Foxb1 of the LHA would be involved in promoting tachypnea and the PMd Foxb1group in mediating freezing and bradycardia. These findings are exciting, and, at this point, they need to be tested in a scenario of actual exposure to a natural predator.

    3. Reviewer #2 (Public Review):

      The authors aimed to examine the role of a group of neurons expressing Foxb1 in behaviors through projections to the dlPAG. Standard chemogenetic activation or inhibition and optogentic terminal activation or inhibition at local PAG were used and results suggested that, while activation led to reduced locomotion and breathing, inhibition led to a small degree of increased locomotion.

      The observed effects on breathing are evident and dramatic. However, due to the circumstance that does not permit to perform additional experiments, the conclusion is not as strong as it could be.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This is an important study that leverages a human-chimpanzee tetraploid iPSC model to test whether cis-regulatory divergence between species tends to be cell type-specific. The evidence supporting the study's primary conclusion--that species differences in gene regulation are enriched in cell type-specific genes and regulatory elements--is compelling, although attention to biases introduced by sequence conservation is merited, and the case that is made for cell type-specific changes reflecting adaptive evolution is incomplete. This work will be of broad interest in evolutionary and functional genomics.

      Public Reviews:

      Reviewer #1 (Public Review):

      This study aims to identify gene expression differences exclusively caused by cis-regulatory genetic changes by utilizing hybrid cell lines derived from human and chimpanzee. While previous attempts have focused on specific tissues, this study expands the comparison to six different tissues to investigate tissue specificity and derive insights into the evolution of gene expression.

      One notable strength of this work lies in the use of composite cell lines, enabling a comparison of gene expression between human and chimpanzee within the same nucleus and shared trans factors environment. However, a potential weakness of the methodology is the use of bulk RNA-seq in diverse tissues, which limits the ability to determine cell-type-specific gene expression and chromatin accessibility regions.

      We agree that profiling single cells could lead to additional exciting discoveries. Although heterogeneity in cell types within samples will indeed reduce our power to detect cell-type-specific divergence, thankfully any heterogeneity will not introduce false positives, since our use of interspecies hybrids controls for differences in cell-type abundance. As a result, we think that the molecular differences we identified in this study represent a subset of the true cell-type specific cis-regulatory differences that would be identified with deep single-cell profiling. We have included a new paragraph in the discussion on future directions, highlighting the utility of single-cell profiling as an exciting future direction (lines 482-490): “In addition to following up on our findings on GAD1 and FABP7, there are other exciting future directions for this work. First, additional bulk assays such as those that measure methylation, chromatin conformation, and translation rate could lead to a better understanding of what molecular features ultimately lead to cell type-specific changes in gene expression. Furthermore, the use of deep single cell profiling of hybrid lines derived from iPSCs from multiple individuals of each species during differentiation could enable the identification of many more highly context-specific changes in gene expression and chromatin accessibility such as the differences in GAD1 we highlighted here. Finally, integration with data from massively parallel reporter assays and deep learning models will help us link specific variants to the molecular differences we identified in this study.”

      Another concern is the use of two replicates derived from the same pair of individuals. While the authors produced cell lines from two pairs of individuals in a previous study (Agloglia et al., 2021), I wonder why only one pair was used in this study. Incorporating interindividual variation would enhance the robustness of the species differences identified here.

      We agree that additional replicates, especially from lines from other individuals, would have improved the robustness of the species differences we identified. In our experience with these hybrid cells (as well as related work from many other labs), inter-species differences typically have much larger magnitudes than intra-species differences, so we expect that the vast majority of differences we identified would be validated with data from additional individuals. Unfortunately, differentiating additional cells and generating these data for this study would be cost-prohibitive. We now mention the use of additional replicates in lines 485-488 of the discussion: “Furthermore, the use of deep single cell profiling of hybrid lines derived from iPSCs from multiple individuals of each species during differentiation could enable the identification of many more highly context-specific changes in gene expression and chromatin accessibility such as the differences in GAD1 we highlighted here.”

      Furthermore, the study offers the opportunity to relate inter-species differences to trends in molecular evolution. The authors discovered that expression variance and haploinsufficiency score do not fully account for the enrichment of divergence in cell-type-specific genes. The reviewer suggests exploring this further by incorporating external datasets that bin genes based on interindividual transcriptomics variation as a measure of extant transcriptomics constraint (e.g., GTEx reanalysis by Garcia-Perez et al., 2023 - PMID: 36777183). Additionally, stratifying sequence conservation on ASCA regions, which exhibit similar enrichment of cell-type-specific features, using the Zoonomia data mentioned also in the text (Andrews et al., 2023 -- PMID: 37104580) could provide valuable insights.

      To address this, we used PhastCons scores computed from a 470-way alignment of mammals as we could not find publicly available PhastCons data from Zoonomia. When stratifying by the median PhastCons score of all sites in a peak, we observe very similar results to those obtained when stratifying by the constraint metric from the gnomAD consortium (see below). The one potential difference is that peaks in the top two bins have slightly weaker enrichment relative to the other bins when using PhastCons, but this is not the case when using gnomAD’s metric. We have elected to include this in the public review but not the manuscript as we are reluctant to add to the complexity of what is already complex analysis.

      Author response image 1.

      Finally, we think that comparisons of the properties of gene expression variance computed from ASE (as done by Starr et al.) and total expression (as done by Garcia-Perez et al.) is a very interesting, potentially complex question that is beyond the scope of this paper but an exciting direction for future work.

      Another potential strength of this study is the identification of specific cases of paired allele-specific expression (ASE) and allele-specific chromatin accessibility (ASCA) with biological significance.

      Prioritizing specific variants remains a challenge, and the authors apply a machine-learning approach to identify potential causative variants that disrupt binding sites in two examples (FABP7 and GAD1 in motor neurons). However, additional work is needed to convincingly demonstrate the functionality of these selected variants. Strengthening this section with additional validation of ASE, ASCA, and the specific putative causal variants identified would enhance the overall robustness of the paper.

      We strongly agree with the reviewer that additional work validating our results would be of considerable interest. We hope to perform follow-up experiments in the future. For now, we have been careful to present these variants only as candidate causal variants.

      Additionally, the authors support the selected ASE-ASCA pairs by examining external datasets of adult brain comparative genomics (Ma et al., 2022) and organoids (Kanton et al., 2019). While these resources are valuable for comparing observed species biases, the analysis is not systematic, even for the two selected genes. For example, it would be beneficial to investigate if FABP7 exhibits species bias in any cell type in Kanton et al.'s organoids or if GAD1 is species-biased in adult primate brains from Ma et al. Comparing these datasets with the present study, along with the Agoglia et al. reference, would provide a more comprehensive perspective.

      We agree with the reviewer’s suggestion that investigating GAD1 and FABP7 expression in other datasets is worthwhile. Unfortunately, the difference in human vs. chimpanzee organoid maturation rates and effects of culture conditions in Kanton et al. makes it unsuitable for plotting the expression of FABP7 as its expression is highly dependent on neuronal maturation. We therefore plotted bulk RNAseq data from multiple cortical regions from Sousa et al. 2017 (see below). This corroborates our claim that FABP7 has human-biased expression in adult humans compared to chimpanzees and rhesus macaques. We also investigated expression of GAD1 in the Ma et al. data as the reviewer suggested.

      Author response image 2.

      While there are differences in GAD1 expression between adult humans and chimpanzees, they are unlikely to be linked to the HAR we highlight as it is likely a transiently active cis-regulatory element (see below). In addition, some cell types seem to have chimpanzee-derived changes in GAD1 expression (e.g. SST positive neurons) whereas others seem to have human-derived changes in GAD1 expression (e.g. LAMP5 positive neurons).

      Author response image 3.

      While these are potentially interesting observations, we think that their inclusion in the manuscript might distract from our emphasis on the cell type-specific and developmental stage-specific of the changes in FABP7 and GAD1 expression we observe so we have not included them in the manuscript.

      The use of the term "human-derived" in ASE and ASCA should be avoided since there is no outgroup in the analysis to provide a reference for the observed changes.

      We agree with the reviewer that the term human-derived should be used with care and have changed the phrasing of line 230 to “human-chimpanzee differences in expression”. With regard to FABP7 we think that our analysis of the Ma et al. data—which includes data from rhesus macaques as an outgroup—justifies our use of “human-derived” in lines 360 and 457. As chimpanzee and macaque expression of FABP7 are similar but human expression is quite different, the most parsimonious explanation for our observations is that FABP7 upregulation occurred in the human lineage.

      Finally, throughout the paper, the authors refer to "hybrid cell lines." It has been suggested to use the term "composite cell lines" instead to address potential societal concerns associated with the term "hybrid," which some may associate with reproductive relationships (Pavlovic et al., 2022 -- PMID: 35082442). It would be interesting to know the authors' perspective on these concerns and recommendations presented in Pavlovic et al., given their position as pioneers in this field.

      We appreciate this question. Whether to refer to our fused cells as “hybrids” or not was indeed a question we considered at great length, starting from the very beginning of this project in 2015. From consultations with multiple bioethicists-- both formal and informal-- we have long been aware of the possibility of misunderstanding based on the word “hybrid”. However, we felt this possibility was outweighed by the long and well-established history of other scientists referring to interspecies fused cells as hybrids. This convention-- which is based on hundreds of papers about heterokaryons, somatic cell hybrids, and radiation hybrids-- goes back over 50 years (e.g. Bolund et al, Exp Cell Res 1969). Soon after the establishment of this nomenclature, cell fusion became widespread and ever since then it has become commonplace to generate interspecies hybrid cells from animals, plants and fungi.

      It is also important to note that in over two years since we published the first two papers on humanchimpanzee fused cells, we have been unable to find any misunderstanding of our use of the term “hybrid”. We have searched blogs, media articles, and social media, all with no evidence of misunderstanding. Therefore, in the current manuscript, rather than creating confusion by renaming a well-established approach, we have opted to clearly and prominently define hybrid cells: in the abstract of our paper we introduce the hybrid cells as “the product of fusing induced pluripotent stem (iPS) cells of each species in vitro.”

      Reviewer #2 (Public Review):

      In this paper, Wang and colleagues build on previous technical and analytical achievements in establishing tetraploid human-chimpanzee hybrid iPSCs to investigate the cell type-specificity of allelespecific expression and allele-specific chromatin accessibility across six differentiated cell types (here, "allele-specific" indicates species differences with a cis-regulatory basis). The combined body of work is remarkable in its creativity and ambition and has real potential for overcoming major challenges in understanding the evolutionary genetics of between-species differences. The present paper contributes to these efforts by showing how differentiated cells can be used to test a long-standing hypothesis in evolutionary genetics: that cis-regulatory changes may be particularly important in divergence because of their potential for modularity.

      In my view, the paper succeeds in making this case: allele (species)-specific expression (ASE) and allelespecific chromatin accessibility (ASCA) are enriched in genes asymmetrically expressed in one cell type, and many cases of ASE/ASCA are cell type-specific. The authors do an excellent job showing that these results are robust across a set of possible analysis decisions. It is somewhat less clear whether these enrichments are primarily a product of relaxed constraint on cell type-specific genes or primarily result from positive selection in the human or chimp lineage. While the authors attempt to control for constraint using several variables (variance in ASE in humans and the sequence-based probability of haploinsufficiency score, pHI), these are imperfect proxies for constraint. For the pHI scores, enrichments for ASE also appear to be strongest in the least constrained genes. Overall, the relative role of relaxation of constraint versus positive selection is unresolved, although the manuscript's language leans in favor of an important role for selection.

      We agree with the reviewer and apologize for the wording that indeed focused more on positive selection than relaxed constraint. We have added language clarifying that our stance is that our analyses suggest some role for positive selection, but that we do not claim that positive selection plays a larger role than reduced constraint (lines 432-437): “Overall, this suggests that broad changes in expression in cell type-specifically expressed genes may be an important substrate for evolution but it remains unclear whether positive selection or lower constraint plays a larger role in driving the faster evolution of more cell type-specifically expressed genes. Future work will be required to more precisely quantify the relative roles of positive selection and evolutionary constraint in driving changes in gene expression.”

      The remainder of the manuscript draws on the cell type-specific ASE/ASCA data to nominate candidate genes and pathways that may have been important in differentiating humans and chimpanzees. Several approaches are used here, including comparing human-chimp ASE to the distribution of ASE observed in humans and investigating biases in the direction of ASE for genes in the same pathway. The authors also identify interesting candidate genes based on their role in development or their proximity to human accelerated regions (where many changes have arisen on the human lineage in otherwise deeply conserved sequence) and use a deep neural network to identify sequence changes that might be causally responsible for ASE/ASCA. These analyses have value and highlight potential strategies for using ASE/ASCA and hybrid cell line data as a hypothesis-generating tool. Of course, the functional follow-up that experimentally tested these hypotheses or linked sequence/expression changes in the candidate pathways to organismal phenotype would have strengthened the paper further- but this is a lot to ask in an already technically and analytically challenging piece of work.

      We thank the reviewer for the kind words and strongly agree that follow-up experiments and orthogonal analyses will be key in validating our results and establishing links to human-specific phenotypes.

      As a minor critique, the present paper is very closely integrated with other manuscripts that have used the hybrid human-chimp cell lines for biological insight or methods development. Although its contributions make it a strong stand-alone contribution, some aspects of the methods are not described in sufficient detail for readers to understand (even on a general conceptual level) without referencing that work, which may somewhat limit reader understanding.

      We agree with the points the reviewer raises regarding the clarity of our methods. We have amended several sections to provide more conceptual information while pointing the reader to other publications for the technical details. For convenience, we include the text here as well as in the new draft.

      Lines 207-214 now provide more intuition for the method used to detect lineage-specific selection: “Next, we sought to use our RNA-seq data to identify instances of lineage-specific selection. In the absence of positive selection, one would expect that an approximately equal number of genes in a pathway would have human-biased vs. chimpanzee-biased ASE. Significant deviation from this expectation (as determined by the binomial test) rejects the null hypothesis of neutral evolution, instead providing evidence of lineage-specific selection on this pathway. Using our previously published modification of this test that incorporates a tissue-specific measure of constraint on gene expression, we detected several signals of lineage-specific selection, some of which were cell type-specific (Starr et al., 2023, Additional file 2).” This is also reflected in the Methods in lines 729-731: “Positive selection on a gene set is only inferred if there is statistically significant human- or chimpanzee-biased ASE in that gene set (using an FDR-corrected p-value from the binomial test).”

      Reviewer #3 (Public Review):

      The authors utilize chimpanzee-human hybrid cell lines to assess cis-regulatory evolution. These hybrid cell lines offer a well-controlled environment, enabling clear differentiation between cis-regulatory effects and environmental or other trans effects.

      In their research, Wang et al. expand the range of chimpanzee-human hybrid cell lines to encompass six new developmental cell types derived from all three germ layers. This expansion allows them to discern cell type-specific cis-regulatory changes between species from more pleiotropic ones. Although the study investigates only two iPSC clones, the RNA- and ATAC-seq data produced for this paper is a valuable resource.

      The authors begin their analysis by examining the relationship between allele-specific expression (ASE) as a measure of species divergence and cell type specificity. They find that cell-type-specific genes exhibit more divergent expression. By integrating this data with measures of constraint within human populations, the authors conclude that the increased divergence of tissue-specific genes is, at least in part, attributable to positive selection. A similar pattern emerges when assessing allele-specific chromatin accessibility (ASCA) as a measure of divergence of cis-regulatory elements (CREs) in the same cell lines.

      By correlating these two measures, the authors identify 95 CRE-gene pairs where tissue-specific ASE aligns with tissue-specific ASCA. Among these pairs, the authors select two genes of interest for further investigation. Notably, the authors employ an intriguing machine-learning approach in which they compare the inferred chromatin state of the human sequence with that of the chimpanzee sequence to pinpoint putatively causal variants.

      Overall, this study delves into the examination of gene expression and chromatin accessibility within hybrid cell lines, showcasing how this data can be leveraged to identify potential causal sequence differences underlying between-species expression changes.

      We appreciate this assessment.

      I have three major concerns regarding this study:

      1. The only evidence that the cells are indeed differentiated in the right direction is the expression of one prominent marker gene per cell type. Especially for the comparison of conservation between the differentiated cell types, it would be beneficial to describe the cell type diversity and the differentiation success in more detail.

      We appreciate this assessment. We agree that evidence beyond a single marker gene is necessary to demonstrate that the differentiations were successful and that a discussion of the limitations of these differentiations in the manuscript is worthwhile. We included figures showing additional marker genes and a thorough discussion of the differentiations in the supplement. For convenience, we have copied the supplemental figure and text here:

      “Before continuing with the analysis, we tested whether the differentiations were successful and contained primarily our target cell types. The very low expression of NANOG, a marker for pluripotency, across all differentiations indicates that the samples contain very few iPSCs (Agoglia et al., 2021). For cardiomyocytes (CM), NKX2-5, MYBPC3, and TNNT2 definitively distinguish CM from other heart cell types and their high expression indicates successful differentiations (Burridge et al., 2014). For motor neurons, the high expression of ELAVL2, a pan-neuronal marker, indicates a high abundance of neurons in the sample (Mickelsen et al., 2019). The expression of ISL1 and OLIG2 further demonstrates that these are motor neurons and not other types of neurons (Maury et al., 2015). For retinal pigment epithelium (RPE), the combined expression of MITF, PAX6, and TYRP1 provides strong evidence that the differentiations were successful in producing RPE cells (Sharma et al., 2019). For skeletal muscle, the very high expression of MYL1, MYLPF, and MYOG indicates that these samples contain a high proportion of skeletal muscle cells (Chal et al., 2016). In general, all these populations of cells contain some proportion of progenitors as there is detectable expression of MKI67 in all samples.

      The low expression of ALB (a marker for mature hepatocytes) and the high expression of TTR and GPC3 (markers for hepatocyte progenitors) combined with the high expression of HNF1B indicate that the bulk of the cells in the HP samples are hepatocyte progenitors rather than mature hepatocytes or endoderm cells, although there are likely some endoderm cells and immature hepatocytes in the sample (Hay et al., 2008; Mallanna & Duncan, 2013). Similarly, the combined expression of PDX1 and NKX6-1 and the low expression of NEUROG3 (a marker of endocrine progenitors which differentiate from pancreatic progenitors) in the PP samples indicates that these primarily contain pancreatic progenitors but likely contain some endocrine progenitors and endoderm cells (Cogger et al., 2017; Korytnikov & Nostro, 2016).

      Notably, HP and PP are closely related cell types that are derived from the same lineage. Indeed, heterogeneous multipotent progenitors can contribute to both the adult liver and adult pancreas in mice (Willnow et al., 2021). Progenitors that express PDX1 (often used as a marker for the pancreatic lineage) can differentiate into hepatocytes (Willnow et al., 2021). As a result, some overlap in the transcriptomic signature of both cell types is expected and we cannot rule out that the HP samples contain cells that could differentiate into pancreatic cells or that the PP samples contain cells that could differentiate into hepatocytes. However, the expression of NKX6-1 and GP2, markers for pancreatic progenitors, in the PP samples but not the HP samples indicates that these two populations of cells are distinct. Overall, the similarity of PP and HP likely explains the lower number of cell type-specific genes and genes showing cell type-specific ASE for these cell types. This similarity does not alter the conclusions presented in the main text.”

      Author response image 4.

      Author response image 5.

      Marker gene expression in different cell types. In order, the panels show: a marker for pluripotency, a marker gene for dividing cells, marker genes for cardiomyocytes, marker genes for hepatocytes and hepatocyte progenitors, marker genes for motor neurons, marker genes for pancreatic progenitors and more mature pancreatic cell types, marker genes for retinal pigment epithelial cells, and marker genes for skeletal myocytes. Hepatocyte progenitors and pancreatic progenitors generally show similar gene expression profiles. TPM: transcript per million.

      1. Check for a potential confounding effect of sequence similarity on the power to detect ASE or ASCA.

      We agree that checking for confounding by power to detect ASE or ASCA would increase confidence in our results. We have added supplementary figures 29-33 to show the results as well as a discussion of these figures in the text (lines 318-326):

      “Finally, it is possible that CREs and genes that are less conserved will have more SNPs, and therefore more power to call ASCA and ASE, leading to systematically biased estimates. There is a weak positive correlation between the number of SNPs and the -log10(FDR) for ASE and a weak negative or no correlation for ASCA (Supp Fig. 29). Similarly, we observe a weak relationship between the number of SNPs in CREs or genes and absolute log fold-change estimates (Supp Fig. 30). Although the relationship between the number of SNPs and ASE/ASCA is weak, we confirmed that cell type-specific genes and peaks are still strongly enriched for ASE and ASCA when stratifying by number of SNPs (Supp Fig. 31-32). Overall, our analysis suggests that the result that more cell type-specific genes and CREs are more evolutionarily diverged is robust to a variety of possible confounders.”

      Author response image 6.

      Relationship between number of SNPs and -log10(FDR) in a) ASE and -log10(pvalue) b) ASCA. These scatter plots show the relationship between the number of SNPs in a gene or peak and the -log10(FDR) for ASE or ASCA. Genes with significant ASE (FDR < 0.05) and peaks with significant ASCA (binomial p-value < 0.05) were annotated as blue dots, and all other genes and peaks were annotated as gray dots. All genes in each cell type in RNA-seq are shown. For clarity, the few outlier peaks with more than 200 SNPs are excluded from these plots.

      Author response image 7.

      Relationship between number of SNPs and absolute log2 fold-change in a) ASE and b) ASCA. These scatter plots show the relationship between the number of SNPs in a gene or peak and the estimated absolute log2 fold-change for ASE or ASCA. Genes with significant ASE (FDR < 0.05) and peaks with significant ASCA (binomial p-value < 0.05) were annotated as blue dots, and all other genes and peaks were annotated as gray dots. All genes in each cell type in RNA-seq are shown. For clarity, the few outlier peaks with more than 200 SNPs are excluded from these plots.

      Author response image 8.

      Cell type-specifically expressed genes are enriched for genes with ASE when stratifying by the number of SNPs per gene. a) Results when SKM is included. Genes were put into five bins with an equal number of genes in each bin. Genes with the fewest SNPs are in the 0-20% bin and genes with the most SNPs are in the 80-100% bin. Significance (using the Wald test) is indicated by asterisks where *** indicates p < 0.005, ** indicates p < 0.01, and * indicates p < 0.05. b) The same as in (a) but excluding SKM.

      Author response image 9.

      Cell type-specific peaks are enriched for ASCA when stratifying by the number of SNPs per peak. a) Peaks with an absolute log2 fold-change greater than or equal to 0.5 were called as having ASCA. Peaks were put into five bins with an equal number of peaks in each bin. Peaks with the fewest SNPs are in the 0-20% bin and genes with the most SNPs are in the 80-100% bin. Significance (using the Wald test) is indicated by asterisks where *** indicates p < 0.005, ** indicates p < 0.01, and * indicates p < 0.05. b) The same as in (a) but peaks with a binomial p-value less than or equal to 0.05 were called as having ASCA.

      1. In the last part the authors showcase 2 examples for which the log2 fold changes in chromatin state scores as inferred by the machine learning model Sei are used. This is an interesting and creative approach, however, more sanity checks on this application are necessary.

      We agree with the reviewer about the importance of sanity checks and apologize for omitting these from the manuscript. Below we highlight several such checks from previous publications:

      In the original Sei paper (Chen et al. 2022), the authors included several tests of their model’s ability to predict the effects on individual genetic variants. Using eQTL data from GTEx, they found that variants predicted to increase enhancer activity were more likely to be up-regulating eQTLs, and those predicted to increase polycomb repression had the expected repressive effect. These relationships became stronger when restricting the analysis only to fine-mapped eQTLs with >95% posterior probabilities of causality. Chen et al. also found that previously known disease-causing noncoding variants from the Human Gene Mutation Database were far more likely to reduce predicted enhancer/promoter activity than matched variants not linked to any disease.

      In addition, we note that a similar approach to ours was recently used to analyze all HARs and included considerable efforts to validate the utility of the Sei predictions in identifying causal variants (Whalen et al. 2023 in Neuron). For example, Whalen et al. found that the Sei output correlated with the effects of genetic variants on expression in a massively parallel reporter assay. They also found that the effect sizes predicted by Sei were much higher for variants in HARs than polymorphic variants in the human population, which is consistent with the idea that variants in HARs lie in highly conserved bases that are more likely to disrupt cis-regulatory elements. Finally, Whalen et al. found that effects on chromatin state predicted by Sei were generally highly correlated across tissues, supporting our approach that leverages all Sei outputs regardless of which cell type or tissue they correspond to. Overall, we think that Sei is a potentially powerful way to prioritize causal variants and that improved machine learning models trained on more extensive and context-specific data will be even more powerful.

    2. eLife assessment

      This is an important study that leverages a human-chimpanzee tetraploid iPSC model to test whether cis-regulatory divergence between species tends to be cell type-specific. The evidence supporting the study's primary conclusions together provide convincing evidence for enrichment of species differences in gene regulation in cell type-specific genes and regulatory elements, motivating future work with larger sample sizes of cell lines. This work will be of broad interest in evolutionary and functional genomics.

    3. Reviewer #1 (Public Review):

      This study aims to identify gene expression differences exclusively caused by cis-regulatory genetic changes by utilizing hybrid cell lines derived from human and chimpanzee. While previous attempts have focused on specific tissues, this study expands the comparison to six different tissues to investigate tissue specificity and derive insights into the evolution of gene expression.

      One notable strength of this work lies in the use of composite cell lines, enabling a comparison of gene expression between human and chimpanzee within the same nucleus and shared trans factors environment. However, a potential weakness of the methodology is the use of bulk RNA-seq in diverse tissues, which limits the ability to determine cell-type-specific gene expression and chromatin accessibility regions. Their approach, using hybrid lines, naturally accounts for cell type heterogeneity avoiding the risk of false positives introduced by the otherwise confounding differences in cell type abundances between species, albeit the challenge of false negatives remains an issue. The authors now dully acknowledge this limitation in the manuscript.

      Another concern is the use of two replicates derived from the same pair of individuals. While the authors produced cell lines from two pairs of individuals in a previous study (Agloglia et al., 2021). The reason for this experimental design is cost limitations. The authors now acknowledge that the use of replicates could enhance the ability to detect "more" species-specific changes in expression and chromatin accessibility. I would emphasize that replicates would increase robustness to the present findings, given that they are derived from a single pair of individuals.

      Furthermore, the study offers the opportunity to relate inter-species differences to trends in molecular evolution. The authors discovered that expression variance and haploinsufficiency score do not fully account for the enrichment of divergence in cell-type-specific genes. The reviewer suggested exploring this further by incorporating external datasets that bin genes based on interindividual transcriptomics variation as a measure of extant transcriptomics constraint (e.g., GTEx reanalysis by Garcia-Perez et al., 2023 - PMID: 36777183). The authors considered this question to be out of the scope of the paper, yet in my opinion this would enhance one of the main findings of this study.

      Additionally, stratifying sequence conservation on ASCA regions, which exhibit similar enrichment of cell-type-specific features, using the Zoonomia data mentioned also in the text (Andrews et al., 2023 -- PMID: 37104580) could provide valuable insights. While the author did not find Zoonomia Phastcons values available, they used PhastCons derived from a 470-way alignment of mammals. I commend the authors for their diligent efforts, which undoubtedly bolster their findings that an enrichment in ASCA is evident across all levels of sequence conservation. However, this recent analysis indicates the presence of a potential relationship between sequence conservation and ASCA. It may be advantageous to consider evaluating more quantile subdivisions of maxZ values and pPhastCons values, with the inclusion of these results in the supplementary materials. This approach would be preferable, even if the precise reasons behind the observed discrepancy are not fully elucidated.

      Another potential strength of this study is the identification of specific cases of paired allele-specific expression (ASE) and allele-specific chromatin accessibility (ASCA) with biological significance. Prioritizing specific variants remains a challenge, and the authors apply a machine learning approach to identify potential causative variants that disrupt binding sites in two examples (FABP7 and GAD1 in motor neurons). However, additional work is needed to convincingly demonstrate the functionality of these selected variants. Strengthening this section with additional validation of ASE, ASCA, and the specific putative causal variants identified would enhance the overall robustness of the paper. The authors have opted to defer these validations to future studies.

      Additionally, the authors support the selected ASE-ASCA pairs by examining external datasets of adult brain comparative genomics (Ma et al., 2022) and organoids (Kanton et al., 2019). While these resources are valuable for comparing observed species biases, the analysis is not systematic, even for the two selected genes. For example, it would be beneficial to investigate if FABP7 exhibits species bias in any cell type in Kanton et al.'s organoids or if GAD1 is species-biased in adult primate brains from Ma et al. Comparing these datasets with the present study, along with the Agoglia et al. reference, would provide a more comprehensive perspective. In the revised version of the manuscript the authors have evaluated the expression of GAD1 in Ma et al, and FABP7 in Sousa et al 2017. For instance, GAD1 show cell type specific species biases in the later. The authors opted for not showing this in the manuscript, However, it remains unclear why certain datasets were favored over others, or why FABP7 should not be evaluated in Kanton et al.

      The use of the term "human-derived" in ASE and ASCA has now been avoided.

      Finally, throughout the paper, the authors refer to "hybrid cell lines." It has been suggested to use the term "composite cell lines" instead to address potential societal concerns associated with the term "hybrid," which some may associate with reproductive relationships (Pavlovic et al., 2022 -- PMID: 35082442). The authors have presented an eloquent and persuasive explanation that I found to be highly informative.

    4. Reviewer #3 (Public Review):

      The authors utilize chimpanzee-human hybrid cell lines to assess cis-regulatory evolution. These hybrid cell lines offer a well-controlled environment, enabling clear differentiation between cis-regulatory effects and environmental or other trans effects.<br /> In their research, Wang et al. expand the range of chimpanzee-human hybrid cell lines to encompass six new developmental cell types derived from all three germ layers. This expansion allows them to discern cell type-specific cis-regulatory changes between species from more pleiotropic ones. Although the study investigates only two iPSC clones, the RNA- and ATAC-seq data produced for this paper is a valuable resource.

      The authors begin their analysis by examining the relationship between allele-specific expression (ASE) as a measure of species divergence and cell type specificity. They find that cell-type-specific genes exhibit more divergent expression. By integrating this data with measures of constraint within human populations, the authors conclude that the increased divergence of tissue-specific genes is, at least in part, attributable to positive selection. A similar pattern emerges when assessing allele-specific chromatin accessibility (ASCA) as a measure of divergence of cis-regulatory elements (CREs) in the same cell lines.

      By correlating these two measures, the authors identify 95 CRE-gene pairs where tissue-specific ASE aligns with tissue-specific ASCA. Among these pairs, the authors select two genes of interest for further investigation. Notably, the authors employ an intriguing machine learning approach in which they compare the inferred chromatin state of the human sequence with that of the chimpanzee sequence to pinpoint putatively causal variants.

      Overall, this study delves into the examination of gene expression and chromatin accessibility within hybrid cell lines, showcasing how this data can be leveraged to identify potential causal sequence differences underlying between-species expression changes.

      All in all most conclusions appear solid, with the exception of the interpretation of a cell type/state identification machine learning model to pinpoint putatively causal variants. The described variants lack any functional validation and there is no data that measure the certainty of the results.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      The study isolated extracellular vesicles (EV) from healthy controls (HCs) and Parkinson patients (PwP), using plasma from the venous blood of non-fasting people. Such EVs were characterized and validated by the presence of markers, their size, and their morphology. The main aim of the manuscript is to correlate the presence of synaptic proteins, namely SNAP-25, GAP-43, and SYNAPTOTAGMIN-1, normalized with HSP70, with the clinical progression of PwP. Changes in synaptic proteins have been documented in the CSF of Alzheimer's and Parkinson's patients. The demographics of participants are adequately presented.

      • One important limiting, as well as puzzling aspect, is the fact that authors did not find differences between groups at the beginning of the study nor after one year, after age and sex adjustment.

      Response: Thanks for your comments. We acknowledge your observation that the absence of a discernible difference in plasma EV synaptic protein levels between the PD and control subjects constitutes a significant limitation of our study. This outcome could be attributed to the fact that the controls were recruited from the neurology outpatient clinic, representing a group that could be considered "sub-healthy." Moreover, these individuals are not exempt from aging-related neurodegenerative processes. Considering that our PD subjects are in the early stages of the disease (with a mean disease duration of less than 3 years) and that synaptic dysfunction is a broader indicator rather than specific to PD, these factors could collectively contribute to the lack of distinction between the PD and control groups.

      However, our primary intention was also to explore the potential of plasma EV synaptic proteins as predictive markers for disease progression in PD. In this regard, we have identified their applicability within the current PD cohort. We are committed to conducting further follow-up with these study subjects over an extended duration to delve deeper into these findings.

      We revised the following statement in the discussion part to address this issue as following “Additionally, synaptic dysfunction is a frequently observed phenomenon in several neurological diseases, and it is not exclusive to PD. Consequently, the HC group in our current study may have included individuals with coexisting neurological conditions, potentially explaining the lack of a significant difference between the PD group and the HCs. However, this approach also illuminates the significance of synaptic dysfunction in the advancement of PD. This insight can be invaluable for monitoring disease progression, particularly in the context of clinical trials focused on disease modification.”

      • Tables in general are hard to follow. Specifically, Table 2 does not convey a clear message nor in the text of the Table itself, and the per 100% of change needs to be explained in the corresponding legend.

      Response: Thanks for your comment. In Table 2, our aim was to demonstrate the association between the change of plasma EV synaptic proteins with the change of clinical severity, and presented as coefficient (p value). We apologize for any prior ambiguity in the main text's description of these results and have since made revisions to enhance clarity.

      Regarding the "per 100% change," this is due to the quantification of plasma EV synaptic proteins being based on a semi-quantitative Western blot method. Each measurement was normalized by the average baseline plasma synaptic protein levels of healthy controls (HCs). The term "per 100% change" denotes the increase or decrease in plasma EV synaptic protein abundance relative to the average baseline levels observed in healthy controls. We apologize for any confusion caused and removed this term. In addition, we rephrased the statement to ensure better understanding and readability in the Table legend of revised manuscript as following “The association between the change of plasma EV synaptic proteins abundance (between baseline and follow-up) with the change of clinical severity in motor and cognitive domains (between baseline and follow-up) in people with Parkinson’s disease. A generalized linear model was employed and the data was presented as coefficient (p value).”

      • It is only when PwP were classified as a first quartile that a significantly greater deterioration was found. However, in the case of tremor, the top 25% had values going from 0.46-0.47 to 0.32-0.35, whereas the lower three quarters went from 0.33-0.34 to 0.27-0.28 depending on the protein analyzed. This needs to be clarified in the text.

      Response: Thanks for your comments. As per the unified Parkinson's disease rating score (UPDRS), a higher score indicates greater severity of symptoms. Regarding tremor, we observed a general trend of improvement in both groups. PwP with elevated baseline plasma EV proteins had a trendy of worse tremor score at baseline, and the improvement was significantly better than the rest of PwP. This improvement seems to contradict the progressive nature of PD, and one possible explanation could be the alleviation of symptoms due to medication usage. The assessment of motor symptoms took place within the hospital setting, where we refrained from requesting patients to withhold their anti-PD medications due to concerns about safety issues such as falls. Consequently, certain motor symptoms might have been effectively controlled by the anti-PD medication. Traditionally, symptoms like tremor and rigidity (as reflected by the akinetic rigidity score) respond well to medications, while postural instability and gait disturbance (PIGD) are less responsive. In our cohort, we noted an improvement in tremor scores and stability in akinetic rigidity (AR) scores. Conversely, PD patients with higher baseline plasma EV synaptic protein levels exhibited notable progression in PIGD scores. These findings have been documented in the results section and discussed comprehensively within the revised manuscript as following “On the other hand, the evaluation of motor symptoms occurred in a hospital setting where we did not ask patients to stop taking their anti- PD medications due to safety concerns like the risk of falls. As a result, specific motor symptoms, particularly tremor and AR, which are more sensitive to medication compared to PIGD, may have been effectively managed by the anti-PD medications. This could potentially explain the improvement in tremor observed between the baseline and one-year follow-up, especially among PwP with elevated baseline plasma EV synaptic proteins.”

      • Table 3 is hard to read and some of the values seem repetitive, especially for tremor, AR, and PIGD. It looks as if Figure 2 represents the same information as Table 3.

      Response: Thanks for your information. We have ensured the accuracy of the results presented in Table 2. While some of the entries may appear similar, they do indeed possess distinct differences.

      To enhance readability, we streamlined the information in Table 3 by removing the p-values from the intra-group comparisons between baseline and the 1-year follow-up within each domain. We retained the original p-values for trend related to the inter-group comparisons for changes. Detailed information has been relocated to the supplementary section of the revised manuscript. In Figure 2, we illustrated the relationship between baseline plasma extracellular vesicle (EV) synaptic protein levels and the clinical assessment parameters during follow-up in patients with Parkinson's disease (PwP). This portrayal is distinct from the information depicted in Table 3.

      If you had concerns about the resemblance between Table 3 and Figure 3, please note that the values in Table 3 represent raw scores, while the values in Figure 3, namely the estimated marginal means, are the "adjusted" scores for UPDRS-II and PIGD at baseline and follow-up. These adjustments encompass age, sex, and disease duration. We sincerely apologize for any lack of clarity in our previous description and have since revised it accordingly.

      • The text and figure legends are not helpful in guiding the reader to understand the presented information.

      Response: Thanks for your comments and we apologized for the unclear statement. We revised the figure legend and the main text for better understanding of the readers.

      Reviewer #2 (Public Review):

      Hong and collaborators investigated variations in the amount of synaptic proteins in plasma extracellular vesicles (EV) in Parkinson's Disease (PD) patients on one-year follow-up. Their findings suggest that plasma EV synaptic proteins may be used as clinical biomarkers of PD progression.

      • It is a preliminary study using semi-quantitative analysis of synaptic proteins.

      Response: Thanks for your comments. The present study represents the initial phase of our investigation into the role of plasma EV synaptic proteins within our PD cohort. Our findings have revealed the potential predictive significance of these synaptic proteins in relation to PD progression. We are committed to conducting further follow-up with these study subjects over an extended period.

      Furthermore, it's important to acknowledge that the semi-quantitative approach employed to assess protein abundance was a limitation of this study. This limitation stems from the low concentration of plasma EV synaptic proteins, which restricts the feasibility of utilizing techniques such as ELISA or other quantitative methods for protein assessment. We have duly acknowledged this limitation within the scope of the present study as following “Semiquantitative assessment of plasma EV synaptic protein (SNAP-25, GAP-43, and synaptotagmin-1) levels was performed using western blot analysis. The lack of absolute values limits further clinical application.”

      Moving forward, we intend to adopt alternative EV isolation methods that enable the extraction of a larger abundance of plasma EV proteins, facilitating more accurate quantitative assessments. In addition, a longer longitudinal follow-up is warranted to clearly assess the prognostic efficacy of plasma EV synaptic proteins in PwP, which we had mentioned in the manuscript.

      • The authors have a cohort of PD patients with clinical examination and a know-how on EV purification. Regarding this latter part, they may improve their description of EV purification. EV may be broken into smaller size EV after freezing. Does it explain the relatively small size in their EV preparation? Do the authors refer to the MISEV guidelines for EV purity?

      Response: Thanks for your comments. In the previous manuscript, we provided a relatively detailed account of the procedures related to EV isolation and validation (https://doi.org/10.1096/fj.202100787R). In the revised manuscript, we added some information about the principle of the EV isolation kit, and the validation antibody as following “Plasma EVs were isolated from 1 mL of plasma by exoEasy Maxi Kit (Qiagen, Valencia, CA, USA), a membrane-based affinity binding step to isolate exosomes and other EVs without relying on a particular epitope, in accordance with the manufacturer’s instructions and storaged in the −80。C freezer. The isolated plasma EVs were then eluted and stored. Usually, 400 μL of eluate is obtained per mL of plasma. The isolated plasma EVs were validated according to the International Society of Extracellular Vesicles guidelines, which include1.markers, including the presence of CD63 (ab59479, Abcam, Cambridge, UK), CD9(ab92726, Abcam, Cambridge, UK), tumor susceptibility gene 101 protein (GTX118736, GeneTex, CA, USA) and negative of cytochrome c (ab110325; Abcam, Cambridge, UK) 2. Physical characterization through the nanoparticle tracking analysis, which demonstrated the majority of the size of EV are mainly within 50-100nm 3. The morphology from the electron microscopy analysis. The validation had been described previously [29-31]. “

      It's important to note that our primary focus was on exosomes, the smallest subtype of EVs. Through nanoparticle tracking analysis, we observed that the majority of isolated EVs fell within the diameter range of 50-150nm, exhibiting significant surface marker (i.e. CD63 and CD9) expression. Moreover, electron microscopy confirmed their vesicular morphology. These meticulously validated EVs were promptly analysed post-isolation.

      However, we acknowledge that the plasma obtained from study participants might have undergone freezing prior to EV isolation. This freezing process has the potential to diminish the yield rate of EVs and result in some degree of fragmentation. We have duly included this issue as a limitation in our revised manuscript as following “The final technical issue in the present study was the relatively small size of the isolated EVs. Despite the primary focus on isolating exosomes, which are the smallest type of EVs, it's important to consider that the presence of small-sized EVs could potentially be attributed to EV fragmentation that occurs during the freezing and thawing processes.”

      • Regarding synaptic protein quantification, the choice of western blotting may not be the best one. ELISA and other multiplex arrays are available. How the authors do justify their choice?

      Response: Thanks for your comments. We appreciate your input regarding the semi-quantitative western blot analysis not being the most optimal approach. Owing to the limited quantity of isolated plasma EVs and the significant protein abundance of synaptic proteins within these EVs, we did explore the use of an ELISA assay. However, it's worth noting that for a specific subset of the samples, the readout obtained was lower than the lower limit of detection of the ELISA kit. In response, we have incorporated this point as limitation within the discussion section of the revised manuscript as following “Semiquantitative assessment of plasma EV synaptic protein (SNAP-25, GAP-43, and synaptotagmin-1) levels was performed using western blot analysis. The lack of absolute values, i.e. from the results of enzyme-linked immunosorbent assay, limits further clinical application.”

      • Do the authors try to sort plasma EV by membrane-associated neuronal EV markers using either vesicle sorting or immunoprecipitation?

      Response: Thanks for your comments. The current study did not specifically isolate neuron-derived extracellular vesicles (EVs), potentially introducing some bias to the results. However, it's important to note that synaptic proteins, such as SNAP-25, exhibit a high degree of neuron-specific expression, with a predominant presence in the brain (as indicated by https://www.proteinatlas.org/ENSG00000132639-SNAP25/tissue). Given this context, the limitation of not analyzing neuron-derived EVs could be mitigated to some extent. In response, we have incorporated this point as limitation within the discussion section of the revised manuscript as following “Furthermore, this study evaluated the overall plasma EVs rather than specifically focusing on neuron-derived exosomes, potentially introducing a bias towards somatic-origin EVs. Nonetheless, it is worth noting that synaptic proteins primarily originate from neurons. Even when considering neuron-derived exosomes, it's important to recognize that they are not exclusively derived from the brain, which can lead to contamination from the peripheral nervous system.”

      • Many technical aspects may be improved. Such technical questions weakened the authors' conclusions.

      Response: Thanks for your comments. We recognize that the aforementioned issues represent limitations of our current study. In response, we have incorporated these points as limitations, including the semi-quantitative assessments, the isolation of total but not neuron-derived exosomes in the plasma, and the short follow-up time within the discussion section of the revised manuscript.

      • The discussion is pretty long to justify the data. It may be shortened by adding some information in the introduction.

      Response: Thanks for your comments. We have repositioned a statement from the second paragraph of the discussion to the introduction. This adjustment serves to enrich the background understanding of the link between synaptic dysfunction and neurodegenerative diseases.

    2. Reviewer #1 (Public Review):

      The study isolated extracellular vesicles (EV) from healthy controls (HCs) and Parkinson patients (PwP), using plasma from the venous blood of non-fasting people. Such EVs were characterized and validated by the presence of markers, their size, and their morphology. The main aim of the manuscript is to correlate the presence of synaptic proteins, namely SNAP-25, GAP-43, and SYNAPTOTAGMIN-1, normalized with HSP70, with the clinical progression of PwP. Changes in synaptic proteins have been documented in the CSF of Alzheimer's and Parkinson's patients. The demographics of participants are adequately presented. One important limiting, as well as puzzling aspect, is the fact that authors did not find differences between groups at the beginning of the study nor after one year, after age and sex adjustment.

    3. Reviewer #2 (Public Review):

      Hong and collaborators investigated variations in the amount of synaptic proteins in plasma extracellular vesicles (EV) in Parkinson's Disease (PD) patients on one-year follow-up. Their findings suggest that plasma EV synaptic proteins may be used as clinical biomarkers of PD progression.

      It is a preliminary study using semi-quantitative analysis of synaptic proteins.

      The authors have a cohort of PD patients with clinical examination and a know-how on EV purification. Regarding this latter part, they may improve their description of EV purification. EV may be broken into smaller size EV after freezing. Does it explain the relatively small size in their EV preparation? Do the authors refer to the MISEV guidelines for EV purity? Regarding synaptic protein quantification, the choice of western blotting may not be the best one. ELISA and other multiplex arrays are available. How the authors do justify their choice? Do the authors try to sort plasma EV by membrane-associated neuronal EV markers using either vesicle sorting or immunoprecipitation?

      Many technical aspects may be improved. Such technical questions weakened the authors' conclusions.

    1. Reviewer #1 (Public Review):

      In their manuscript, Arjun et al. investigate the role of the histone acetyltransferase Gcn5 in the control of drosophila blood cell homeostasis in the larval lymph gland. They use gcn5 zygotic mutants as well as targeted knock-down and over-expression of Gcn5 in various lymph gland populations to show that these modulations impact (in a rather haphazard manner) niche cell number, blood cell progenitor maintenance, plasmatocyte differentiation, crystal cell differentiation or DNA damage accumulation. Their results suggest that Gcn5 controls autophagy and they show that decreasing the expression of the autophagy machinery increases blood cell differentiation. Using drugs to modulate the mTOR pathway, they conclude that Gcn5 levels are regulated by mTOR but that the impact of this pathway on blood cell homeostasis can override Gcn5 function.

      While the authors did a lot of experiments and good quantifications of the blood cell phenotypes, many results do not make much sense or do not bring valuable information about Gcn5 mode of action. Several conclusions of the manuscripts are not backed by solid data (e.g. that Gcn5 action is mediated by TFEB and the autophagy machinery) and different aspects of the literature are not well taken into consideration. Some results (such as the validation of the knockdown and overexpression of Gcn5) seem flawed. There are some concerns about the results obtained with gcn5 zygotic mutants and an interpretation of the phenotypes observed upon manipulation of Gcn5 expression in different cell types is missing.

      Important revisions are needed to improve the quality of the manuscript and confirm the authors' findings.

    2. eLife assessment

      This manuscript shows that manipulating the expression of the histone acetyltransferase Gcn5 affects blood cell homeostasis in the Drosophila larval hematopoietic organ. The data suggest a link between autophagy and the mTOR pathway, as could be expected from the literature. The authors use several genetic manipulations as well as some chemical modulators to generate solid evidence supporting most of their conclusions, but some of the analyses are inadequate and would benefit from improvement.

    3. Reviewer #2 (Public Review):

      Summary:<br /> Drosophila hematopoiesis has been shown to be governed by a number of signaling pathways such as JAK/STAT and Dpp. This important study shows the role of nutrient sensing and autophagy in determining blood cell differentiation. The authors show that General control non-derepressible 5 (Gcn5), a histone acetyltransferase affects blood cell differentiation. Gcn5 also negatively regulates autophagy through its effector TFEB which directly regulates autophagy genes. The authors also show that mTORC1 modulates Gcn5 levels and through it, TFEB activity thus acting as a fine-tuning mechanism that maintains optimal levels of autophagy.

      Strengths:<br /> The main strength of the work lies in the interesting finding that cellular metabolic processes such as autophagy have a direct role in blood cell differentiation and has the potential to be of interest to those working on vertebrate haematopoiesis as well. The report has generated intriguing data, using promoters specific for sub-sections of the lymph gland, that different cellular subsets of the lymph gland contribute differently towards haematopoiesis, but this is not followed up in detail and the final conclusions are derived from a combination of whole lymph gland perturbations as well as those from specific promoters.

      Weaknesses:<br /> 1. Gc5 seems to be expressed throughout the lymph gland but modulating it in the subsections does not have the same result. It is very striking that the knockdown of Gcn5 in the prohemocyte population does not have an effect on differentiation whereas overexpression does. The modulations of Gcn5 in PSC also have variable effects across hemocyte subpopulations which is not explored in the manuscript. Interestingly, also the domain deletion constructs show a differential effect on blood cell differentiation when altered solely in the prohemocytes which is not explained. While Gcn5 can be seen in all sections of the lymph gland in the first figure, under the HHLT-Gal4 and Hml-Gal4, Gcn5 looks cytoplasmic and almost completely excluded from the nucleus strikingly unlike Gcn5 expression under the Collier-Gal4 and Dome-Gal4. The rest of the experiments in the manuscript are done with multiple promoters, with autophagy flux measured by modulating Gcn5 with a pan hemocyte promoter, but the mTORC1-Gcn5 axis is explored using chemical modulators which affect the whole of the lymph gland (Fig7) or using two pro-hemocyte promoters (Fig8).

      2. The knockdown of Gcn5 seems to affect the gland size (A compared to B and C). Since mTORC1 is a central regulator of cell size, it is possible that some of the effects seen in these knockdowns are potentially through mTORC1 affecting size suggesting that the signalling axis between mTORC1 and Gcn5 might not be a one-way axis as suggested in Figure 9. Also, this would mean that in experiments where absolute cell counts of crystal cells or niche cells are used to assess blood cell differentiation, further analysis to consider total cell numbers in the lymph gland would strengthen the manuscript.

      3. A genetic manipulation of mTORC1 specifically in the pro hemocytes would strengthen the role of mTORC1 in the pathway rather than the chemical modulation which affects the whole of the lymph gland.

    1. Reviewer #1 (Public Review):

      Summary:

      This study describes all tangential neurons of the lobula plate (LOPs) of the fruit fly Drosophila melanogaster. Importantly, this is done in a complete manner, for the first time in any species. This means that for the first time, all neurons involved in transmitting wide-field optic flow information to the central brain are known. Exploiting known structure-function relations in these neurons (which are based on solid physiological data in different species of flies), the authors provide estimates of the physiological properties of all described neurons. Combined with transmitter predictions of these cells, this yields a full account of what information about wide-field motion is available to the central fly brain in order to derive behavioral commands from. The study goes one step further and includes anatomical descriptions and physiological property predictions for all major downstream target cells of LOPs.

      Main strengths:

      The paper is exceptional in three ways. First, it is the first comprehensive account of all tangential neurons of the lobula plate of an insect. This now provides the ground truth for similar studies in other insects. In particular, these results will allow neurons emerging in other species to be confidently described as novel/different from Drosophila, if they were not found in the current study. This is a major change from previously, when confidence in the non-existence of neuronal cell types in this system was impossible, as that system was not fully described.

      Second, the rigorous prediction of physiological characteristics (flow-field encoding) in all anatomically described neurons provides a solid basis for system-wide modeling of optic flow encoding in Drosophila. Importantly, the presented physiological predictions include the downstream partner cells of the LOPs in the central brain, neurons for which only very few physiological descriptions exist, but which are essential for transforming optic flow input into behavioral outputs. This paper therefore opens a path towards closing the gap between sensory processing and behavior not only for a few identified and well-studied pathways, but for all wide-field motion processing that exists in a species.

      Third, the connectomics work is not only based on one individual sample, but incorporates two EM volumes, analyzed with two different methods (manual tracing and auto segmentation/proofreading), using interhemispheric correspondence and inter-individual correspondence to validate the obtained neuron catalogue. Additionally, light microscopical data was used to validate the EM data. All of this provides exceptional levels of confidence in the presented results.

      Main weaknesses:

      While the authors compare their results with data from both larger flies and other work in Drosophila, a recent paper (Henning et al 2022) that presented novel data on the distribution of preferred motion directions in the fly lobula plate is not mentioned. This is unfortunate, as the claim of that paper is that the lobula plate contains six instead of four main tuning directions, both at the level of LOPs and T4/T5 input cells - a claim that could likely be directly confirmed or dismissed, or at least incorporated in the data presented in the current study. How would the flow-field predictions change if the data from Henning et al on T4 neurons was used as an input for the modeling rather than the classic four tuning directions?

      While the authors nicely perform comparison to other fly species, a more general discussion of how the found cells relate to other insects, e.g. cells known from bees (e.g. Honkanen et al., 2023) or older work from locusts, could give the data more general relevance. While the comparison can likely not be done on a cell type level, given that the structure of the lobula complex is very different between those insects, the types of projections found and their physiologies, i.e. the overall patterns of how wide field motion is sent to the central brain, might be comparable and informative for highlighting general principles of motion processing.

    2. eLife assessment

      This important study presents the first comprehensive catalog of the large neurons that compute optic flow in any insect. The morphological reconstructions from volume electron microscopy of the large arbors of these neurons, the Lobula Plate Tangential Neurons, were followed by the examination of their spatial arrangement to estimate their individual receptive fields and predict their optimal motion sensitivity. This compelling, rigorous data set, which includes the synaptic connectivity of the neurons under study with major target neurons in the fly brain, establishes a foundation for future studies on visual processing on the basis of a known connectome plus genetic driver lines to manipulate its constituent neurons. It will be of interest beyond insect vision to those studying sensory processing and neural circuit function.

    3. Reviewer #2 (Public Review):

      Summary:

      In this study, Zhao, Nern, et al. investigate a population of neurons in the optic lobe of Drosophila melanogaster that process optic flow, relative motion between the insect's eyes and its environment that is generated during flight and provides useful information to the fly about its own self-motion. Although a sample of these Lobula Plate Tangential (LPT) neurons has been studied across Diptera in prior work, the full population has not been exhaustively and thoroughly cataloged in a single species, limiting our understanding of how LPT tuning properties across the population convey features of optic flow fields relevant to downstream motor regions.

      Through extensive manual reconstructions in a fly electron microscopy volume, the authors of this study identify 58 LPT neurons in the fruit fly encompassing previously studied Horizontal and Vertical cells and novel cells that have not been previously characterized. Using the detailed anatomy of each cell and knowledge of upstream T4/T5 selectivity, the authors derive the predicted motion pattern map (PMPM) of each neuron. To understand how optic flow field tunings of individual LPTs align with global optic flow patterns flies are expected to encounter during flight such as translation and rotation, the authors compute the average angular difference between each PMPM and idealized rotation and translation optic flow fields. The authors also map individual LPTs to their counterparts in a second fly brain to explore LPT-LPT connectivity and downstream connectivity to central brain neuropils. They find that distinct groupings of LPTs have diverse downstream connectivity patterns and that downstream neurons align more closely to global optic flow fields that are expected during flight. This study is a valuable resource to researchers studying motion vision in the insect brain and is of interest to researchers studying sensorimotor processing by providing hypotheses for how optic flow information is integrated downstream to guide fly behavior.

      Strengths:

      A key strength of this study is the thoroughness with which the authors comprehensively identify individual LPT neurons in the FAFB volume. They not only conduct an impressive number of careful manual reconstructions to recover individual LPTs, but they also attempt, and often succeed, to map each individual neuron to its counterpart in light microscopy, studies across Diptera, and available auto-segmented connectome datasets such as FlyWire, FAFB-FFN1, and Hemibrain. The authors are similarly thorough when surveying individual LPT properties such as neurotransmitter identities, in some cases using multiple datasets to reconcile ambiguous neurotransmitter predictions. The care with which the complete LPT population has been identified establishes this study as a useful resource for future studies of insect motion.

      In addition to providing a comprehensive catalog of individual LPTs, the authors also contextualize their findings within broader sensorimotor circuitry by considering connectivity between LPTs and from LPTs to downstream regions. Exploration of structure in downstream connectivity suggests that optic flow information is directed to various central brain neuropils through specific groups of LPTs. With some additional analyses, these results broaden the scope of this study by providing useful hypotheses for sensorimotor circuit organization.

      Weaknesses:

      A novel method introduced in this study is the derivation of individual LPT-predicted motion pattern maps (PMPMs) using T4 preferred directions and LPT morphology. Although this method underlies core findings in this study, such as alignment to global optic flow fields and properties of downstream integration, aspects of the methods used to derive PMPMs are not explained sufficiently well, particularly in the main text. For example, in the Methods, the authors briefly describe the process of computing a weighted sum of T4 preferred directions to obtain the PMPM for each LPT, but a detailed understanding of these preferred directions combined is missing in Figure 2 or the associated descriptions in the main text. It is also not clear how PMPMs are derived in cases where LOP layer coverages are overlapping (for example VS 13-1 in Figure 3) to yield smooth PMPMs. In addition, it is not clear how the PMPMs of bilateral LPTs such as LPT-45 and LPT-50 in Figure 4 were integrated to compute downstream target composite PMPMs. Finally, all the PMPMs were derived from the T4 preferred direction that relies on the ommatidial viewing directions ("Eyemap") introduced in Zhao et al. 2022. It is also important for the current study to give an indication of how sensitive their results are to possible inaccuracies in this map and derived T4/T5 direction selectivities.

      Although the authors explore some features of connectivity from LPT to downstream partners (Figure 6), there is a lack of reconciliation of these findings with individual LPT properties explored earlier in the study, such as those presented in Figures 2-4. In that sense, there is a disconnect between the two parts of the manuscript (and a missed opportunity). For instance, an important follow-up analysis would be to use knowledge about LPT-LPT connectivity to better predict effective PMPMs of LPTs taking into account network effects. This extension would lead to a better understanding of how LPT-LPT interactions shape optic flow responses in the LOP. In addition, in Figure 6 Supplement 2 (which I recommend to move to the main figures), the authors show that LPTs can be grouped together based on similarity of output connectivity (Panel B-D) and that this structure corresponds to output synapses located in different groups of central brain neuropils. However, they do not attempt to explicitly link these groupings with individual LPT PMPMs, alignment to global optic flow patterns, LPT layer enervation, cell morphologies, and input connectivity patterns. Such an analysis would be an important step to bring the manuscript together and to get a better understanding of the organization of the whole system.

    4. Reviewer #3 (Public Review):

      Summary:

      The fruit fly visual system has provided a powerful context in which to investigate fundamental questions in neural development, phototransduction, and systems neuroscience. Of recent interest is motion processing, particularly how visual motion cues are estimated locally, and then pooled to derive behaviorally meaningful signals. Many of these pooling operations have been shown to take place in the wide-field neurons in the lobula plate, cell types that have been explored using electrophysiological recordings for more than 50 years in a variety of Diptera. However, our understanding of the diversity and connectivity of these cells remains incompletely understood, and is of interest to many.

      In this context, Reiser and colleagues describe the anatomy and connectivity of the complete set of Lobula Plate Tangential neurons in Drosophila, using a careful and systematic reconstruction of the FAFB dataset. Leveraging a previous study of retinal geometry, combined with their characterization of the anatomical inputs to the elementary motion detectors, T4 and T5, they then predict the motion sensitivities of each cell, their neurotransmitter identities, and map the connections of many of these cells into the central brain and contralateral optic lobe.

      Strengths:

      The quality of the connectomic analysis is exceptional, and the quantitative analysis that links connectivity to function is rigorous and impressive. This paper will be an important resource for the community.

      Weaknesses:

      Some of the findings could be better linked to previously published work in this field, and there may be a minor limitation to the predicted optimal motion axes, given one of the simplifying assumptions made.

    1. eLife assessment

      This study presents new data highlighting the importance of appropriate coenzyme A handling in the mitochondria for maintaining appropriate energy production capacity. Several findings regarding the role of a key metabolic enzyme in how skeletal muscle cells use different substrates for energy production are valuable and supported by solid evidence, but there are concerns whether the data support the conclusion that ACOT2 regulates mitochondrial matrix acyl-CoA levels in white skeletal muscle to facilitate fatty acid oxidation β-oxidation.

    2. Reviewer #1 (Public Review):

      This study examined whether mitochondrial acyl-CoA thioesterase-2 (ACOT2) regulates mitochondrial matrix acyl-CoA levels. Acot2 deletion in murine skeletal muscle (SM) resulted in acyl-CoA build-up. When energy demand and pyruvate availability were elevated, a lack of ACOT2 activity promoted glucose oxidation. This preference for glucose over fatty acid oxidation was recapitulated in C2C12 myotubes with acute depletion of Acot2. In mice fed a high-fat diet, ACOT2 enabled the accretion of acyl-CoAs and ceramide derivatives in glycolytic SM, and this was associated with worse glucose homeostasis compared to when ACOT2 was absent. The authors suggest that ACOT2 supports CoASH availability to facilitate β-oxidation in glycolytic SM when lipid supply is modest. However, when lipid supply is high, ACOT2 enables acyl-CoA and lipid accumulation, CoASH sequestration, and poor glucose homeostasis. Thus, ACOT2 regulates matrix acyl-CoA concentration in glycolytic muscle, and its impact depends on lipid supply.

      Based on the data provided in this study, the authors propose that ACOT2 regulates mitochondrial matrix acyl-CoA levels in white skeletal muscle to facilitate fatty acid oxidation β-oxidation. However, I do not believe the data supports this concept, since ACOT2 deletion actually increased fatty acid oxidation in the mitochondrial JO2 studies. In addition, there are some problems with the experimental data that the authors need to address. This includes the experimental conditions used to assess JO2 in the mitochondria, and not using Cre control mice.

    3. Reviewer #2 (Public Review):

      Summary:

      The manuscript from Bekeova et al. entitled "Acyl-CoA thioesterase-2 facilitates P-oxidation in glycolytic skeletal muscle in a lipid supply dependent manner" examines whether loss of acyl-CoA thioesterase-2 (ACOT2) in the mitochondrial matrix of skeletal muscle alters mitochondrial fatty acid metabolism. The authors generate data demonstrating that under normal chow conditions, loss of ACOT2 increases mitochondrial respiration of long-chain fatty acid, but also increases susceptibility to the build-up of metabolic intermediates. However, during short-term high-fat feeding (7 days), mice with knockout of skeletal muscle ACOT2 had better glucose and insulin tolerance. Interestingly, skeletal muscle ACOT2 knockout mice on chow and high-fat diet utilized more glucose during the active (dark cycle) portion of the day. These data suggest that ACOT2 may be a potential therapeutic target to improve glucose homeostasis.

      Strengths:

      The use of creatine kinase cre recombinase to specifically target striated muscle localizes the genetic manipulation, thus increasing the rigor of these experiments by limiting potential off-site changes in ACOT2 expression. Also, the assessment of mitochondrial respiration and response to changes in energy change via the creatine kinase clamp technique is a strength. These data provide a measurement of isolated mitochondrial respiration at physiologically relevant concentrations of ATP and ADP, while also allowing for assessment of how these mitochondria respond to changes in free energy (Fisher-Wellman et al. 2018). The indirect calorimetry data provides systemic physiological context to the striated muscle-specific genetic manipulation, while also allowing for the examination of how this change in skeletal muscle ACOT2 impacts systemic responses to different energy challenges. Finally, the extensive metabolomics, transcriptomics, and lipidomics analysis, not only provides a wealth of data but is used to further the authors' investigation of skeletal muscle ACOT2 activity in mitochondrial fatty acid oxidation and glucose homeostasis.

      Weaknesses:

      Several general confounding factors exist in the experimental design that could potentially impact the interpretation of the observed outcomes. First, all mice were housed at housing temperatures (22C) below the thermoneutral zone, which has been well described by many investigators to result in dramatically increased energy expenditure. Changes in total and resting energy expenditure could alter the skeletal muscle and systemic utilization of lipids, response to high-fat diet, and glucose homeostasis. Second, no dietary control was observed in these experiments. While this did not impact outcomes when the diets were not compared, once the authors began to compare normal chow to high-fat diet, numerous differences in the composition of these diets could impact the outcomes. Third, the extended food withdrawal before the glucose- and insulin tolerance tests puts the mouse in a state of extreme energy stress more akin to starvation than fasting, which can negatively impact outcomes (Ayala et al. 2010, Virtue & Vidal-Puig 2021). Fourth, the use of the Seahorse platform for the assessment of respiration of isolated mitochondria is highly debatable (Schmidt et al. 2021), particularly when the investigators also used high-resolution respirometry specifically designed for the purpose of measuring isolated mitochondrial oxygen consumption. Importantly, the use of the Seahorse platform to assess cellular respiration in this investigation is quite appropriate. Finally, while the authors present data demonstrating that ACOT2 expression is highest in Type I fibers compared to the various Type II fiber types, a large number of the experiments are performed in a muscle that is primarily composed of Type II fibers. The authors briefly acknowledge this limitation. But, is important for the reader to keep this in mind when trying to consider how these findings would translate to humans.

      Impact:

      The authors have generated data that implicates skeletal muscle mitochondrial coenzyme A handling as a therapeutic target in the improvement of glucose homeostasis. While the exact role of increased tissue lipid burden on insulin action, glucose uptake, and substrate metabolism is still debated, the association between increased tissue lipid and impaired tissue- and systemic glucose handling is very strong. The data herein suggest that ACOT2 represents a pharmaceutical target to improve systemic glucose homeostasis in the population with obesity.

    4. Reviewer #3 (Public Review):

      Cells can oxidize diverse substrates in the mitochondria to sustain cellular energy metabolism. However, all of these substrates require covalent thioester linkage to coenzyme A (CoA). Thus, multiple energy metabolism substrates could potentially compete for a limited pool of mitochondrial CoA. Cells encode a set of mitochondrial acyl-CoA thioesterases (ACOTs) that free CoA up by removing attached substrates. The authors hypothesized that ACOT2, a mitochondrial ACOT with a preference for long-chain acyl-CoA substrates that arise during the oxidation of lipids as a fuel source, could regulate the balance of substrates used in the mitochondria by reducing the oxidation of lipids by removing them from CoA and freeing the mitochondrial pool of CoA for use by other substrates.

      To test this hypothesis, the authors generated mice with loss of ACOT2 in the skeletal muscle, where this is most expressed, and assayed the CoA composition of muscle and their glucose/fatty acid catabolism in mice that were challenged with different diets, fasting or exercise to expose the muscle to different substrates conditions. These experiments were complemented with biochemical analysis of mitochondria isolated from the muscle of control and ACOT2 animals exposed to a variety of substrates and challenged with different simulated energy demands.

      On the basis of these convincing experiments, the authors argue that loss of ACOT2 both in vivo and in vitro interestingly increases glucose oxidation, while not increasing oxidation of lipids. This is particularly surprising as the CoA competition model would predict that ACOT2 loss would increase lipid oxidation while hindering glucose oxidation. The authors argue that ACOT2 facilitates lipid oxidation due to ACOT2 reversal of lipid ligation to CoA preventing feedback inhibition of the lipid oxidation pathway that occurs when lipid supply outstrips the ability of the lipid oxidation pathway to metabolize the lipids. These findings will be valuable for the field of metabolism providing insight into how ACOTs regulate substrate catabolism in cells and tissues.

    1. eLife assessment

      This manuscript presented convincing single-cell transcriptomic data of hematopoietic cells and immunocytes in zebrafish kidney marrow and showed that these cells have distinctive responses to viral infection. The findings in this study suggest that zebrafish kidney is a secondary lymphatic organ and hematopoietic stem cells in zebrafish may exhibit trained immunity. This represents a valuable discovery of the unique features of the fish immune system.

    2. Reviewer #1 (Public Review):

      Hu et al. performed sc-RNA-seq analyses of kidney cells with or without virus infection, vaccines, and vaccines+virus infections from pooled adult zebrafish. They compared within these experimental groups as well as kidney vs spleen. Their analyses identified expected populations but also revealed new hematopoietic stem/progenitor cell (HSPC), even in the spleen. Their analyses show that HSPCs in the kidney can respond to virus infection differentially and can be trained to recognize the same infection and argue that zebrafish kidney can serve as a secondary immune organ. The findings are important and interesting. The manuscript is well written and a pleasure to read. However, there are several issues with their figure presentation and figure qualities, as well as the lack of clarity in some of figure legends. Some of the data presentation can be improved for better clarity. It is also important to outline what is conserved and what is unique for fish.

      Major concerns:

      1. The visualization for several figure panels is very poor. Please provide high resolution images and larger font sizes for gene list or Y and X axis labels. This includes Figure 1B, Figure 1-figure supplement 2, Figure 2B-2C, 3A-3D, 4F, 5B, 6G, Figure 6-figure supplement 1B, Figure 6-figure supplement 2. Figure 7B, 8C-8E, Figure 8-figure supplement 1., 10F, 10G-10J, Figure 10-figure supplement 1.<br /> 2. What are the figures at the end of the manuscript without any figure legends?<br /> 3. It would be better to use a Table to organize the gene signatures that define each unique population of immune cells such as T, B, NK, etc.<br /> 4. What are the similarities for HSPC and immune cell populations between fish and man based on this research? It is better to form a table to compare and discuss.<br /> 5. It is highly likely that sex and age could be the biological variation for how HSPC responds to virus infections and vaccination. The author should clearly state the fish sex and age from their samples and discuss their results taking into consideration of these variations.<br /> 6. The authors claim that the spleen and kidney share HSPCs. However, their data did not demonstrate this result clearly in Figure 4A. Perhaps they should use different color to make the overlay becoming more obvious? Or include a table to show which HSPCs are shared between the kidney and spleen? Are they sure if these are just HSPCs seeding the spleen to differentiate into B cells or other immune cells?

    3. Reviewer #2 (Public Review):

      In this manuscript, the authors have meticulously constructed a comprehensive atlas delineating hematopoietic stem/progenitor cell (HSPC) and immune-cell types within the zebrafish kidney, employing single-cell transcriptome profiling analysis. Notably, these cell populations exhibited distinctive responses to viral infection. Intriguingly, the investigation revealed that HSPCs manifest positive reactivities to viral infection, indicating the effective induction of trained immunity in select HSPCs. Furthermore, the study unveiled the capacity for the generation of antigen-stimulated adaptive immunity within the kidney, suggesting a role for the zebrafish kidney as a secondary lymphoid organ. This research elucidates the distinctive features of the fish immune system and underscores the multifaceted biology of the kidney in ancient vertebrates.

    1. eLife assessment

      This useful work provides insight into agonist binding to nicotinic acetylcholine receptors, which is the stimulus for channel activation that regulates muscle contraction at the neuromuscular junction. The authors use in silico methods to explore the transient conformational change from a low to high affinity agonist-bound conformation as occurs during channel opening, but for which structural information is lacking owing to its transient nature. The evidence supporting the main conclusion that ligands flip ~180 degrees in the binding site as it transitions from a low to high affinity bound conformation is incomplete because little support is available for the starting low affinity docked conformations, and the rather approximate methods for computing binding free energies differ significantly from experimental measures for two of the four tested ligands. Nonetheless, this work presents an intriguing possibility for the nature of a transient conformational change at the agonist binding site correlated with channel opening. If the ligand flip observed in these simulations can be reproduced or verified by other studies, then this work would stand as a significant advance in our knowledge of nicotinic receptor gating.

    2. Reviewer #1 (Public Review):

      Summary:<br /> This useful work provides insight into agonist binding to muscle nicotinic receptors. The authors want to understand the fundamental steps in ligand binding to muscle nicotinic receptors using computational methods. The study builds on a large basis of empirical studies of the various states involved in receptor activation. However, the evidence supporting the conclusions is incomplete, because little support is available for the starting structures that are derived from ligand docking. This work is a useful starting point for more detailed work on ligand binding to this important class of receptors.

      Strengths:<br /> The strengths include the number of ligands tried, and the relation to the mature analysis of the receptor function.

      Weaknesses:<br /> The weaknesses are the brevity of the simulations, the concomitant lack of scope of the simulations, the lack of depth in the analysis, and the incomplete relation to other relevant work.

    3. Reviewer #2 (Public Review):

      Summary:<br /> The aim of this manuscript is to use molecular dynamics (MD) simulations to describe the conformational changes of the neurotransmitter binding site of a nicotinic receptor. The study uses a simplified model including the alpha-delta subunit interface of the extracellular domain of the channel and describes the binding of four agonists to observe conformational changes during the weak-to-strong affinity transition.

      Strength:<br /> The 200 ns-long simulations of this model suggest that the agonist rotates about its centre in a 'flip' motion, while loop C 'flops' to restructure the site. The changes appear to be reproduced across simulations and different ligands and are thus a strong point of the study.

      Weaknesses:<br /> After carrying out all-atom molecular dynamics, the authors revert to a model of binding using continuum Poisson-Boltzmann, surface area, and vibrational entropy. The motivations for and limitations associated with this approximate model for the thermodynamics of binding, rather than using modern atomistic MD free energy methods (that would fully incorporate configurational sampling of the protein, ligand, and solvent) could be provided. Despite this, the authors report a correlation between their free energy estimates and those inferred from the experiment. This did, however, reveal shortcomings for two of the agonists. The authors mention their trouble getting correlation to experiment for Ebt and Ebx and refer to up to 130% errors in free energy. But this is far worse than a simple proportional error, because -24 Vs -10 kcal/mol is a massive overestimation of free energy, as would be evident if the authors were to instead express results in terms of KD values (which would have an error exceeding a billion fold). The MD analysis could be improved with better measures of convergence, as well as a more careful discussion of free energy maps as a function of identified principal components, as described below. Overall, however, the study has provided useful observations and interpretations of agonist binding that will help understand pentameric ligand-gated ion channel activation.

      Main points:<br /> Regarding the choice of model, some further justification of the reduced 2 subunit ECD-only model could be given. On page 5 the authors argue that, because binding free energies are independent of energy changes outside the binding pocket, they could remove the TMD and study only an ECD subunit dimer. While the assumption of distant interactions being small seems somewhat reasonable, provided conformational changes are limited and localised, how do we know the packing of TMD onto the ECD does not alter the ability of the alpha-delta interface to rearrange during weak or strong binding? They further write that "fluctuations observed at the base of the ECD were anticipated because the TMD that offers stability here was absent.". As the TMD-ECD interface is the "gating interface" that is reshaped by agonist binding, surely the TMD-ECD interface structure must affect binding. It seems a little dangerous to completely separate the agonist binding and gating infrastructure, based on some assumption of independence. Given the model was only the alpha and delta subunits and not the pentamer with TMD, I am surprised such a model was stable without some heavy restraints. The authors state that "as a further control we carried out MD simulation of a pentamer docked with ACh and found similar structural changes at the binding pocket compared to the dimer." Is this sufficient proof of the accuracy of the simplified model? How similar was the model itself with and without agonist in terms of overall RMSD and RMSD for the subunit interface and the agonist binding site, as well as the free energy of binding to each model to compare?

      Although the authors repeatedly state that they have good convergence with their MD, I believe the analysis could be improved to convince us. On page 8 the authors write that the RMSD of the system converged in under 200 ns of MD. However, I note that the graph is of the entire ECD dimer, not a measure for the local binding site region. An additional RMSD of local binding site would be much more telling. You could have a structural isomerisation in the site and not even notice it in the existing graph. On page 9 the authors write that the RMSF in Figure S2 showed instability mainly in loops C and F around the pocket. Given this flexibility at the alpha-delta interface, this is why collecting those regions into one group for the calculation of RMSD convergence analysis would have been useful. They then state "the final MD configuration (with CCh) was well-aligned with the CCh-bound cryo-EM desensitized structure (7QL6)... further demonstrating that the simulation had converged." That may suggest a change occurred that is in common with the global minimum seen in cryo EM, which is good, but does not prove the MD has "converged". I would also rename Figure S3 accordingly.

      The authors draw conclusions about the dominant states and pathways from their PCA component free energy projections that need clarification. It is important first to show data to demonstrate that the two PCA components chosen were dominant and accounted for most of the variance. Then when mapping free energy as a function of those two PCA components, to prove that those maps have sufficient convergence to be able to interpret them. Moreover, if the free energies themselves cannot be used to measure state stability (as seems to be the case), that the limitations are carefully explained. First, was PCA done on all MD trajectories combined to find a common PC1 & PC2, or were they done separately on each simulation? If so, how similar are they? The authors write "the first two principal components (PC-1 and PC-2) that capture the most pronounced C. displacements". How much of the total variance did these two components capture? The authors write the changes mostly concern loop C and loop F, but which data proves this? e.g. A plot of PC1 and PC2 over residue number might help.

      The authors map the -kTln rho as a free energy for each simulation as a function of PC1 & PC2. It is important to reveal how well that PC1-2 space was sampled, and how those maps converged over time. The shapes of the maps and the relative depths of the wells look very different for each agonist. If the maps were sampled well and converged, the free energies themselves would tell us the stabilities of each state. Instead, the authors do not even mention this and instead talk about "variance" being the indicator of stability, stating that m3 is most stable in all cases. While I can believe 200ns could not converge a PC1-2 map and that meaningful delta G values might not be obtained from them, the issue of lack of sampling must be dealt with. On page 12 they write "Although the bottom of the well for 3 energy minima from PCA represent the most stable overall conformation of the protein, they do not convey direct information regarding agonist stability or orientation". The reasons why not must be explained; as they should do just that if the two order parameters PC1 and PC2 captured the slowest degrees of freedom for binding and sampling was sufficient. The authors write that "For all agonists and trajectories, m3 had the least variance (was most stable), again supporting convergence by 200 ns." Again the issue of actual free energy values in the maps needs to be dealt with. The probabilities expressed as -kTln rho in kcal/mol might suggest that m2 is the most stable. Instead, the authors base stability only on variance (I guess breadth of the well?), where m3 may be more localised in the chosen PC space, despite apparently having less preference during the MD (not the lowest free energy in the maps).

      The motivations and justifications for the use of approximate PBSA energetics instead of atomistic MD free energies should be dealt with in the manuscript, with limitations more clearly discussed. Rather than using modern all-atom MD free energy methods for relative or absolute binding free energies, the author selects clusters from their identified states and does Poisson-Boltzmann estimates (electrostatic, vdW, surface area, vibrational entropy). I do believe the following sentence does not begin to deal with the limitations of that method: "there are limitations with regard to MM-PBSA accurately predicting absolute binding free energies (Genheden & Ryde, 2015; Hou et al., 2011) that depends on the parameterization of the ligand (Oostenbrink et al., 2004)." What are the assumptions and limitations in taking continuum electrostatics (presumably with parameters for dielectric constants and their assignments to regions after discarding solvent), surface area (with its assumptions and limitations), and of course assuming vibration of a normal mode can capture entropy. On page 30, regarding their vibrational entropy estimate, they write that the "entropy term provides insights into the disorder within the system, as well as how this disorder changes during the binding process". It is important that the extent of disorder captured by the vibrational estimate be discussed, as it is not obvious that it has captured entropy involving multiple minima on the system's true 3N-dimensional energy surface, and especially the contribution from solvent disorder in bound Vs dissociated states.

      As discussed above, errors in the free energy estimates need to be more faithfully represented, as fractional errors are not meaningful. On page 21 the authors write "The match improved when free energy ratios rather than absolute values were compared." But a ratio of free energies is not a typical or expected measure of error in delta G. They also write "For ACh and CCh, there is good agreement between.Gm1 and GLA and between.Gm3 and GHA. For these agonists, in silico values overestimated experimental ones only by ~8% and ~25%. The agreement was not as good for the other 2 agonists, as calculated values overestimated experimental ones by ~45%(Ebt) and ~130% (Ebt). However, the fractional overestimation was approximately the same for GLA and GHA." See the above comment on how this may misrepresent the error. On page 21 they write, in relation to their large fractional errors, that they "do not know the origin of this factor but speculate that it could be caused by errors in ligand parameterization". However the estimates from the PBSA approach are, by design, only approximate. Both errors in parameterisation (and their likely origin) and the approximate model used, need discussion.

    4. Reviewer #3 (Public Review):

      Summary:<br /> The authors use docking and molecular dynamics (MD) simulations to investigate transient conformations that are otherwise difficult to resolve experimentally. The docking and simulations suggest an interesting series of events whereby agonists initially bind to the low-affinity site and then flip 180 degrees as the site contracts to its high-affinity conformation. This work will be of interest to the ion channel community and to biophysical studies of pentameric ligand-gated channels.

      Strengths:<br /> I find the premise for the simulations to be good, starting with an antagonist-bound structure as an estimate of the low affinity binding site conformation, then docking agonists into the site and using MD to allow the site to relax to a higher affinity conformation that is similar to structures in complex with agonists. I cannot speak to the details of the simulation methods, but the predictions are interesting and provide a view into what a transient conformation that is difficult to observe experimentally might be like.

      Weaknesses:<br /> Although the match in simulated vs experimental energies for two ligands was very good, the calculated energies for two other ligands were significantly different than the experiment. It is unclear to what extent the choice of method for the energy calculations influenced the results.

      A control simulation, such as for an apo site, is lacking.

    5. Reviewer #4 (Public Review):

      Summary:<br /> In their manuscript "Conformational dynamics of a nicotinic receptor neurotransmitter binding site," Singh and colleagues present cogent molecular docking and dynamics simulations to explore the initial conformational changes associated with agonist binding in the muscle nicotinic acetylcholine receptor, aligned with the extensive experimental literature on this system. Their central findings are of a consistently preferred pose for agonists upon initial association with a resting channel, followed by a dramatic rotation of the ligand and contraction of a critical loop over the binding site. Principal component analysis also suggests the formation of an intermediate complex, not yet captured in structural studies. Binding free energy calculations are consistent with the evolution of a higher-affinity complex following agonist binding, with a ligand efficiency notably similar to experimental values. Snapshot comparisons provide a structural rationale for these changes on the basis of pocket volume, hydration, and rearrangement of key residues at the subunit interface.

      Strengths:<br /> Docking results are clearly presented and remarkably consistent. Simulations are produced in triplicate with each of four different agonists, providing an informative basis for internal validation. They identify an intriguing transition in ligand pose, not well documented in experimental structures, and potentially applicable to mechanistic or even pharmacological modeling of this and related receptor systems. The paper seems a notable example of integrating quantitative structure-function analysis with systematic computational modeling and simulations, likely applicable to the wider journal audience.

      Weaknesses:<br /> Timescales (200 ns) do not capture global rearrangements of the extracellular domain, let alone gating transitions of the channel pore, though this work may provide a launching point for more extended simulations. A more general concern is the reproducibility of the simulations, and how representative states are defined. It is not clear whether replicates were included in principal component analysis or subsequent binding energy calculations, nor how simulation intervals were associated with specific states. Structural analysis largely focuses on snapshots, with limited direct evidence of consistency across replicates or clusters. Figure legends and tables could be clarified.

    1. eLife assessment

      This manuscript presents important findings that inform the genetic underpinnings of the model plant Arabidopsis' resistance to turnip mosaic virus (TuMV). The strength of the evidence in the manuscript is exceptional, with very large sample sizes, careful controls, multiple follow-up experiments, and broadening to the evolutionary context. The evidence provides robust support for each of the manuscript's conclusions and could pave the way for functional studies.

    1. eLife assessment

      This is an important study on the role of phenotypic aging in cancer risk. It presents results that show that Phenotypic Age Acceleration (PhenoAgeAccel) can predict cancer incidence of different types and could be used with genetic risk to facilitate the identification of cancer-susceptible individuals. This article presents solid results that would be of broad interest to the research community and clinicians.

    2. Reviewer #1 (Public Review):

      Bian et al showed that biomarker-informed PhenoAgeAccel was consistently related to an increased risk of site-specific cancer and overall cancer within and across genetic risk groups. The results showed that PhenoAgeAccel and genetic liability of a bunch of cancers serve as productive tools to facilitate the identification of cancer-susceptible individuals under an additive model. People with a high genetic risk for cancer may benefit from PhenoAgeAccel-informed interventions.

      As the authors pointed out, the large sample size, the prospective design UK Biobank study, and the effective application of PhenoAgeAccel in predicting the risk of overall cancer are the major strengths of the study. Meanwhile, the CPRS seems to be a solid and comprehensive score based on incidence-weighted site-specific polygenic risk scores across 20 well-powered GWAS for cancers.

      It wouldn't be very surprising to identify the association between PhenoAgeAccel and cancer risk, since the PhenoAgeAccel was constructed as a predictor for mortality which attributed a lot to cancer. Although cancer is an essential mediator for the association, sensitivity analyses using cancer-free mortality may provide an additional angle. It would be interesting to see, to what extent, PhenoAgeAccel could be reversed by environmental or lifestyle factors. G by E for PhenoAgeAccel might be worth a try.

    3. Reviewer #2 (Public Review):

      Summary:

      Bian et al. calculated Phenotypic Age Acceleration (PhenoAgeAccel) via a linear model regressing Phenotypic Age on chronological age. They examined the associations between PhenoAgeAccel and cancer incidence using 374,463 individuals from the UK Biobank and found that older PhenoAge was consistently related to an increased risk of incident cancer, even among each risk group defined by genetics.

      Strengths:

      The study is well-designed, and uses a large sample size from the UK biobank.

      Weaknesses:

      Since the UK biobank has a large sample size, it should have enough power to split the dataset into discovery and validation sets. Why did the authors use 10-fold cross-validation instead of splitting the dataset?

    1. eLife assessment

      This valuable study addresses the question of whether pupil size can serve as a sensitive physiological marker of information anticipation in human infants. The authors present solid experimental findings indicating that pupil size differs depending on the expected information content of a visual signal and that this effect might rapidly generalize to new visual information. The results could be further strengthened by additional eye movement processing and statistical analyses to rule out confounding effects of saccades and other artifacts as well as a stronger and more consistent rationale for excluding data.

    2. Reviewer #1 (Public Review):

      This study investigates the underlying mechanisms of information-seeking in infancy. Eight-month-old Dutch infants were tested in a screen-based eye-tracking task in which one of two geometrical shape cues (differing in their shape and motion) either announced the location of an upcoming reward cartoon (informative) or not (non-informative). The authors measured the infants' pupil size before the cartoon appeared. Infants showed smaller pupil sizes when presented with the informative cue as compared to the noninformative cue. The decrease in pupil size in the informative condition emerged over the course of trials whereas infants' pupil size remained unchanged in the noninformative condition. The authors interpret their findings as supportive evidence of statistical learning and generalization processes organizing infants' information-seeking.

      It was a pleasure to read the paper and I think the study makes a valuable contribution to our understanding of information-seeking in infancy. The manuscript is very well written and the study is cleverly designed. My following comments are based on my reading of the manuscript and the supplemental materials. It should be noted that evaluating the details of the statistical procedure the authors used lies outside my expertise. The same applies to some decisions of the authors related to pre-processing and filtering the pupil data. I very much appreciate that the authors shared all their raw data and analysis scripts openly accessible on the Open Science Framework. The study was unfortunately not preregistered, making it difficult to trace when in the study process certain decisions or assumptions were made.

      My two main concerns relate to the conceptualization and definition of information-seeking and the proposed speed of the mechanisms explaining infants' behavior. I outline my general comments below before listing some more concrete issues.

      1) While reading the manuscript, I was sometimes confused about what the authors refer to when talking about information-seeking - both in terms of the broader conceptualization of the phenomenon as well as when referring to their own study. What information are infants seeking? The informative value of the cue shape in terms of their motion (because it carries information about the location of a rewarding animation)? Or is the target (the rewarding video) the information being sought? From how the study is set up, I assume the authors refer mainly to the first aspect, but I think the manuscript would benefit from some clearer distinctions and definitions of terms.

      More specifically, I think it could help if the authors would specify the different aspects involved in information-seeking in the introduction (e.g., seeking information "directly", seeking cues guiding them towards information, etc.). Secondly, it would help if they would sharpen their (already in some parts existing) definitions for their study and then keep consistent with their definitions throughout the methods, results, and discussion. Is the cue the information being sought or the "behavior" (motion) of the cue? Or is the target animation the information being sought and guided via the cueing?

      2) Speed of the generalization process:<br /> From my understanding of the study design, the shape of the geometrical shape gains informative value over time (serving as an informative cue) and the *motion* of the shape is the actual informative or non-informative visual cue in that it either reliably highlights the actual target region (or all regions). In the generalization trials, only the shape was manipulated while the motion aspect remained consistent with the previous trials. Based on infants' behavior across learning and generalization trials, the authors make an argument about two distinct processes taking place: a slower allowing to learn where to find info and a faster generalization process. Apologies if I missed something, but given that the motion remains consistent, it's maybe not surprising that the generalization trials are "faster"? Maybe the generalization process would have been slower if not only the shape had changed but if also a novel informative motion had been introduced. Also, it would be helpful if the authors could clarify what they mean by the statistical learning process being more "data-hungry" (line 274).

      3) I would find it very helpful if the authors would discuss statistical learning and information-seeking processes from other possible mechanisms such as reward learning mechanisms. For example, the authors use a "rewarding" (not informative) stimulus as the target-wouldn't it be possible that the results can be also explained by reinforcement learning processes? Relatedly, in line 396 they write that they used TD learning to predict whether "information will be delivered" and contrast this with the approach being used to predict whether a reward will be delivered. But in their study reward was being delivered, too (in the form of the target), in addition to the informative motion of the cue.

    3. Reviewer #2 (Public Review):

      Summary<br /> The study used eye tracking with a focus on pupillometry to examine how infants can learn to distinguish between informative and uninformative visual cues. Infants (n = 30, mean age = 8.2-months-old) viewed displays consisting of a sequence of stimuli: a fixation point, a central cue that predicted a subsequent informative or uninformative signal, the signal itself, and the target event (a cartoon animal, referred to as the reward). The key results are that: (1) pupil size differs depending on whether the infants anticipated an informative or uninformative signal, (2) this difference develops across trials, consistent with a slow learning process, and (3) there is rapid generalization when new shapes were introduced that shared features with the informative vs uninformative cues. The study complements a rich literature, including from this same group, showing that children are sensitive to information gains, and is interesting and important in revealing that pupil size is a physiological marker of information anticipation. We have several comments and concerns and believe that addressing them would substantially strengthen the manuscript.

      Major points are related to interpretation, statistical robustness, and clarity

      1. There is a tendency to overinterpret the findings.<br /> a. Throughout, the authors interpret the findings as meaning that pupil size tracks the "value" of information; however, the results do not demonstrate conclusively whether, or what kind of value information has in this task. A natural hypothesis is that infants are intrinsically motivated to predict - i.e., value the ability to predict the target event as early as possible. In a supplementary figure, the authors present evidence that infants indeed fixate on the target event sooner after seeing informative vs uninformative cues, consistent with the idea that they use the information for improving predictions. However, those results are not fully convincing, as we detail in point 2. Most importantly, the analysis is not integrated or even mentioned in the main analyses analysis. Making the link between the pupil reaction and the use of the information would greatly strengthen the paper (whether or not the supplementary findings hold up to more thorough scrutiny). Either this link should be made and discussed, or the authors should soften their conclusions about the utility of the informative cues.

      b. On line 236, the text states that the evidence "...supports the growing body of evidence indicating that infants are proactive in shaping their learning environment by searching for and focusing on information-rich stimuli". The results do not show that the infants search for information, only that they have a pupil reaction that differentiates between informative and uninformative stimuli.

      c. On lines 248-249, it seems a stretch to relate the changes in pupil dilation to a shift in information value onto the cue. Without some other measure (e.g., EEG), this remains speculative. While I believe the suggestion is plausible, the language should be softened to highlight this as a follow-up research question that the present research cannot directly speak to.

      2. Several findings are statistically weak and several analyses are insufficiently controlled.

      a. The analysis in Supplementary Figure 2, which shows that the latencies of target fixations are shorter after informative vs uninformative cues, raises several questions.<br /> i. We were unable to fully test these analyses as the OSF project seems to only contain latency data for 33 participants (including 22 of the 30 that remain in the final sample).<br /> ii. The results are described as revealing a significant difference, but the 89% confidence interval of the difference contains 0. How did the authors establish significance here?<br /> iii. How do the authors distinguish incidental fixations (which just happened to land near the target) from true predictive gaze shifts? Fixations were pooled if they occurred from 1.25 seconds before to 1 second after target onset. This is sufficient time for the eye to move in and out of the window several times. The authors should analyse the distributions of fixation durations to rule out various artifacts unrelated to target prediction.<br /> iv. Latencies to fixation were standardized, bringing the mean across each participant to 0, and yet the statistical model includes a random intercept; is there a justification for this?<br /> v. Standardizing removes information about whether fixations were proactive or reactive. It would be very interesting to see if/how information affects these two differently.<br /> vi. Since informativeness was learned across trials, it seems desirable that the model should include as random effects a trial number and an interaction between trial number and informativeness. This would allow a comparison between learning to predict and the pupil reaction. Are infants who have a stronger (or earlier) pupil reaction also more likely to show stronger learning to anticipate?

      b. The main finding that pupil size differs between informative and uninformative cues is based on a 3-second analysis window. This long window most likely spans many saccades, which can affect pupil size on its own or by bringing the eye on or off visual stimuli. There is no analysis to show that the statistics of saccades or fixation locations are equivalent between the two trial types - but this is necessary to convincingly rule out a spurious artifact.

      c. The second main finding that the effect of informativeness grows across trials seems statistically weak. The text (line 138) states that the interaction had a beta of 0.002, which was equal to the lower border of the 89%HDI ([0.002, 0.003]). For the second claim that pupil size decreased across informative trials, the beta is -0.002, and 89% HID is non-existent - i.e., [-0.002, -0.002]. (In general, the authors should check their numbers more carefully and make sure they are presented with a degree of precision that allows the reader to interpret them meaningfully.

      d. The analyses do not indicate how well the TD model fits; we are shown only that it fits better than a linear model. On line 177 a correlation analysis is mentioned between the data and model, but the statistic cited for this test on line 179 is a mean beta coefficient, so it is impossible to know what this means. An analysis of goodness of fit or, at the very least, a figure superimposing the model and data, would be much more convincing.

      3. The descriptions are very unclear in some key parts of the paper

      a. The description of the TD model applied to pupil learning (starting on line 391) is very unclear. The model has to include some measure of informativeness - i.e., the match between the cued and true target location - but it is unclear how this was formalized. It is also very unclear how time within the trial is incorporated (the meaning of the TDE equation).

      b. The description of the generalization analysis (Fig. 5) is also very unclear. Every single sentence in it evoked some confusion, so I will go through them one by one. "A Bayesian additive model showed that infants' pupil dilation was reduced for novel cues." Reduced relative to what? "This was specific to those novel cues that shared the features of the familiar informative cues (estimated mean difference = -0.05, 89%HDI = [-0.062, -0.038])." All the novel cues shared features with the informative cues; do the authors mean the novel cues that had the critical feature indicative of the informative cue? "The size of this effect approximated the difference between conditions that were observed for familiar stimuli (estimated mean difference = -0.067, 89% HDI = [-201 0.077, -0.057])." What is "this effect"? "Crucially, this difference was not observable at the start of the task, when the familiar stimuli were first introduced (estimated mean difference = -0.007, 89%HDI = [-0.015, 0.001])." At the start of the task, the stimuli were novel, and not familiar.

    4. Reviewer #3 (Public Review):

      Summary:<br /> The study attempts to shed light on the mechanisms underlying information-seeking in infants by investigating whether infants distinguish between informative and uninformative stimuli to resourcefully allocate their attention. The authors show that 8-month-old infants can learn whether a visual stimulus is informative or uninformative about the location of a later appearing rewarding stimulus by employing statistical regularities from the input. Specifically, infants showed decreased pupil dilation for informative over uninformative cues, which developed over the course of trials as more and more information was gathered from the input. The pattern of learning was in line with a reinforcement learning model which employed a steep learning curve in the beginning followed by a more shallow but steady learning growth over trials. After 17 trials, the authors presented novel cues that shared certain visual features with the previous stimuli and showed that pupil dilation was reduced for novel cues that shared features with the previous informative stimuli, suggesting that infants were able to generalize their acquired knowledge about the informativeness of certain features to novel stimuli. The present study adds to the existing literature about the underlying mechanisms of learning by showing that infants cannot only predict an upcoming stimulus based on statistical regularities of a preceding cue but also the informativeness of the cue itself.

      Strengths:<br /> The authors use a suitable method to test the highly relevant question of whether and how infants infer the informativeness of stimuli from experience and whether they can generalize this knowledge to new stimuli. Their experiment is carefully designed and well controlled with conditions closely matched (e.g., the shape and color of objects and the structure of each trial). Their measure of interest (i.e., pupil dilation) is also examined at a time point in each trial when the conditions are the most similar, which further points to a thought-through and careful design. This empirical data is backed up with a computational approach (using a Bayesian model and training a reinforcement learning algorithm) to elucidate the learning mechanisms at play. This approach is explained concisely to readers not familiar with the models.

      The results are convincing showing a clear difference between informative and uninformative condition and development over trials. Specifically, this difference is not apparent in the first trial (Fig. 2c) but develops over time which supports a learning trajectory. The data support the authors' conclusion that infants learn about the informativeness of the object cue from the input, and the employed learning algorithms give further insights into the learning trajectory of the infants. Overall, the statistical analyses seem solid and the priors for the Bayesian models are well reported.

      Data and scripts are openly available fostering transparency.

      Overall, the manuscript is very well and concisely written.

      Weaknesses:<br /> The authors' conclusion that infants can generalize the acquired knowledge to similar but novel stimuli is weakened by methodological concerns regarding the analysis. It is not fully clear which trials the authors excluded and analyzed as they do not consistently report the trials in the manuscript (e.g., it is stated that after trial 17 the first generalization trial started, but also that trial 17 was excluded as the first trial of the generalization phase). As there are only a few novel trials and novel and familiar trials alternated, the inclusion or exclusion of trial analyses might have a significant impact on the results. Thus, this needs further clarification. The authors also mentioned that the novel stimuli shared relevant as well as irrelevant features, but it was not clear to me whether the authors could establish that only the relevant features contributed to the observed generalization effect.

      Some methodological decisions were not explained and need justification, in particular, as the study is not preregistered. This includes, for example, the exclusion criteria and the choice not to analyze all generalization trials. Further, the authors did not perform model comparison (e.g., their model against a null model) and therefore do not report the strength of evidence for a difference in conditions.

      Another weakness is that the sample sizes of 30 infants for the initial part and 19 infants for the generalization part of the experiment are rather small (especially with regard to the chosen weakly informative priors).

    1. eLife assessment

      By leveraging optical coherence tomography this study provides important insight into the deformation of human fingertip ridges when contacting raised features such as edges and contours. The study provides solid evidence that such features tend to cause deformation and relative movement of what the authors term ridge flanks rather than bending of the ridges themselves. Clarification about the anatomical structures under study is needed to fully interpret the claims and the resulting implications for both skin mechanics as well as the neural coding of touch.

    2. Reviewer #1 (Public Review):

      Summary:<br /> This manuscript uses optical coherence tomography (OCT) to visualize tissue microstructures about 1-2 mm under the finger pad skin surface. Their geometric features are tracked and used to generate tissue strains upon skin surface indentation by a series of transparent stimuli both normal and tangential to the surface. Then movements of the stratum corneum and the upper portion of the viable epidermis are evaluated. Based upon this data, across a number of participants and ridges, around 300 in total, the findings report upon particular movements of these tissue microstructures in various loading states. A better understanding of the mechanics of the skin microstructures is important to understand how surface forces propagate toward the locations of mechanoreceptive end organs, which lie near the edge of the epidermis and dermis, from which tactile responses of at least two peripheral afferents originate. Indeed, the microstructures of the skin are likely to be important in shaping how neural afferents respond and enhance their sensitivity, receptive field characteristics, etc.

      Strengths:<br /> The use of OCT in the context of analyzing the movements of skin microstructures is novel. Also novel and powerful is the use of distinct loading cases, e.g., normal, tangential, and stimulus features, e.g., edges, and curves. I am unaware of other empirical visualization studies of this sort. They are state-of-the-art in this field. Moreover, in addition to the empirical imaging observations, strain vectors in the tissues are calculated over time.

      Weaknesses:<br /> The interpretation of the results and their framing relative to the overall hypotheses/questions and prior works could be articulated more clearly. In particular, the major findings of the manuscript are in newly describing a central concept regarding "ridge flanks," but such structures are neither anatomically nor mechanistically defined in a clear fashion. For example, "... it appears that the primary components of ridge deformation and, potentially, neural responses are deformations of the ridge flanks and their relative movement, rather than overall bending of the ridges themselves." From an anatomical perspective, I think what the authors mean by "ridge flanks" is a differential in strain from one lateral side of a papillary ridge to the other. But is it unclear what about the continuous layers of tissue would cause such behaviors. Perhaps a sweat duct or some other structure (not visible to OCT) would subdivide the "flanks" of a papillary ridge somehow? If not due to particular anatomy, then is the importance of the "ridge flank" due to a mechanistic phenomenon of some sort? Given that the findings of the manuscript center upon the introduction of this new concept, I think a greater effort should be made to define what exactly are the "ridge flanks." It is clear from the results, especially the sliding case, that there is something important that the manuscript is getting at with this concept.

      The OCT used herein cannot visualize deep and fully into what the manuscript refers to as a "ridge" (note others have previously broken apart this concept apart into "papillary", "intermediate" and "limiting" ridges) near locations of the mechanoreceptive end organs lie at the epidermal-dermal border. Therefore, the OCT must make inferences about the movements of these deeper tissues, but cannot see them directly, and it is the movements of these deeper tissues that are likely driving the intricacies of neural firing. Note the word "ridge" is used often in the manuscript's abstract, introduction, and discussion but the definition in Fig. 1 and elsewhere differs in important ways from prior works of Cauna (expert in anatomy). Therefore, the manuscript should clarify if "ridge" refers to the papillary ridge (visible at the exterior of the skin), intermediate ridge (defined by Cauna as what the authors refer to as the primary ridge), and limiting ridge (defined by Cauna as what the authors refer to as the secondary ridge). What the authors really mean (I think) is some combination of the papillary and intermediate ridge structures, but not the full intermediate ridge. The manuscript acknowledges this in the "Limitations and future work" section, stating that these ridges cannot be resolved. This is important because the manuscript is oriented toward tracking this structure. It sets up the narrative and hypotheses to evaluate the prior works of Cauna, Gerling, Swensson, and others who all directly addressed the movement of this anatomical feature which is key to understanding ultimately how stresses at these locations might move the peripheral end organs (i.e., Merkel cells, Meissner corpuscles).

    3. Reviewer #2 (Public Review):

      Summary:<br /> The authors investigate sub-skin surface deformations to a number of different, relevant tactile stimuli, including pressure and moving stimuli. The results demonstrate and quantify the tension and compression applied from these types of touch to fingerprint ridges, where pressure flattens the ridges. Their study further revealed that on lateral movement, prominent vertical shearing occurred in ridge deformation, with somewhat inconsistent horizontal shear. This also shows how much the deeper skin layers are deformed in touch, meaning the activation of all cutaneous mechanoreceptors, as well as the possibility of other deeper non-cutaneous mechanoreceptors.

      Strengths:<br /> The paper has many strengths. As well as being impactful scientifically, the methods are sound and innovative, producing interesting and detailed results. The results reveal the intricate workings of the skin layers to pressure touch, as well as sliding touch over different conditions. This makes it applicable to many touch situations and provides insights into the differential movements of the skin, and thus the encoding of touch in regards to the function of fingerprints. The work is very clearly written and presented, including how their work relates to the literature and previous hypotheses about the function of fingerprint ridges. The figures are very well-presented and show individual and group data well. The additional supplementary information is informative and the video of the skin tracking demonstrates the experiments well.

      Weaknesses:<br /> There are very few weaknesses in the work, rather the authors detail well the limitations in the discussion. Therefore, this opens up lots of possibilities for future work.

      Impact/significance:<br /> Overall, the work will likely have a large impact on our understanding of the mechanics of the skin. The detail shown in the study goes beyond current understanding, to add profound insights into how the skin actually deforms and moves on contact and sliding over a surface, respectively. The method could be potentially applied in many other different settings (e.g. to investigate more complex textures, and how skin deformation changes with factors like dryness and aging). This fundamental piece of work could therefore be applied to understand skin changes and how these impact touch perception. It can further be applied to understand skin mechanoreceptor function better and model these. Finally, the importance of fingertip ridges is well-detailed, demonstrating how these play a role in directly shaping our touch perception and how they can shape the interactions we have with surfaces.

    4. Reviewer #3 (Public Review):

      Summary:<br /> The publication presents unique in-vivo images of the upper layer of the epidermis of the glabrous skin when a flat object compresses or slides on the fingertip. The images are captured using OCT, and are the process of recovering the strain that fingerprints experience during the mechanical stimulation.

      The most important finding is, in my opinion, that fingerprints undergo pure compression/tension without horizontal shear, hinting at the fact that the shear stress caused by the tangential load is transferred to the deeper tissues and ultimately to the mechanoreceptors (SA-I / RA-I).

      Strengths:<br /> - Fascinating new insights into the mechanics of glabrous skin. To the best of my knowledge, this is the first experimental evidence of the mechanical deformation of fingerprints when subjected to dynamic mechanical stimulation. The OCT measurement allows an unprecedented measurement of the depth of the skin whereas previous works were limited to tracking the surface deformation.<br /> - The robust data analysis reveals the continuum mechanics underlying the deformation of the fingerprint ridges.

      Weaknesses:<br /> I do not see any major weaknesses. The work is mainly experimental and is rigorously executed. Two points pique my curiosity, however:

      1. How do the results presented in this study compare with previous finite element analysis? I am curious to know if the claim that the horizontal shear strain is transferred to the previous layer is also captured by these models. The reason is that the FEA models typically use homogeneous materials and whether or not the behavior in-silico and in-vivo matches would offer an idea of the nature of the stratum corneum.<br /> 2. Was there a specific reason why the authors chose to track only one fingerprint? From the method section, it seems that nothing would have prevented tracking a denser point cloud and reconstructing the stain on a section of the skin rather than just one ridge. With such data, the author could extend their analysis to multiple ridges interaction and get a better sense of the behavior of the entire strip of skin.

    1. Joint Public Review:

      In this manuscript, the authors introduced an explicit ion model using the coarse-grained modelling approach to model the interactions between nucleosomes and evaluate their effects on chromatin organization. The strength of this method lies in the explicit representation of counterions, especially divalent ions, which are notoriously difficult to model. To achieve their aims and validate the accuracy of the model, the authors conducted coarse-grained molecular dynamics simulations and compared predicted values to the experimental values of the binding energies of protein-DNA complexes and the free energy profile of nucleosomal DNA unwinding and inter-nucleosome binding. Additionally, the authors employed umbrella sampling simulations to further validate their model, reproducing experimentally measured sedimentation coefficients of chromatin under varying salt concentrations of monovalent and divalent ions.

      The significance of this study lies in the authors' coarse-grained model which can efficiently capture the conformational sampling of molecules while maintaining a low computational cost. The model reproduces the scale and, in some cases, the shape of the experimental free energy profile for specific molecule interactions, particularly inter-nucleosome interactions. Additionally, the authors' method resolves certain experimental discrepancies related to determining the strength of inter-nucleosomal interactions. Furthermore, the results from this study support the crucial role of intrinsic physicochemical interactions in governing chromatin organization within the nucleus.

      The authors have successfully addressed the majority of my key concerns. I appreciate the clarification regarding the parameterization from Pablo's lab and the addition of comparisons of energy profiles as a function of inter-nucleosome distances.

      However, the statement "The agreement is evident" may not sufficiently capture the essence of Figure S4, as there is a shortage of substantial agreement. The authors rightly acknowledge it but should delineate the nature of the observed discrepancies.

    1. eLife assessment

      This important work reports on the transcriptomic analysis of leukocytes in the brain of adult zebrafish. A specific novel finding is the identification of dendritic cells distinct from microglia or macrophages; regional distribution of these subsets is described using transgenic lines and immunhistochemistry. The dependence of these subsets of specific transcription factors or receptors is addressed with mutants. This is a thorough and compelling analysis, of interest for scientists using the zebrafish models for neurology, immunology, and infectiology, as well as for those interested in the evolution of the brain and immune system.

    2. Reviewer #1 (Public Review):

      Summary:<br /> The authors used several zebrafish reporter lines to demonstrate the presence, regional distribution, and transcriptional profile of the immune cells in adult zebrafish brains. They identified DC-like cells distinct from microglia or other macrophages, resembling murine cDC1s. Analysis of different mutants further revealed that this DC population was dependent on Irf8, Batf3, and Csf1rb, but did not rely on Csf1ra.

      Strengths:<br /> It is an elegantly designed study providing compelling evidence for further heterogeneity among brain mononuclear phagocytes in zebrafish, consisting of microglia, macrophages, and DC-like cells. This will provide a better understanding of the immune landscape in the zebrafish brain and will help to better distinguish the different cell types from microglia, and to assign specific functions.

      Weaknesses:<br /> While scRNA-seq data clearly revealed different subsets of microglia, macrophages, and DCs in the brain, it remains somewhat challenging to distinguish DC-like cells from P2ry12- macrophages by immunohistochemistry or flow cytometry.

    3. Reviewer #2 (Public Review):

      The authors made an atlas of single-cell transcriptome of on a pure population of leukocytes isolated from the brain of adult Tg(cd45:DsRed) transgenic animals by flow cytometry. Seven major leukocyte populations were identified, comprising microglia, macrophages, dendritic-like cells, T cells, natural killer cells, innate lymphoid-like cells, and neutrophils. Each cluster was analyzed to characterize subclusters. Among lymphocytes, in addition to 2 subclusters expressing typical T cell markers, a group of il4+ il13+ gata3+ cells was identified as possible ILC2. This hypothesis is supported by the presence of this population in rag2KO fish, in which the frequency of lck and zap70+ cells is strongly reduced. The use of KO lines for such validations is a strength of this work (and the zebrafish model).

      The subcluster analysis of mpeg1.1 + myeloid cells identified 4 groups of microglial cells, one novel group of macrophage-like cells (expressing s100a10b, sftpbb, icn, fthl27, anxa5b, f13a1b and spi1b), and several groups of DC like cells expressing the markers siglec15l, ccl19a.1, ccr7, id2a, xcr1a.1, batf3, flt3, chl1a and hepacam2. Combining these new markers and transgenic reporter fish lines, the authors then clarified the location of leukocyte subsets within the brain, showing for example that DC-like cells stand as a parenchymal population along with microglia. Reporter lines were also used to perform a detailed analysis of cell subsets, and cross with a batf3 mutant demonstrated that DC-like cells are batf3 dependent, which was similar to mouse and human cDC1. Finally, analysis of classical mononuclear phagocyte deficient zebrafish lines showed they have reduced numbers of microglia but exhibit distinct DC-like cell phenotypes. A weakness of this study is that it is mainly based on FACS sorting, which might modify the proportion of different subtypes.

      This atlas of zebrafish brain leukocytes is an important new resource for scientists using the zebrafish models for neurology, immunology, and infectiology, and for those interested in the evolution of the brain and immune system.

    4. Reviewer #3 (Public Review):

      Rovira, et al., aim to characterize immune cells in the brain parenchyma and identify a novel macrophage population referred to as "dendritic-like cells". They use a combination of single-cell transcriptomics, immunohistochemistry, and genetic mutants to conclude the presence of this "dendritic-like cell" population in the brain. The strength of this manuscript is the identification of dendritic cells in the brain, which are typically found in the meningeal layers and choroid plexus. A weakness is the lack of specific reporters or labeling of this dendritic cell population using specific genes found in their single-cell dataset. Additionally, it is difficult to remove the meningeal layers from the brain samples and thus can lead to confounding conclusions. Overall, I believe this study should be accepted contingent on sufficient labeling of this population and addressing comments.

    1. eLife assessment

      This fundamental work substantially advances our understanding of the molecular basis of long-term memory formation. The study identifies PKCδ as a major molecular player in long-term memory formation and describes its translocation to mitochondria to promote pyruvate metabolism, specifically after spaced training. The evidence supporting the conclusions is compelling and the work will be of broad interest to neuroscience and medicine.

    2. Reviewer #1 (Public Review):

      Summary:<br /> This is a detailed description of the role of PKCδ in Drosophila learning and memory. The work is based on a previous study (Placais et al. 2017) that has already shown that for the establishment of long-term memory, the repetitive activity of MP1 dopaminergic neurons via the dopamine receptor DAMB is essential to increase mitochondrial energy flux in the mushroom body.

      In this paper, the role of PKCδ is now introduced. PKCδ is a molecular link between the dopaminergic system and the mitochondrial pyruvate metabolism of mushroom body Kenyon cells. For this purpose, the authors establish a genetically encoded FRET-based fluorescent reporter of PKCδ-specific activity, δCKAR.

      Strengths:<br /> This is a thorough study of the long-term memory of Drosophila. The work is based on the extensive, high-quality experience of the senior authors. This is particularly evident in the convincing use of behavioral assays and imaging techniques to differentiate and explore various memory phases in Drosophila. The study also establishes a new reporter to measure the activity of PKCδ - the focus of this study - in behaving animals. The authors also elucidate how recurrent spaced training sessions initiate a molecular gating mechanism, linking a dopaminergic punishment signal with the regulation of mitochondrial pyruvate metabolism. This advancement will enable a more precise molecular distinction of various memory phases and a deeper comprehension of their formation in the future.

      Weaknesses:<br /> Apart from a few minor technical issues, such as the not entirely convincing visualisation of the localisation of a PKCδ reporter in the mitochondria, there are no major weaknesses. Likewise, the scientific classification of the results seems appropriate, although a somewhat more extensive discussion in relation to Drosophila would have been desirable.

    3. Reviewer #2 (Public Review):

      Summary<br /> This study deepens the former authors' investigations of the mechanisms involved in gating the long-term consolidation of an associative memory (LTM) in Drosophila melanogaster. After having previously found that LTM consolidation 1. costs energy (Plaçais and Préat, Science 2013) provided through pyruvate metabolism (Plaçais et al., Nature Comm 2017) and 2. is gated by the increased tonic activity in a type of dopaminergic neurons ('MP1 neurons') following only training protocol relevant for LTM, i.e. interspaced in time (Plaçais et al., Nature Neuro 2012), they here dig into the intra-cell signalling triggered by dopamine input and eventually responsible for the increased mitochondria activity in Kenyon Cells. They identify a particular PKC, PKCδ, as a major molecular interface in this process and describe its translocation to mitochondria to promote pyruvate metabolism, specifically after spaced training.

      Methodological approach<br /> To that end, they use RNA interference against the isozyme PKCδ, in a time-controlled way and in the whole Kenyon cell populations or in the subpopulation forming the α/β lobe. This knock-down decreased the total PKCδ mRNA level in the brain by ca. 30%, and is enough to observe decreased in flies performances for LTM consolidation. Using Pyronic, a sensor for pyruvate for in vivo imaging, and pharmacological disruption of mitochondrial function, the authors then show that PKCδ knock-down prevents a high level of pyruvate from accumulating in the Kenyon cells at the time of LTM consolidation, pointing towards a role of PKCδ in promoting pyruvate metabolism. They further identify the PDH kinase PDK as a likely target for PKCδ since knocking down both PKCδ and PDK led to normal LTM performances, likely counterbalancing PKCδ knock-down alone.

      To understand the timeline of PKCδ activation and to visualise its mitochondrial translocation in a subpart of Mushroom body lobes they imported in fruitfly the genetically-encoded FRET reporters of PKCδ, δCKAR, and mitochondria-δCKAR (Kajimoto et al 2010). They show that PKCδ is activated to the sensor's saturation only after spaced training, and not other types of training that are 'irrelevant' for LTM. Further, adding thermogenetic activation of dopaminergic neurons and RNA interference against Gq-coupled dopamine receptor to FRET imaging, they identify that a dopamine-triggered cascade is sufficient for the elevated PKCδ-activation.

      Strengths and weaknesses<br /> The authors use a combination of new fluorescent sensors and behavioral, imaging, and pharmacological protocols they already established to successfully identify the molecular players that bridge the requirement for spaced training/dopaminergic neurons MP1 oscillatory activity and the increased metabolic activity observed during long-term memory consolidation.

      The study is dense in new exciting findings and each methodological step is carefully designed. Almost all possible experiments one could think of to make this link have been done in this study, with a few exceptions that do not prevent the essential conclusions from being drawn.

      The discussion is well conducted, with interesting parallels with mammals, where the possibility that this process takes place as well is yet unknown.

      Impact<br /> Their findings should interest a large audience:<br /> They discover and investigate a new function for PKCδ in regulating memory processes in neurons in conjunction with other physiological functions, making this molecule a potentially valid target for neuropathological conditions. They also provide new tools in drosophila to measure PKCδ activation in cells. They identify the major players for lifting the energetic limitations preventing the formation of a long-term memory.

    1. eLife assessment

      Shore et al. report important findings on the impact of a gain-of-function mutation (Y777H) in the Kcnt1 gene on ion currents and firing behavior in both excitatory and inhibitory neurons in the mouse cortex. The KCNT1 gene encodes a subunit of the Na+-activated K+ (KNa) channel, and the authors substantiate their claims with solid evidence from electrophysiological patch-clamp recordings of dissociated cortical neurons. Nevertheless, the majority of reviewers recommended additional studies to reinforce key findings, proposing the replication of experiments using a more physiologically intact preparation, such as an ex vivo slice.

    2. Reviewer #1 (Public Review):

      Summary:<br /> This manuscript reports the effects of a heterozygous mutation in the KCNT1 potassium channels on the properties of ion currents and the firing behavior of excitatory and inhibitory neurons in the cortex of mice expressing KCNT1-Y777H. In humans, this mutation as well as multiple other heterozygotic mutations produce very severe early-onset seizures and produce a major disruption of all intellectual function. In contrast, in mice, this heterozygous mutation appears to have no behavioral phenotype or any increased propensity to seizures. A relevant phenotype is, however, evident in mice with the homozygous mutation, and the authors have previously published the results of similar experiments with the homozygotes. As perhaps expected, the neuronal effects of the heterozygous mutation presented in this manuscript are generally similar but markedly smaller than the previously published findings on homozygotes. There are, however, some interesting differences, particularly on PV+ interneurons, which appear to be more excitable than wild type in the heterozygotes but more excitable in the heterozygotes. This raises the interesting question (which could be more explicitly discussed by the authors) as to whether the reported changes represent homeostatic events that suppress the seizure phenotype in the mouse heterozygotes or simply changes in excitability that do not reach the threshold for behavioral outcomes.

      Strengths and Weaknesses:<br /> 1) The authors find that the heterozygous mutation in PV+ interneurons increases their excitability, a result that is opposite from their previous observation in neurons with the corresponding homozygous mutation. They propose that this results from the selective upregulation of a persistent sodium current INaP in the PV+ interneurons. While the observations are very interesting, there are three issues concerning this interpretation that should be addressed:<br /> A) The protocol for measuring the INaP current could potentially lead to results that could be (mis)interpreted in different ways in different cells. First, neither K currents nor Ca currents are blocked in these experiments. Instead, TTX is applied to the cells relatively rapidly (within 1 second) and the ramp protocol is applied immediately thereafter. It is stated that, at this time, Na currents and INaP are fully blocked but that any effects on Na-activated K currents are minimal. In theory, this would allow the pre- to post-difference current to represent a relatively uncontaminated INaP. This would, however, only work if activation of KNa currents following Na entry is very slow, taking many seconds. A good deal of literature has suggested that the kinetics of activation of KNa currents by Na influx vary substantially between cell types, such that single action potentials and single excitatory synaptic events rapidly evoke KNa currents in some cell types. This is, of course, much faster than the time of TTX application. Most importantly, the kinetics of KNa activation may be different in different neuronal types, which would lead to errors that could produce different estimates of INaP in PV+ interneurons vs other cell types.<br /> B) As the authors recognize, INaP current provides a major source of cytoplasmic sodium ions for the activation. An expected outcome of increased INaP is, therefore, further activation of KNa currents, rather than a compensatory increase in an inward current that counteracts the increase in KNa currents, as is suggested in the discussion.<br /> C) Numerical simulations, in general, provide a very useful way to evaluate the significance of experimental findings. Nevertheless, while the in-silico modeling suggests that increases in INaP can increase firing rate in models of PV+ neurons, there is as yet insufficient information on the relative locations of the INaP channels and the kinetics of sodium transfer to KNa channels to evaluate the validity of this specific model.

      2) The greatest effect of TTX application would be expected to be the elimination of large transient inward sodium currents. Why are no such currents visible in the control (pre-TTX) or the difference currents (Fig. 2)? Is it possible I missed something in the methods?

      3) As expected, the changes in many of the measured parameters are smaller in the present study with heterozygotes than those previously reported for the homozygous mutation. Some of the statements on the significance of some of the present findings need to be stated more clearly. For example, in the results section describing Fig. 2, it is stated that "In glutamatergic and NFS GABAergic YH-HET neurons, the overall KNa current was increased ...as measured by a significant effect of genotype ...." Later in the same paragraph it is stated that the increases in KNa current are not significant. Apparently, different tests lead to different conclusions. Both for the purpose of understanding the pathophysiological effects of changes in KNa current and for making further numerical simulations, more explicit clarifying statements should be made.

      4) The effects of the KCNT1 channel blocker VU170 on potassium currents are somewhat larger and different from those of TTX, suggesting that additional sources of sodium may contribute to activating KCNT1, as suggested by the authors. Because VU170 is, however, a novel pharmacological agent, it may be appropriate to make more careful statements on this. While the original published description of this compound reported no effect on a variety of other channels, there are many that were not tested, including Na and cation channels that are known to activate KCNT1, raising the possibility of off-target effects.

      5) The experiments were carried out at room temperature. Is it possible that different effects on firing patterns in heterozygotes and homozygotes would be observed at more physiological temperatures?

    3. Reviewer #2 (Public Review):

      Summary:<br /> In this manuscript, Shore et al. investigate the consequent changes in excitability and synaptic efficacy of diverse neuronal populations in an animal model of juvenile epilepsy. Using electrophysiological patch-clamp recordings from dissociated neuronal cultures, the authors find diverging changes in two major populations of inhibitory cell types, namely somatostatin (SST)- and parvalbumin (PV)-positive interneurons, in mice expressing a variant of the KCNT1 potassium channel. They further suggest that the differential effects are due to a compensatory increase in the persistent sodium current in PV interneurons in pharmacological and in silico experiments.

      Strengths:<br /> 1) Heterozygous KCNT1 gain of function variant was used which more accurately models the human disorder.<br /> 2) The manuscript is clearly written, and the flow is easy to follow. The authors explicitly state the similarities and differences between the current findings and the previously published results in the homozygous KCNT1 gain of function variant.<br /> 3) This study uses a variety of approaches including patch clamp recording, in silico modeling, and pharmacology that together make the claims stronger.<br /> 4) Pharmacological experiments are fraught with off-target effects and thus it bolsters the authors' claims when multiple channel blockers (TTX and VU170) are used to reconstruct the sodium-activated potassium current. Having said that, it would be helpful to see the two drug manipulations be used in the same experiment. Notably, does the more selective blocker VU170 mimic the results of TTX for NFS GABAergic cells in Figure 2? And does it unmask a genotype difference for FS GABAergic cells like the one seen in PV interneurons in Figure 5C3.

      Weaknesses:<br /> 1) This study relies on recordings in dissociated cortical neurons. Although specific WT interneurons showed intrinsic membrane properties like those reported for acute brain slices, it is unclear whether the same will be true for those cells expressing KCNT1 variants. This reviewer highly recommends confirming some of the key findings using an ex vivo slice preparation. This is especially important given the discrepant result of reduced excitability of PV cells reported by Gertler et al., 2022 (cited here in the manuscript but not discussed in this context) in acute hippocampal slices for a different KCTN1 gain of function variant.<br /> 2) It is unclear how different pieces of results fit together to form a story about the disease pathophysiology. For example, hyperexcitability of PV cells would suggest more inhibition which would counter seizure propensity. However, spontaneous inhibitory postsynaptic currents show no change in pyramidal neurons. Moreover, how do the authors reconcile that the reductions in synaptic inputs onto interneurons in Figure 3B with the increases in Figure 8? This should be discussed.<br /> 3) Similarly, the results in this work are not entirely internally consistent. For example, given the good correspondence between FS and NFS GABAergic cells with PV and SST expression, why are FS GABAergic cells hyperexcitable in Figure 1? If anything, there is a tendency to show reduced excitability like the NFS GABAergic cells. Also, why do the WT I-V curves look so different between Figures 2 and 5? This reviewer suggests at least a brief explanation in the discussion.<br /> 4) Given the authors' claim that the KCNT1 activation curve is a major contributor to the observed excitability differences in specific GABA cell subtypes, it would be helpful to directly measure the activation curve in the variants experimentally as was done for WT KCNT1 in Figure 6A and use the derived kinetics in the compartmental model.

    4. Reviewer #3 (Public Review):

      Summary:<br /> The present manuscript by Shore et al. entitled Reduced GABAergic Neuron Excitability, Altered Synaptic Connectivity, and Seizures in a KCNT1 Gain-of-Function Mouse Model of Childhood Epilepsy" describes in vitro and in silico results obtained in cortical neurons from mice carrying the KCNT1-Y777H gain-of-function (GOF) variant in the KCNT1 gene encoding for a subunit of the Na+-activated K+ (KNa) channel. This variant corresponds to the human Y796H variant found in a family with Autosomal Dominant Nocturnal Frontal lobe epilepsy. The occurrence of GOF variants in potassium channel encoding genes is well known, and among potential pathophysiological mechanisms, impaired inhibition has been documented as responsible for KCNT1-related DEEs. Therefore, building on a previous study by the same group performed in homozygous KI animals, and considering that the largest majority of pathogenic KCNT1 variants in humans occur in heterozygosis, the Authors have investigated the effects of heterozygous Kcnt1-Y777H expression on KNa currents and neuronal physiology among cortical glutamatergic and the 3 main classes of GABAergic neurons, namely those expressing vasoactive intestinal polypeptide (VIP), somatostatin (SST), and parvalbumin (PV), crossing KCNT1-Y777H mice with PV-, SST- and PV-cre mouse lines, and recording from GABAergic neurons identified by their expression of mCherry (but negative for GFP used to mark excitatory neurons).

      The results obtained revealed heterogeneous effects of the variant on KNa and action potential firing rates in distinct neuronal subpopulations, ranging from no change (glutamatergic and VIP GABAergic) to decreased excitability (SST GABAergic) to increased excitability (PV GABAergic). In particular, modelling and in vitro data revealed that an increase in persistent Na current occurring in PV neurons was sufficient to overcome the effects of KCNT1 GOF and cause an overall increase in AP generation.

      Strengths:<br /> The paper is very well written, the results clearly presented and interpreted, and the discussion focuses on the most relevant points.

      The recordings performed in distinct neuronal subpopulations are a clear strength of the paper. The finding that the same variant can cause opposite effects and trigger specific homeostatic mechanisms in distinct neuronal populations is very relevant for the field, as it narrows the existing gap between experimental models and clinical evidence.

      Weaknesses:<br /> My main concern is in the epileptic phenotype of the heterozygous mice investigated. In fact, in their previous paper the Authors state that "...Kcnt1-Y777H heterozygous mice did not exhibit any detectable epileptiform activity" (first sentence on page 4). However, in the present manuscript, they indicate twice in the discussion section that these mice exhibit "infrequent seizures". This relevant difference needs to be clarified to correctly attribute to the novel pathophysiological mechanism a role in seizure occurrence. Were such infrequent seizures clearly identified on the EEG, or were behavioral seizures? Could the authors quantify this "infrequent" value? This is crucial also to place in the proper perspective the Discussion statement regarding "... the increased INaP contribution to ... network hyperexcitability and seizures".

      Also, some statistical analysis seems to be missing. For example, I could not find any for the data shown in Fig. 6. Thus, the following statement: "the model PV neurons responded to KCNT1 GOF with decreased AP firing and an increased rheobase" requires proper statistical evaluation.

    1. eLife assessment

      This important study provides convincing evidence of artifactual calcium micro-waves during calcium imaging of populations of neurons in the hippocampus using methods that are common in the field. The evidence that this artifact occurs in the data is convincing; however, the evidence for the particular conditions under which the calcium waves occur is incomplete. The work raises awareness of these artifacts so that any research labs planning to do calcium imaging in the hippocampus can avoid them by using alternative strategies that the authors propose.

    2. Reviewer #1 (Public Review):

      Summary: This paper reported interesting aberrant calcium microwaves in the hippocampus when synapsin promoter driven GCaMPs were expressed for a long period of time. These aberrant hippocampal Ca2+ micro-waves depend on the viral titre of the GECI. The microwave of Ca2+ was not observed when GECI was expressed only in a sparse set of neurons.

      Strengths: These findings are important to the wide neuroscience community, especially considering a great number of investigators are using similar approaches. Results look convincing and are consistent across several laboratories.

      Weaknesses: One important question is needed to further clarify the mechanisms of aberrant Ca2+ microwaves as described below.

      Synapsin promoter labels both excitatory pyramidal neurons and inhibitory neurons. To avoid aberrant Ca2+ microwave, a combination of Flex virus and CaMKII-Cre or Thy-1-GCaMP6s and 6f mice were tested. However, all these approaches limit the number of infected pyramidal neurons. While the comprehensive display of these results is appreciated, a crucial question remains unanswered. To distinguish whether the microwave of Ca2+ is caused selectively via the abnormality of interneurons, or just a matter of pyramidal neuron density, testing Flex-GCaMP6 in interneuron specific mouse lines such as PV-Cre and SOM-Cre will be critical.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors describe and quantify a phenomenon in the CA1 and CA3 of the hippocampus that they call aberrant Ca2+ micro-waves. Micro-waves are sometimes seen during 2-photon calcium imaging of populations of neurons under certain conditions. They are spatially confined slow calcium events that start in a few cells and slowly spread to neighboring groups of cells. This phenomenon has been uttered between researchers in the field at conferences, but no one has taken the time to carefully capture and quantify micro-waves and pin down the causes. The authors show that micro-waves are dependent on the viral titre of the genetically encoded calcium indicators (GECIs), the genetic promoter (synapsin), the neuronal subtype (granule cells in the dentate gyrus do not produce micro-waves and they are not seen in the neocortex), and the density of GECI expression. The authors should be commended for their work and raising awareness to all labs doing any form of calcium imaging in populations of neurons. The authors also come up with alternative approaches to avoid artifactual micro-waves such as reducing the transduction titre (1:2 dilution of virus) and a transduction method employing sparser and cre-dependent GECI expression in principal cells using a CaMKII promoter.

      Strengths:

      The micro-waves reported in the paper were robustly observed across 4 laboratories and 3 different countries with various experimenters and calcium imaging set-ups. This adds significant strength to the work.

      The age of mice used covered a broad range (from 6 to 43 weeks). This is a strength because is covers most ages that are used in labs that regularly do calcium imaging.

      Another strength is they used different GCaMP variants (GCaMP6m, GCaMP6s, GCaMP7f), as well a red indicator: RCaMP. This shows the micro-waves are not an issue with any particular GECI, as the authors suggest.

      The authors include many movies of micro-waves. This is extremely useful for researchers in the field to view them in real-time so they can identify them in their own data.

      They provide a useful table with specific details of the virus injected, titre, dilution, and other information along with the incidence of micro-waves. A nice look-up table for researchers to see if their viral strategy is associated with a high or low incidence of micro-waves.

      Weaknesses:

      Whether micro-waves are associated with the age of mice was not quantified. This would be good to know and the authors do have this data.

      The effect of mico-waves on single cell function was not analyzed. It would be useful, for example, if we knew the influence of micro-waves on place fields. Can a place cell still express a place field in a hippocampus that produces micro-waves? What effect might a microwave passing over a cell have on its place field? Mice were not trained in these experiments, so the authors do not have the data.

      The CaMKII-Cre approach for flexed-syn-GCaMP expression shows no micro-waves and is convincing, but it is only from 2 animals, even though both had no micro-waves.

      The authors state in their Discussion that even without observable microwaves, a syn-Ca2+-indicator transduction strategy could still be problematic. This may be true, but they do not check this in their analysis, so it remains unknown.

    4. Reviewer #3 (Public Review):

      Summary:<br /> The work by Masala and colleagues highlights a striking artifact that can result from a particular viral method for expressing genetically encoded calcium indicators (GECIs) in neurons. In a cross-institutional collaboration, the authors find that viral transduction of GECIs in the hippocampus can result in aberrant slow-traveling calcium (Ca2+) micro-waves. These Ca2+ micro-waves are distinct from previously described ictal activity but nevertheless are likely a pathological consequence of overexpression of virally transduced proteins. Ca2+ micro-waves will most likely obscure the physiology that most researchers are interested in studying with GECIs, and their presence indicates that the neural circuit is in an unintended pathological state. Interestingly this pathology was not observed using the same viral transduction methods in the visual cortex. The authors recommend several approaches that may help other experimenters avoid this confound in their own data such as reducing the titer of viral injections or using recombinase-dependent expression. The intent of this manuscript is to raise awareness of the potential unintended consequences of viral overexpression, particularly for GECIs. A rigorous investigation into the exact causes of Ca2+ micro-waves or the mech

      Strengths:

      The authors clearly demonstrate that Ca2+ micro-waves occur in the CA1 and CA3 regions of the hippocampus following large volume, high titer injections of adeno-associated viruses (AAV1 and AAV9) encoding GECIs. The supplementary videos provide undeniable proof of their existence.

      By forming an inter-institutional collaboration, the authors demonstrate that this phenomenon is robust to changes in surgical techniques or imaging conditions.

      Weaknesses:

      I believe that the weaknesses of the manuscript are appropriately highlighted by the authors themselves in the discussion. I would, however, like to emphasize several additional points.

      As the authors state, the exact conditions that lead to Ca2+ micro-waves are unclear from this manuscript. It is also unclear if Ca2+ micro-waves are specific to GECI expression or if high-titer viral transduction of other proteins such as genetically encoded voltage indicators, static fluorescent proteins, recombinases, etc could also cause Ca2+ micro-waves.

      The authors almost exclusively tested high titer (>5x10^12 vg/mL) large volume (500-1000 nL) injections using the synapsin promoter and AAV1 serotypes. It is possible that Ca2+ micro-waves are dramatically less frequent when titers are lowered further but still kept high enough to be useful for in vivo imaging (e.g. 1x10^12 vg/mL) or smaller injection volumes are used. It is also possible that Ca2+ micro-waves occur with high titer injections using other viral promoter sequences such as EF1α or CaMKIIα. There may additionally be effects of viral serotype on micro-wave occurrence.

      The number of animals in any particular condition are fairly low (Table 1) with the exception of V1 imaging and thy1-GCaMP6 imaging. This prohibits rigorous comparison of the frequency of pathological calcium activity across conditions.

    1. eLife assessment

      This study presents a valuable confirmation of the roles of Dact1 and Dact2, two factors involved in Wnt signaling, during zebrafish gastrulation and craniofacial development. The limitation of the study is that its examination of genetic interactions with other Wnt factors does not conclusively distinguish primary from secondary effects for each factor. Addressing this weakness is essential for supporting claims on interactions between dact1/2 and any Wnt factors examined. The findings of a new potential target of dact1/2-mediated Wnt signaling are potentially of value; however, experimental evidence supporting the veracity of this finding is incomplete due to an apparent lack of reproducibility.

    2. Reviewer #1 (Public Review):

      Summary:<br /> This study delves into the roles of dact1 and dact2 during zebrafish embryonic axis formation and craniofacial morphogenesis. The researchers seek to unravel the mechanisms by which dact1/2 influences Wnt signaling modulation throughout embryonic development and patterning. They propose distinct spatiotemporal roles for Dact1 and Dact2 proteins in zebrafish embryonic development, highlighting their involvement in modulating noncanonical Wnt signaling during convergent extension events. Their findings demonstrate that dact1 and dact2 exhibit distinct spatiotemporal expression domains during development and that dact1/2 mutation leads to convergent extension defects. Furthermore, the study attempts to establish a link between convergent extension defects resulting from dact1/2 mutation and subsequent craniofacial abnormalities during development. To investigate the connection between dact1 and dact2, compound mutants were employed since single mutants did not exhibit craniofacial phenotypes. Additionally, the research encompasses comprehensive transcriptomics and pathway analyses of differentially expressed genes in dact1/2 mutants. This analysis reveals the overexpression of a calcium-dependent cysteine protease, calpain 8. The study suggests a connection between the upregulation of calpain 8 and the observed craniofacial dysmorphology in dact1/2 mutants, implying a potential link between the altered expression of calpain 8 and the craniofacial abnormalities observed in these mutants.

      Strengths:<br /> The study beautifully recapitulates previous findings on the role of dact1/2 in modulating convergent extension during zebrafish embryogenesis.

      A combination of multiple approaches, including in vivo time-lapse imaging, has been employed to elucidate the etiology of the rod-like neurocranial phenotype in dact1/2 double mutant.<br /> This study utilizes and discusses several 'traditional' mutant lines and newly created ones, analyzing them through single-cell transcriptomics.

      Weaknesses:<br /> 1. Enhancing Reproducibility and Robustness:<br /> To enhance the reproducibility and robustness of the findings, it would be valuable for the authors to provide specific numbers of animals used in each experiment.<br /> Explicitly stating the penetrance of the rod-like neurocranial shape in dact1/2-/- animals would provide a clearer understanding of the consistency of this phenotype.

      2. Strengthening Single-Cell Data Interpretation:<br /> To further validate the single-cell data and strengthen the interpretation of the gene expression patterns, I recommend the following:<br /> -Provide a more thorough explanation of the rationale for comparing dact1/2 double mutants with gpc4 mutants.<br /> -Employ genotyping techniques after embryo collection to ensure the accuracy of animal selection based on phenotype and address the potential for contamination of wild-type "delayed" animals.<br /> -Supplement the single-cell data with secondary validation using RNA in situ or immunohistochemistry techniques.

      3. Directly Investigating Non-Cell-Autonomous Effects:<br /> To directly assess the proposed non-cell-autonomous role of dact1/2, I suggest conducting transplantation experiments to examine the ability of ectodermal/neural crest cells from dact1/2 double mutants to form wild-type-like neurocranium.

      4. Further Elucidating Calpain 8's Role:<br /> To strengthen the evidence supporting the critical role of Calpain 8, I recommend conducting overexpression experiments using a sensitized background to enhance the statistical significance of the findings.

    3. Reviewer #2 (Public Review):

      Summary:<br /> Non-canonical Wnt signaling plays an important role in morphogenesis, but how different components of the pathway are required to regulate different developmental events remains an open question. This paper focuses on elucidating the overlapping and distinct functions of dact1 and dact2, two Dishevelled-binding scaffold proteins, during zebrafish axis elongation and craniofacial development. By combining genetic studies, detailed phenotypic analysis, lineage tracing, and single-cell RNA-sequencing, the authors aimed to understand (1) the relative function of dact1/2 in promoting axis elongation, (2) their ability to modulate phenotypes caused by mutations in other non-canonical wnt components, and (3) pathways downstream of dact1/2.

      Strong qualitative evidence was provided to support dact1/2's role in genetically modulating non-canonical wnt signaling to regulate body axis elongation and the morphology of the anterior neurocranium (ANC). However, there is currently insufficient evidence supporting the author's claim that suppression of calpain 8 by dact1/2 is important for craniofacial development and that "embryonic fields determined during gastrulation affect the CNCC ability to contribute to the craniofacial skeleton".

      Strengths:<br /> (1) The generation of dact1/2 germline mutants and the use of genetic approaches to dissect their genetic interactions with wnt11f2 and gpc4 provide unambiguous and consistent results that inform the relative functions of dact1 and dact2, as well as their combined effects.

      (2) Because the anterior neurocranium exhibits a spectrum of phenotypes in different genetic mutants, it is a useful system for studying how tissue morphology can be modulated by different components of the same pathway, as demonstrated in this study.

      (3) The authors leveraged lineage tracing by photoconversion to dissect how dact1/2 differentially impacts the ability of different cranial neural crest populations to contribute to the anterior neurocranium. This revealed that distinct mechanisms can lead to similar phenotypes in different mutants.

      Weaknesses:<br /> (1) While the qualitative data show altered morphologies in each mutant, quantifications of these phenotypes are lacking in several instances, making it difficult to gauge reproducibility and penetrance, as well as to assess the novel ANC forms described in certain mutants.

      (2) Germline mutations limit the authors' ability to study a gene's spatiotemporal functional requirement. They therefore cannot concretely attribute nor separate early-stage phenotypes (during gastrulation) to/from late-stage phenotypes (ANC morphological changes).

      (3) Given that dact1/2 can regulate both canonical and non-canonical wnt signaling, this study did not specifically test which of these pathways is altered in the dact1/2 mutants, and it is currently unclear whether disrupted canonical wnt signaling contributes to the craniofacial phenotypes, even though these phenotypes are typical non-canonical wnt phenotypes.

      (4) The use of single-cell RNA sequencing unveiled genes and processes that are uniquely altered in the dact1/2 mutants, but not in the gpc4 mutants during gastrulation. However, how these changes lead to the manifested ANC phenotype later during craniofacial development remains unclear. The authors showed that calpain 8 is significantly upregulated in the mutant, but the fact that only 1 out of 142 calpain-overexpressing animals phenocopied dact1/2 mutants indicates the complexity of the system.

      (5) Craniofacial phenotypes observed in this study are attributed to convergent extension defects but convergent extension cell movement itself was not directly examined, leaving open if changes in other cellular processes, such as cell differentiation, proliferation, or oriented division, could cause distinct phenotypes between different mutants.

    4. Reviewer #3 (Public Review):

      Summary:<br /> In this manuscript, the authors explore the roles of dact1 and dact2 during zebrafish gastrulation and craniofacial development. Previous studies used morpholino (MO) knockdowns to show that these scaffolding proteins, which interact with disheveled (Dsh), are expressed during zebrafish gastrulation and suggested that dact1 promotes canonical Wnt/B-catenin signaling, while dact2 promotes non-canonical Wnt/PCP-dependent convergent-extension (Waxman et al 2004). This study goes beyond this work by creating loss-of-function mutant alleles for each gene and unlike the MO studies finds little (dact2) to no (dact1) phenotypic defects in the homozygous mutants. Interestingly, dact1/2 double mutants have a more severe phenotype, which resembles those reported with MOs as well as homozygous wnt11/silberblick (wnt11/slb) mutants that disrupt non-canonical Wnt signaling (Heisenberg et al., 1997; 2000). Further analyses in this paper try to connect gastrulation and craniofacial defects in dact1/2 mutants with wnt11/slb and other wnt-pathway mutants. scRNAseq conducted in mutants identifies calpain 8 as a potential new target of dact1/2 and Wnt signaling.

      Strengths:<br /> When considered separately the new mutants are an improvement over the MOs and the paper contains a lot of new data.

      Weaknesses:<br /> The hypotheses are very poorly defined and misinterpret key previous findings surrounding the roles of wnt11 and gpc4, which results in a very confusing manuscript. Many of the results are not novel and focus on secondary defects. The most novel result of overexpressing calpain8 in dact1/2 mutants is preliminary and not convincing.

      Major Comments:<br /> 1) One major problem throughout the paper is that the authors misrepresent the fact that wnt11f2 and gpc4 act in different cell populations at different times. Gastrulation defects in these mutants are not similar: wnt11 is required for anterior mesoderm CE during gastrulation but not during subsequent craniofacial development while gpc4 is required for posterior mesoderm CE and later craniofacial cartilage morphogenesis (LeClair et al., 2009). Overall, the non-overlapping functions of wnt11 and gpc4, both temporally and spatially, suggest that they are not part of the same pathway.

      2) There are also serious problems surrounding attempts to relate single-cell data with the other data in the manuscript and many claims that lack validation. For example, in Fig 1 it is entirely unclear how the Daniocell scRNA-seq data have been used to compare dact1/2 with wnt11f2 or gpc4. With no labeling in panel 1E of this figure these comparisons are impossible to follow. Similarly, the comparisons between dact1/2 and gpc4 in scRNA-seq data in Fig. 6 as well as the choices of DEGs in dact1/2 or gpc4 mutants in Fig. 7 seem arbitrary and do not make a convincing case for any specific developmental hypothesis. Are dact1 and gpc4 or dact2 and wnt11 co-expressed in individual cells? Eyeballing similarity is not acceptable.

      3) Many of the results in the paper are not novel and either confirm previous findings, particularly Waxman et al (2004), or even contradict them without good evidence. The authors should make sure that dact2 loss-of-function is not compensated for by an increase in dact1 transcription or vice versa. Testing genetic interactions, including investigating the expression of wnt11f2 in dact1/2 mutants, dact1/2 expression in wnt11f2 mutants, or the ability of dact1/2 to rescue wnt11f2 loss of function would give this work a more novel, mechanistic angle.

      4) The identification of calpain 8 overexpression in Dact1/2 mutants is interesting, but getting 1/142 phenotypes from mRNA injections does not meet reproducibility standards.

    1. Author Response

      Reviewer #1 (Public Review)

      The manuscript by Singh et al proposes a new theoretical model for the phenomenon of planar cell polarity (PCP). The new model is simulating the emergence of the subcellular polarity of the Fat-Ds pathway, based on the interactions of the protocadherins Fat and Ds at the boundary between cells and in response to external gradients. Several mathematical models for PCP have been previously developed focusing on different aspects of PCP, including non-autonomy domineering (Amonlirdviman et al.), the effect of stochasticity on polarity (Burak et al.), gradient sensing (Mani et al), formation of molecular bridges (Fisher et al.) to name a few. The current modeling approach suggests a new model, based on a relatively simple set of equations for membrane Fat and Ds and their interactions, both in 1D (line of cells) and in 2D (hexagonal array). The equations are relatively simple on one hand, allowing performing tractable computational analysis as well as analytical approximations, while on the other hand allowing tracking membrane protein levels, which is what is measured experimentally. It has been previously shown that achieving polarity requires local feedback that amplify complexes in one orientation at the expense of complexes in the opposite orientation (e.g. Mani et al.). Interestingly, the current manuscript shows that a simple assumption, that Fat-DS complexes are stabilized when bound is sufficient to induce PCP when concentrations are high enough. The authors use the model to show how it captures several experimental observations, as well as to analyze the sensitivity to noise, the response to gradients, and the response to local perturbations (mutant clones). The manuscript is clear and the analysis is mostly coherent and sensible (although some parts need to be clarified, see below). The main issue I have with the manuscript is that it mostly describes how it captures different features that were mostly explained in previous models. I do think the authors should do more with their model to explain features that were not explained by other models, and/or generate non-trivial predictions that can be tested experimentally.

      We thank the reviewer for the positive feedback and valuable comments We have comprehensively modified the manuscript by including new results and detailing the specific model prediction and their potential experimental tests to address the concerns.

      Reviewer #2 (Public Review):

      The setting of planar cell polarity in epithelial tissues involves a complex interplay of chemical interactions. While local interactions can spontaneously give rise to cell polarity, planar cell polarity also involves tissue scale gradients whose effects are not clear. To understand their role, the authors built a minimal mechanistic model in considering two atypical cadherins, Fat (Ft) and Dachsous (Ds) which can associate at cell-cell interfaces to form hetero-dimers in which monomers belong to adjacent cells. This association can be seen as a local interaction between cells and is also sensitive to overall concentration gradients. From their model which appears to capture diverse experimental observations, the authors conclude that tissue-scale gradients provide to planar cell polarity a directional cue and some robustness to cellular stochasticity. While this model comes after similar works reaching similar predictions, the quality of this model is in its simplicity, its convenience for experimental testing, and the diversity of experimental observations it recapitulates.

      A strength of this work is to recapitulate many experimental observations made on planar cell polarity. It, for example, seems to capture the response of tissues to perturbations such as local downregulation of some important proteins, and the polarity patterns observed in the presence of noise in synthesis or cell-to-cell heterogeneity. It also gives a mechanistic description of planar cell polarity, making its experimental interpretation simple. Finally, the simplicity of the model facilitates its exploration and makes it easily testable because of the reduced amount of free model parameters.

      A weakness of this work is that it comes after several models with similar hypotheses and similar predictions.

      Another weakness is that some conclusions of this work rely on visual appreciation rather than quantification. This is particularly true for what concerns 2D patterns. An argument of the authors is for example that their model reproduces a variety of known spatial patterns, but the comparison with experiments is only visual and would be more convincing in being more quantitative.

      We are grateful to the reviewer for a critical evaluation of the manuscript and for giving important suggestions. We have incorporated all the comments and revised the manuscript accordingly by including quantitative analysis of all the results presented.

      Reviewer #3 (Public Review):

      Using theory, the authors study mechanisms for establishing planar cell polarity (PCP) through local and global modules. These modules refer to the interaction between neighbouring cells and tissue-wide gradients, respectively. Whereas local interactions alone can lead to tissue-wide alignment PCP, a global gradient can set the direction of PCP and maintain the pattern in presence of noise. In contrast, the authors argue that a global gradient can only generate PCP to an extent that is proportional to the gradient magnitude.

      The authors formulate a discrete model in one and two spatial dimensions that describe the assembly dynamics of PCP proteins on membranes. The number of proteins per cell remains constant. Additive noise is introduced to account for stochasticity in the attachment/detachment kinetics of proteins. Furthermore, ’quenched’ noise is introduced to account for variations of protein numbers between cells. The authors perform simulations of the stochastic discrete model in various situations. In addition, they derive a continuum description to perform some analytical computations.

      The strength of this analysis relies clearly on showing that simple dynamics can lead to tissue-wide PCP even in absence of a gradient in protein expression. A number of phenomena observed in tissues are qualitatively reproduced. In two spatial dimensions, they find swirling patterns that resemble patterns found in tissues when a global gradient is absent. The model also captures qualitative effects due to the down-regulation of one of the PCP proteins in a certain region of the tissue.

      The main weak point is that, from a physical point of view, the findings are not particularly surprising. Furthermore, some assumptions underlying the model, need some more justification. This holds notably for the question, of why additive noise is appropriate to account for the effect of stochasticity in the attachment-detachment dynamics of the proteins. Finally, the authors consider a situation that they consider to be one of the most interesting features of PCP, namely, the formation of PCP in the presence of a region with a down-regulated PCP protein and in presence of a gradient. Unfortunately, the effect is not very clear and the data provided remains limited.

      We thank the reviewer for the valuable comments are critique of the work. We have considered all the concerns and revised the manuscript comprehensively. In particular, we have elaborated the sections on model assumptions and added new figures/figure-panels to quantitatively present the model predictions. We have also revised the details of the one-dimensional continuum theory for PCP which, we feel, presents a detailed quantitative picture of PCP and its dependence on model parameters.

    1. Author Response

      Reviewer #2 (Public Review):

      In this study, Leiba et al. aim at establishing the developing zebrafish embryo as a suitable infection model to study Salmonella persistence in vivo. Under environmental stress (ex: macrophage phagosomes) a proportion of bacteria switch to a slow/arrested growth state conferring increased resistance to antibiotic treatments. Persisters are getting increasingly linked to infection relapses. Understanding how persistent infections emerge and bacteria survive in an organism for long time without replicating before switching back to a replicative state is essential. Zebrafish represents an alternative model to mice offering the possibility to image the whole organism and capture persistency with an amazing spatio-temporal resolution.

      In this paper, the authors demonstrate that persistent infections of Salmonella can be reproduced in the developing zebrafish. The kinetics of infection have been well characterized and shows a very nice heterogeneity between animals demonstrating the complex host-pathogen interactions (Fig 1). From the perspective of persistence, the presence of Salmonella survivors to host clearing is reported until 14dpi demonstrating the possibility to induce persistent infection in this model. Through the manuscript, the authors have used a variety of state-of-the-art technics illustrating the flexibility of this model including microscopy and imaging of specific immune populations, various transgenic animals and selective depletion of macrophages or neutrophils to assess their relative contributions. Overall, the conclusions of the authors are well supported by the presented data. This said, the authors should strengthen the conclusions of the paper by providing a better characterization of the infection.

      Major comments:

      1) Figure 1: What is the general life-spam of the fish?

      The general life-span of the zebrafish is approximately 3 years on average. Persistent infection is determined by the existence of a fraction of bacteria that endure over an extended period (after 96 hpi). Further, we observed Salmonella persistence for 14 days. In figure 1, we don’t think that the information of the general life-span of the zebrafish is critical.

      2) Figure 2: It would be nice to clearly state what infection scenario we are looking at. Have the authors studied "high proliferation", "infected" or "cleared" zebrafish?

      In Figure 2 we have studied the "infected" group. Both "high proliferation" and "cleared" larvae were excluded from the analysis. This is now clearly stated in the legend of Figure 2.

      3) Figure 3 and 4: It would be very informative if the authors can tell us what proportion of Salmonella is associated with macrophages and neutrophils. From panel C and D (Figure 3) and Figure 4 C and D and Suppl Fig 1, it seems that a lot of bacteria are extracellular. Maybe an EM image of the tissue would help to understand if the bacteria is "all" intracellular or intracellular.

      We apologize for any misunderstanding regarding the presence of intra- and extracellular bacteria depicted in Figure 3 C and D, Figure 4 C and D and Figure 3 -Suppl Fig 1. These figures illustrate infection experiments conducted in single-reporter larvae, limiting our analysis to bacteria associated with a single cell type. Figure 3G and Figure 4E-G, the panels depict infection experiments carried out in dual-reporter larvae, showing bacteria associated or not with macrophages and neutrophils. The present study aimed to establish the role of neutrophils and macrophages in the control of early and persistent Salmonella infection but further studies will focus on the exact localization of Salmonella during the course of the infection and, despite being a challenging technique for zebrafish, electron microscopy could be of great interest, allowing to visualize any type of cells (to determine if all bacteria are intracellular) at high resolution.

      4) Figure 3 and 4: It would be very useful if the authors can tell us if the intracellular bacteria are mainly found individually (like in Figure 3C) or does host cells harbor many intracellular bacteria. Looking at figure 4G: it is not clear to me how many intracellular bacteria can be counted on this image.

      This is an interesting suggestion. At present, an accurate quantification of the intracellular bacteria on microscopy 3D-datasets is challenging because bacteria aggregate inside the cells. At 4 hpi, single bacteria can occasionally be observed outside leukocytes, while most of infected macrophages harbored several intracellular bacteria (bacteria aggregates). To compare the levels of intracellular bacterial between acute and persistent stages, we measured the size of E2Crimson-positive (E2Crimson+) events. At 5 hpi, the median volume of E2Crimson+ events was lower than that at 4 dpi. The size distribution analysis of E2Crimson+ events indicated a higher representation of smaller volumes (0.5-1.5 m3 and 1.5-10 m3) at 5 hpi compared to 4 dpi, a stage during which very large E2Crimson+ events were observed (between 100-1000 m3, with some exceeding 1000 m3). This observation suggests an elevated presence of intracellular bacteria within the cells during persistent stages and that intracellular bacteria are predominantly observed as multiple rather than as solitary entities. This analysis has been incorporated in new Figure 5.

      5) Figure 3 and 4: The authors should also perform an experiment with a Salmonella strain harboring a growth reporter to quantify the amount of replicating and non-replicating bacteria. This experiment is not absolutely necessary for the story, but if possible, it would provide a very nice add-up to the story and impact to the paper.

      We welcome the reviewers’ suggestion, which we have indeed considered and planning to carry on in the future, along with experimented more oriented on the bacterial side.

      6) Figure 6: The authors should provide in suppl. the flow cytometry scatter plots used to delineate the different subpopulations.

      We agree with the reviewer that the flow cytometry scatter plots used to delineate the different subpopulations were missing and are now incorporated in new Fig 7 - figure supplement 2.

      7) Figure 6: A specific characterization of macrophages harboring Salmonella persisters at 4dpi is missing. As shown by the authors in Figure 6, the tnfa- populations of macrophages at 4dpi are very similar for both infected and non-infected larvae. Persisters should indeed reside within tnfa- macrophages but they should also induce a specific signature through the actions of Salmonella effectors. Measuring this signature will allow a direct comparison with published data in mice and assess how accurately the zebrafish model recapitulates the manipulation of macrophages by Salmonella

      We agree with the reviewer that a specific characterization of macrophages harboring persistent Salmonella at 4 dpi is missing. However due to the technical limitation inherent to the model (limited recovery of infected cells following FACS sorting), we were not able to specifically sort infected macrophages at 4 dpi.

    1. Author Response

      Reviewer #1 (Public Review):

      This paper combines a number of cutting-edge approaches to explore the role of a specific mouse retinal ganglion cell type in visual function. The approaches used include calcium imaging to measure responses of RGC populations to a collection of visual stimuli and CNNs to predict the stimuli that maximally activate a given ganglion cell type. The predictions about feature selectivity are tested and used to generate a hypothesized role in visual function for the RGC type identified as interesting. The paper is impressive; my comments are all related to how the work is presented.

      We thank the reviewer for appreciating our study and for the interesting comments.

      Is the MEI approach needed to identify these cells?

      To briefly summarize the approach, the paper fits a CNN to the measured responses to a range of stimuli, extracts the stimulus (over time, space, and color) that is predicted to produce a maximal response for each RGC type, and then uses these MEIs to investigate coding. This reveals that G28 shows strong selectivity for its own MEI over those of other RGC types. The feature of the G28 responses that differentiate it appears to be its spatially-coextensive chromatic opponency. This distinguishing feature, however, should be relatively easy to discover using more standard approaches.

      The concern here is that the paper could be read as indicating that standard approaches to characterizing feature selectivity do not work and that the MEI/CNN approach is superior. There may be reasons why the latter is true that I missed or were not spelled out clearly. I do think the MEI/CNN approach as used in the paper provides a very nice way to compare feature selectivity across RGC types - and that it seems very well suited in this context. But it is less clear that it is needed for the initial identification of the distinguished response features of the different RGC types. What would be helpful for me, and I suspect for many readers, is a more nuanced and detailed description of where the challenges arise in standard feature identification approaches and where the MEI/CNN approaches help overcome those challenges.

      Thank you for the opportunity for clarification. In fact, the MEI (or an alternative nonlinear approach) is strictly necessary to discover this selectivity: as we show above (response #1 to editorial summary), the traditional linear filter approach does not reveal the color opponency. We realize that this fact was not made sufficiently clear in the initial submission. In the revised manuscript, we now include this analysis. Moreover, throughout the manuscript, we added explanations on the differences between MEIs and standard approaches and more intuitions about how to interpret MEIs. We also added a section to the discussion dedicated to explaining the advantages and limitations of the MEI approach.

      Interpretation of MEI temporal structure

      Some aspects of the extracted MEIs look quite close to those that would be expected from more standard measurements of spatial and temporal filtering. Others - most notably some of the temporal filters - do not. In many of the cells, the temporal filters oscillate much more than linear filters estimated from the same cells. In some instances, this temporal structure appears to vary considerably across cells of the same type (Fig. S2). These issues - both the unusual temporal properties of the MEIs and the heterogeneity across RGCs of the same type - need to be discussed in more detail. Related to this point, it would be nice to understand how much of the difference in responses to MEIs in Figure 4d is from differences in space, time, or chromatic properties. Can you mix and match MEI components to get an estimate of that? This is particularly relevant since G28 responds quite well to the G24 MEI.

      One advantage of the MEI approach is that it allows to distinguish between transient and sustained cells in a way that is not possible with the linear filter approach: Because we seek to maximize activity over an extended period of time, transient cells need to be repetitively stimulated whereas sustained cells will also respond in the absence of multiple contrast changes. In the revised manuscript, we add a section explaining this, together with Figure 3-supplement 2, illustrating this point by showing that oscillations disappear when we optimize the MEI for a short time window. The benefit of a longer time window lies in the increased discriminability between transient and sustained cells, which is also shown in the new supplementary figure.

      Regarding the heterogeneity of MEIs, this is most likely due to heterogeneity within the RGC group: “The mixed non-direction-selective groups G17 and G31 probably contain more than one type, as supported by multiple distinct morphologies and genetic identities (for example, G31,32, Extended Data Fig. 5) or response properties (for example, G17, see below)” (Baden et al. Nature 2016). We added a paragraph in the Results section.

      Concerning the reviewer’s last point: We agree that it is important to know whether the defining feature - i.e., the selectivity for chromatic contrast - is robust against variations in other stimulus properties. New electrophysiological data included in the manuscript (Fig. 6e,f) offers some insights here. We probed G28/tSbC cells with full-field flashed stimuli that varied in chromatic contrast. Despite not matching the cell’s preferred spatial and temporal properties, this stimulus still recovered the cell’s preference for chromatic contrast. While we think it is an interesting direction to systematically quantify the relative importance of temporal, spatial and chromatic MEI properties for an RGC type’s responses, we think this is beyond the scope of this manuscript.

      Explanation of RDM analysis

      I really struggled with the analysis in Figure 5b-c. After reading the text several times, this is what I think is happening. Starting with a given RGC type (#20 in Figure 5b), you take the response of each cell in that group to the MEI of each RGC type, and plot those responses in a space where the axes correspond to responses of each RGC of this type. Then you measure euclidean distance between the responses to a pair of MEIs and collect those distances in the RDM matrix. Whether correct or not, this took some time to arrive at and meant filling in some missing pieces in the text. That section should be expanded considerably.

      We appreciate the reviewer’s efforts to understand this analysis and confirm that they interpreted it correctly. However, we decided to remove the analysis. The point we were trying to make with this analysis is that the transformation implemented by G28/tSbC cells “warps” stimulus space and increases the discriminability of stimuli with similar characteristics like the cell’s MEI. We now make this point in a - we think - more accessible manner by the new analysis about the nonlinearity of G28/tSbC cell’s color opponency (see above).

      Centering of MEIs

      How important is the lack of precise centering of the MEIs when you present them? It would be helpful to have some idea about that - either from direct experiments or using a model.

      In the electrophysiological experiments, the MEIs were centered precisely (now Fig. 5 in revised manuscript) and these experiments yielded almost identical results to the 2P imaging experiments, where the MEIs were presented on a grid to approach the optimal position for the recorded cells. Additionally, all model simulations work with perfectly centered MEIs. We hence conclude that our grid-approach at presenting stimuli provided sufficient precision in stimulus positioning.

      We added this information to the revised manuscript.

      Reviewer #2 (Public Review):

      This paper uses two-photon imaging of mouse ganglion cells responding to chromatic natural scenes along with convolutional neural network (CNN) models fit to the responses of a large set of ganglion cells. The authors analyze CNN models to find the most effective input (MEI) for each ganglion cell as a novel approach to identifying ethological function. From these MEIs they identify chromatic opponent ganglion cells, and then further perform experiments with natural stimuli to interpret the ethological function of those cells. They conclude that a type of chromatic opponent ganglion cell is useful for the detection of the transition from the ground to the sky across the horizon. The experimental techniques, data, and fitting of CNN models are all high quality. However, there are conceptual difficulties with both the use of MEIs to draw conclusions about neural function and the ethological interpretations of experiments and data analyses, as well as a lack of comparison with standard approaches. These bear directly both on the primary conclusions of the paper and on the utility of the new approaches.

      We thank the reviewer for the detailed comments.

      1) Claim of feature detection.

      The color opponent cells are cast as a "feature detector" and the term 'detector' is in the title. However insufficient evidence is given for this, and it seems likely a mischaracterization. An example of a ganglion cell that might qualify as a feature detector is the W3 ganglion cell (Zhang et al., 2012). These cells are mostly silent and only fire if there is differential motion on a mostly featureless background. Although this previous work does not conduct a ROC analysis, the combination of strong nonlinearity and strong selectivity are important here, giving good qualitative support for these cells as participating in the function of detecting differential motion against the sky. In the present case, the color opponent cells respond to many stimuli, not just transitions across the horizon. In addition, for the receiver operator characteristic (ROC) analysis as to whether these cells can discriminate transitions across the horizon, the area under the curve (AUC) is on average 0.68. Although there is not a particular AUC threshold for a detector or diagnostic test to have good discrimination, a value of 0.5 is chance, and values between 0.5 and 0.7 are considered poor discrimination, 'not much better than a coin toss' (Applied Logistic Regression, Hosmer et al., 2013, p. 177). The data in Fig. 6F is also more consistent with a general chromatic opponent cell that is not highly selective. These cells may contribute information to the problem of discriminating sky from ground, but also to many other ethologically relevant visual determinations. Characterizing them as feature detectors seems inappropriate and may distract from other functional roles, although they may participate in feature detection performed at a higher level in the brain.

      The reviewer apparently uses a rather narrow definition of a feature detector. We, however, argue for a broader definition, which, in our view, better captures the selectivities described for RGCs in the literature. For example, while W3 cells have been quite extensively studied, one can probably agree on that so far only a fraction of the possible stimulus space has been explored. Therefore, it cannot be excluded that W3 cells respond also to other features than small dark moving dots, but we (like the reviewer) still refer to it as a feature detector. Or, for instance, direction-selective (DS) RGCs are commonly considered feature detectors (i.e., responsive to a specific motion direction), although they also respond to flashes and spike when null-direction motion is paused (Barlow & Levick J Physiol 1965).

      The G28/tSbC cells’ selectivity for full-field changes in chromatic contrast enables them to encode ground-sky horizon transitions reliably across stimulus parameters (e.g., see new Fig. 7i panel). This cell type is thus well-suited to contribute to detecting context changes, as elicited by ground-sky transitions.

      Therefore, we think that the G28/tSbC RGC can be considered a feature detector and as such, could be used at a higher level in the brain to quickly detect changes in visual context (see also Kerschensteiner Annu Rev Vis Sci 2022). Still, their signals may also be useful for other computations (e.g., defocus, as discussed in our manuscript).

      Regarding the ROC analysis, we acknowledge that an average AUC of .68 may seem comparatively low; however, this is based on the temporally downsampled information (i.e., by way of Ca2+ imaging) gathered from the activity of a single cell. A downstream area would have access to the activity of a local population of cells. This AUC value should therefore be considered a lower bound on the discrimination performance of a downstream area. We now comment on this in the manuscript.

      2) Appropriateness of MEI analysis for interpretations of the neural code.

      There is a fundamental incompatibility between the need to characterize a system with a complex nonlinear CNN and then characterizing cells with a single MEI. MEIs represent the peak in a complex landscape of a nonlinear function, and that peak may or may not occur under natural conditions. For example, MEIs do not account for On-Off cells, On-Off direction selectivity, nonlinear subunits, object motion sensitivity, and many other nonlinear cell properties where multiple visual features are combined. MEIs may be a useful tool for clustering and distinguishing cells, but there is not a compelling reason to think that they are representative of cell function. This is an open question, and thus it should not be assumed as a foundation for the study. This paper potentially speaks to this issue, but there is more work to support the usefulness of the approach. Neural networks enable a large set of analyses to understand complex nonlinear effects in a neural code, and it is well understood that the single-feature approach is inadequate for a full understanding of sensory coding. A great concern is that the message that the MEI is the most important representative statistic directs the field away from the primary promise of the analysis of neural networks and takes us back to the days when only a single sensory feature is appreciated, now the MEI instead of the linear receptive field. It is appropriate to use MEI analyses to create hypotheses for further experimental testing, and the paper does this (and states as much) but it further takes the point of view that the MEI is generally informative as the single best summary of the neural code. The representation similarity analysis (Fig. 5) acts on the unfounded assumption that MEIs are generally representative and conveys this point of view, but it is not clear whether anything useful can be drawn from this analysis, and therefore this analysis does not support the conclusions about changes in the representational space. Overall this figure detracts from the paper and can safely be removed. In addition, in going from MEI analysis to testing ethological function, it should be made much more clear that MEIs may not generally be representative of the neural code, especially when nonlinearities are present that require the use of more complex models such as CNNs, and thus testing with other stimuli are required.

      The reviewer correctly characterizes MEIs as representing the peak in a nonlinear loss landscape that, in this case, describes the neurons’ tuning. As such, the MEI approach is indeed capable of characterizing nonlinear neuronal feature selectivities that are captured by a nonlinear model, such as the CNN we used here. We therefore disagree with the suggestion that MEIs should not be used “when nonlinearities are present that require the use of more complex models such as CNNs”. It is unclear what other “analysis of neural networks” the reviewer refers to. One approach to analyze the predictive neural network are MEIs.

      We also want to clarify that, while the reviewer is correct in stating that the MEI approach as used here only identifies a single peak, this does not mean that it cannot capture neuronal selectivities for a combination of features, as long as this combination of features can be described as a point in high-dimensional stimulus space. In fact, this is demonstrated in our manuscript for the case of G28/tSbC cell’s selectivity for large or full-field, sustained changes in chromatic contrast (a combination of spatial, temporal, and chromatic features). While approaches similar to the one used here generate several diverse exciting inputs (Ding et al. bioRxiv 2023) and could therefore also fully capture On-Off selectivities, we pointed out the limitation of MEIs when describing On-Off cells in the manuscript (both original and revised).

      Regarding the reviewer’s concern that “[...] the message that the MEI is the most important representative statistic [...] takes us back to the days when only a single sensory feature is appreciated”. It was certainly not our intention to proclaim MEIs as the ultimate representation of a cell’s response features and we have clarified this in the revised manuscript. However, we also think that (i) in applying a nonlinear method to extract chromatic, temporal, and spatial response properties from natural movie responses, we go beyond many characterizations that use linear methods to extract spatial or temporal only, achromatic response properties from static, white-noise stimuli. This said, we agree that (ii) expanding around the peak is desirable, and we do that in an additional analysis (new Fig. 6); but that reducing complexity to a manageable degree (at least, at first) is useful and even necessary when discovering novel response properties.

      Concerning the representational similarity analysis (RSA): the point we were trying to make with this analysis is that the transformation implemented by G28 “warps” stimulus space and increases the discriminability of stimuli with similar characteristics like the cell’s MEI. We now made this point in a more accessible fashion through the above-mentioned analysis, where we extended the estimate around the peak. We therefore agree to remove the RSA from the paper.

      In the revised manuscript, we (a) discuss the advantages and limitations of the MEI approach in more detail (in Results and Discussion; see also our reply #1) and (b) replaced the RSA analysis.

      3) Usefulness of MEI approach over alternatives. It is claimed that analyzing the MEI is a useful approach to discovering novel neural coding properties, but to show the usefulness of a new tool, it is important to compare results to the traditional technique. The more standard approach would be to analyze the linear receptive field, which would usually come from the STA of white noise measurement, but here this could come from the linear (or linear-nonlinear) model fit to the natural scene response, or by computing an average linear filter from the natural scene model. It is important to assess whether the same conclusion about color opponency can come from this standard approach using the linear feature (average effective input), and whether the MEIs are qualitatively different from the linear feature. The linear feature should thus be compared to MEIs for Fig. 3 and 4, and the linear feature should be compared with the effects of natural stimuli in terms of chromatic contrast (Fig. 6b). With respect to the representation analysis (Fig. 5), although I don't believe this is meaningful for MEIs, if this analysis remains it should also be compared to a representation analysis using the linear feature. In fact, a representation analysis would be more meaningful when performed using the average linear feature as it summarizes a wider range of stimuli, although the most meaningful analysis would be directly on a broader range of responses, which is what is usually done.

      We agree that the comparison with a linear model is an important validation. Therefore, we performed an additional analysis (see also reply #1, as well as Fig. 6 and corresponding section in the manuscript) which demonstrates that an LN model does not recover the chromatic feature selectivity. This finding supports our claims about the usefulness of the MEI approach over linear approaches.

      Regarding the comment on the representation analysis, as mentioned above, we consider it replaced by the analysis comparing results from an LN model and a nonlinear CNN.

      4) Definition of ethological problem. The ethological problem posed here is the detection of the horizon. The stimuli used do not appear to relate to this problem as they do not include the horizon and only include transitions across the horizon. It is not clear whether these stimuli would ever occur with reasonable frequency, as they would only occur with large vertical saccades, which are less common in mice. More common would be smooth transitions across the horizon, or smaller movements with the horizon present in the image. In this case, cells which have a spatial chromatic opponency (which the authors claim are distinct from the ones studied here) would likely be more important for use in chromatic edge detection or discrimination. Therefore the ethological relevance of any of these analyses remains in question.

      It is further not clear if detection is even the correct problem to consider. The horizon is always present, but the problem is to determine its location, a conclusion that will likely come from a population of cells. This is a distinct problem from detecting a small object, such as a small object against the background of the sky, which may be a more relevant problem to consider.

      Thank you for giving us the opportunity to clear these things up. First, we would like to clarify that we propose that G28/tSbC cells contribute to detecting context changes, such as transitions across the horizon from ground to sky, not to detecting the horizon itself. We acknowledge that we were not clear enough about this in the manuscript and corrected this. To back-up our hypothesis that G28 RGCs contribute to detecting context changes, we performed an additional simulation analysis, which is described in our reply #3 (see above).

      5) Difference in cell type from those previously described. It is claimed that the chromatic opponent cells are different from those previously described based on the MEI analysis, but we cannot conclude this because previous work did not perform an MEI analysis. An analysis should be used that is comparable to previous work, the linear spatiotemporal receptive field should be sufficient. However, there is a concern that because linear features can change with stimulus statistics (Hosoya et al., 2005), a linear feature fit to natural scenes may be different than those from previous studies even for the same cell type. The best approach would likely be presenting a white noise stimulus to the natural scenes model to compute a linear feature, which still carries the assumption that this linear feature from the model fit to a natural stimulus would be comparable to previous studies. If the previous cells have spatial chromatic opponency and the current cells only have chromatic opponency in the center, there should be both types of cells in the current data set. One technical aspect relating to this is that MEIs were space-time separable. Because the center and surround have a different time course, enforcing this separability may suppress sensitivity in the surround. Therefore, it would likely be better if this separability were not enforced in determining whether the current cells are different than previously described cells. As to whether these cells are actually different than those previously described, the authors should consider the following uncited work; (Ekesten Gouras, 2005), which identified chromatic opponent cells in mice in approximate numbers to those here (~ 2%). In addition, (Yin et al., 2009) in guinea pigs and (Michael, 1968) in ground squirrels found color-opponent ganglion cells without effects of a spatial surround as described in the current study.

      First of all, we did not intend to claim to have discovered a completely new type of color-opponent tuning in general; what we were trying to say is that tSbC cells display spatially co-extensive color opponency, a feature selectivity previously not described in this mouse RGC type, and which may be used to signal context changes as elicited by ground-sky transitions.

      Concerning the reviewer’s first argument about a lack of comparability of our results to results previously obtained with a different approach: We think that this is now addressed by the new analysis (new Fig. 6), where we show why linear methods are limited in their capability to recover the type of color opponency that we discovered with the MEI approach.

      Regarding the argument about center-surround opponency, we agree that “if the previous cells have spatial chromatic opponency and the current cells only have chromatic opponency in the center, there should be both types of cells in the current data set”. We did not focus on analyzing center-surround opponency in the present study, but from the MEIs, it is visible that many cells have a stronger antagonistic surround in the green channel compared to the UV channel (see Fig. 4a, example RGCs of G21, G23, G24; Figure 3-supplement 1 example RGCs of G21, G23, G24, G31, G32). Importantly, the MEIs shown in Fig. 4a were also shown in the verification experiment, and had G28 RGCs preferred this kind of stimulus, they would have responded preferentially to these MEIs, which was not the case (Fig. 4f).

      It should also be noted here that, while the model’s filters were space-time separable, we did not impose a restriction on the MEIs to be space-time separable during optimization. However, we analyzed only the rank 1 components of the MEIs (see Methods section Validating MEIs experimentally). since our analysis focused on aspects of retinal processing not contingent on spatiotemporal interactions in the stimulus.

      In summary, we are convinced that our finding of center-opponency in G28 is not an artifact of the methodology.

      We discuss this in the manuscript and add the references mentioned by the reviewer to the respective part of the Discussion.

      Reviewer #3 (Public Review):

      This study aims to discover ethologically relevant feature selectivity of mouse retinal ganglion cells. The authors took an innovative approach that uses large-scale calcium imaging data from retinal ganglion cells stimulated with both artificial and natural visual stimuli to train a convolutional neural network (CNN) model. The resulting CNN model is able to predict stimuli that maximally excite individual ganglion cell types.

      The authors discovered that modeling suggests that the "transient suppressed-by-contrast" ganglion cells are selectively responsive to Green-Off, UV-On contrasts, a feature that signals the transition from the ground to the sky when the animal explores the visual environment. They tested this hypothesis by measuring the responses of these suppressed-by-contrast cells to natural movies, and showed that these cells are preferentially activated by frames containing ground-to-sky transitions and exhibit the highest selectivity of this feature among all ganglion cell types. They further verified this novel feature selectivity by single-cell patch clamp recording.

      This work is of high impact because it establishes a new paradigm for studying feature selectivity in visual neurons. The data and analysis are of high quality and rigor, and the results are convincing. Overall, this is a timely study that leverages rapidly developing AI tools to tackle the complexity of both natural stimuli and neuronal responses and provides new insights into sensory processing.

      We thank the reviewer for appreciating our study.

    1. Author Response

      Reviewer #3 (Public Review):

      This manuscript uses ASO to inhibit the self-cleaving ribozyme within CPEB intron 3 and test its effect on CPEB3 expression and memory consolidation. The authors conclude that the intronic ribozyme negatively affects CPEB3 mRNA splicing and expression, and suggests its implications for experience-induced gene expression underlying learning and memory.

      The strength of the manuscript is in its exploration of a potentially novel mechanism of regulating CPEB3 expression in learning and memory, a combination of both biochemical and behavioral approaches to gain a wide perspective of this regulatory mechanism, and the application of ASO in this context. The introduction is sufficiently detailed. Statistics are thorough and appropriate. If the results could be more robust, the mechanism would provide a novel target and venue to modify learning and memory paradigm.

      The weakness of the manuscript is that the magnitude of the activity-dependent regulation of ribozyme, the effects of ASOs on CPEB3 expression (mRNA and protein) and downstream target gene expression, in vitro and in vivo, are generally weak, raising concerns about the robustness of the result. This may have caused some of the inconsistencies between the data presentation (see below). Also unclear is whether the ribozyme activity is physiologically regulated by experience without ASO interference.

      While the statistics tests support corresponding figure panels and their conclusions. The manuscript can be significantly strengthened by additional evidence, clarification of some methodologies, and reconciling some inconsistent results.

      The premise of a comparable timescale between transcription and ribozyme activity as the foundation of the whole thesis was based on in vitro measurement of self-scission half-life and a broadly generalized transcription rate (which actually varies significantly between genes). This premise is weak and needs direct experimental support.

      The physiological relevance of the proposed mechanism has yet to be demonstrated without ASO interference.

      Fig2b: how were total and uncleaved Ribozymes measured by qRT-PCR? Where are the primers' locations? If the two products were amplified using different primers, their subtraction to derive % cleavage would not be appropriate.

      We thank the reviewer for the thoughtful review. We measured the levels of the total ribozyme by measuring a 220-bp amplicon that starts 18 nts downstream from the ribozyme cleavage site. The uncleaved ribozyme levels were measured using oligos that amplify a region of the intron that starts 45 nts upstream and ends 238 nts downstream of the ribozyme cleavage site. We added this information to the Table of primers in the manuscript. For all PCR oligos we established independent standard curves and calculated RNA levels independently of other amplicons, as noted in the Methods section and now specified in the Results section as well (Page 15). The measurements were thus appropriate for the calculation of the cleaved ribozyme fractions in the various experiments. The fraction ribozyme cleaved was calculated from the uncleaved fraction as the difference between uncleaved fraction and unity (1 – fraction uncleaved), now specified on page 16 of the manuscript. Fraction uncleaved was calculated as [uncleaved ribozyme]/[total ribozyme], as was done previously (see Salehi-Ashtiani et al. Science 313:1788-1792 or Webb et al. Science 326:953).

      Line 400-403: shouldn't ribozyme-blocking ASO prevent ribozyme self-cleavage, and as a result should further increase ribozyme levels? This would contradict the result in fig3a.

      We showed that the ribozyme is inhibited in vitro (Fig. 1F and 1G) and all our data are consistent with ASO inhibition of the ribozyme in cellulo and in vivo. However, we do not have direct evidence for this ribozyme inhibition in vivo, because such an experiment would require a single-molecule FRET-type sensitivity in cells and this assay has not been developed for ribozyme cleavage in cellulo or in vivo. We measured the ribozyme levels by RT-qPCR and observed lower ribozyme levels in presence of ASO in cultured neurons (Fig. 3A) as well as in vivo (Fig. 5B), which is nominally in contrast to the observations in vitro. However, in these situations we do not measure the co-transcriptional fate of the intron or the ribozyme; rather, we measure the levels of the intron after splicing (evidenced by the increased levels of spliced exons 2–3) when the intron is likely already being degraded. We also do not know what effect the ribozyme ASO has on the intron stability once splicing occurs. Understandably, this is a weakness of the study—and we are fully open about this result— however, given the abundance of evidence that the ribozyme ASO leads to increase of CPEB3 mRNA under all conditions tested, we feel that there is strong, if indirect, evidence that our model for the ribozyme function is correct. Future studies will examine this issue closer, but a definitive experimental investigation for the mechanism and timing of ribozyme inhibition and intron degradation is out of scope of this study.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer 1 (Public Review):

      Weakness: Although the cross-links stimulate ATP hydrolysis, further controls are needed to convince me that the TM1 conformations observed in the structures are physiologically relevant, since they have been trapped by "large" substrates covalently-tethered by crosslinks.

      Our response: Reviewer 1 raised concerns about the relatively large size of our covalently attached AAC substrate that would potentially distort TM1 in Pgp. We would like to clarify that AAC has a molecular weight of 462 Da, which, in comparison to many known Pgp substrates ranging from 250 to over 1,000 Da, is not a large compound. For instance, the few other Pgp substrates mentioned in our manuscript all have a comparable or larger size: verapamil, 455 Da; doxorubicin, 544 Da; FK506, 804 Da; valinomycin, 1,111 Da; cyclosporin A, 1,203 Da.

      Furthermore, AAC was strategically attached to a site distant from TM1 in the inwardfacing Pgp conformation. After it was exported to the outward-facing state, several TM helices accommodate the compound. The observation that only TM1 exhibited significant conformational changes suggests its potential role in the transport mechanism. This hypothesis is supported by our findings, where a conservative substitution (G72A) in TM1 resulted in a dramatic loss of transport function for various drug substrates and impaired verapamil-stimulated ATPase activity.

      Reviewer 1 (Recommendations for the Authors):

      I understand the need for an unconventional approach to understanding the translocation pathway. What would help to support this model is to cross-link a much smaller substrate, as the one used is quite large and could potentially distort TM1 in the outward-state when cross-linked.

      Our response: We thank the reviewer for this recommendation, and we have outlined plans for future experiments involving other substrates, including smaller ones, to further investigate our proposed model. However, it is important to acknowledge that conducting these studies will require a significant amount of effort and resources, which we believe extend beyond the scope of our current manuscript.

      In unbiased MD simulations starting from the IF state are there any simulations where the substrate follows the same path as proposed here?

      Our response: All our MD simulations were performed in the outward-facing state to focus on potential substrate release pathways. Starting MD simulations from the inwardfacing state would introduce complexities in capturing the necessary domain motions and nucleotide binding and hydrolysis required for substrate translocations. Therefore, we opted not to perform MD studies starting from the inward-facing state.

      Reviewer 2 (Public Review):

      Weakness: There is much to like about the experimental work here but I am less sanguine on the interpretation. The main idea is to covalently link via disulfide bonds a model tripeptide substrate under different conditions that mimic transport and then image the resulting conformations. The choice of the Pgp cysteine mutants here is critical but also poses questions regarding the interpretation. What seems to be missing, or not reported, is a series of control experiments for further cysteine mutations.

      Our response: Reviewer 2 raised concerns about the interpretation of our results and suggested the need for additional mutant designs to validate our proposed TM1 mechanism. Firstly, we believe that the observed TM1 conformational changes are valid in our cryoEM structures, despite the use of different conditions and several mutants to capture Pgp in the outward-facing state.

      Regarding the G72A mutant, we consider it conclusive that this single point mutation in the TM1 has a profound effect. Importantly, the G72A mutant was readily expressed and purifiable as a stable protein. We were able to resolve a high-resolution structure of the G72A mutant (without the substrate), confirming that the protein is not generally destabilized but properly folded.

      Above all, we appreciate the Reviewer’s suggestion to explore additional mutations and intend to do so in future studies.

      Reviewer 2 (Recommendations for the Authors):

      I am sold on the results regarding TM1 conformational changes as they are evident in the cryoEM structures. However, the set of states compared between mutants are not biochemically equivalent: for 335 and 978 they used an ATP-impaired Pgp whereas for 971 they used what appears to be WT, and the conformation was imaged presumably subsequent to ATP hydrolysis and Vanadate trapping. This is significant if the authors were unable to trap the OF in the impaired mutant background and should be highlighted. I have to believe that they tried that condition but I could be wrong.

      Our response: We acknowledge the point made by the Reviewer about the biochemical equivalence of mutant states and the potential significance of using an ATP-impaired mutant for trapping the outward-facing conformation of 971. We have not yet attempted to use the ATPase-deficient 971C mutant for crosslinking and intend to address this question in future studies.

      In our current approach, we used the ATPase-active 971C for two specific reasons:

      1) Our biochemistry data, as shown in Fig 1C, indicates that 971C only crosslinks in the presence of ATP hydrolysis conditions. Vanadate trapping was employed to stabilize the outward-facing conformation.

      2) Based on our experience, we have observed that the conformations of ATP-bound (mutant) and vanadate-trapped states of an ABC transporter are structurally equivalent at this resolution level of our study (see ref. 21: Hoffmann et al. NATURE 2019).

      The authors propose a new model for substrate translocation. It is based on three mutants and a number of structures. If the authors were not challenging the current dogma I would not have written the next comment. Considering the impact of the findings, I would have designed a couple more cysteine mutants based on their model. For instance, this pathway has a number of stabilizing interactions, can't they make a mutant that preserves conformational switching but eliminates substrate translocation? I like the G97A mutant result but I am worried that the effect could just be a general destabilization or misfolding as part of the cryoEM particles seem to suggest. The authors advance one interpretation of the disorder observed in this mutant but it could easily be my interpretation.

      Our response: We thank the reviewer for the suggestion to design additional mutants to further validate our proposed model for substrate translocation. We agree that this would be highly valuable, considering the potential impact of our findings. However, given the time-intensive nature of our approach, we believe that presenting these additional designs in a future study is a reasonable course of action.

      Regarding the G72A mutation, we believe that our current data fully supports our model and the role of TM1 in regulating the Pgp activity. Importantly, we would like to emphasize that the G72A mutant was readily expressed and purifiable as a stable protein. Additionally, our cryoEM structural determination of the G72A mutant at high resolution confirmed that the protein is not generally destabilized but properly folded.

      There are a couple of troubling methodological questions that I want the authors to address or clarify:

      1. In the methods they report that the final sample for cryoEM was prepared on a SEC devoid of detergent. It is obvious that the sample was folded but I was wondering why the detergent was removed? Was that critical for observing these structures with multiple ligands? Did they observe any lipids in their cryoEM?

      Our response: We avoid detergent in the buffer on final SEC purification. This step is to remove free detergent from the background which helps during cryoEM imaging. Of course, this cannot be done with every detergent but due to the very low CMC of LMNG it is possible. By now, we have verified this method for several other transporters with the same success. While this procedure helps us to obtain better images it is not necessary to obtain specific conformations or ligand bound states, nor does it affect these states or conformations.

      In our cryoEM structures , we did observe multiple cholesterol hemisuccinate (CHS) molecules on the outer transmembrane surface of Pgp.

      1. Can the authors comment on why labeling was carried out in the presence of ATP? Does it matter if the substrate was added prior to ATP and incubated for a few minutes?

      Our response: For every dataset, we first added the substrate to be cross-linked and afterwards added the ATP. In the cases of 335C and 978C, labeling was successful before ATP was added, as evidenced by the inward-facing structures with cross-linked substrate. However, for 971C, cross-linking only occurred after the addition of ATP. We interpret this data to suggest that the 971 site is inaccessible to the substrate in the inward-facing state, and cross-linking can only occur after the transporter transitions to outward-facing state. This is in line with our inward-facing structure which does not show a cross-linked substrate, and our biochemical data shown in Fig 1C, where 971C only crosslinked in the presence of ATP.

      1. I am not an expert on MD simulations and I understand that carrying out simulations at higher temperatures used to be a trick to accelerate the process. Is this still necessary? Why didn't the author use approaches such as WESTPA?

      Our response: Most so-called enhanced sampling methods, including WESTPA, explicitly define a reaction coordinate for the process of interest, usually based on intuition or prior studies. If this coordinate is chosen poorly, enhanced sampling usually fails, either because the sampling becomes inefficient or because the sampling biases the transition pathway (or both). Lacking reliable intuition or prior knowledge on which motions would result in substrate release, we chose temperature to speed up the process. High temperature largely avoids the introduction of an any bias through the definition of a progress coordinate. By contrast, the weighted ensemble method underlying WESTPA is a great method to simulate unbiased dynamics of a process with a known progress coordinate, but unfortunately requires to choose a progress coordinate prior to the simulation and will then mostly sample the process along this progress coordinate, because this is the only direction in which sampling is improved. High temperature MD on the other hand accelerates all processes in the system under study. Indeed, we have now confirmed that the pathway found at high temperature is also feasible at near-ambient conditions.

      In new simulations, we have now observed a similar release pathway at T=330 K. As the only difference, the substrate has not fully dissociated from the protein after 2.5 us, with weak interactions persisting at the top part of TM1 from the extracellular side. Importantly, this is a configuration observed also in higher temperature simulations but with much shorter lifetime.

      In response, we now included these new findings and a new Extended Data Fig. 15 in the revised manuscript.

      1. One way to show that the two substrates binding mode is biochemically relevant is to measure Vmax at different substrate concentrations. One would expect a cooperative transition if that interaction is mechanistically important.<br /> Our response: We have measured Vmax as a function of QZ-Ala concentration in a previous report (ref. 24), supporting positive cooperativity for binding to two sites.

      Reviewer 3 (Public Review):

      We thank Reviewer 3 for recommending the acceptance of our manuscript as is.

      Reviewer 3 (Recommendations for the Authors):

      Page 4, last line: Pgp302 should be Pgp1302. In addition, I can only encourage the authors to add an additional table to the manuscript. Here, the mutation, the obtained structure(s), IF or OF, the resolution, and the main message should be summarized.

      Our response: Following the reviewer’s suggestion, we have added Extended Data Table 2 summarizing the Pgp mutants and respective structural data in the revised manuscript.<br /> We verified that Pgp302 is the correct term on Page 4, last line.

      Pg. 5, section 'Covalent ligand design for Pgp labeling', it is mentioned that even in the presence of Mg2+ATP, Pgp302 could not react with AAC-DNPT. Maybe it would be worthwhile to add the data either in Supplementary Information or state 'data not shown'.

      Our response: We stated ‘data not shown’ in the text.

      Pg. 47, last line : A space is missing between M68, and M74.

      Our response: Space was added.

      Pg. 7, line 2: The authors mention that a single dataset of ATP-bound Pgp335 revealed three different OF conformations: ligand-free, single-ligand-bound, and double-ligandbound. However, the percentage fraction of each dataset sums up to be more than 100%. Would request the authors to recalculate the fraction size of each conformation.

      Our response: We have corrected the error in our calculation, based on the particle distribution in our dataset (OF335-nolig: 1,437,110 particles, 40.4%; OF335-1lig: 1,184,253 particles, 33.3%; and OF335-2lig: 939,924 particles, 26.4%).

      Pg 53, Figure legend of Extended Data Fig. 11: Please include the color coding for the helix TM1 and also the residues colored plum.

      Our response: We added the color coding for TM1 and other residues in the figure legend.

      Pg. 8, line 3: While referring to the structure of OF971-1lig, the authors nicely point towards the conserved residues M74 and F78 which coordinate the ligand. However, in Fig. 3b, residues M74 and F78 should also be indicated.

      Our response: We updated Fig. 3b by adding arrows pointing towards the residues M74 and F78.

      Pg. 54, Extended data Fig. 12: The authors should adopt a single writing style. In some places, Pgp is referred to as P-gp while in others as Pgp.

      Our response: We updated the protein labels in Extended Data Fig. 12.

      Pg. 54, Extended data Fig. 12: The authors should clearly mention which OF335 structure (1st panel) was used for visualizing the interactions.

      Our response: To clarify, we added the following sentences in the figure legend: “Pgp335 OF in the top panel refers to OF335-1lig. In the bottom panel describing OF335-2lig, the left and right diagrams refer to the binding positions of non-covalent and covalent ligand, respectively”.

      Pg. 18, section 'synthesis of dipeptide 8': In the text it is mentioned that for the synthesis of thiazole acid 6, compound 3 was dissolved in a mixture of THF/MeOH/H2O (3:1:1), while in the corresponding figure (Extended Data Fig. 1), the ratio is stated as 5:1:2.

      Our response: 3:1:1 ratio is correct. We made the correction in Extended Data Fig. 1.

      Pg. 19, section 'synthesis of linear tripeptide 10': Same as above for compounds 10 and 4, respectively.

      Our response: We corrected the conditions in the Extended Data Fig. 1 accordingly.

      Pg. 20, section 'Synthesis of cyclic peptide 11': There seems to be a discrepancy in the synthesis protocol between the text and the extended figure 1, especially regarding the use of THF/MeOH/H20, followed by NaOH and TFA or only NaOH and TFA.

      Our response: we further clarified the conditions of using NaOH in THF/MeOH/H2O (3:1:1) and TFA in DCM in the text for synthesis and Extend Data Fig. 1.

      Pg. 40, Extended Data Fig. 1: In the bottom last panel showing the synthesis of peptide 11, the authors have missed showing peptide 10 as the starting material for the reaction.

      Our response: Label for the peptide 10 was added following the suggestion.

      Pg. 26, third last line: 'o' is missing from the last word cry'o'

      Our response: We corrected the typo.

      Pg. 63 and 64, Extended Data Table 1: The Cryo-EM data collection, refinement, and validation statistics for OF971-1lig, IF971-1lig, OF978-1lig, and IF978-2lig are mentioned twice in the table.

      Our response: This was now corrected in the revision.

    2. eLife assessment

      P-glycoprotein is a major ABC-transporter that exports drugs used in chemotherpay and effects the pharmacokinetics of other drugs. Here the authors have determined cryo-EM structures of drug complexes in previously unforeseen outward-facing conformations. These convincing findings are mechanistically important and reveal potential regions to be exploited by rational-based drug design.

    3. Reviewer #1 (Public Review):

      Summary<br /> Here the authors have tethered a Pgp substrate to strategically place cysteine residues in the protein. Notably, the cysteine-linked substate (ANC-DNPT)- stimulate ATP hydrolyse and so are able to undergo IF to OF transitions. The authors then determined cryo-EM structures of these complexes and MD simulations of bound states. By capturing unforeseen OF conformations with substate they propose that TM1 undergoes local conformational changes that are sufficient to translocate substrates, rather than large bundle movements.

      Strengths: This paper provides the first substrate (ANC-DNPT)- bound conformations of PgP and a new mechanistic model of how substrates are translocated.

      Weaknesses: Although the cross-links stimulate ATP hydrolysis, it is unclear if the TM1 conformations are exactly the same under physiological conditions, since they have been covalently-trapped to the substrate.

    4. Reviewer #3 (Public Review):

      Summary: The authors used cross-linking of a known P-gp substrate in combination with single particle cyro-EM to investigate the translocation pathway of this important ABC transporter. Based on the results of this study, a new translocation mechanism is proposed that is supported by the data. While only one substrate was used, the data obtained are convincing. In addition, the proposed model will stimulate new experiments from other laboratories to proof or disproof the model.

      Strengths: the combination of cross-linking and structural biology allowed novel insights in the translocation pathway of ABCB1

      Weaknesses: While only one substrate was used, the data obtained are convincing. In addition, the proposed model will stimulate new experiments from other laboratories to proof or disproof the model.

    1. Author Response

      The following is the authors’ response to the previous reviews.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the Authors):

      The authors have addressed my recommendations in the previous review round in a satisfactory way. I only have one additional comment to the authors:

      In the manuscript abstract lines 31-32, the author state that: "Using NIH data for the period 2006-2022, we report that ~230 K99 awards were made every year, representing ~$25 million annually."-- The "~$25 million" is under-stating the actual funds spent because this sum is just money spent on the first year of some k99s while the NIH is paying years 2,3,4 etc for others for k99 awards (~90% conversion rate to R00) awarded in previous years for a given year. The NIH is actually spending ~$230-$250 million a year on the k99 award mechanism in a given year. so the authors need to amend the stated amount in the manuscript.

      Thank you for pointing this out. The reviewer is correct, that we had incorrectly only calculated the investment $ in new K99 awards made. We have corrected this in the revised manuscript. We appreciate your careful reading of our manuscript and the edits made based on your comments have improved the final version.

      Reviewer #2 (Recommendations for the Authors):

      Thank you for taking the time to revise this important work. I learned a lot reading this paper a second time, and appreciate the improvements you have made.

      My only major thought while re-reading this is that I wish you all had written two papers! I see two themes in this work: one looking at faculty hiring networks from the Wapman et al. dataset, and another at K99/R00 conversions by institution, gender, and researcher mobility and its impact on subsequent funding success. After reading, I felt like I had many follow-up questions about both analyses, but it would be impractical for me to suggest all these follow-up analyses without making your paper unreasonably long.

      Thank you for these comments. We agree that there are 2 general themes in this paper. While we feel that significantly expanding on both themes will be important in future research. Our hope is that this work continues to inspire others to critically examine funding practices and inequity in the same way that the work of Wapman, Pickett, etc. inspired the present work.

      For example, regarding the results that more R00 are activated at different institutions, and that moving institutions improves subsequent funding success, I wonder: Do proportionally more women or men move institutions? Do proportionally more K99 awardees at less-funded places move for their R00, or less? The Cox proportional hazard models illustrate the impact of various characteristics on subsequent funding success, but they do not illustrate disparate impacts of mobility on different groups (if I am understanding them correctly). (You sort of dive into these questions in the very interesting subsection, "K99/R00 awardee self-hires are more common at institutions with top NIH funding." I wanted to read more!)

      Thank you for these kind comments. These are fantastic follow-up questions. We do not feel that we can adequately address them within the present manuscript without potentially splitting it into 2 separate manuscripts. However, we may examine these in future analyses. We are particularly interested in examining additional aspects such as how the K99 MOSAIC funding mechanism may differ from the traditional K99 mechanism. Since the K99 MOSAIC mechanism is newer, there may not be enough K99 MOSAIC awards made for a thorough exploration.

      As another example, for your analysis on faculty hiring networks, the prevalence of self-hiring amongst institutions and regions was one finding. However, this finding seems somewhat at odds with the previous takeaway about how researcher mobility improves subsequent funding success. Are institutions doing themselves a disfavor by hiring their own, then? I suspect there is more to say here about this pattern... maybe there are important differences between PhD institution and postdoc institution and its impact on hiring/subsequent funding success? Or is this a story about upward mobility into the top 25 well-funded NIH institutions?

      Again, these are very insightful comments and follow-up questions. We hope to address these in potential future manuscripts. We also hope that others may become interested in finding answers to these questions by exploring our dataset as well as other publicly available datasets such as the Wapman et al. dataset.

      I can completely understand how combining the faculty hiring network analysis with the K99/R00 conversions would seem like a natural fit, but I personally feel - emphasis on this being a personal opinion - that there would have been benefits to giving more space to the details of both analyses separately. Perhaps this is a "hindsight is 20/20" issue. Or an issue with the current times in which ones' brain can only hold so many main takeaways from a single body of work. (For example, I struggled to summarize your paper in my public review because I find so many takeaways important.)

      I suppose this is all to say that I find your work important enough to warrant additional follow-up work! :)

      Thank you for these very kind remarks. This work evolved over 8-10 months as evidenced by the updates to the biorXiv preprint. With unlimited time and foresight, it would probably be best to have separated the 2 themes into separate manuscripts and expanded both. Given current constraints, we plan to make some changes/updates to the present manuscript and hopefully include more in-depth analyses on each theme in future works. Thank you again for the thoughtful reading and critique of both our original manuscript and the revised version.

      Minor comments/questions:

      "K99 to R00 conversions are increasing in time"

      • Assuming I am interpreting the figures correctly, in my opinion, the most important takeaway is that the number of R00 awards have increased, but only for awardees moving to another institution. This key result, best illustrated by panels A and C of Figure 1, is buried in the long paragraph in this section. The organization of content in this section could be improved and more focused. Consider renaming this subsection to be more declarative: "K99 tR00 conversions have increased, but only for awardees moving to another institution."

      This is a very concise interpretation of this data. We have edited the paragraph referenced by the reviewer, split it into 2 paragraphs, and changed the title to “K99 awardees increasingly move to other institutions for R00 awards from 2008 to 2022” and the final sentence to “Thus, the number of K99 to R00 conversions is consistent over time, but increasingly more R00 awardees have moved to other institutions since 2013”

      • Similarly, I personally found the current title of the subsection, "K99 to R00 conversions are increasing with time" is mildly confusing. An R00 award indicates a successful conversion, so why not simply call this an R00 award instead of saying K99-to-R00 conversion? Also, when I look at Figure 1B and exclude the conversion rates for 2007 and 2008 (because this is a 3 year rolling average), I see that conversion rates (or R00 awards) have remained stagnant. This comment is very much in-the-weeds and is mainly to do with clarity of language.

      Thank you for these comments. We had “K99 to R00 conversion” to highlight the unique nature of this award mechanism that a person can only receive an R00 if they previously had a K99 award. Nevertheless, we have edited the text to “R00 awards” and “R00 awardees” to simplify things. We also want to note that we did not compute a 3-year rolling average. The function we used was: (X/(Y -1))x100 where X is the number of R00 awards made in a year and Y is the number of K99 awards made in a year. We did note an error in our calculation in the previous version of the manuscript. Previously, we included all R00 awards and K99 awards for each year from the NIH Reporter dataset; however, this is a flawed methodology. NIH reporter includes only extramural K99 award data and extramural R00 awards, but intramural K99 awardees can receive extramural R00 awards and thus are only included in the R00 dataset. There were 141 R00 awardees in our dataset from NIH Reporter that did not have K99 data, so we assume these are intramural K99 awards since it is required to have a K99 to be eligible for the R00 award. Since we do not know the awarding year for intramural K99 awardees or have data on intramural K99 awardees that fail to activate the R00 award (or stay internal at NIH), we have excluded these 141 R00 awardees. In the previous version, this mis-calculation exaggerated rolling conversion rate (we had correctly calculated the 78% total conversion rate). We re-analyzed our rolling conversion rate and found the average is 81.8% (excluding the first 2 years of the K99 program and the last 2 years).

      This is a long explanation, but essentially, we overestimated the number of R00 awards which inadvertently increased the rolling conversion rate. We have corrected this and simplified the first 2 paragraphs of the Results section.

      • I was also mildly confused looking at Figure 1c. The caption says that the percentages represent the K99 awardees that stayed at the same institution for the R00 activation, but the percentages are next to the solid circles which the legend labels as "different institution." Perhaps another or different way to show this is a stacked bar chart, where one bar represents the percentage of R00 awards activated at the same institution and another bar represents the percentage of R00 awards activated at a different institution. The bars always add to 100% but the change in proportions illustrates that proportionally fewer awards are being made to those remaining at the same institution.

      Great idea. We have included a stacked bar chart here. Since the stacked bar chart is percentages, we felt it was important to also show the total numbers so we still included the previous chart also but removed the percentage numbers from it. We also changed the departmental analysis to stacked bar charts. This shows the stark difference between 2008-2012 and 2013 onward. These changes were made in the revised Fig. 1.

      • Minor question: I would love to see Table 3 and Table 4 as a time-series. Has the proportion of recipients at various institution types changed with time?

      This is a great suggestion and we felt it fit best in Figure 5, so we’ve added it there.

      • Table 3 is useful but only indirectly addresses my first "Recommendation to the Authors" from my previous review. I did some number crunching myself from the data provided. Assuming I did this correctly: If you're a K99 awardee at a private institute, you had a 76.3% change of getting an R00 compared to 80.4% for a K99 awardee at a public institution. If you're a K99 awardee at a top-funded institution, you had a 76.8% chance of R00 compared to 78.6% for a lower-funded institution. I would have liked to see more figures and tables to illustrate conversion rates by institution type in this way. Interestingly, to me, these data suggest that there are not enormous conversion rate differences by institution type (though looking at these now, I am confused at the 89% statistic in line 174 and where that comes form, since it is much higher than what I've calculated).

      Thank you for this suggestion and these comments. Please see above where we describe how we incorrectly overestimated the 89% statistic. This has been corrected. As the reviewer suggested, we now show yearly percent of grants to specific institution types in the revised Figure 5. We agree with the reviewer that showing the conversion rate by institution type is interesting; however, it is fairly obvious from the new panels in Figure 5 that there is not much difference in conversion rate. Thus, to avoid crowding too many panels into the figure, we opted to keep the stacked bar plot.

      Reviewer #3 (Recommendations for the Authors):

      -One minor change to Figure 1C would be to switch the color coding for the lines so that they match with 1D whereby "same institution" would be white circles, or whatever the authors decide would be best for consistency since they are similar comparisons.

      Thank you for this suggestion. We have corrected this to be consistent.

      -Minor note for lines 459-461: I would suggest changing the wording to "intersectional inequalities" as it is not that a scientist's identities impact their careers as much as how those identities are positioned within an unequal opportunity structure and differentially treated that produce varying career trajectories and experiences of marginalization and cumulative (dis)advantages.

      Thank you and we agree with you. We have made this correction.

      -To carry forward a suggestion for the authors in my previous review, future research that more fully explores the research infrastructure of institutions for how top NIH funded institutions continue to be top funded institutions year after year could help clarify some of the career mobility and same/similar institution hiring found in the data. Rather than hand coding institutions for some of the infrastructure, the National Center for Education Statistics' Integrated Postsecondary Education Data System (IPEDS) has data on colleges and universities including whether they operate a hospital, have a medical degree, and many other interesting data about student and faculty demographics, institutional expenditures (including research budgets), and degrees awarded in different fields of study (undergrad and grad) that may be helpful to the authors as they continue their research stream in this area.

      Thank you very much. We will look into this data set as we continue our investigations in this area.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      The discussion seems to imply that the ball-and-chain peptide is or is related to the common gate. (Although it isn't stated explicitly, it is implied based on the presentation of the gating model in Figure 8 immediately after the discussion of common gating, and the simultaneous opening of both pores in Figure 8). What does the asymmetric structure say about the relationship between the N-term peptide and common gating in ClC-2? It seems like this structure suggests that the CTDs can independently rotate, and independently bind N-terminal peptide, which might not be expected to impact both pores. Some additional clarification and/or discussion of these ideas could be helpful here.

      We thank the reviewer for raising these very important points. We agree we should have been more explicit and have now expanded our discussion on this topic, highlighting the independent movement of the N-term peptide and CTDs and clarifying that it is currently unknown whether CLC-2 has a common gate (lines 431484).

      Discussion of "Revised Framework for CLC-2 gating": I think this would be a little easier to follow if most of the legend from Figure 8 was in the main text at the end of that section. Also, additional labels in Figure 8 (of the glutamates, the N-terminal peptide, and what the CTD arrows represent).

      We have revised this section of the text and added labels to the (revised) Figure as suggested.

      Line 261: typo, misspelling of "hydrogen"

      Fixed. (Now line 279.)

      Figure 6 - supplement 2B: Looks like an error in numbering y-axis - should be 90/120/150, I think. Can you show the three data points for the WT initial current rectification? Can you clarify whether the 3 that you are analyzing are the ones where AK42 the AK42 "zero current" level is not more than the initial positive current?

      We apologize for this error, which arose from the Y-axis label overlapping the tick labels, so 90/120/150 showed as 90/20/50. We have fixed this error and have added a new panel (C) to show three data points for the WT initial current rectification. In the Figure legend to panel C, we clarify that the 3 experiments we analyzed are the ones where the AK-42 current level is not more than the initial current at 80 mV.

      Reviewer #2 (Recommendations For The Authors):

      1. It appears from a close inspection of Figure 2 that the TM dimer is not quite symmetric, but I couldn't tell for sure from the figures as presented. No comment is made in the methods about symmetry imposed, and the authors explicitly comment on asymmetry in the cytoplasmic domain. It would be useful to have an explicit discussion of the TM dimer symmetry.

      We have now explicitly stated that the TM dimer is symmetric, and we have clarified the wording in the Methods:

      Main text, line 81: "The TM region of CLC-2 displays a typical CLC family symmetric homodimeric structure, with each subunit containing an independent Cl– pathway (Figure 2A, B)."

      Methods (lines 557-558): "The following ab initio reconstruction and 3D refinement (for all structures presented in this paper) were performed with C1 symmetry (no symmetry imposed)."

      1. For the simulations in Figure 5 Supplement 2, the N terminus flexibility is shown, but this of course can't be compared to a control. However, given the structural results, one might expect the JK helix to show changes in flexibility/mobility in the apo vs inactivated structures. Is this observed?

      We agree that the structures strongly suggest the JK-helix is not as stable without the N-terminus bound. We did not perform comparative simulations on the JK helix in the apo vs inactivated structures. While we agree this could be of interest, we don’t think it is essential to our conclusions, and the simulations might need to be quite long to adequately capture dynamics of the JK helix. [In the simulation results shown in Figure 5 Supplement 2, our aim was to test the validity of the structure by determining whether the N-terminus remains bound to the channel in simulations. The plot shows that the N-terminus stays in the same binding pose with an average RMSD (to the initial structure) of less than 2 angstroms, which is generally considered to be relatively stable.]

      1. I find the section "revised framework for ClC-2 gating" to be wanting. The ideas are illustrated in the cartoon, but should also be laid out in the text. In what ways are you revising the framework, and in what aspects are you carrying through ideas already proposed?

      Thank you for raising this point, which was also raised by Reviewer 1. We have revised this section and the accompanying Figure (Figure 8 and Lines 431-484).

      1. The authors mention in passing the idea that the hairpin could contribute to inward rectification (lines 227/8), but also suggest a role for the gating glutamate in this process. They also mention the idea of a common gate, but don't flesh out its function very much. These possibilities are very interesting and should be substantially fleshed out in the "framework" section, even if they cannot be fully answered yet.

      We have expanded on these points in the “framework” section.

      1. Figure 6E. points representing individual experiments should be shown.

      We added points representing individual experiments for Delta N (normalized to WT) in the surface-expression experiments in Figure 6E. Individual data points for the electrophysiology experiments are in panel C; we did not replot these in panel E because some of the points would have been off scale.

      1. The density in Figure 2A is hard to see, is there a better way to display it? Also, the orientation of the rightmost panel in Figure 2C is difficult to interpret.

      We revised 2A to make the density easier to see. We revised Figure 2C so that the middle and rightmost panels have the same orientation.

      1. P6. Line 87. This sentence is a little confusing, and perhaps could be a little clearer-the density is consistent with a Cl- ion, but no experiments have been done to support this, no?

      We have clarified the wording as suggested (now line 89) and added references supporting Clˉ binding to the Sext site in CLCs (line 90).

      1. P6 lines 89-98. Two lines of evidence, the conformation of the gate and the pinch point, both point to the structure representing a closed state. The wording as presented is a little hard to follow.

      We have revised the wording in this paragraph (lines 92-111)

      1. It's hard to distinguish water protons and oxygens in the lower right panel (QQQ).

      We revised this panel (in Figure 3 – figure supplement 2) to better distinguish the water protons and oxygens.

      Reviewer #3 (Recommendations For The Authors):

      A few points to consider for improving the manuscript

      1. It is intriguing that in the AK-42 structure, there is no density for the hairpin loop even though the CTD is in a symmetrical conformation as the apo. The authors could perhaps comment on whether there is any difference in the rectification properties of currents (or run-up) upon unblocking of AK-42 which may suggest that the hairpin binding is prevented by AK-42.

      We have not yet performed the suggested experiment nor any experiments to examine state-dependence, though we agree such experiments would be informative. We have added a note on this point in the discussion, lines 334-337.

      1. Although the conformation-dependent placement of the hairpin loop is convincing based on the density, the sequence assigned to this region is not conclusive.

      To strengthen our conclusion concerning the hairpin assignment, we investigated fits of peptide segments from the disordered sections of the C-terminal cytoplasmic domain to the hairpin density. We found that these fits are not as good as that with the N-terminal peptide. This analysis is described in lines 179-181 and a new figure (Figure 5 – figure supplement 1). We appreciate the reviewer’s point that it is extremely difficult to conclusively assign residues that are not contiguous with the rest of the structure. Nevertheless, given the wide variety of evidence all pointing to the conclusion that the hairpin loop corresponds to residues 14-28, we think the assignment is on strong footing. We respectfully ask that you consider removing this criticism from the public review, as we think it will hinder the casual reader from recognizing the strength of the evidence: (1) of unresolved regions in CLC-2, residues 14-28 fit best; (2) residues 14-28 were previously identified as part of the ball blocking region (lines 158-161); (3) MD simulations support that the N-terminal residues stay stably bound (Figure 5 – figure supplement 4) (4) gain-of-function disease causing mutations map onto either the Nterminal residues or interacting residues on the TM domain (Figure 5 – figure supplement 6). Thank you for considering this request.

      1. The authors should comment on the physiological relevance of the CBS domain rearrangements during gating.

      We have added this sentence (lines 131-133): “The physiological relevance of C-terminal domain rearrangements is suggested by disease-causing mutations that alter channel gating (Estevez et al., 2004; Brenes et al., 2023).”

      1. For the figures with cryo-EM maps, indicate the contour levels.

      Contour levels are now indicated in the Figure legends.

      1. It will be useful to the electrostatic map of the N-terminal peptide and the docking site.

      This is now shown in Figure 5 – figure supplement 3 and Video 5.

      1. Include a comment on the recent CLC-2 /AK-42 structure and if there are any differences in the structural features.

      We added this text to lines 273-274: “The RMSD between our CLC2-TM-AK42 structure and that of Ma et al. is 0.655 Å, and the RMSD between the apo TM structures is 0.756 Å.”

    1. Author Response

      The following is the authors’ response to the previous reviews.

      eLife assessment

      The paper contains some useful analysis of existing data but there are concerns regarding the conclusion that there might be alternative mechanisms for determining the location of origins of DNA replication in human cells compared to the well known mechanism known from many eukaryotic systems, including yeast, Xenopus, C. elegans and Drosophila. The lack of overlap between binding sites for ORC1 and ORC2, which are known to form a complex in human cells, is a particular concern and points to the evidence for the accurate localization of their binding sites in the genome being incomplete.

      Public Reviews:

      Reviewer #1 (Public Review):

      In the best genetically and biochemically understood model of eukaryotic DNA replication, the budding yeast, Saccharomyces cerevisiae, the genomic locations at which DNA replication initiates are determined by a specific sequence motif. These motifs, or ARS elements, are bound by the origin recognition complex (ORC). ORC is required for loading of the initially inactive MCM helicase during origin licensing in G1. In human cells, ORC does not have a specific sequence binding domain and origin specification is not specified by a defined motif. There have thus been great efforts over many years to try to understand the determinants of DNA replication initiation in human cells using a variety of approaches, which have gradually become more refined over time.

      In this manuscript Tian et al. combine data from multiple previous studies using a range of techniques for identifying sites of replication initiation to identify conserved features of replication origins and to examine the relationship between origins and sites of ORC binding in the human genome. The authors identify a) conserved features of replication origins e.g. association with GC-rich sequences, open chromatin, promoters and CTCF binding sites. These associations have already been described in multiple earlier studies. They also examine the relationship of their determined origins and ORC binding sites and conclude that there is no relationship between sites of ORC binding and DNA replication initiation. While the conclusions concerning genomic features of origins are not novel, if true, a clear lack of colocalization of ORC and origins would be a striking finding. However, the majority of the datasets used do not report replication origins, but rather broad zones in which replication origins fire. Rather than refining the localisation of origins, the approach of combining diverse methods that monitor different objects related to DNA replication leads to a base dataset that is highly flawed and cannot support the conclusions that are drawn, as explained in more detail below.

      Response: We are using the narrowly defined SNS-seq peaks as the gold standard origins and making sure to focus in on those that fall within the initiation zones defined by other methods. The objective is to make a list of the most reproducible origins. Unlike what the reviewer states, this actually refines the dataset to focus on the SNS origins that have also been reproduced by the other methods in multiple cell lines. We have changed the last box of Fig. 1A to make this clearer: Shared origins = reproducible SNS-seq origins that are contained in initiation zones defined by Repli-seq, OK-seq and Bubble-seq. This and the Fig. 2B (as it is) will make our strategy clearer.

      Methods to determine sites at which DNA replication is initiated can be divided into two groups based on the genomic resolution at which they operate. Techniques such as bubble-seq, ok-seq can localise zones of replication initiation in the range ~50kb. Such zones may contain many replication origins. Conversely, techniques such as SNS-seq and ini-seq can localise replication origins down to less than 1kb. Indeed, the application of these different approaches has led to a degree of controversy in the field about whether human replication does indeed initiate at discrete sites (origins), or whether it initiates randomly in large zones with no recurrent sites being used. However, more recent work has shown that elements of both models are correct i.e. there are recurrent and efficient sites of replication initiation in the human genome, but these tend to be clustered and correspond to the demonstrated initiation zones (Guilbaud et al., 2022).

      These different scales and methodologies are important when considering the approach of Tian et al. The premise that combining all available data from five techniques will increase accuracy and confidence in identifying the most important origins is flawed for two principal reasons. First, as noted above, of the different techniques combined in this manuscript, only SNS-seq can actually identify origins rather than initiation zones. It is the former that matters when comparing sites of ORC binding with replication origin sites, if a conclusion is to be drawn that the two do not co-localise.

      Response: We agree. So the reviewer should agree that our method of finding SNS-seq peaks that fall within initiation zones actually refines the origins to find the most reproducible origins. We are not losing the spatial precision of the SNS-seq peaks.

      Second, the authors give equal weight to all datasets. Certainly, in the case of SNS-seq, this is not appropriate. The technique has evolved over the years and some earlier versions have significantly different technical designs that may impact the reliability and/or resolution of the results e.g. in Foulk et al. (Foulk et al., 2015), lambda exonuclease was added to single stranded DNA from a total genomic preparation rather than purified nascent strands), which may lead to significantly different digestion patterns (ie underdigestion). Curiously, the authors do not make the best use of the largest SNS-seq dataset (Akerman et al., 2020) by ignoring these authors separation of core and stochastic origins. By blending all data together any separation of signal and noise is lost. Further, I am surprised that the authors have chosen not to use data and analysis from a recent study that provides subsets of the most highly used and efficient origins in the human genome, at high resolution (Guilbaud et al., 2022).

      Response: 1) We are using the data from Akerman et al., 2020: Dataset GSE128477 in Supplemental Table 1. We have now separately examined the core origins defined by the authors to check its overlap with ORC binding (Supplementary Fig. S8b)

      2) To take into account the refinement of the SNS-seq methods through the years, we actually included in our study only those SNS-seq studies after 2018, well after the lambda exonuclease method was introduced. Indeed, all 66 of SNS-seq datasets we used were obtained after the lambda exonuclease digestion step. To reiterate, we recognize that there may be many false positives in the individual origin mapping datasets. Our focus is on the True positives, the SNS-seq peaks that have some support from multiple SNS-seq studies AND fall within the initiation zones defined by the independent means of origin mapping (described in Fig. 1A and 2B). These True positives are most likely to be real and reproducible origins and should be expected to be near ORC binding sites.

      We have changed the last box of Fig. 1A to make this clearer: Shared origins = reproducible SNS-seq origins that are contained in initiation zones defined by Repli-seq, OK-seq or Bubble-seq.

      Ini-seq by Torsten Krude and co-workers (Guillbaud, 2022) does NOT use Lambda exonuclease digestion. So using Ini-seq defined origins is at odds with the suggestion above that we focus only on SNS-seq datasets that use Lambda exonuclease. However, Ini-seq identifies a much smaller subset of SNS-seq origins, so, as requested, we have also done the analysis with just that smaller set of origins, and it does show a better proximity to ORC binding sites, though even then the ORC proximate origins account for only 30% of the Ini-seq2 origins (Supplementary Fig. S8d). Note Ini-seq2 identifies DNA replication initiation sites seen in vitro on isolated nuclei.

      Update in response to authors' comments on the original review:

      While the authors have clarified their approach to some aspects of their analysis, I believe they and I are just going to have to disagree about the methodology and conclusions of this work. I do not find the authors responses sufficiently compelling to change my mind about the significance of the study or veracity of the conclusions. In my opinion, the method for identification of strong origins is not robust and of insufficient resolution. In addition, the resolution and the overlap of the MCM Chip-seq datasets is poor. While the conclusion of the paper would indeed be striking and surprising if true, I am not at all persuaded that it is based on the presented data.

      Reviewer #2 (Public Review):

      Tian et al. performed a meta-analysis of 113 genome-wide origin profile datasets in humans to assess the reproducibility of experimental techniques and shared genomics features of origins. Techniques to map DNA replication sites have quickly evolved over the last decade, yet little is known about how these methods fare against each other (pros and cons), nor how consistent their maps are. The authors show that high-confidence origins recapitulate several known features of origins (e.g., correspondence with open chromatin, overlap with transcriptional promoters, CTCF binding sites). However, surprisingly, they find little overlap between ORC/MCM binding sites and origin locations.

      Overall, this meta-analysis provides the field with a good assessment of the current state of experimental techniques and their reproducibility, but I am worried about: (a) whether we've learned any new biology from this analysis; (b) how binding sites and origin locations can be so mismatched, in light of numerous studies that suggest otherwise; and (c) some methodological details described below.

      • I understand better the inclusion/exclusion logic for the samples. But I'm still not sure about the fragments. As the authors wrote, there is both noise and stochasticity; the former is not important but the latter is essential to include. How can these two be differentiated, and what may be the expected overlap as a function of different stochasticity rates?

      It is difficult to separate the effect of noise from the effect of stochastic firing of origins. We therefore took the simplest approach: focus only on the most reproducible origins (shared origins) and ignore the non-reproducible origins. At least the most reproducible origins can be used to test the hypotheses regarding origin firing.

      • Many of the major genomic features analyzed have already been found to be associated with origin sites. For example, the correspondence with TSS has been reported before:

      https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6320713/

      https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6547456/

      • Line 250: The most surprising finding is that there is little overlap between ORC/MCM binding sites and origin locations. The authors speculate that the overlap between ORC1 and ORC2 could be low because they come from different cell types. Equally concerning is the lack of overlap with MCM. If true, these are potentially major discoveries that butts heads with numerous other studies that have suggested otherwise.

      The key missing dataset is ORC1 and ORC2 CHiP-seq from the same cell type. This shouldn't be too expensive to perform, and I hope someone performs this test soon. Without this, I remain on the fence about how much existing datasets are "junk" vs how much the prevailing hypothesis about replication needs to be revisited. Nonetheless, the authors do perform a nice analysis showing that existing techniques should be carefully used and interpreted.

      We agree that a thorough set of ChIP-seq data (with multiple antibodies or with equivalent techniques that do not use antibodies) for all six subunits of ORC in mammalian cells will be very useful for the field. Note, though, that just by simple cell lysis, it is very easy to divide human ORC into at least three different parts: ORC1, ORC2-5, and ORC6. The subunits do not form as robust a complex as seen in the yeasts and in flies.

      Reviewer #3 (Public Review):

      Summary: The authors present a thought-provoking and comprehensive re-analysis of previously published human cell genomics data that seeks to understand the relationship between the sites where the Origin Recognition Complex (ORC) binds chromatin, where the replicative helicase (Mcm2-7) is loaded, and where DNA replication actually beings (origins). The view that these should coincide is influenced by studies in yeast where ORC binds site-specifically to dedicated nucleosome-free origins where Mcm2-7 can be loaded and remains stably positioned for subsequent replication initiation. However, this is most certainly not the case in metazoans where it has already been reported that chromatin bindings sites of ORC and Mcm2-7 do not necessarily overlap, nor do they always overlap with origins. This is likely due to Mcm2-7 possessing linear mobility on DNA (i.e., it can slide) such that other chromatin-contextualized processes can displace it from the site in which it was originally loaded. Additionally, Mcm2-7 is loaded in excess and thus only a fraction of Mcm2-7 would be predicted to coincide with replication start sites. This study reaches a very similar conclusion of these previous studies: they find a high degree of discordance between ORC, Mcm2-7, and origin positions in human cells.

      Strengths: The strength of this work is its comprehensive and unbiased analysis of all relevant genomics datasets. To my knowledge, this is the first attempt to integrate these observations. It also is an important cautionary tale to not confuse replication factor binding sites with the genomic loci where replication actually begins, although this point is already widely appreciated in the field. Response: Thank you for recognizing the comprehensive and unbiased nature of our analysis. Our findings will prevent the unwise adoption of ORC or MCM binding sites as surrogate markers of origins and will stimulate the field to try and improve methods of identifying ORC or MCM binding until the binding sites are found to be proximal to the most reproducible origins. The last possibility is that there are ORC- or MCM-independent modes of defining origins, but we have no evidence of that.

      Weaknesses: The major weakness of this paper is the lack of novel biological insight and that the comprehensive approach taken failed to provide any additional mechanistic insight regarding how and why ORC, Mcm2-7, and origin sites are selected or why they may not coincide.

      Response: we agree that we cannot provide a novel biological insight from this kind of meta-analysis. The importance of this study is in highlighting that there is either significant problems with the data collected till now (preventing the co-localization of ORC or MCM binding sites with the most reproducible origins) or ORC and MCM binding sites are often far away from where the most reproducible origins fire, which should make us consider ways in which origins could be activated kilobases away from ORC and MCM binding sites.

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      All suggestions and recommendations were described in a previous review.

      Reviewer #3 (Recommendations For The Authors):

      The most significant omission is a contextualization of the results in the discussion and an explanation of why these results matter for the biology of replication, disease, and/or our confidence in the genomic techniques reported on in this study. As written, the discussion simply restates the results without any interpretation towards novel insight. I suggest that the authors revise their discussion to fill this important gap.

      A second important, unresolved point is whether replication origins identified by the various methods differ due to technical reasons or because different cell types were analyzed. Given the correlation between TSS and origins (reported in this study but many others too), it is somewhat expected that origins will differ between cell types as each will have a distinct transcriptional program. This critique is partly addressed in Figure S1C. However, given the conclusion that the techniques are only rarely in agreement (only 0.27% origins reproducibly detected by the four techniques), a more in-depth analysis of cell type specific data is warranted. Specifically, I would suggest that cell type-specific data be reported wherever origins have been defined by at least two methods in the same cell type, specifically reporting the percent of shared origins amongst the datasets. This type of analysis may also inform on whether one or more techniques produces the highest (or lowest) quality list of true origins.

      We have done what has been suggested: used K562 cell type-specific data because here the origins have been defined by at least two methods in the same cell type and reported the percent of shared origins amongst the datasets (Supp. Fig. S4).

      Other MINOR comments include:

      • Line 215: the authors show that shared origins overlap with TF binding hotspots more often than union origins, which they claim suggests "that they are more likely to interact with transcription factors." As written, it sounds like the authors are proposing that ORC may have some direct physical interaction with transcription factors. Is this intended? If so, what support is there for this claim?

      The reviewer is correct. We have rephrased because we have no experimental support for this claim.

      • In the text, Figure 3G is discussed before Figure 3F. I suggest switching the order of these panels in Figure 3.

      Done.

      • It's not clear what Figure 5H to Figure 6 accomplishes. What specifically is added to the story by including these data? Is there something unique about the high confidence origins? If there is nothing noteworthy, I would suggest removing these data.

      We want to keep them to highlight the small number of origins that meet the hypothesis that ORC and MCM must bind at or near reproducible origins. These would be the origins that the field can focus in on for testing the hypothesis rigorously. They also show the danger of evaluating proximity between ORC or MCM binding sites with origins based on a few browser shots. If we only showed this figure, we could conclude that ORC and MCM binding sites are very close to reproducible origins.

      • Line 394: "Since ORC is an early factor for initiating DNA replication, we expected that shared human origins will be proximate to the reproducible ORC binding sites." This is only expected if one disbelieves the prior literature that shows that ORC and origins are not, in many cases, proximal. This statement should be revised, or the previous literature should be cited, and an explanation provided about why this prior work may have missed the mark.

      We do not know of any genome-wide study in mammalian cell lines where ORC binding sites and MCM binding have been compared to highly reproducible origins, or that show that these binding sites and highly reproducible origins are mostly not proximal to each other. Most studies cherry pick a few origins and show by ChIP-PCR that ORC and/or MCM bind near those sites. Alternatively, studies sometimes show a selected browser shot, without a quantitative measure of the overlap genome wide and without doing a permutation test to determine if the observed overlap or proximity is higher than what would be expected at random with similar numbers of sites of similar lengths. In the revised manuscript we have discussed Dellino, 2013; Kirstein, 2021; Wang, 2017; Mas, 2023. None of them have addressed what we are addressing, is the small subset of the most reproducible origins proximal to ORC or MCM binding sites?

      • Line 402-404: given the lack of agreement between ORC binding sites and origins the authors suggest as an explanation that "MCM2-7 loaded at the ORC binding sites move much further away to initiate origins far from the ORC binding sites, or that there are as yet unexplored mechanisms of origin specification in human cancer cells". The first part of this statement has been shown to be true (Mcm2-7 movement) and should be cited. But what do the authors mean by the second suggestion of "unexplored mechanisms"? Please expand.

      We have addressed this point in the revised manuscript.

      • The authors should better reference and discuss the previous literature that relates to their work, some of these include Gros et al., 2015 Mol Cell, Powell et al., 2015 EMBO J, Miotto et al., 2016 PNAS, but likely there are many others.

      We have addressed this point in the revised manuscript.

      Note for authors:

      Line 107: The introduction discusses the mechanism for yeast ORC recognizes specific origins and discusses the Orc4 contribution, but it is known that Orc2 also binds DNA on a base-specific manner (see PMID 33056978). Thus Lee et al. did not "humanize ORC" as stated.

      Done

      Lines 117-119: Two of the cited papers are on endo-reduplication and not on initiation in a normal cell cycle and this should be pointed out. Second, there is contradictory evidence that ORC is essential in human cells and this should be cited (PMID 33522487)

      Done

    1. Author Response

      The following is the authors’ response to the original reviews.

      Based on the reviewer comments (see below) and subsequent discussion between the reviewers and the Reviewing Editor, I would like to invite the authors to make major revisions, including new experiments. However, if major new experiments are not feasible, as may be the case, then at a minimum, I would urge the authors to:

      1. Tone down the language regarding a causative role for changes in GH/IGF-I signaling in mediating the effects of Tmem63 on the skeleton, and also be very open in acknowledging the lack of mechanistic insight into how Tmem regulates GH signaling.

      Response: We toned down the language as suggested and also acknowledged the lack of mechanistic insights into how Tmem263 regulates GH signaling.

      1. Revise/redo or if not possible, then delete the problematic experiment in Fig. 5E.

      Response: We have included additional Western blot data in Figure 5 from control WT and KO male mice without exogenous GH injection. In the absence of GH injection, we could not detect Jak2 and Stat5 phosphorylation in the liver of male WT and KO mice.

      1. Address the comments about liver feminization.

      Response: We have performed additional analysis as suggested by reviewer # 3. We have now included additional data to address the issue of liver feminization (new Fig. 6G-I and Figure 6-figure supplement 1). We plan to expand on this very topic in future studies as this is an interesting transcriptional phenomenon.

      1. Revise the manuscript to address as many of the recommendations for the authors as possible, many of which can be addressed by textual edits. Response: We have addressed as many of the textual changes as suggested in the revised manuscript.

      Reviewer #2 (Recommendations for The Authors):

      TMEM263 has been suggested to be associated with bone mineral density and growth in humans and mice, but the functional role of this transmembrane protein in the regulation of bone metabolism is unknown. With the knockout mouse approach, this manuscript demonstrates that Tmem263 is essential for longitudinal bone growth in the mouse as deletion of Tmem263 in knockout (KO) mice developed severe postnatal growth impairment and proportional dwarfism. It is determined that the dwarfism was caused by a substantial reduction in liver expression of growth hormone receptor (GHR), a slight increase in serum GH, and a reduction in serum IGF-I, which resulted in disruptive of GH/IGF-I regulatory axis of endochondral bone formation.

      The study was relatively well designed, and the results in general are supportive of the conclusions. While this study discloses new and intriguing functional information about a novel cytoplasmic membrane gene, there are a few minor issues that the authors may wish to address. These issues are listed in the following:

      1. One of the intriguing findings of this manuscript is that deletion of a gene encoding a small cytoplasmic membrane protein could cause a substantial reduction in the expression and protein levels of GHR. Inasmuch as a couple of potential explanations were offered in the Discussion section (first complete paragraph of page 10), there has been no attempt to test any of the suggested causes, since many of these potential mechanisms can readily be tested experimentally. Accordingly, the lack of mechanistic investigation into this intriguing effect renders the manuscript largely descriptive in nature.

      Response: The point made by the reviewer is well taken. We do plan to have follow up studies to establish which among the mechanisms we highlighted in the discussion is contributing to the reduction in GHR transcript and protein level. Our present study is the first functional characterization of this enigmatic novel membrane protein. We anticipate that multiple follow-up studies are needed to gain a deeper understanding of the biology of Tmem263. We believe that our present study represents an important first step.

      1. Because a major conclusion is that the bone phenotype of Tmem263 KO mice was caused by deficient hepatic expression and/or action of GHR, it would be helpful to (or strengthen) the conclusion if a brief comparison of the bone phenotype between GHR KO mice and Tmem263 KO mice is included in the Discussion section.

      Response: We have now included this information in the revised manuscript.

      1. In Figure 3, the cortical bone parameters (i.e., Tt.Ar, Ct.Ar, and Ct.Th), but none of the trabecular bone parameters (i.e., BV/TV, Tb.N, Tb.Th), were normalized against femur length. The authors did not provide a rationale for this differential treatment with the cortical bone parameters from the trabecular bone parameters. If the reason to normalize the cortical bone parameters against bone length was to demonstrate that the reduced cortical bone mass in mutants was related to the impaired longitudinal bone growth, then why did the authors not also assess whether the observed reduction in these trabecular bone parameters in KO mutants was proportional to reduced longitudinal bone growth?

      Response: We actually made the exact adjustments that the reviewer refers to, as stated in the methods section. Please see page 14. The regions of interest (ROIs) of both the trabecular bone analysis and the cortical analysis in the mutants was reduced proportional to the length of the bone (40% smaller). The normalization to Tt Ar to femur length in Figure 3I was only meant to show that the reduction in Tt Ar in the mutants was proportional. We have modified the text in our result section for clarity.

      1. Elements described in Fig. 5A have been well documented. Therefore, Fig. 5A is unnecessary and can be deleted.

      Response: We felt that Figure 5A should remain. It helps orient readers that are not familiar with the literature to be aware that both liver- and bone-derived IGF-1 contribute to longitudinal bone growth.

      1. Figure 6 was performed with male KO mice. Were the altered gene expression profiles in female KO mice any different from male KO mice?

      Response: We plan to perform RNA-seq in female mouse liver in our follow-up studies. We do not know, at present, whether and to what extent the liver transcriptomic profile would be different between male and female KO mice. As far as dwarfism and deficiency in skeletal acquisition, both male and female KO mice showed the same phenotypes.

      1. The number of animals (or samples) per group in some of the Figures (i.e., Fig. 2G, 2I, 2J, 3A to J, the entire Fig. 4, 5D, 5F, and Suppl Fig. 1) is needed to be provided in the legends.

      Response: We have included this information in the figure legends.

      Reviewer #3 (Recommendations for The Authors):

      1. Explain the discrepancy between the impact of KO on serum Igfbp3 (= decreased) vs. hepatic Igfbp3 (= unchanged).

      Response: We do not have a plausible mechanism, at present, that can explain the reduction in circulating serum Igfbp3 level without an apparent reduction in Igfbp3 transcript level in the liver. In human studies, typically only serum IGFBP3 levels are measured but not the hepatic IGFBP3 transcript level. Therefore, it is unclear whether the circulating levels of IGFBP3 is being regulated at the posttranscriptional level, an issue that can be explored in future studies.

      1. Line 215, 221, and elsewhere - Foxa1 does not show significant male-biased expression in mouse liver.

      Response: We have removed Foxa1 from the text.

      1. Line 225- According to the abstract of Ref. #45, Cux2 regulates a subset of sex-biased genes in the liver. The authors should compare the genes dysregulated by TMEM263-KO (Fig. 6) to those altered by Cux2 loss (Ref. #45) to ascertain whether the results of Fig. 6 are partially or entirely explained by Cux2 overexpression.

      Response: We agree that this is a great area of future study. We do feel this, however, would be better explored in a more in-depth follow-up article. We felt, given the current direction of the paper it made more sense to include differential expression comparisons of male vs female, hypophysectomized vs sham control, and Stat5b-KO vs WT mouse liver gene expression data. Our future work will explore the transcriptomes of male and female WT and Tmem263-KO liver gene expression in the context of the observed physiology.

      1. Line 262- "lower transcription of Ghr gene". A decrease in mRNA levels does NOT equate with a decrease in transcription per se. Altered mRNA splicing, poly A, export, cytoplasmic stability, etc. are all potential contributors.

      Response: We have included these possibilities highlighted by the reviewer in our revised Discussion section.

      1. Line 273, "TMEM263... most highly expressed in liver" Not correct - see Fig. 1C for TMEM263 RNA levels in mouse tissues.

      Response: We have corrected the text on page 11.

      1. Line 425 - Include GEO accession number.

      Response: We have already uploaded our RNA-seq data to the NCBI Sequence Read Archive (SRA), and the data can be accessed under accession number # PRJNA938158.

      1. Fig. 6 - Line 796 - Specify the age and sex of mice analyzed.

      Response: We have included the information in the revised figure 6 legend.

      1. Fig.2 - Suppl 1- Specify age of mice.

      Response: We have included the information in the revised Figure 2-figure supplement 2.

      1. Fig.2G -Specify the sex of the mice.

      Response: For the P1 to P21 pups’ data, we did not separate by sex, as gender determination of pups at P1 and P7 can be challenging. We now indicated this in the figure legend.

      1. Fig. 6A and 6C-6F: Which of these genes shows sex-dependent expression in wild-type liver? Use color to highlight gene names for genes that show male-biased or female-biased expression.

      Response: We agree with the reviewer that additional labels on Figure 6A and 6C-F would be helpful to show genes of sex-bias. However, this is not the primary point of the paper. This topic deserves a much more in-depth analysis in follow up studies focused on defining the exact type and degree of transcript feminization in the liver of Tmem263-KO mice, as well as, its physiologic consequences. For readers interested in this topic, we have included the subfigures G-I in Figure 6 and for greater transcript level detail, figure 6 supplement 1.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1

      Recommendation 1: The authors reasoned upon the presence of a differential basal hydraulic stress in waves' valleys vs hills at first from the observation of "domes" formation upon 48h cultivation. I suggest performing a quantification to support the statement as a good scientific practice. Furthermore, it would strengthen the concept when the formation of domes was compared between the waves' dimensions as a different grade of cell extrusion was quantified. i.e., 50, 100, and 200 µm.

      Response 1: Upon seeing the phenomenon (Author response image 1 A), we performed a count for domes on the 100 µm and saw a significant effect. We refrained from including the results as it is the subject of ongoing research in our lab. In response to the reviewer’s suggestion, we have included a graph (Author response image 1 B) showing the increasing number of domes over 48 hours from three 100 µm wave samples.

      We have updated Figure 2A and B in the manuscript to include the new graph.

      Author response image 1.

      (A) shows dome (white arrows) over a 100 µm wave substrate. (B) is the number of accumulated domes in valley and hill regions, for 3 independent samples, over 48 hours.

      Recommendation 2: Using RICM microscopy to quantify the cell basal separation with the substrate and hydraulic stress is very clever. Nevertheless, I am in doubt if the different intensity reported for the hills vs valley (Fig. 2G and H) is a result of the signal reduction at deeper Z levels. Since there is no difference in extrusion and forces between valleys and hills in the 200 µm waves but only in 50µm and 100µm, I would add this to the quantification. I would expect no intensity difference from RICM for the 200 µm sample if this is not an artefact of imaging.

      Response 2: We performed additional experiments on blank wave substrates (both 100 and 200 µm) to ascertain the extent of reflection intensity drop (Author response image 2A). And, as correctly pointed out by Reviewer #1, there was a drop in intensity even without cells. On the 100 µm waves, hill reflections are on average ~27 % dimmer than valley reflections. Whereas, on the 200 µm waves, hill reflections are on average ~39 % dimmer.

      Using this information, we performed a calibration on the RICM results obtained from both the 100 and 200 µm waves (Author response image 3B). The calibrated 100 µm data showed residual signatures of difference, whereas the calibrated 200 µm distributions appeared very similar. We noticed large cross- sample variations in the registered intensities, which will negatively impact effect size if not accounted for. To do this, we subsequently normalized both hill and valley intensities against planar region intensities for each sample. As shown by the final output (Author response image 3C), we were able to remove the skewness in the distributions. Moreover, 1-way ANOVA followed by a post hoc analysis with BH correction revealed a significant reduction in 100 µm hill/flat intensity ratio compared to 100 µm valley/flat intensity ratios (Δ~-23 %). Conversely, no significance was observed for the same comparison on the 200 µm waves.

      Author response image 2.

      (A). RICM from blank wave samples reveal a reduction in reflection intensity in hill regions compared to flat and valley regions.

      Author response image 3.

      (B) shows the RICM intensities after adjusting for the inherent reflection intensity drop shown in (A). (C) show the RICM intensities after normalization against planar region signals; this removes cross-sample variations and improve effect size of differences.

      We have updated the manuscript Figure 2I and text accordingly. The blank wave results are included in Figure 2-figure supplement 1 along with updated text and summary data table in Supplementary File 4.

      Recommendation 3: To measure 3D forces on top of the hills and valleys, the use of PAA gels is necessary. Since in Fig 3B, the authors show a difference in cell extrusion number between substrates and stiffnesses, I think it is necessary to confirm the presence of more extrusion in valleys vs hills on PAA gels. This would ensure the conclusion between normal forces and extrusion.

      Response 3: We do have time-lapse data with monolayers on the PAA waves. However, we felt results from the flat regions were sufficient in supporting the point being made in the text. Specifically, our original intention with PAA gels was to show that the extrusion reductions seen in osmotic perturbations were by virtue of removing basal stress and not some cryptic osmotic response. Hydrogels were chosen because they can effectively dilute basal solute concentration and thereby reduce the osmotically induced water transport. Moreover, as fluid could freely move within the gel, the fluid stress can quickly equilibrate across the basal surface. In contrast, poorly water/solute permeable substrates could lead to localized spikes in solute concentration and transient basal regions with high fluid stress.

      To get a sense of the potential difference in basal solute concentration between the two materials, we can do a quick hand-waving estimation. For monolayers on non-water/solute permeable PDMS of 20x20 mm and using the laser wavelength (640 nm) for RICM as an extreme estimate of basal separation, we should expect ~0.25 µl of total basal water content. On the other hand, we typically produce our PAM gel slabs using ~150 µl of precursor solutions. This means that, given similar amounts of solute, PAM gels will lead to monolayer basal osmolarity that is around 3 orders of magnitude lower than monolayers on PDMS, producing significantly lower osmotic potential. This implies from the outset that we should expect high survivability of cells on these substrates irrespective of curvature domains. Indeed, later immunoblotting experiments showed MDCKs exhibiting hyper activated FAK and Akt on PAM gels.

      In response to Reviewer #1’s suggestion then, we have added another supporting time-lapse (Video 19) showing typical response of MDCK monolayers on 100 µm PAA waves (Author response image 4). Evident from the time-lapses, like the planar regions, cell extrusions were very rare. This supports the idea that on PAM gels the effects of basal hydraulic stress and asymmetric forces are marginal against the strong survival signals. And the response is similar to hyper-osmotic perturbations; there, we did not see a significant difference between valley and hill extrusions.

      Author response image 4.

      Time-lapse snapshot showing negligible MDCK extrusions 24 hours after confluency over PAM gel wave substrates.

      Recommendation 4: Before proceeding with the FAK inhibitor experiment, the authors should better justify why the 4.1 wt % sucrose vs DMSO or NaCl is the most inert treatment. This can be done by citing relevant papers or showing time-lapses (as it is done for the higher FAKI14 dose).

      Response 4: Although some cells have recently been shown to be able to transport and utilize sucrose, mammalian cells generally cannot directly take up polysaccharides for metabolism and this is frequently mentioned in literature: see (Ref. R1) for example. Without special enzymes to break sucrose down into monosaccharides, such as sucrase found in the gut, the sugars should remain spectators in the culture medium, contributing only to osmotic effects.

      DMSO on the other hand, besides changing osmolarity, can also be integrated into cell membrane and pass through cells over time. It has been reported to chronically affect cell membrane properties and gene expressions (Ref. R2).

      Finally, it is well known that both sodium and chloride ions are readily taken up and transported by cells (Ref R3). They help to regulate the transmembrane potential, which in turn can affect membrane bound proteins and biochemical reactions within a cell.

      Hence, comparing the 3 hyper-osmotic perturbations, adding sucrose should have the least off- target effects on both the inhibitor study and the subsequent immunoblotting. And, in response to the reviewer’s recommendation, we have updated the text accordingly and included new references to support our statement.

      Ref R1. H. Meyer, O. Vitavska, H. Wieczorek; Identification of an animal sucrose transporter. Journal of Cell Science 124, 1984–1991 (2011). Doi: 10.1242/jcs.082024

      Ref R2. B. Gironi, Z. Kahveci, B. McGill, B.-D. Lechner, S. Pagliara, J. Metz, A. Morresi, F. Palombo, P. Sassi, P. G. Petrov; Effect of DMSO on the Mechanical and Structural Properties of Model and Biological Membranes. Biophysical Journal 119, 274-286 (2020). Doi: doi.org/10.1016/j.bpj.2020.05.037

      Ref R3. X. Zhang, H. Li; Interplay between the electrostatic membrane potential and conformational changes in membrane proteins. Protein Science 28, 502-512 (2019). Doi: 10.1002/pro.3563

      Recommendation 5: The data showing a FAK-dependent phosphorylation of AKT responsible for a higher cell survival rate in the hills is not yet completely convincing. Please show a reduced AKT phosphorylation level after FAK inhibition in high osmolarity levels. Furthermore, the levels of AKT activation seem to increase slightly upon substrate softening independently of FAK activation or osmotic pressure (i.e., Fig. 4E, Soft PDMS). The authors should comment on this in connection with the results shown for PAA gels.

      Response 5: For the additional immunoblotting experiments, work is currently underway. We could not, however, complete these experiments in time for this revision, as both Cheng-Kuang and Xianbin will shortly be taking on new jobs elsewhere. David will continue with the immunoblotting studies and should be able to include the results in an update in the coming months. As for the apparent elevated levels of AKT seen on soft silicones, we speculate that it is because we cannot immunoblot cells that have died and were inevitably washed out at the start of the procedure. Inferring from the higher extrusion rates on these soft substrates, we could be missing a significant portion of stats. Specifically, we are missing all the cells that would have lowered AKT activation but died, and had we been able to collect those statistics, perhaps both the FAK and AKT should have shown lower levels. We risk committing survival bias on the results if we read too much into the data as is.

      Alternatively, another explanation could be that, by virtue of survival of the fittest, we might have effectively selected a subpopulation of cells that were able to survive on lower FAK signals, or completely irrespectively of it.

      At any rate, to prove our foregoing hypothesis would require us to perform comprehensive immunoblotting and total transcriptome analysis over different duration conditions. Unfortunately, we do not have the time to do that for the current article, but it could be developed into a stand-alone molecular biology investigation in future. We have included similar discussion in the main text.

      Recommendation 6: In the discussion, the authors suggest the reported findings be especially relevant for epithelia that significantly separate compartments and regulate water and soluble transport. These are for example kidney epithelia (i.e., MDCK is the best experimental choice), retinal epithelium or intestinal epithelium. I would suggest that some proof-of-concept experiments could be done to support this concept. For example, I would expect keratinocytes (i.e., HaCaT) not to show a strong difference in extrusion rate between valleys and hills since the monolayer is not so sealed as kidney epithelium. In general, this kind of experiment would significantly strengthen the finding of this work.

      Response 6: As recommended, we tracked the behavior of retina pigment epithelial cells (hTERT RPE-1 from ATCC) which do not form tight monolayers like MDCKs (Ref. R4). We did not detect extrusion events occurring from monolayers of these cells (Author response image 5). This is true even for portions of monolayers over waved regions.

      Author response image 5.

      Time-lapse snapshot showing non-existent o cell extrusions from RPE monolayers confluent for over 21 hours.

      We have updated these findings in the main text discussions and included a new supporting time- lapse (Video 15) in our article.

      Ref R4 F. Liu, T. Xu, S. Peng, R. A. Adelman, L. I. Rizzolo; Claudins regulate gene and protein expression of the retinal pigment epithelium independent of their association with tight junctions. Experimental Eye Research 198, 108157 (2020). Doi: 10.1016/j.exer.2020.108157

      Recommendation 7 (minor point): Figure S1 needs to have clear notes indicating in each step what is what. i.e., where is glass, PDMS, NOA73, etc? A more detailed caption will help the figure's comprehension. Also "Cy52" should be changed to "soft silicone" to be consistent with the text (or Cy52 should be mentioned in the text).

      Response 7 (minor point): Changes were made to Figure 1-figure supplement 1 to improve comprehension accordingly. CY52 was added to the main-text, next to the first appearance of the word soft silicone, to be consistent with the figures.

      Recommendation 8 (minor point): The authors often mentioned that epithelial monolayers are denser on PAA gels. Please add a reference(s) to this statement.

      Response 8 (minor point): The statement is an inference from visually comparing monolayers on PAM gels and PDMS. The difference is quite evident (Author response image 6). The density difference is in spite of the fact that the substrates share similar starting cell numbers.

      To address the reviewer’s comment, we have combined time-lapses of monolayers on silicones and PAM gels side-by-side in Video 17 to facilitate convenient comparisons.

      Author response image 6.

      Time-lapse snapshot at 24 hours after confluence, showing conspicuously higher density of MDCK monolayers on PAM gel compared to those on silicon elastomer.

      Reviewer #2

      Recommendation 1: The sinusoidal wavy substrate that the authors use in their investigation is interesting and relevant, but it is important to realize that this is a single-curved surface (also known as a developable surface). This means that the Gaussian curvature is zero and that monolayers need to undergo (almost) no stretching to conform to the curvature. The authors should at least discuss other curved surfaces as an option for future research, and highlight how the observations might change. Convex and concave hemispherical surfaces, for example, might induce stronger differences than observed on the sinusoidal substrates, due to potentially higher vertical resultant forces that the monolayer would experience. The authors could discuss this geometry aspect more in their manuscript and potentially link it to some other papers exploring cell-curvature interactions in more complex environments (e.g. non-zero Gaussian curvature).

      Response 1: In response to reviewer #2’s recommendation we have highlighted in the discussion of our text that our waves constitute a developable surface and that cells will experience little stretching for the most part. Based on our knowledge of how curvature can modulate forces and thus osmotic effects, we included some rudimentary analysis of what one would expect on hemispherical surfaces of two types: one that is periodic and contiguous (Ref. R5), and another with delineating flat regions (Ref. R6).

      For epithelial monolayers in the first scenario, and on poorly solute/water permeable substrates, we should also expect to see a relatively higher likelihood of extrusions from concave regions compared to convex ones. Moreover, as the surfaces are now curved in both principal directions (producing larger out-of-plane forces), we should see the onset of differential extrusions seen in this study, but at larger length scales. For example, the effects seen on 100 µm hemicylindrical waves might now happen at larger feature size for hemispherical waves. Furthermore, as this kind of surface would invariably contain hyperbolic regions (saddle points), we might expect an intermediate response from these locations. If the forces in both principal directions offset each other, the extrusion response may parallel planar regions. On the other hand, if one dominates over the other, we may see extrusion responses tending to the dominating curvature (concave of convex).

      On the other hand, on curved landscapes with discrete convex or concave regions, we should expect, within the curved surface, extrusion behaviors paralleling findings in this study. What would be interesting would be to see what happens at the rims (or skirt regions) of the features. At these locations we effectively have hyperbolically curved surfaces, and like before, we should expect some sort of competing effect between the forces generated from the principal directions. So, for dome skirts, we should see fewer extrusions when the domes are small, and vice versa, when they are larger. Meanwhile, for pit rims, we should see a reversed behavior. It should also be noted that the transitioning curvature between convex/concave and planar regions would also modulate the effect.

      These effects might have interesting developmental implications. For instance, in developing pillar like tissues (e.g., villi) structures, the strong curvatures of nascent lumps would favor accumulation of cell numbers. However, once the size of the lumps reaches some critical value, epithelial cell extrusions might begin to appear at the roots of the developing structures, offsetting cell division, and eventually halting growth.

      Ref R5. L. Pieuchot, J. Marteau, A. Guignandon, T. Dos Santos, I. Brigaud, P. Chauvy, T. Cloatre, A. Ponche, T. Petithory, P. Rougerie, M. Vassaux, J. Milan, N. T. Wakhloo, A. Spangenberg, M. Bigerelle, K. Anselme, Curvotaxis directs cell migration through cell-scale curvature landscapes. Nature Communications 9, 3995 (2018). Doi: 10.1038/s41467-018-06494-6

      Ref R6. M. Werner, S. B.G. Blanquer, S. P. Haimi, G. Korus, J. W. C. Dunlop, G. N. Duda, D. W. Grijpma, A. Petersen, Surface curvature differentially regulates stem cell migration and differentiation via altered attachment morphology and nuclear deformation. Advanced Science 4, 1–11 (2017). Doi: 10.1002/advs.201600347

      Recommendation 2: The discussion of the experiments on PAM gels is rather limited. The authors describe that cells on the PAM gels experience fewer extrusions than on the PDMS substrates, but this is not discussed in sufficient detail (e.g. why is this the case). Additionally, the description of the 3D traction force microscopy and its validation is quite limited and should be extended to provide more convincing evidence that the measured force differences are not an artefact of the undulations of the surface.

      Response 2: We first saw a significant reduction in cell extrusions when we performed hyper-osmotic perturbations, and to eliminate possible off-target effects of the compounds used to increase osmolarity, we used three different compounds to be sure. In spite of this, we felt it would further support our argument, that basal accumulation of fluid stress was responsible for the extrusions, if we had some other independent means of removing fluid stress without directly tuning osmolarity through addition of extraneous solutes. We hence thought of culturing MDCK monolayers on hydrogels.

      Hydrogels were chosen because they can effectively dilute basal solute concentration (for reference ions (Na+) are continuously pumped out basally by the monolayer) and thereby reduce the associated osmotically induced water transport. Moreover, as fluid could freely move within the gel, the fluid stress can quickly equilibrate across the basal surface. In contrast, poorly water/solute permeable substrates will lead to localized spikes in solute concentration and transient basal regions with high fluid stress.

      To get a sense of the extent of difference in basal solute concentration between the two materials, we can do a quick hand-waving estimation. For monolayers on non-water-permeable PDMS of 20x20 mm, and using the laser wavelength (640 nm) for RICM as an extreme estimate of basal separation, we should expect ~0.25 µl of total basal water content. On the other hand, we typically produce our PAM gel slabs using ~150 µl of precursor solutions. This means that, given similar amounts of solute, PAM gels will lead to monolayer basal osmolarity that is around 3 orders of magnitude lower than monolayers on PDMS, producing significantly lower osmotic potential. This implies from the outset that we should expect high survivability of cells on these substrates. Indeed, later immunoblotting experiments showed MDCKs exhibiting hyper activated FAK and Akt on PAM gels.

      As for the 3D TFM used in this study, it is actually implemented from a well-established finite element method to solve inverse problems in engineering and has been repeatedly validated in larger scale engineering contexts (Ref. R7). The novelty and contribution of our article is in its adaptation to reconstruct cellular forces at microscopic scales.

      In brief, soft materials, such as hydrogels used in our case, are doped with fluorescent particles, coated with ECM, and then seeded with cells. The cells would exert forces that deform the soft substrate, thereby displacing the fluorescent particles from their equilibrium positions. This particle displacement can be extracted by producing an image pair with microscopy; first one with the cells, and subsequent one of relaxed gel after removal of cells with acutely cytotoxic reagents, such as SDS. There are several ways in which the displacement field can be extracted from the image pair. These include particle tracking velocimetry, particle image velocimetry, digital volume correlation, and optical flow.

      We employed 3D Farneback optical flow in our study for its superior computational performance. The method was validated using synthetically generated images from Sample 14 of the Society for Experimental Mechanics DIC challenge. The accuracy of the calculated displacements using the 3D Farneback optical flow was then compared to the provided ground truth displacements. For the highest frequency displacement image pairs, an x-component root-mean-square-error (RMSE) value of 0.0113 was observed. This was lower than the 0.0141 RMSE value for the Augmented Lagrangian Digital Volume Correlation method. This suggested that the 3D Farneback optical flow is capable of accurately calculating the displacement between two bead images.

      The displacement fields are then fed into a finite element suite (ANSYS in our case) along with the model and mesh of the underlying substrate structure to obtain node specific displacements. This is required because mech nodes do not typically align with voxel positions of displacements. With these node specific displacements, we subsequently solve the inverse problem for the forces using Tikhonov regularization (Ref. R8). The outcome is a vector of node specific forces.

      In light of the above, to physically validate the method in our context would require the generation of a known ground truth force on the scale of pico- to nano-newtons and subsequently image the particle displacements from this force using confocal microscopy. The force must then be released in situ in order for the relaxed gel to be imaged again. This is not a straightforward feat at this scale, and a method that immediately springs to mind is magnetic tweezers. Unfortunately, this is a tool that we cannot develop within reasonable timeframes, as the method will have to be seamlessly integrated with our spinning-disk confocal. However, as a compromise, we have included an in-silico validation with our revised manuscript.

      Specifically, given a finite element model with a predefined curvature, a known force was applied to the surface of the model (Author response image 7A). The resulting displacements were then calculated from the finite element solution. A 10% random noise is then added to the resulting displacement. The traction force recovery (Fig. R2-1 B) was then performed using the in-silico noisy displacements. To evaluate the accuracy of the recovery, the cosine similarity along with the mean norm of the force vectors were calculated. A value closer to 1 for both evaluation metrics indicates a more accurate reconstruction of the simulated traction force. The cosine similarity of the recovered traction forces to the original applied force was 0.977±0.056 while the norm of the recovered traction forces as a proportion of the original applied force was 1.016±0.165. As both values are close to 1 (i.e., identical), this suggested that the traction forces could be satisfactorily recovered using the finite-element based method.

      In response to the reviewer’s recommendations then, additional content has been included in the main text to explain the use of PAM gels and the workings of our 3D TFM pipeline.

      Ref R7. James F. Doyle, Modern Experimental Stress Analysis: Completing the Solution of Partially Specified Problems (John Wiley & Sons, Chichester, 2004).

      Ref R8. Per Christian Hansen, Discrete Inverse Problems: Insight and Algorithms (siam, Philadelphia, 2010).

      Author response image 7.

      (A) shows simulated force field to generate simulated displacements. (B) shows force field reconstructed from simulated displacements with noise.

      Recommendation 3: The authors show nuclear deformation on the hills and use this as evidence for a resultant downward-pointing force vector. This has, indeed, also been observed in other works referenced by the authors (e.g. Werner et al.), and could be interesting evidence to support the current observations, provided the authors also show a nuclear shape on the concave and flat regions. The authors could potentially also characterize this shape change better using higher-resolution data.

      Response 3: We characterized nucleus deformation using Hoechst-stained samples as per recommendation. The deformation is estimated by dividing segmented nuclei volumes by best-fit ellipsoid volumes of same objects. In this way, objects exhibiting minimal bending will lead to values close to 1.0. The obtained graph is shown in figure Author response image 8B (and manuscript Figure 3D).

      Author response image 8.

      (A) an example of deformed nuclei on 50 µm wave hill region. (B) a Violin plot of calculated nuclear deformations across dimensions and features using segmented volume normalized against best-fit ellipsoid volume.

      Our quantifications show a statistically significant difference in nuclei deformation measure medians between hill and valley cells on the 50 µm (0.973 vs 0.982) and 100 µm (0.971 vs 0.979) waves; this indicates that cells on the hills tend to have more deformed nuclei compared to cells in the valleys. Meanwhile, no significant difference was found for a similar comparison on 200 µm (0.978 vs 0.978) samples. For reference, the median found for cells pooled from planar regions was 0.975.

      In response to the reviewer’s suggestions Figure 3 of our manuscript has been updated to include the new results on nuclei deformation. The text has also been updated to account for the new information to support our claims. The statistics are included in a new summary data table in Supplementary File 6.

      Recommendation 4: The U-net for extrusion detection is a central tool used within this study, though the explanation and particularly validation of the tool are somewhat lacking. More clarity in the explanation and more examples of good (or bad) detections would help establish this tool as a more robust component of the data collection (on all geometries).

      Response 4: The architecture of the neural network used in this study is outlined in supplementary figure S5a. To validate the performance of the model, a test dataset consisting of 200 positive examples and 100 negative examples were fed into the network and the resulting prediction was obtained from model. The confusion matrix of the model is shown in supplementary figure S5c. The weighted precision and recall of the model are 0.958 and 0.953 respectively.

      Additionally, we have included examples of false positive and false negative detections in Figure 1-figure supplement 5 (Author response image 8). For false positive detections, these were typically observed to be extrusions that were labelled to have occurred the frame prior to the frame of interest (Author response image 9 bottom sequence). However, as the extrusion process is incomplete in the prior frame, there are still changes in the extruded cell body and the network falsely predicts this as a detection.

      Author response image 9.

      Examples of false negative and false positive extrusions registration.

      Recommendation 5: The authors study the involvement of FAK in the observed curvature-dependent and hydraulic stress-dependent spatial regulation of cell extrusion. In one of the experiments, the authors supplement the cell medium with FAK inhibitors, though only in a hyper-osmotic medium. They show that FAK inhibition counteracts the extrusion-suppressing effect of a hyper-osmotic medium. However, no data is shown on the effect of FAK inhibitors within the control medium. Would the extrusion rates be even higher then?

      Response 4: We proceeded, as suggested by the reviewer, to explore the effects of the FAK inhibitor on MDCK monolayers in our control medium. The results revealed that, at the 3 µM FAK concentration, where cells in sucrose media showed an elevated extrusion rate, monolayers in control medium quickly suffered massive cell death (Author response image 10) similar to what was seen when 6 µM FAK was introduced to sucrose medium.

      This finding suggests that osmolarity protects against FAK inhibitors in a dose dependent manner. Moreover, as cell extrusions require an intact monolayer, its rates cannot increase indefinitely: a point will be reached where an intact monolayer can no longer be maintained.

      We have updated the main text of our article to mention this observation, and also included a new time-lapse (Video 22) to demonstrate the effect.

      Author response image 10.

      Timelapse snapshot of MDCK monolayers over waves 4 hours after inclusion of focal adhesion kinase inhibitor.

      Recommendation 6: The supplementary videos show two fields of view next to each other, which is not immediately clear to the viewer. I strongly advise the authors to add a clear border between the two panels, so that it is clear that the cells from one panel are not migrating into the next panel.

      Response 6: A distinctive border has been added to the movies to separate panels showing different focal planes of the same stack.

      Recommendation 7: The general quality and layout of the figures could be improved. Some figures would benefit from higher-resolution or larger cell images (e.g. Figure 2A, C, D), and the organisation of subpanels could be improved (e.g. especially in Figure 2). The box plots and bar graphs are also not consistent throughout the manuscript in terms of colouring and style, which should be improved.

      Response 7: We have enlarged the figures in question accordingly, at the cost of reducing some information. However, the full scope of the sub-figures remains accessible in the supplementary movies. We have also tried to change the placement of the panels to improve readability. We have also adjusted the valley, hill, and flat coloring scheme for the extrusion boxplots in Figures 1 and 2 to make them consistent.

      Recommendation 8: The graphs in Figures 3E and F are confusing and difficult to interpret. The x-axis states "Position along curve in radians" but it is unclear how to relate this to the position on the wavy substrate. The graphs also have a second vertical axis on the right ("valley-interface-hill"), which adds to the confusion. I would recommend the authors provide more explanation and consider a different approach of plotting this.

      Response 8: We have removed the confusing plot of cross-sectional profile from the force graphs. To indicate positions on the waves, we have augmented radian values with Hill, Interface, and Valley accordingly.

      Recommendation 9: Specify which silicone was used for the low-stiffness silicone substrates in the methods and in the main text.

      Response 9: CY52 has been added to the main-text, next to the first appearance of the word soft silicone, to be consistent with the figures.

      Recommendation 10: The flow lines that are plotted over the RICM data make it difficult to see the underlying RICM images. I would advise to also show the RICM images without the flow lines.

      Response 10: The original movie S15 (now Video 16) showing the RICM overlapped with optical flow paths has now been replaced by a movie showing the same, but with the flow paths and RICM in separate panels.

      Recommendation 11: In the first paragraph of the discussion, the authors write: "And this difference was both dependent on the sense (positive or negative)...". This is superfluous since the authors already mentioned earlier in the paragraph that the convex and concave regions (i.e. different signs of curvature) show differences in extrusion rates.

      Response 11: The sentence has been changed to “And this difference was also dependent on the degree of curvature.”

      Recommendation 12: In the second paragraph of the discussion, the authors mention that "basal fluid spaces under monolayers in hill regions were found consistently smaller than those in valley regions". Is this data shown in the figures of the manuscript? If so, a reference should be made because it was unclear to me.

      Response 12: This statement is an inference from the comparison of the hill and valley RICM grey values. Specifically, RICM intensities are direct surrogates for basal separations (i.e., fluid space (as there cannot be a vacuum)) by virtue of the physics underlying the effect. To be more precise then, “inferred from RICM intensity differences (Figure 2I)” has been added to support the statement.

      Recommendation 13: On page 7 of the discussion, the authors talk about positively and negatively curved surfaces. This type of description should be avoided, as this depends on the definition of the surface normal (i.e. is positive convex or concave?). Rather use convex and concave in this context.

      Response 13: The wording has been changed accordingly.

      Recommendation 14: The label of Table 8 reads "Table 2".

      Response 14: The error has been corrected.

      Reviewer #3

      Recommendation 1: The central finding seems to be opposite to an earlier report (J Cell Sci (2019) 132, jcs222372), where MDCK cells in curved alginate tubes exhibit increased extrusion on a convex surface. I suggest that you comment on possible explanations for the different behaviors.

      Response 1: The article in question primarily reported the phenomenon of MDCK and J3B1A monolayers detaching from the concave alginate tube walls coated with Matrigel. The authors attributed this to the curvature induced out-of-plane forces towards the center of the tubes. Up to this point, the findings and interpretation are consistent with our current study where we also find a similar force trend in concave regions.

      To further lend support to the importance of curvature in inducing detachment, the authors cleverly bent the tubes to introduce asymmetry in curvature between outer and inner surfaces. Specifically, the outside bend is concave in both principal directions, whereas the inside bend is convex in one of its principal directions. As expected, the authors found that detachment rates from the outer surface were much larger compared to the inner one. Again, the observations and interpretations are consistent with our own findings; the convex direction will generate out-of-plane forces pointing into the surface, serving to stabilize the monolayer against the substrate. It should be noted however, since the inner-side tube is characterized by both convex and concave curvatures in its two principal directions, the resulting behavior of overlaying monolayers will depend on which of the two resulting forces become dominant. So, for gradual bends, one should expect the monolayers to still be able to detach from the inner tube surface. This is what was reported in their findings.

      For their extrusion observations, I am surprised. Because their whole material (hydrogels) is presumably both solute and water permeable, I would be more inclined to expect very few extrusions irrespective of curvature. This is indeed the case with our study of MDCKs on PAM hydrogels, where the hydrogel substrate effectively buffers against the quick build-up of solute concentration and basal hydraulic stress. Without the latter, concave monolayer forces alone are unlikely to be able to disrupt cell focal adhesions. Indeed, the detachments seen in their study are more likely by exfoliation of Matrigel rather than pulling cells off Matrigel matrix entirely.

      My guess is that the extrusions seen in their study are solely of the canonical crowding effect. If this was the case, then the detached monolayer on the outside bend could buffer against crowding pressure by buckling. Meanwhile, the monolayer on the inside bend, being attached to the surface, can only regulate crowding pressure by removing cells through extrusions. This phenomenon should be particular to soft matrices such as Matrigel. Using stiffer and covalently bonded ECM should be sufficient to prevent monolayers from detaching, leading to similar extrusion behaviors. In response to the reviewer’s recommendation then, we have included a short paragraph to state the points discussed in this response.

      Recommendation 2: Fig 3E, F: The quantities displayed on the panels are not forces, but have units of pressure (or stress).

      Response 2: we have changed “force” to “stress” according to the reviewer’s suggestion. The reason we kept the use of force in the original text was due to the fact that we were reconstructing forces. Due to discretization, the resulting forces will inevitably be assigned to element nodes. In between the nodes, in the faces, there will be no information. So, in order to have some form of continuity to plot, the face forces are obtained by averaging the 4 nodes around the element face. Unfortunately, element face areas are not typically of the same size, therefore the average forces obtained needs to be further normalized against the face area, leading to a quantity that has units of stress.

      Recommendation 3: Fig 2D: Asterisks are hard to see.

      Response 3: the color of the asterisks has been changed to green for better clarity against a B&W background.

      Recommendation 4: p 19, l 7: Word missing in "the of molding"

      Response 4: the typo has been amended to “the molding of”.

    1. Author response

      Reviewer #1 (Public Review):

      Loss of skeletal muscle tissue from traumatic injury is debilitating. Restoring muscle mass and function remains a challenge. Using a mouse model, the authors performed punch biopsy injuries of the tibialis anterior in which the volume of muscle loss was varied to result in either successful muscle regeneration with a smaller injury or the unsuccessful outcome of fibrosis with a larger injury. For both conditions, a novel lipidomic profiling approach was used to evaluate pro-inflammatory and anti-inflammatory lipids at key time points post-injury with respect to collagen deposition, macrophage infiltration, muscle fiber regeneration, and force produced during isometric contractions. A key finding was that while all lipids increased at 3 days post-injury (dpi) and then declined through 14 dpi, pro-inflammatory lipids remained elevated during recovery from greater muscle loss which led to fibrosis. Maresin 1 was identified as an anti-inflammatory lipid that, when injected into injured muscle, reduced fibrosis, improved muscle regeneration, and partially restored the strength of contraction.

      Strengths: The metabolipidomic profiling demonstrated here represents a novel approach to identifying pro-inflammatory and anti-inflammatory mediators of successful vs unsuccessful skeletal muscle regeneration. These findings may translate into a new therapeutic approach for promoting successful regeneration following volumetric muscle loss.

      Weaknesses: Certain aspects of the data are overinterpreted; while some measures appear to have an adequate sample size to make sound conclusions, other measures are likely to lack sufficient statistical power given their variability. Presentation of the results would be strengthened by adhering to consistent terminology and labeling of figures throughout; specific examples are identified in recommendations to the authors. Several of the images used to illustrate differences between treatments are unconvincing because differences are not readily.

      We agree with the reviewer and have scaled back some of the interpretation as well as clarified the sample sizes. We have also amended the text to maintain a consistent terminology.

      Reviewer #2 (Public Review):

      The study is novel and valuable to the field and provides new and important insights into the role of lipid mediators in VML injuries. By expanding our understanding of the mechanisms that regulate muscle regeneration following VML injuries, the study has the potential to guide the development of novel therapeutic interventions that promote tissue repair and recovery. The data presented in the manuscript is of good quality. The findings and conclusions are supported by a variety of different analyses (e.g., gene expression, histology, flow cytometry).

      Despite the strengths of the study, some limitations are identified. Specifically, the impact of maresin 1 on macrophage phenotypes (M1/M2) could have been explored in more detail using histological or protein expression analysis. Moreover, additional data are needed to substantiate the claims about increased muscle regeneration. Lastly, the study does not address myofiber innervation, myofiber-type transitions, or motor unit remodeling.

      We thank the reviewer for the suggestions and have performed a more in-depth exploration of macrophage phenotypes through additional scRNA-sequencing analysis. We have also included additional data describing how Maresin 1 impacts muscle stem cells through cyclic AMP. Respectfully, profiling myofiber innervation, motor unit remodeling and myofiber-type transitions are beyond the scope of this manuscript.

    1. Author Response

      Reviewer #1 (Public Review):

      In this work George et al. describe RatInABox, a software system for generating surrogate locomotion trajectories and neural data to simulate the effects of a rodent moving about an arena. This work is aimed at researchers that study rodent navigation and its neural machinery.

      Strengths:

      • The software contains several helpful features. It has the ability to import existing movement traces and interpolate data with lower sampling rates. It allows varying the degree to which rodents stay near the walls of the arena. It appears to be able to simulate place cells, grid cells, and some other features.

      • The architecture seems fine and the code is in a language that will be accessible to many labs.

      • There is convincing validation of velocity statistics. There are examples shown of position data, which seem to generally match between data and simulation.

      Weaknesses:

      • There is little analysis of position statistics. I am not sure this is needed, but the software might end up more powerful and the paper higher impact if some position analysis was done. Based on the traces shown, it seems possible that some additional parameters might be needed to simulate position/occupancy traces whose statistics match the data.

      Thank you for this suggestion. We have added a new panel to figure 2 showing a histogram of the time the agent spends at positions of increasing distance from the nearest wall. As you can see, RatInABox is a good fit to the real locomotion data: positions very near the wall are under-explored (in the real data this is probably because whiskers and physical body size block positions very close to the wall) and positions just away from but close to the wall are slightly over explored (an effect known as thigmotaxis, already discussed in the manuscript).

      As you correctly suspected, fitting this warranted a new parameter which controls the strength of the wall repulsion, we call this “wall_repel_strength”. The motion model hasn’t mathematically changed, all we did was take a parameter which was originally a fixed constant 1, unavailable to the user, and made it a variable which can be changed (see methods section 6.1.3 for maths). The curves fit best when wall_repel_strength ~= 2. Methods and parameters table have been updated accordingly. See Fig. 2e.

      • The overall impact of this work is somewhat limited. It is not completely clear how many labs might use this, or have a need for it. The introduction could have provided more specificity about examples of past work that would have been better done with this tool.

      At the point of publication we, like yourself, also didn’t know to what extent there would be a market for this toolkit however we were pleased to find that there was. In its initial 11 months RatInABox has accumulated a growing, global user base, over 120 stars on Github and north of 17,000 downloads through PyPI. We have accumulated a list of testimonials[5] from users of the package vouching for its utility and ease of use, four of which are abridged below. These testimonials come from a diverse group of 9 researchers spanning 6 countries across 4 continents and varying career stages from pre-doctoral researchers with little computational exposure to tenured PIs. Finally, not only does the community use RatInABox they are also building it: at the time of writing RatInABx has received logged 20 GitHub “Issues” and 28 “pull requests” from external users (i.e. those who aren’t authors on this manuscript) ranging from small discussions and bug-fixes to significant new features, demos and wrappers.

      Abridged testimonials:

      ● “As a medical graduate from Pakistan with little computational background…I found RatInABox to be a great learning and teaching tool, particularly for those who are underprivileged and new to computational neuroscience.” - Muhammad Kaleem, King Edward Medical University, Pakistan

      ● “RatInABox has been critical to the progress of my postdoctoral work. I believe it has the strong potential to become a cornerstone tool for realistic behavioural and neuronal modelling” - Dr. Colleen Gillon, Imperial College London, UK

      ● “As a student studying mathematics at the University of Ghana, I would recommend RatInABox to anyone looking to learn or teach concepts in computational neuroscience.” - Kojo Nketia, University of Ghana, Ghana

      ● “RatInABox has established a new foundation and common space for advances in cognitive mapping research.” - Dr. Quinn Lee, McGill, Canada

      The introduction continues to include the following sentence highlighting examples of past work which relied of generating artificial movement and/or neural dat and which, by implication could have been done better (or at least accelerated and standardised) using our toolbox.

      “Indeed, many past[13, 14, 15] and recent[16, 17, 18, 19, 6, 20, 21] models have relied on artificially generated movement trajectories and neural data.”

      • Presentation: Some discussion of case studies in Introduction might address the above point on impact. It would be useful to have more discussion of how general the software is, and why the current feature set was chosen. For example, how well does RatInABox deal with environments of arbitrary shape? T-mazes? It might help illustrate the tool's generality to move some of the examples in supplementary figure to main text - or just summarize them in a main text figure/panel.

      Thank you for this question. Since the initial submission of this manuscript RatInABox has been upgraded and environments have become substantially more “general”. Environments can now be of arbitrary shape (including T-mazes), boundaries can be curved, they can contain holes and can also contain objects (0-dimensional points which act as visual cues). A few examples are showcased in the updated figure 1 panel e.

      To further illustrate the tools generality beyond the structure of the environment we continue to summarise the reinforcement learning example (Fig. 3e) and neural decoding example in section 3.1. In addition to this we have added three new panels into figure 3 highlighting new features which, we hope you will agree, make RatInABox significantly more powerful and general and satisfy your suggestion of clarifying utility and generality in the manuscript directly.

      On the topic of generality, we wrote the manuscript in such a way as to demonstrate how the rich variety of ways RatInABox can be used without providing an exhaustive list of potential applications. For example, RatInABox can be used to study neural decoding and it can be used to study reinforcement learning but not because it was purpose built with these use-cases in mind. Rather because it contains a set of core tools designed to support spatial navigation and neural representations in general. For this reason we would rather keep the demonstrative examples as supplements and implement your suggestion of further raising attention to the large array of tutorials and demos provided on the GitHub repository by modifying the final paragraph of section 3.1 to read:

      “Additional tutorials, not described here but available online, demonstrate how RatInABox can be used to model splitter cells, conjunctive grid cells, biologically plausible path integration, successor features, deep actor-critic RL, whisker cells and more. Despite including these examples we stress that they are not exhaustive. RatInABox provides the framework and primitive classes/functions from which highly advanced simulations such as these can be built.”

      Reviewer #3 (Public Review):

      George et al. present a convincing new Python toolbox that allows researchers to generate synthetic behavior and neural data specifically focusing on hippocampal functional cell types (place cells, grid cells, boundary vector cells, head direction cells). This is highly useful for theory-driven research where synthetic benchmarks should be used. Beyond just navigation, it can be highly useful for novel tool development that requires jointly modeling behavior and neural data. The code is well organized and written and it was easy for us to test.

      We have a few constructive points that they might want to consider.

      • Right now the code only supports X,Y movements, but Z is also critical and opens new questions in 3D coding of space (such as grid cells in bats, etc). Many animals effectively navigate in 2D, as a whole, but they certainly make a large number of 3D head movements, and modeling this will become increasingly important and the authors should consider how to support this.

      Agents now have a dedicated head direction variable (before head direction was just assumed to be the normalised velocity vector). By default this just smoothes and normalises the velocity but, in theory, could be accessed and used to model more complex head direction dynamics. This is described in the updated methods section.

      In general, we try to tread a careful line. For example we embrace certain aspects of physical and biological realism (e.g. modelling environments as continuous, or fitting motion to real behaviour) and avoid others (such as the biophysics/biochemisty of individual neurons, or the mechanical complexities of joint/muscle modelling). It is hard to decide where to draw but we have a few guiding principles:

      1. RatInABox is most well suited for normative modelling and neuroAI-style probing questions at the level of behaviour and representations. We consciously avoid unnecessary complexities that do not directly contribute to these domains.

      2. Compute: To best accelerate research we think the package should remain fast and lightweight. Certain features are ignored if computational cost outweighs their benefit.

      3. Users: If, and as, users require complexities e.g. 3D head movements, we will consider adding them to the code base.

      For now we believe proper 3D motion is out of scope for RatInABox. Calculating motion near walls is already surprisingly complex and to do this in 3D would be challenging. Furthermore all cell classes would need to be rewritten too. This would be a large undertaking probably requiring rewriting the package from scratch, or making a new package RatInABox3D (BatInABox?) altogether, something which we don’t intend to undertake right now. One option, if users really needed 3D trajectory data they could quite straightforwardly simulate a 2D Environment (X,Y) and a 1D Environment (Z) independently. With this method (X,Y) and (Z) motion would be entirely independent which is of unrealistic but, depending on the use case, may well be sufficient.

      Alternatively, as you said that many agents effectively navigate in 2D but show complex 3D head and other body movements, RatInABox could interface with and feed data downstream to other softwares (for example Mujoco[11]) which specialise in joint/muscle modelling. This would be a very legitimate use-case for RatInABox.

      We’ve flagged all of these assumptions and limitations in a new body of text added to the discussion:

      “Our package is not the first to model neural data[37, 38, 39] or spatial behaviour[40, 41], yet it distinguishes itself by integrating these two aspects within a unified, lightweight framework. The modelling approach employed by RatInABox involves certain assumptions:

      1. It does not engage in the detailed exploration of biophysical[37, 39] or biochemical[38] aspects of neural modelling, nor does it delve into the mechanical intricacies of joint and muscle modelling[40, 41]. While these elements are crucial in specific scenarios, they demand substantial computational resources and become less pertinent in studies focused on higher-level questions about behaviour and neural representations.

      2. A focus of our package is modelling experimental paradigms commonly used to study spatially modulated neural activity and behaviour in rodents. Consequently, environments are currently restricted to being two-dimensional and planar, precluding the exploration of three-dimensional settings. However, in principle, these limitations can be relaxed in the future.

      3. RatInABox avoids the oversimplifications commonly found in discrete modelling, predominant in reinforcement learning[22, 23], which we believe impede its relevance to neuroscience.

      4. Currently, inputs from different sensory modalities, such as vision or olfaction, are not explicitly considered. Instead, sensory input is represented implicitly through efficient allocentric or egocentric representations. If necessary, one could use the RatInABox API in conjunction with a third-party computer graphics engine to circumvent this limitation.

      5. Finally, focus has been given to generating synthetic data from steady-state systems. Hence, by default, agents and neurons do not explicitly include learning, plasticity or adaptation. Nevertheless we have shown that a minimal set of features such as parameterised function-approximator neurons and policy control enable a variety of experience-driven changes in behaviour the cell responses[42, 43] to be modelled within the framework.

      • What about other environments that are not "Boxes" as in the name - can the environment only be a Box, what about a circular environment? Or Bat flight? This also has implications for the velocity of the agent, etc. What are the parameters for the motion model to simulate a bat, which likely has a higher velocity than a rat?

      Thank you for this question. Since the initial submission of this manuscript RatInABox has been upgraded and environments have become substantially more “general”. Environments can now be of arbitrary shape (including circular), boundaries can be curved, they can contain holes and can also contain objects (0-dimensional points which act as visual cues). A few examples are showcased in the updated figure 1 panel e.

      Whilst we don’t know the exact parameters for bat flight users could fairly straightforwardly figure these out themselves and set them using the motion parameters as shown in the table below. We would guess that bats have a higher average speed (speed_mean) and a longer decoherence time due to increased inertia (speed_coherence_time), so the following code might roughly simulate a bat flying around in a 10 x 10 m environment. Author response image 1 shows all Agent parameters which can be set to vary the random motion model.

      Author response image 1.

      • Semi-related, the name suggests limitations: why Rat? Why not Agent? (But its a personal choice)

      We came up with the name “RatInABox” when we developed this software to study hippocampal representations of an artificial rat moving around a closed 2D world (a box). We also fitted the random motion model to open-field exploration data from rats. You’re right that it is not limited to rodents but for better or for worse it’s probably too late for a rebrand!

      • A future extension (or now) could be the ability to interface with common trajectory estimation tools; for example, taking in the (X, Y, (Z), time) outputs of animal pose estimation tools (like DeepLabCut or such) would also allow experimentalists to generate neural synthetic data from other sources of real-behavior.

      This is actually already possible via our “Agent.import_trajectory()” method. Users can pass an array of time stamps and an array of positions into the Agent class which will be loaded and smoothly interpolated along as shown here in Fig. 3a or demonstrated in these two new papers[9,10] who used RatInABox by loading in behavioural trajectories.

      • What if a place cell is not encoding place but is influenced by reward or encodes a more abstract concept? Should a PlaceCell class inherit from an AbstractPlaceCell class, which could be used for encoding more conceptual spaces? How could their tool support this?

      In fact PlaceCells already inherit from a more abstract class (Neurons) which contains basic infrastructure for initialisation, saving data, and plotting data etc. We prefer the solution that users can write their own cell classes which inherit from Neurons (or PlaceCells if they wish). Then, users need only write a new get_state() method which can be as simple or as complicated as they like. Here are two examples we’ve already made which can be found on the GitHub:

      Author response image 2.

      Phase precession: PhasePrecessingPlaceCells(PlaceCells)[12] inherit from PlaceCells and modulate their firing rate by multiplying it by a phase dependent factor causing them to “phase precess”.

      Splitter cells: Perhaps users wish to model PlaceCells that are modulated by recent history of the Agent, for example which arm of a figure-8 maze it just came down. This is observed in hippocampal “splitter cell”. In this demo[1] SplitterCells(PlaceCells) inherit from PlaceCells and modulate their firing rate according to which arm was last travelled along.

      • This a bit odd in the Discussion: "If there is a small contribution you would like to make, please open a pull request. If there is a larger contribution you are considering, please contact the corresponding author3" This should be left to the repo contribution guide, which ideally shows people how to contribute and your expectations (code formatting guide, how to use git, etc). Also this can be very off-putting to new contributors: what is small? What is big? we suggest use more inclusive language.

      We’ve removed this line and left it to the GitHub repository to describe how contributions can be made.

      • Could you expand on the run time for BoundaryVectorCells, namely, for how long of an exploration period? We found it was on the order of 1 min to simulate 30 min of exploration (which is of course fast, but mentioning relative times would be useful).

      Absolutely. How long it takes to simulate BoundaryVectorCells will depend on the discretisation timestep and how many neurons you simulate. Assuming you used the default values (dt = 0.1, n = 10) then the motion model should dominate compute time. This is evident from our analysis in Figure 3f which shows that the update time for n = 100 BVCs is on par with the update time for the random motion model, therefore for only n = 10 BVCs, the motion model should dominate compute time.

      So how long should this take? Fig. 3f shows the motion model takes ~10-3 s per update. One hour of simulation equals this will be 3600/dt = 36,000 updates, which would therefore take about 72,000*10-3 s = 36 seconds. So your estimate of 1 minute seems to be in the right ballpark and consistent with the data we show in the paper.

      Interestingly this corroborates the results in a new inset panel where we calculated the total time for cell and motion model updates for a PlaceCell population of increasing size (from n = 10 to 1,000,000 cells). It shows that the motion model dominates compute time up to approximately n = 1000 PlaceCells (for BoundaryVectorCells it’s probably closer to n = 100) beyond which cell updates dominate and the time scales linearly.

      These are useful and non-trivial insights as they tell us that the RatInABox neuron models are quite efficient relative to the RatInABox random motion model (something we hope to optimise further down the line). We’ve added the following sentence to the results:

      “Our testing (Fig. 3f, inset) reveals that the combined time for updating the motion model and a population of PlaceCells scales sublinearly O(1) for small populations n > 1000 where updating the random motion model dominates compute time, and linearly for large populations n > 1000. PlaceCells, BoundaryVectorCells and the Agent motion model update times will be additionally affected by the number of walls/barriers in the Environment. 1D simulations are significantly quicker than 2D simulations due to the reduced computational load of the 1D geometry.”

      And this sentence to section 2:

      “RatInABox is fundamentally continuous in space and time. Position and velocity are never discretised but are instead stored as continuous values and used to determine cell activity online, as exploration occurs. This differs from other models which are either discrete (e.g. “gridworld” or Markov decision processes) or approximate continuous rate maps using a cached list of rates precalculated on a discretised grid of locations. Modelling time and space continuously more accurately reflects real-world physics, making simulations smooth and amenable to fast or dynamic neural processes which are not well accommodated by discretised motion simulators. Despite this, RatInABox is still fast; to simulate 100 PlaceCell for 10 minutes of random 2D motion (dt = 0.1 s) it takes about 2 seconds on a consumer grade CPU laptop (or 7 seconds for BoundaryVectorCells).”

      Whilst this would be very interesting it would likely represent quite a significant edit, requiring rewriting of almost all the geometry-handling code. We’re happy to consider changes like these according to (i) how simple they will be to implement, (ii) how disruptive they will be to the existing API, (iii) how many users would benefit from the change. If many users of the package request this we will consider ways to support it.

      • In general, the set of default parameters might want to be included in the main text (vs in the supplement).

      We also considered this but decided to leave them in the methods for now. The exact value of these parameters are subject to change in future versions of the software. Also, we’d prefer for the main text to provide a low-detail high-level description of the software and the methods to provide a place for keen readers to dive into the mathematical and coding specifics.

      • It still says you can only simulate 4 velocity or head directions, which might be limiting.

      Thanks for catching this. This constraint has been relaxed. Users can now simulate an arbitrary number of head direction cells with arbitrary tuning directions and tuning widths. The methods have been adjusted to reflect this (see section 6.3.4).

      • The code license should be mentioned in the Methods.

      We have added the following section to the methods:

      6.6 License RatInABox is currently distributed under an MIT License, meaning users are permitted to use, copy, modify, merge publish, distribute, sublicense and sell copies of the software.

    2. Reviewer #2 (Public Review):

      George and colleagues present a novel open-source toolbox to model rodent locomotor patterns and electrophysiological responses of spatially modulated neurons, such as hippocampal "place cells". The present manuscript describes a comprehensive Python package ("RatInABox") with powerful capabilities to simulate a variety of environments, exploratory behaviors and concurrent responses of a variety of cell types. In addition, they provide the tools to expand these basics functions and potentially multiple different model designs, new cell types or more complex neural network architectures. The manuscript also illustrated several simple application cases. The authors have also created a comprehensive GitHub repository with more detailed explanations, tutorials and example scripts. Overall, I found both the manuscript and associated repository very clear, well written and easy the scrips easy to follow and implement, to a superior level of many commercial software packages. RatInABox fills several existing gaps in the literature and features important improvements over previous approaches; for example, the implementation of continuous 2D environments instead of tabularized state spaces. I believe this toolbox will be of great interest for many researchers in the field of spatial navigation and beyond and provide them with a remarkably powerful and flexible tool. I don't have any major issues with the manuscript. However, the manuscript can be further improved by clarifying some aspects of the toolbox, discussing its limitations and biological plausibility.

    1. Author Response

      LD Score regression (LDSC) is a software tool widely used in the field of genome-wide association studies (GWAS) for estimating heritabilities, genetic correlations, the extent of confounding, and biological enrichment. LDSC is for the most part not regarded as an accurate estimator of \emph{absolute} heritability (although useful for relative comparisons). It is relied on primarily for its other uses (e.g., estimating genetic correlations). The authors propose a new method called \texttt{i-LDSC}, extending the original LDSC in order to estimate a component of genetic variance in addition to the narrow-sense heritability---epistatic genetic variance, although not necessarily all of it. Epistasis in quantitative genetics refers to the component of genetic variance that cannot be captured by a linear model regressing total genetic values on single-SNP genotypes. \texttt{i-LDSC} seems aimed at estimating that part of the epistatic variance residing in statistical interactions between pairs of SNPs. To simplify, the basic model of \texttt{i-LDSC} for two SNPs $X_1$ and $X_2$ is

      \begin{equation}\label{eq:twoX} Y = X_1 \beta_1 + X_2 \beta_2 + X_1 X_2 \theta + E, \end{equation}

      and estimation of the epistatic variance associated with the product term proceeds through a variant of the original LD Score that measures the extent to which a SNP tags products of genotypes (rather than genotypes themselves). The authors conducted simulations to test their method and then applied it to a number of traits in the UK Biobank and Biobank Japan. They found that for all traits the additive genetic variance was larger than the epistatic, but for height the absolute size of the epistatic component was estimated to be non-negligible. An interpretation of the authors' results that perhaps cannot be ruled out, however, is that pairwise epistasis overall does not make a detectable contribution to the variance of quantitative traits.

      We thank the reviewer for carefully reading of our manuscript and we appreciate the constructive comments. Our responses and edits to the specific major comments and minor issues are given below.

      Major Comments

      This paper has a lot of strong points, and I commend the authors for the effort and ingenuity expended in tackling the difficult problem of estimating epistatic (non-additive) genetic variance from GWAS summary statistics. The mere possibility of the estimated univariate regression coefficient containing a contribution from epistasis, as represented in the manuscript's Equation~3 and elsewhere, is intriguing in and of itself.

      Is \texttt{i-LDSC} Estimating Epistasis?

      Perhaps the issue that has given me the most pause is uncertainty over whether the paper's method is really estimating the non-additive genetic variance, as this has been traditionally defined in quantitative genetics with great consequences for the correlations between relatives and evolutionary theory (Fisher, 1930, 1941; Lynch & Walsh, 1998; Burger, 2000; Ewens, 2004).

      Let us call the expected phenotypic value of a given multiple-SNP genotype the \emph{total genetic value}. If we apply least-squares regression to obtain the coefficients of the SNPs in a simple linear model predicting the total genetic values, then the partial regression coefficients are the \emph{average effects of gene substitution} and the variance in the predicted values resulting from the model is called the \emph{additive genetic variance}. (This is all theoretical and definitional, not empirical. We do not actually perform this regression.) The variance in the residuals---the differences between the total genetic values and the additive predicted values---is the \emph{non-additive genetic variance}. Notice that this is an orthogonal decomposition of the variance in total genetic values. Thus, in order for the variance in $\mathbf{W}\bm{\theta}$ to qualify as the non-additive genetic variance, it must be orthogonal to $\mathbf{X} \bm{\beta}$.

      At first, I very much doubted whether this is generally true. And I was not reassured by the authors' reply to Reviewer~1 on this point, which did not seem to show any grasp of the issue at all. But to my surprise I discovered in elementary simulations of Equation~\ref{eq:twoX} above that for mean-centered $X_1$ and $X_2$, $(X_1 \beta_1 + X_2 \beta_2)$ is uncorrelated with $X_1 X_2 \theta$ for seemingly arbitrary correlation between $X_1$ and $X_2$. A partition of the outcome's variance between these two components is thus an orthogonal decomposition after all. Furthermore, the result seems general for any number of independent variables and their pairwise products. I am also encouraged by the report that standard and interaction LD Scores are ``lowly correlated' (line~179), meaning that the standard LDSC slope is scarcely affected by the inclusion of interaction LD Scores in the regression; this behavior is what we should expect from an orthogonal decomposition.

      I have therefore come to the view that the additional variance component estimated by \texttt{i-LDSC} has a close correspondence with the epistatic (non-additive) genetic variance after all.

      In order to make this point transparent to all readers, however, I think that the authors should put much more effort into placing their work into the traditional framework of the field. It was certainly not intuitive to multiple reviewers that $\mathbf{X}\bm{\beta}$ is orthogonal to $\mathbf{W}\bm{\theta}$. There are even contrary suggestions. For if $(\mathbf{X}\bm{\beta})^\intercal \mathbf{W} \bm{\theta} = \bm{\beta}^\intercal \mathbf{X}^\intercal \mathbf{W} \bm{\theta} $ is to equal zero, we know that we can't get there by $\mathbf{X}^\intercal \mathbf{W}$ equaling zero because then the method has nothing to go on (e.g., line~139). We thus have a quadratic form---each term being the weighted product of an average (additive) effect and an interaction coefficient---needing to cancel out to equal zero. I wonder if the authors can put forth a rigorous argument or compelling intuition for why this should be the case.

      In the case of two polymorphic sites, quantitative genetics has traditionally partitioned the total genetic variance into the following orthogonal components:

      \begin{itemize}

      \item additive genetic variance, $\sigma^2_A$, the numerator of the narrow-sense heritability;

      \item dominance genetic variance, $\sigma^2_D$;

      \item additive-by-additive genetic variance, $\sigma^2_{AA}$;

      \item additive-by-dominance genetic variance, $\sigma^2_{AD}$; and

      \item dominance-by-dominance genetic variance, $\sigma^2_{DD}$.

      \end{itemize}

      See Lynch and Walsh (1998, pp. 88-92) for a thorough numerical example. This decomposition is not arbitrary or trivial, since each component has a distinct coefficient in the correlations between relatives. Is it possible for the authors to relate the variance associated with their $\mathbf{W}\bm{\theta}$ to this traditional decomposition? Besides justifying the work in this paper, the establishment of a relationship can have the possible practical benefit of allowing \texttt{i-LDSC} estimates of non-additive genetic variance to be checked against empirical correlations between relatives. For example, if we know from other methods that $\sigma^2_D$ is negligible but that \texttt{i-LDSC} returns a sizable $\sigma^2_{AA}$, we might predict that the parent-offspring correlation should be equal to the sibling correlation; a sizable $\sigma^2_D$ would make the sibling correlation higher. Admittedly, however, such an exercise can get rather complicated for the variance contributed by pairs of SNPs that are close together (Lynch & Walsh, 1998, pp. 146-152).

      I would also like the authors to clarify whether LDSC consistently overestimates the narrow-sense heritability in the case that pairwise epistasis is present. The figures seem to show this. I have conflicting intuitions here. On the one hand, if GWAS summary statistics can be inflated by the tagging of epistasis, then it seems that LDSC should overestimate heritability (or at least this should be an upwardly biasing factor; other factors may lead the net bias to be different). On the other hand, if standard and interaction LD Scores are lowly correlated, then I feel that the inclusion of interaction LD Score in the regression should not strongly affect the coefficient of the standard LD Score. Relatedly, I find it rather curious that \texttt{i-LDSC} seems increasingly biased as the proportion of genetic variance that is non-additive goes up---but perhaps this is not too important, since such a high ratio of narrow-sense to broad-sense heritability is not realistic.

      We thank the reviewer for taking the time to thoughtfully offer more context on how we might situate the i-LDSC framework within the greater context of traditional quantitative genetics. We now formalize the interaction component used in the i-LDSC model as an estimate of the phenotypic variance explained by additive-by-additive interactions between genetic variants (which we denote by 𝜎" to follow the conventional notation). In the newly revised Material and Methods, we also show how the i-LDSC model can be formulated to include dominance effects in a more general framework. Our updated derivations provide two key takeaways.

      First, we assume that the additive and interaction effect sizes in the general model (𝜷,𝜽) are each normally distributed with variances proportional to their individual contributions to trait heritability: 𝛽& ∼ 𝒩(0, 𝜎"), 𝜃' ∼ 𝒩(0, 𝜎" ). This independence assumption implies that the additive and non- $ $$ additive components 𝑿𝜷 and 𝑾𝜽 are orthogonal where 𝔼[𝜷⊺𝑿⊺𝑾𝜽] = 𝔼[𝜷⊺]𝑿⊺𝑾𝔼[𝜽] = 𝟎. This is important because, as the reviewer points out, it means that there is a unique partitioning of genetic variance when studying a trait of interest. In the revised version of the manuscript, we show this derivation in the main text (see lines 129-143). We also extend this derivation in the Materials and Methods where we show the same result even after we include the presence of dominance effects in the generative model (see lines 415-417 and 438-457).

      Second, we show that the genotype matrix 𝑿 and the matrix of genetic interactions 𝑾 are not linearly dependent because the additive-by-additive effects between two SNPs are encoded as the Hadamard product of two genotypic vectors in the form 𝒘! = 𝒙" ∘ 𝒙# (which is a nonlinear function of the genotypes). Linear dependence would have implied that one could find a transformation between a SNP and an interaction term in the form 𝒘! = 𝑐 × 𝒙" for some constant 𝑐. However, despite their linear independence, 𝑿 and 𝑾 are themselves not orthogonal and still have a nonzero correlation. This implies that the inner product between genotypes and their interactions is nonzero 𝑿⊺𝑾 ≠ 𝟎. To see this, we focus on a focal SNP 𝒙& and consider three different types of interactions:

      • Scenario I: Interaction between a focal SNP with itself (𝒙" ∘ 𝒙").
      • Scenario II: Interaction between a focal SNP with a different SNP (𝒙" ∘ 𝒙#).
      • Scenario III: Interaction between a focal SNP with a pair of different SNPs (𝒙# ∘ 𝒙$).

      In the Materials and Methods of the revised manuscript, we now provide derivations showing when would expect nonzero correlation between 𝑿 and 𝑾 which rely on the fact that: (1) we assume that genotypes have been mean-centered and scaled to have unit variance, and (2) under Hardy-Weinberg equilibrium, SNPs marginally follow a binomial distribution 𝒙& ∼ 𝐵𝑖𝑛(2, 𝑝) where 𝑝 represents the minor allele frequency (MAF) (Wray et al. 2007, Genome Res; Lippert et al. 2013, Sci Rep). These new additions are given in new lines 460-485).

      Lastly, we agree with the reviewer that our results indicate that LDSC inflates estimates of SNP- based narrow-sense heritability. Our intuition for why this happens is largely consistent with the reviewer’s first point: since GWAS summary statistics can be inflated by the tagging of non- additive genetic variance, then it makes sense that LDSC should overestimate heritability. LDSC uses a univariate regression without the inclusion of cis-interaction scores. A simple consequence from “omitted variable bias” is likely happening where, since LDSC does not explicitly account for contributions from the tagged non-additive components which also contribute to the variance in the GWAS summary statistics, the estimate for the coefficient 𝜎" becomes slightly inflated.

      How Much Epistasis Is \texttt{i-LDSC} Detecting?

      I think the proper conclusion to be drawn from the authors' analyses is that statistically significant epistatic (non-additive) genetic variance was not detected. Specifically, I think that the analysis presented in Supplementary Table~S6 should be treated as a main analysis rather than a supplementary one, and the results here show no statistically significant epistasis. Let me explain.

      Most serious researchers, I think, treat LDSC as an unreliable estimator of narrow-sense heritability; it typically returns estimates that are too low. Not even the original LDSC paper pressed strongly to use the method for estimating $h^2$ (Bulik-Sullivan et al., 2015). As a practical matter, when researchers are focused on estimating absolute heritability with high accuracy, they usually turn to GCTA/GREML (Evans et al., 2018; Wainschtein et al., 2022).

      One reason for low estimates with LDSC is that if SNPs with higher LD Scores are less likely to be causal or to have large effect sizes, then the slope of univariate LDSC will not rise as much as it ``should' with increasing LD Score. This was a scenario actually simulated by the authors and displayed in their Supplementary Figure~S15. [Incidentally, the authors might have acknowledged earlier work in this vein. A simulation inducing a negative correlation between LD Scores and $\chi^2$ statistics was presented by Bulik-Sullivan et al. (2015, Supplementary Figure 7), and the potentially biasing effect of a correlation over SNPs between LD Scores and contributed genetic variance was a major theme of Lee et al. (2018).] A negative correlation between LD Score and contributed variance does seem to hold for a number of reasons, including the fact that regions of the genome with higher recombination rates tend to be more functional. In short, the authors did very well to carry out this simulation and to show in their Supplementary Figure~S15 that this flaw of LDSC in estimating narrow-sense heritability is also a flaw of \texttt{i-LDSC} in estimating broad-sense heritability. But they should have carried the investigation at least one step further, as I will explain below.

      Another reason for LDSC being a downwardly biased estimator of heritability is that it is often applied to meta-analyses of different cohorts, where heterogeneity (and possibly major but undetected errors by individual cohorts) lead to attenuation of the overall heritability (de Vlaming et al., 2017).

      The optimal case for using LDSC to estimate heritability, then, is incorporating the LD-related annotation introduced by Gazal et al. (2017) into a stratified-LDSC (s-LDSC) analysis of a single large cohort. This is analogous to the calculation of multiple GRMs defined by MAF and LD in the GCTA/GREML papers cited above. When this was done by Gazal et al. (2017, Supplementary Table 8b), the joint impact of the improvements was to increase the estimated narrow-sense heritability of height from 0.216 to 0.534.

      All of this has at least a few ramifications for \texttt{i-LDSC}. First, the authors do not consider whether a relationship between their interaction LD Scores and interaction effect sizes might bias their estimates. (This would be on top of any biasing relationship between standard LD Scores and linear effect sizes, as displayed in Supplementary Figure~S15.) I find some kind of statistical relationship over the whole genome, induced perhaps by evolutionary forces, between \emph{cis}-acting epistasis and interaction LD Scores to be plausible, albeit without intuition regarding the sign of any resulting bias. The authors should investigate this issue or at least mention it as a matter for future study. Second, it might be that the authors are comparing the estimates of broad-sense heritability in Table~1 to the wrong estimates of narrow-sense heritability. Although the estimates did come from single large cohorts, they seem to have been obtained with simple univariate LDSC rather than s-LDSC. When the estimate of $h^2$ obtained with LDSC is too low, some will suspect that the additional variance detected by \texttt{i-LDSC} is simply additive genetic variance missed by the downward bias of LDSC. Consider that the authors' own Supplementary Table~S6 gives s-LDSC heritability estimates that are consistently higher than the LDSC estimates in Table~1. E.g., the estimated $h^2$ of height goes from 0.37 to 0.43. The latter figure cuts quite a bit into the estimated broad-sense heritability of 0.48 obtained with \texttt{i-LDSC}.

      Here we come to a critical point. Lines 282--286 are not entirely clear, but I interpret them to mean that the manuscript's Equation~5 was expanded by stratifying $\ell$ into the components of s-LDSC and this was how the estimates in Supplementary Table~S6 were obtained. If that interpretation is correct, then the scenario of \texttt{i-LDSC} picking up missed additive genetic variance seems rather plausible. At the very least, the increases in broad-sense heritability reported in Supplementary Table~S6 are smaller in magnitude and \emph{not statistically significant}. Perhaps what this means is that the headline should be a \emph{negligible} contribution of pairwise epistasis revealed by this novel and ingenious method, analogous to what has been discovered with respect to dominance (Hivert et al., 2021; Pazokitoroudi et al., 2021; Okbay et al., 2022; Palmer et al., 2023).

      This is an excellent question raised by the reviewer and, again, we really appreciate such a thoughtful and thorough response. First, we completely agree with the reviewer that the s-LDSC estimates previously included in the Supplementary Material should instead be discussed in the main text of the manuscript. In the revision, we have now moved the old Supplemental Table S6 to be the new Table 2. Second, we also agree that the conclusions about the magnitude of additive-by-additive effects should be based upon variance explained when using the cis- interaction score in addition to scores specific to different biological annotations when available, per s-LDSC.

      However, we want to respectfully disagree that the results indicate a negligible contribution of additive-by-additive genetic variance to all the traits we analyzed (see Figure 4D). Although the additive-by-additive genetic variance component is not significant in any trait in the UK Biobank, there is little reason to expect that they would be given the inclusion of 97 other biological annotations from the s-LDSC model. Indeed, in the s-LDSC paper itself the authors look only for enrichment of heritability for a given annotation not a statistically significant test statistic. It also worth noting that jackknife approaches tend to be conservative and yield slightly larger standard errors for hypothesis testing. Taking all the great points that the reviewer mentioned into account, we believe that a moderate stance to the interpretation of our results is one that: (i) emphasizes the importance of using s-LDSC with the cis-interaction score to better assess the variance explained by additive-by-additive interaction effects and (ii) allows for the significance of the additive-by-additive component to not be the only factor when determining the importance of the role of non-additive effects in shaping trait architecture.

      In the revision, we now write the following in lines 331-343:

      Lastly, we performed an additional analysis in the UK Biobank where the cis-interaction scores are included as an annotation alongside 97 other functional categories in the stratified-LD score regression framework and its software s-LDSC (Materials and Methods). Here, s-LDSC heritability estimates still showed an increase with the interaction scores versus when the publicly available functional categories were analyzed alone, but albeit at a much smaller magnitude (Table 2). The contributions from the additive-by-additive component to the overall estimate of genetic variance ranged from 0.005 for MCHC (P = 0.373) to 0.055 for HDL (P = 0.575) (Figures 4C and 4D). Furthermore, in this analysis, the estimates of the additive-by-additive components were no longer statistically significant for any of the traits in the UK Biobank (Table 2). Despite this, these results highlight the ability of the i-LDSC framework to identify sources of “missing” phenotypic variance explained in heritability estimation. Importantly, moving forward, we suggest using the cis- interaction scores with additional annotations whenever they are available as it provides more conservative estimates of the role of additive-by-additive effects on trait architecture.

      Lastly, in the Discussion, we now mention an area of future work would be to explore how the relationship between cis-interaction LD scores and interaction effect sizes might bias heritability estimates from i-LDSC (e.g., similar to the relationship explored standard LD scores and linear effect sizes in Figure 3 – figure supplement 8). See new lines 364-367.

    1. Author Response

      The following is the authors’ response to the current reviews.

      We agree with Reviewer #1 that it is not typical to include primary data in a review, but this seems to be a very unusual situation and it is not unprecedented. We seriously believe that it will significantly dilute the impact of the message if we were to separate this into two papers. We intended initially to do a comprehensive review of the αC-β4 motif as we think it is an extremely important element of secondary structure that has been rather overlooked in the protein kinase field. It is the site where the nucleotide and peptide/protein binding sites converge in the C:PKI complex and also in the RIα holoenzyme, which is also a pseudo-substrate inhibitor. This stable element is highly conserved in all protein kinases, and we think it is an extremely important allosteric site where the kinases differ. Thus, it is highly relevant for this set of Elife papers on kinase allostery. In parallel, we have developed the Local Spatial Pattern (LSP) alignment method for identifying Protein Residue Networks (PRNs) into a robust tool. When the Veglia team, our long-time collaborators, did their NMR analysis of the F100A mutant, which is in the αC-β4 loop, we thus decided to do the LSP analysis. The LSP results were so interesting and striking that we decided immediately to explore the motif further and to specifically compare the various crystal structures that we had solved in the past to see if indeed we had missed some changes. In addition to looking at the backbone, we decided to also look at the side chains and to compare the structures with the simulations. The results proved to be extremely informative and defined a multi-pronged approach that could be used to screen any disease mutation or alternatively as an Ala scan for any residue in any protein. I consider this to be one of the most important papers that I have published in many years. It describes a process for exploring the potential dynamic impact of any disease mutation or any point mutation. We emphasize repeatedly that the hypotheses generated from the computational screen will need to be validated experimentally, but our LSP analysis is a rapid and relatively inexpensive way to screen a set of mutations and predict which will have the greatest impact on dynamics. It is an especially powerful and robust way to identify allosteric sites as the LSP approach maps global changes of a single mutation across the entire protein. These mutants would then be prioritized for experimental follow-up. We are indeed now implementing this more comprehensive strategy in two ways. We are specifically exploring three disease mutations in the αC-β4 loop and, in parallel, are also doing a computational Ala scan of the entire loop (L95-L106); however, this is part of a separate and more comprehensive study that will take much longer. It will be the "Proof-of-Principle” of the hypotheses that we propose in our Elife paper. In addition to the LSP method, the MD simulations provide new and complementary insights into side chain dynamics in contrast to the static crystal structures. We will also begin to compare the αC-β4 loop in other kinases, specifically PKCβ2 and LRRK2, but once again this is part of a separate study and is clearly beyond the scope of this Elife paper. This focus on the αC-β4 loop is an excellent strategy that can be applied to any protein kinase. The LSP approach, however, can obviously also be applied to any protein or any motif, so it is potentially very powerful tool. We think that the impact and potential importance of this paper will be lost if it is split into two papers.

      I went back to look at a recent review that we did for the Biochemical Journal on the PKA Cβ isoform, and there we also included some new primary data in the review. It was never questioned. We believe that our manuscript is so perfectly appropriate for this Elife series that is focus on allostery in kinases, and having our paper back-to-back with the Veglia NMR paper is especially important and relevant. We thus ask you will seriously consider keeping this as a single paper as part of this series on allostery.


      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this work Wu, J., et al., highlight the importance of a previously overlooked region on kinases: the αC-β4 loop. Using PKA as a model system, the authors extensively describe the conserved regulatory elements within a kinase and how the αC-β4 loop region integrates with these important regulatory elements. Previous biochemical work on a mutation within the αC-β4 loop region, F100A showed that this region is important for the synergistic high affinity binding of ATP and the pseudo substrate inhibitor PKI. In the current manuscript, the authors assess the importance of the αC-β4 loop region using computational methods such as Local Spatial Pattern Alignment (LSP) and MD simulations. LSP analysis of the F100A mutant showed decreased values for degree centrality and betweenness centrality for several key regulatory elements within the kinase which suggests a loss in stability/connectivity in the mutant protein as compared to the WT. Additionally, based on MD simulation data, the side chain of K105, another residue within the αC-β4 loop region had altered dynamics in the F100A mutant as compared to the WT protein. While these changes in the αC-β4 loop region seem to be consistent with the previous biochemical data, the results are preliminary and the manuscript can be strengthened (as the authors themselves acknowledge) with additional experiments. Specific comments/concerns are listed below.

      1. MD simulations were carried out using a binary complex of the catalytic subunit of PKA and ATP/Mg and not the ternary complex of PKA, ATP/Mg and PKI. MD simulations carried out using the ternary complex instead of the binary complex would be more informative, especially on the role of the αC-β3 loop region in the synergistic binding of ATP/Mg and PKI.

      Response 1. Thank you for your suggestion. We have included the data for the MD simulations of the ternary complex in the revised manuscript. This includes a new figure and was indeed informative (Figure 11). Text describing this simulation is also added on pages 15-17. All the changes in the revised manuscript are highlighted in red.

      1. The LSP analysis shows a decrease in degree centrality for the αC-β4 loop region in the F100A mutant compared to the WT protein which suggests a gain in stability in this region for the F100A mutant (Fig. 8A). These results seem to be contradictory to the MD simulation data which shows the side chain dynamics of K105 destabilizes the αC-β4 loop region in the F100A mutant (Fig. 10B). It would be helpful if the authors could clarify this apparent discrepancy.

      Response 2. In Figure 8A, the negative values of degree centrality for the αC-β4 loop region show that the value of DC is less in the WT compared to the mutant, suggesting that those regions are more stable in the mutant. This says that the mutation in the αC-β4 loop region both rigidifies the motif and alters the communication signaling networks between the two lobes.

      The betweenness centrality plots (Figure 8B) also show how the connectivity between the two lobes is altered upon mutation. In the mutant the major connectors become V104 and I150 in the C-lobe, whereas connectivity was primarily governed by K72 (N-lobe) and D184 (C-lobe) in the wt C-subunit. Overall, the mutation causes rigidification of the αC-β4 loop and this leads to loss of allosteric communication between the two lobes.

      The MD simulation results as shown in Figure 10B are not contradictory. This figure shows the overall dynamic profile of the protein, based on principal component analysis (PCA) using the parameter of the residual flexibility. It does not reflect a particular motif's stability or flexibility. Instead it shows that overall the protein upon mutation becomes more dynamic and can sample different conformational states, while, in contrast, the WT protein preferred a single global state of conformation. However, the LSP results showed that, compared to the other parts, the αC-β4 loop, especially V104 at the tip, becomes more stable following mutation, and this has an impact on the allosteric communication between the two lobes. We have added this information into the revised manuscript on page 14, also highlighted in red.

      1. The foundation for the experiments carried out in this paper are based on previous NMR and computational data for the F100A mutant. However, the specific results and conclusions from these previous experiments are not clearly described.

      Response 3. The NMR paper has been already accepted by eLIFE and here we are attaching the bioRxiv paper link, “https://www.biorxiv.org/content/10.1101/2023.09.12.557419v1.”

      Reviewer #1 (Recommendations For The Authors):

      In this work Wu, J., et al., draw attention to the αC-β4 loop, a previously neglected region within kinases. A comprehensive review on the important regulatory elements within the kinase along with how the αC-β4 loop (and the αE helix) integrates with these different regulatory elements is presented well. As the authors themselves acknowledge, the data presented here while promising is preliminary. Additional biochemical, NMR and computational experiments need to be carried out to assess the importance of F100, K105 and other residues in this region.

      1. The authors indicate that previous computational studies predict a flip in the αC-β4 loop in the apo state. It would be helpful to have a figure showing the predicted flip as well as an explanation for the significance of this predicted flip.

      Response 1. The NMR paper has been already accepted by eLIFE and here we are attaching the bioRxiv paper link, “https://www.biorxiv.org/content/10.1101/2023.09.12.557419v1.” The Figures 3 and 6 in that paper described the predicted flip in the αC-β4 loop in the apo state. We did not see a flip in any of our crystal structures, and the LSP analysis which is based on 200 ns simulations is not sufficient to see this major conformational change.

      1. The authors cite previous NMR and biochemical experiments (reference 62), work that has just been submitted to eLife. Access to this work was difficult as this manuscript could not be found on the eLife website.

      Response 2. The NMR paper has been already accepted by eLIFE and here we are attaching the bioRxiv paper link, “https://www.biorxiv.org/content/10.1101/2023.09.12.557419v1.”

    2. Reviewer #1 (Public Review):

      In this work Wu, J., et al., highlight the importance of a previously overlooked region on kinases: the αC-β4 loop. Using PKA as a model system, the authors extensively describe the conserved regulatory elements within a kinase and how the αC-β4 loop region integrates with these important regulatory elements. Previous biochemical work on a mutation within the αC-β4 loop region, F100A showed that this region is important for the synergistic high affinity binding of ATP and the pseudo substrate inhibitor PKI. In the current manuscript, the authors assess the importance of the αC-β4 loop region using computational methods such as Local Spatial Pattern Alignment (LSP) and MD simulations. LSP analysis of the F100A mutant showed decreased values for degree centrality and betweenness centrality for several key regulatory elements within the kinase which suggests a loss in stability/connectivity in the mutant protein as compared to the WT. Additionally, based on MD simulation data, the side chain of K105, another residue within the αC-β4 loop region had altered dynamics in the F100A mutant as compared to the WT protein. While these changes in the αC-β4 loop region seem to be consistent with the previous biochemical data, the manuscript can be strengthened with additional experiments.

      Comments on the revised version:

      Additional experiments (both computational and experimental) assessing the role of the αC-β4 loop region (especially residues such as K105) are needed to bolster their hypothesis. My initial assessment therefore remains unchanged. While this manuscript falls short of expectations when it comes to experimental findings, it is an excellent review on the structural elements of kinases and how the newly identified αC-β4 loop region integrates with these important regions. Perhaps the experimental section (LSP analysis and MD simulation data) could be removed and this manuscript could be converted into a Review Article?

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      Despite the importance of T follicular helper cells (Tfh cells) in vaccine-induced humoral responses, it is still unclear which type of Tfh cells (Tfh1, Tfh2, and Tfh17) is critical for generating protective humoral immunity. By using the rhesus macaques model (most similar to human), the authors have addressed this potentially important question and obtained suggestive data that Tfh1 is critical. Although being suggestive, the evidence for the importance of Tfh1 is incomplete.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Developing vaccination capable of inducing persistent antibody responses capable of broadly neutralizing HIV strains is of high importance. However, our ability to design vaccines to achieve this is limited by our relative lack of understanding of the role of T-follicular helper (Tfh) subtypes in the responses. In this report Verma et al investigate the effects of different prime and boost vaccination strategies to induce skewed Tfh responses and its relationship to antibody levels. They initially find that live-attenuated measles vaccine, known to be effective at inducing prolonged antibody responses has a significant minority of germinal center Tfh (GC-Tfh) with a Th1 phenotype (GC-Tfh1) and then explore whether a prime and boost vaccination strategy designed to induce GC-Tfh1 is effective in the context of anti-HIV vaccination. They conclude that a vaccine formulation referred to as MPLA before concluding that this is the case.

      Clarification: MPLA serves as the adjuvant, and the vaccine formulation is characterized as a Th1 formulation based on the properties of the adjuvant.

      Strengths:

      While there is a lot of literature on Tfh subtypes in blood, how this relates to the germinal centers is not always clear. The strength of this paper is that they use a relevant model to allow some longitudinal insight into the detailed events of the germinal center Tfh (GC-Tfh) compartment across time and how this related to antibody production.

      Weaknesses:

      The authors focus strongly on the numbers of GC-Tfh1 as a proportion of memory cells and their comparison to GC-Tfh17. There seems to be little consideration of the large proportion of GC-Tfh which express neither CCR6 and CXCR3 and currently no clear reasoning for excluding the majority of GC-Tfh from most analysis. There seems to be an assumption that since the MPLA vaccine has a higher number of GC-Tfh1 that this explains the higher levels of antibodies. There is not sufficient information to make it clear if the primary difference in vaccine efficacy is due to a greater proportion of GC-Tfh1 or an overall increase in GC-Tfh of which the percentage of GC-Tfh1 is relatively fixed.

      Response: We appreciate the reviewer's comment. Indeed, while there is substantial literature on Tfh subtypes in blood; the strength of our study lies in utilizing a relevant model to provide longitudinal insights into the dynamics of the germinal center Tfh (GC-Tfh) compartment over time and its relationship to antibody production. Regarding the concern about the comprehensive analysis of GC Tfh subsets, including GC-Tfh1, GC-Tfh17, and others not expressing CCR6 and/or CXCR3, we fully acknowledge its importance. To address this, we will conduct a detailed analysis of GC Tfh and GC Tfh1 frequencies, encompassing subsets without CCR6 and CXCR3 expression, to provide a more comprehensive view of the GC-Tfh population in our analysis.

      Reviewer #2 (Public Review):

      Summary:

      Anil Verma et al. have performed prime-boost HIV vaccination to enhance HIV-1 Env antibodies in the rhesus macaque model. The authors used two different adjuvants, a cationic liposome-based adjuvant (CAF01) and a monophosphoryl lipid A (MPLA)+QS-21 adjuvant. They demonstrated that these two adjuvants promote different transcriptomes in the GC-TFH subsets. The MPLA+QS-21 adjuvant induces abundant GC TFH1 cells expressing CXCR3 at first priming, while the CAF01 adjuvant predominantly induced GC TFH1/17 cells co-expressing CXCR3 and CCR6. Both adjuvants initiate comparable Env antibody responses. However, MPLA+QS-21 shows more significant IgG1 antibodies binding to gp140 even after 30 weeks.

      The enhancement of memory responses by MPLA+QS-21 consistently associates with the emergence of GC TFH1 cells that preferentially produce IFN-γ.

      Strengths:

      The strength of this manuscript is that all experiments have been done in the rhesus macaque model with great care. This manuscript beautifully indicated that MPLA+QS-21 would be a promising adjuvant to induce the memory B cell response in the HIV vaccine.

      Weaknesses:

      The authors did not provide clear evidence to indicate the functional relevance of GC TFH1 in IgG1 class-switch and B cell memory responses.

      Response. We appreciate the recognition of our meticulous work in the rhesus macaque model and the potential of MPLA+QS-21 as an adjuvant for HIV vaccine-induced humoral immunity. We acknowledge the need to provide clearer evidence of the functional relevance of GC Tfh1 in IgG1 class-switching and B cell memory responses. We will attempt to address this concern in our revisions.

      Recommendations for Authors:

      Reviewer #1:

      1. Is the proportion of GC-Tfh1 within GC-Tfh significantly increased in MPLA vs CAF01? The balance between Tfh1 and Tfh17 data is shown in 4C but appears quite a modest difference. Additionally, it excludes the majority of GC-Tfh since it only considers CCR6 and CXCR3 expressing cells.

      Response. We have now included a comparison of the relative proportions of GC Tfh cells expressing CCR6 and CXCR3, as well as those lacking these markers. Our data now demonstrate an increased presence of Tfh1 within the GC-Tfh population when MPLA is employed at P1w2, as depicted in Figure 4D.

      1. Is there any relationship between GC-Tfh17, 1/17 and non Th1/17 GC-Tfh and antibody levels? In Figure 5C only GC Tfh1 is examined making it impossible to judge if this is specific to GC-Tfh1 or a general relationship between higher total GC-Tfh and antibodies.

      Response. In our revised description of the results, we have mentioned that GC Tfh frequencies correlated with antibody levels (r = 0.6, p < 0.05). However, it is important to note that this correlation was specific to the GC Tfh1 subset and was not observed with other subsets.

      Other points:

      1. The authors make a number of statements that rather exaggerate differences such as stating in the abstract that CAF01 induces Tfh1/17 while MPLA predominantly induces Tfh1. As shown in Figure 4C the majority of CCR6-CXCR3- GC-Tfh induced by CAF01 are GC-Tfh1 i.e. both formulations predominantly induce GC-Tfh1. Also, it is difficult to judge since the data is never provided but the predominant group of GC-Tfh appears to be CCR6-CXCR3- in both cases.

      Response. We acknowledge the need for greater precision in our descriptions. In response, we have addressed this concern by providing the frequencies of CCR6-CXCR3- GC Tfh cells in Figure 4D. We have also included a comparison of the relative frequencies across the adjuvant groups in the Results section (Lines 331-338).

      1. The authors use the term peripheral Tfh (pTfh), it may be better to use the more common term circulating Tfh (cTfh) to avoid confusion with T peripheral helper cells (Tph).

      Response. We appreciate the reviewer's suggestion to use the more commonly accepted term "circulating Tfh (cTfh)" instead of "peripheral Tfh (pTfh)." We have incorporated this change into our manuscript to ensure clarity and avoid potential confusion with "peripheral helper cells (Tph).

      1. Some further labelling of the pie chart in Figure 1G to at least specify larger groups such as Tfh2, Tfh17, Tfh1/17 would be helpful.

      Response. We have incorporated the suggestion and identified cTfh2, cTfh17, and cTfh2/17 cells. We additionally now state in the legend that overlapping pie arcs correspond to specific polarized Tfh subsets denoted by arc color.

      1. A gating example of the CXCR3, CCR6, CCR4 patterns in the GC Tfh would be helpful. "up to 25% of GC Tfh cells expressed CCR6" I think it is better to state the average here since 25% appears an outlier.

      Response. We have now included a gating example of chemokine receptor expression, patterns in the GC Tfh. Additionally, we have revised the statement to mention the median (7%) of GC Tfh cells expressing CCR6 instead of specifying the upper limit.

      1. Figure 1I, does this graph exclude triple negative cells? It's not clear from the figure legend but the numbers do not seem to add up with the graphical proportions shown in figure 1H.

      Response. We have made the necessary clarification in both the results section, figure, and the figure legend to state that the Boolean analysis is based on cells expressing either CXCR3 or CCR6, thus explaining the exclusion of triple negative cells.

      1. Figure 3C. Some label should be added to make clear which violins are from the CD95- and CD95+ groups. There may be too much data in this panel for p values to be legible. Either less graphs or more space may be needed.

      Response. We have updated the Y axis labels in the figure to state that the violin plots show the differences in gene expression between CD95+ CD4 T cells and CD95- CD4 T cells (naive).

      1. Figure 4B. Numbers attached to the gates (1, 17 etc) should be more clearly labeled Tfh1, Tfh17 etc since normally they might be expected to be gate percentages in this format. Gate percentages should also be added.

      Response. We have clearly labeled the subsets as "Tfh1" and "Tfh17," making it easier for readers to interpret the figure. Additionally, we have included gate percentages in the flow plot. Furthermore, the percentages of GC Tfh subsets are now depicted in Figure 4D.

      1. Overlarge and indistinct datapoint symbols are often a problem e.g. Figure 4G most of the CAF01 datapoints are merged into a single blob with no indication of where one point ends or begins. Supplementary figure 5E. Datapoint sizes are large to the extent that the lines are difficult to see. Lines indicating central tendency are often lost.

      Response. We have reworked the graphs (including 4G, now 4I) to ensure clarity,

      1. Generally greater care is needed with graph layout e.g. the B indicating figure 6B is on the graph of figure 6A.

      Response. We have made the necessary adjustment to ensure that the letter "B" correctly corresponds to the graph in Figure 6B.

      1. Figure 6J, the text seems to indicate "higher avidity with MPLA against autologous Env including V1V2 loops." However, the graph seems to indicate lower avidity for V1V2 loops? Response. We appreciate the careful observation. We have rectified this by updating the description in the results section to accurately reflect the graph, which shows higher avidity for V1V2 loops with CAF01.

      2. Figure 6A. The authors state that significantly higher IgG1 was induced but Figure 6A seems to be the only graph lacking an indication of statistical significance.

      Response. We have made the necessary adjustment to ensure that significance symbol is depicted in Figure 6A.

      1. Brackets indicating significance are often unclear. e.g. in Figure 4B MPLA graph there are three groups and a single multipoint bracket with a single result making it unclear which groups are being compared.

      Response. We have added clarification to the legend. It now states that the temporal comparisons in GC Tfh subsets for each vaccine group are made in relation to frequencies at baseline. This revision provides a clear reference point for the significance comparisons and ensures that readers can easily understand which groups are being compared.

      Reviewer #2:

      Overall, the manuscript is well-written and addresses an important issue. However, further investigation is warranted to understand how the MPLA+QS-21 induced GC TFH1 influenced on memory B cell response. This manuscript only showed the correlation between GC TFH1 and antibody responses. If the authors explain adjuvant preference in memory B cell responses, this manuscript could be more considerable for publication.

      1. This reviewer recommends that the author provide more evidence to indicate the functional relevance of GC TFH1 in IgG1 class-switch and B cell memory responses. Some evidence supports that IFN-γ controls the antigen-specific IgG1 responses in humans, but it is still controversial. The author also suggests the involvement of IL-21, but this is also an open question even in the human system. This is also the case in the memory responses. There is no direct link between IFN-γ and memory B cell responses in the human system. The authors need more evidence of how GC TFH1 cell development has more advantages in IgG1 and memory responses than GC TFH1 /17 cells. I believe an antibody blockade of cytokines would be a possible strategy to prove these questions.

      Response. We appreciate the reviewer's valuable suggestion to provide more evidence regarding the functional relevance of GC Tfh1 cells in IgG1 class-switch and B cell memory responses. It is indeed important to establish a direct link between GC Tfh1 cells and these responses, particularly in the context of cytokine skewing. The suggestion of antibody blockade studies to mechanistically link the modulation of the inflammatory milieu to Tfh differentiation and subsequent antibody functions is important. However, we must acknowledge that these studies are currently beyond the scope of our work. We have included this as a limitation in our study, recognizing the need for further studies to address these important questions.

      1. In Fig.5, the authors use different scales to indicate the IgG antibody titer. A shows the log scale, while B shows the linear scale. Moreover, the differences are minimal, even though the authors indicated a significant difference. I am not sure this difference is meaningful.

      Response. To clarify, we used a log scale in Figure 5A to demonstrate temporal changes over the course of vaccination. In Figure 5B, where we are comparing differences across vaccine regimens at week 30, a linear scale was deemed more appropriate, as it allows for a clear representation of the approximately two-fold difference observed. We fully acknowledge that to establish the biological significance of the observed difference, challenge studies will be essential.

    1. Author Response

      Reviewer #1 (Public Review):

      This article proposes a new statistical approach to identify which of several experimenter-defined strategies best describes a biological agent's decisions when such strategies are not fully observable by choices made in a given trial. The statistical approach is described as Bayesian but can be understood instead as computing a smoothed running average (with decay) of the strategies' success at matching choices, with a winner-take-all inference across the rules. The article tests the validity of this statistical approach by applying it to both simulated agents and real data sets in mice and humans. It focuses on dynamically changing environments, where the strategy best describing a biological agent may change rapidly.

      The paper asks an important question, and the analysis is well conducted; the paper is well-written and easy to follow. However, there are several concerns that limit the strength of the contribution. Major concerns include the framing of the method, considerations around the strategy space, limitations in how useful the technique may be, and missing details in analyses.

      Reviewer #2 (Public Review):

      In this study, the goal is to leverage the power of Bayesian inference to estimate online the probability that any given arbitrarily chosen strategy is being used by the decision-maker. By computing the trial-by-trial MAP and variance of the posterior distribution for each candidate strategy, the authors can not only see which strategy is primarily being used at every given time during the task and when strategy changes occur but also detect when the target rule of a learning task becomes the front-running strategy, i.e., when successful learning occurs.

      Strengths:

      1) The proposed approach adds to recent methods for capturing the dynamics of decision-making at finer temporal resolution (trials) (Roy et al., 2021; Ashwood et al., 2022) but it is novel and differs from these in that it is suited especially well for analyzing when learning occurs, or when a rule switches and learning must recommence, and it does not necessitate large numbers of trials.

      2) The manuscript starts with a validation of the approach using synthetic data and then is applied to datasets of trial-based two-alternative forced choice tasks ranging from rodent to non-human primate to human, providing solid evidence of its utility.

      3) Compared to classic procedures for identifying when an animal has learned a contingency which typically needs to be conservative in favor of better accuracy, this method retrieves signs of learning happening earlier (~30 trials earlier on average). This is achieved by identifying the moment (trial) when the posterior probability of the correct "target" rule surpasses the probability of all other strategies. Having greater temporal precision in detecting when learning happens may have a very significant impact on studies of the neural mechanisms of learning.

      4) This approach seems amenable to testing many different strategies depending on the purpose of the analysis. In the manuscript, the authors test target versus non-target strategies (correct versus incorrect) and also in another version of the analysis, they test what they call "exploratory" strategies.

      5) One of the main appeals of this method is its apparent computational simplicity. It necessitates only updating on every trial the parameters of a beta distribution (prior distribution for a given strategy) with the evidence that the behavior on trial was either consistent or inconsistent with the strategy. Two scalars, the mode of the posterior (MAP) and the inverse of the variance, are all that are required for identifying the decision criterion (highest MAP and if tied lowest variance) and the learning criterion (first trial where MAP for target strategy is higher than chance).

      Weaknesses:

      1) It seems like a limitation of this approach is that the candidate strategies to arbitrate between must be known ex-ante. It is not clear how this approach could be applied to uncover latent strategies that are not mixtures of the strategies selected.

      2) Different strategies may be indistinguishable from each other and thus it may not be possible to distinguish between them. Similarly, the fact that two strategies seem to be competing for the highest MAP doesn't necessarily mean that those are correct strategies and perhaps interchangeable as the manuscript seems to suggest.

      3) The decay parameter is a necessary component to make the strategy selection non-stationary and accommodate data sets where the rules are changing throughout the task. However, the choice of the decay parameter value bounds does not seem very principled. Having this parameter as a free-parameter adds a flexibility that seems to have significant effects on when the strategy switch is detected and how stable the detected switch is.

      4) This method is a useful approach for arbitrating between strategies and describing the behavior with a temporal precision that may prove important for studies attempting to tie these precise events to changes in neural activity. However, it seems limited in its explanatory power. In its current form, this method does not provide a prediction of the probability to transition from one strategy to another. And, because the MAP of different strategies may be close at any given moment, it is hard to imagine using this approach to tease out the different "mental states" that represent each strategy being at play.

      The reviewers’ detailed comments, not shared here, helped us considerably to improve the paper, and we thank the reviewers for their time here. We are unsure of the merits of sharing public reviews of a paper that has now changed considerably from the version that these reviews address. Nonetheless we shall address some key points of potential misunderstanding here.

      “The statistical approach is described as Bayesian but can be understood instead as computing a smoothed running average (with decay) of the strategies' success at matching choices, with a winner-take-all inference across the rules.“

      This is inaccurate. The algorithm performs sequential Bayesian updates on the evidence for and against the use of each strategy considered; for a given strategy i, its output at each trial is a fully parameterised posterior distribution over the probability of that strategy being used by the subject.

      We are careful in the paper to separate the algorithm’s output from our further use of that output. To plot and analyse the output we often make use of the maximum a posteriori (MAP) estimate from each posterior. Other choices are of course possible, and we discuss them in the text.<br /> In one set of simulations we quantify the results using a decision rule that chooses the strategy with the highest MAP - this is presumably the “winner-takes-all inference” in the quoted text. We do not use this anywhere else in the paper, including the analyses of the 4 datasets, and so do not consider it as part of our method, but one possible use of the output of the algorithm.

      “Major concerns include the framing of the method, considerations around the strategy space, limitations in how useful the technique may be, and missing details in analyses”

      Our goal for this paper was to develop a computationally lightweight, trial-resolution, Bayesian approach to tracking the probability of user-specified strategies, so that we can capture the observer’s evidence for learning or for the features driving exploratory choice (e.g. whether subjects are responding to losses or wins; are they responding to cues or choice etc). The above quote reflects their detailed review comments, where we felt this reviewer wanted a solution to a different problem, that of a parameterised latent model of strategy use: while a perfectly valid research goal, this was not what we addressed here.

      “1) It seems like a limitation of this approach is that the candidate strategies to arbitrate between must be known ex-ante. It is not clear how this approach could be applied to uncover latent strategies that are not mixtures of the strategies selected.”

      The problem of knowing which strategies to analyse in advance only applies when running our algorithm in real-time. The fact that it could be run in real-time on modest computing hardware is to us one of its strengths, so we consider this a good problem to have.

      As noted above, rather than determine latent strategies, our goal was to build an observer model that allows users to specify whatever strategy they wanted in order to answer their scientific question(s) of their data. For example, to define when a particular rule has been learnt; or to look for changes in response to particular features of the environment, such as a cue, or to a drug treatment or other intervention.

      2) Different strategies may be indistinguishable from each other and thus it may not be possible to distinguish between them. Similarly, the fact that two strategies seem to be competing for the highest MAP doesn't necessarily mean that those are correct strategies and perhaps interchangeable as the manuscript seems to suggest.

      As noted above, this is an observer model, and it is thus necessarily true that there are strategies for which the observer does not have sufficient evidence to distinguish. For example, a subject who continually chooses the rewarded left-hand lever will be doing both a strategy of “go left” and of “win-stay” in response to their choice. The inability to distinguish strategies is a property of the data, not of the algorithm. Also as noted above, we do not here consider the competition between strategies.

      3) The decay parameter is a necessary component to make the strategy selection non-stationary and accommodate data sets where the rules are changing throughout the task. However, the choice of the decay parameter value bounds does not seem very principled. Having this parameter as a free-parameter adds a flexibility that seems to have significant effects on when the strategy switch is detected and how stable the detected switch is.

      The revised manuscript draws together the existing simulations and analysis of the method to directly address this point, showing that there is a principled range of the decay parameter in which the algorithm should operate. The Discussion also points out that this is no different to a free parameter than any frequentist approach to strategy analysis, which must choose some time windows over which to compute the frequentist probability.

      4) This method is a useful approach for arbitrating between strategies and describing the behavior with a temporal precision that may prove important for studies attempting to tie these precise events to changes in neural activity. However, it seems limited in its explanatory power. In its current form, this method does not provide a prediction of the probability to transition from one strategy to another. And, because the MAP of different strategies may be close at any given moment, it is hard to imagine using this approach to tease out the different "mental states" that represent each strategy being at play.

      As noted above, this is an observer model and does not intend to infer mental states. The goal is to make accurate statements about observable behaviour. We agree that an interesting extension to this approach would be to model the transitions between strategies, and had already outlined this in the Discussion.

    1. Author Response

      Our responses to the reviewers to go into the published pre-print. We thank the reviewers for their encouraging and thoughtful comments. These are good points that we would like to comment on as follows:

      Reviewer 1:

      Some important and interesting data are missing. For example, whether the gene therapy can extend the life span of these mutants? The overall in vivo voiding function is missing. AAV9/HSPE2 expression in the bladder wall is not shown.

      A. Our study was not designed to determine whether gene therapy can improve life span of the Hpse2 mutant mice. We know that the mutant mice usually become ill after the first month of life and can die. However, we wanted to study the mice when they were generally well so that there would be no confounding effects on the bladder physiology caused by general ill health. Indeed, a recent study of Hpse2 inducible deletion in adult mice has shown evidence of exocrine pancreatic insufficiency (Kayal et al., PMID 37491420). We are currently exploring the status of the pancreas in our non-conditional juvenile Hpse2 mice, and whether gene transfer into the pancreas is possible.

      B. We strongly agree that in vivo voiding studies will be important it the future, and suggest in vivo cystometry is the gold standard for this but is currently beyond the remit of this study.

      C. It is correct that in this paper we have focussed on gene transduction into the pelvic ganglia, because the evidence is mounting that this is a neurogenic disease. Our ex vivo physiological studies show predominantly neurogenic defects that are corrected by the gene therapy. A detailed study of the bladder body is an interesting idea, in terms of possible transgene expression and detailed histology, and is something we will pursue in future studies.

      Review 2:

      Weaknesses include a lack of discussion of the basis for differences in carbachol sensitivity in Hpse2 mutant mice, limited discussion of bladder tissue morphology in Hpse2 mutant mice, some questions over the variability of the functional data, and a need for clarification on the presentation of statistical significance of functional data.

      A. Yes, it is interesting that untreated male mutant mice have an increased bladder body contraction to carbachol compared with WT males. In a previous paper (Manak et al., 2020) we performed quantitative western blots for the M2 and M3 receptors and found levels were similar in mutants to the WTs, thus the increased sensitivity probably lies post-receptor.

      B. A detailed study of the bladder body is an interesting idea, in terms of possible transgene expression and detailed histology, and is something we will pursue in future studies.

      C. We have reported in our physiology graphs what we find. We do find some variability, particularly at lower frequencies, but our conclusions depend on analyses of the whole curve, which depend on multiple frequencies and show the expected overall pattern of frequency-dependent relaxation.

      D. Thank you, the stats for Figure 8 will be corrected in the final version.

      Reviewer 3:

      Single-cell analysis of mutants versus control bladder, urethra including sphincter. This would be great also for the community.

      A. Yes, in future we are very interested in using a single cell sequencing approach to look at the mutant, WT and rescued pelvic ganglia. In relation to this, there is a recent proof-of-principle paper pre-print in WT mouse pelvic ganglia, which suggests this may be feasible (Sivori et al., 2023).

      Detailed tables showing data from each mouse examined.

      B. In theory, it would be very interesting to correlate the strength of human gene transduction into the pelvic ganglia, with, for example, the effect on a physiological parameter. However, in general we used different sets of mice for these techniques so at the present we don’t have this information.

      Use of measurements that are done in vivo (spot assay for example). This sounds relatively simple.

      C. We strongly agree that in vivo voiding studies will be important it the future, and suggest in vivo cystometry is the gold standard for this but is currently beyond the remit of this study.

      Assessment of viral integration in tissues besides the liver (could be done by QPCR).

      D. This is an important point, and suggest the pancreas is a particularly interesting target for future studies. a recent study of Hpse2 inducible deletion in adult mice has shown evidence of exocrine pancreatic insufficiency (Kayal et al., PMID 37491420). We are currently exploring the status of the pancreas in our non-conditional juvenile Hpse2 mice, and whether gene transfer into the pancreas is possible.

      Discuss subtypes of neurons that are present and targeted in the context of mutants and controls.

      E. The make-up of the pelvic ganglia in Hpse2 mutant mice is a fascinating question. Future analysis using scRNA-Seq may be the most effective way to answer this question and is a molecular approach we are looking to pursue in the future.

    2. eLife assessment

      Urofacial syndrome is a rare early-onset lower urinary tract disorder characterized by variants in HPSE2, the gene encoding heparanase-2. This valuable study demonstrates that AAV9-based gene therapy for urofacial syndrome is feasible and safe, at least over the time frame evaluated, with restoration of HPSE2 expression leading to re-establishment of evoked contraction and relaxation of bladder and outflow tract tissue, respectively, in organ bath studies. The evidence supporting these findings is solid, although the analysis would benefit from evaluation of additional replicates for several endpoints, quantitative assessment of HPSE2 expression, inclusion of in vivo analyses such as void spot assays or cystometry, more rigorous assessment of viral integration, and single-cell analysis of the urinary tract in mutants versus controls, all of which make the analysis of the data currently incomplete.

    3. Reviewer #1 (Public Review):

      Summary:

      The authors try to use a gene therapy approach to cure urofacial symptoms in an HSPE2 mutant mouse model.

      Strengths:

      The authors have convincingly shown the expression of AAV9/HSPE2 in pelvic ganglion and liver tissues. They have also shown the defects in urethra relaxation and bladder muscle contraction in response to EFS in mutant mice, which were reversed in treated mice.

      Weaknesses:

      Some important and interesting data are missing. For example, whether the gene therapy can extend the life span of these mutants? The overall in vivo voiding function is missing. AAV9/HSPE2 expression in the bladder wall is not shown.

    4. Reviewer #2 (Public Review):

      In this study, Lopes and colleagues provide convincing evidence to support the potential for gene therapy to restore expression of heparanase-2 (Hpse2) in mice mutant for this gene, as occurs in urofacial syndrome. Beyond symptomatic relief for the consequences of outlet obstruction that results from Hpse2 mutation, no treatments exist. Building on prior studies describing the nature of urinary tract dysfunction in Hpse2 mutant mice, the authors applied a gene therapy approach to determine whether gene replacement could be achieved, and if so, whether restoration of HPSE2 expression could mitigate the urinary tract dysfunction and present a potential cure. Using an AAV9 viral vector encoding HPSE2, the authors performed gene replacement in neonatal wild-type or Hpse2 mutant mice and determined gene and protein expression as well as the impact on bladder outflow tract and bladder body physiology in juvenile mice. In addition to dose-dependent transduction of liver and pelvic ganglia (that innervate the bladder) with HPSE2, and demonstration of increased HPSE2 protein in Hpse2 mutant mice, the authors showed restoration of nerve-evoked outflow tract relaxation and bladder body contraction, both of which were deficient in mutant mice. They also showed that the viral vector-based approach was not deleterious to weight gain or to liver morphology. Based on these findings the authors concluded that AAV9-based HPSE2 replacement is feasible and safe, mitigates the physiological deficits in outflow tract and bladder tissue from Hpse2 mutant mice, and provides a foundation for gene replacement approaches for other genes implicated in lower urinary tract disorders.

      Strengths include a rigorous experimental design, solid data in support of the conclusions, and a discussion of the limitations of the approach.

      Weaknesses include a lack of discussion of the basis for differences in carbachol sensitivity in Hpse2 mutant mice, limited discussion of bladder tissue morphology in Hpse2 mutant mice, some questions over the variability of the functional data, and a need for clarification on the presentation of statistical significance of functional data

    5. Reviewer #3 (Public Review):

      Summary:

      This is a really interesting study, looking at the efficacy of AAV-mediated delivery of wt HSPE2 gene into mouse mutants with the goal of rescuing lower urinary tract defects.

      Strengths: Nice analysis of muscle physiology ex vivo, interesting approach.

      Weaknesses: lack of rigor (see below). This is an awesome opportunity to learn much more about the disease, its affects on neurons, muscle, etc.

      * Single-cell analysis of mutants versus control bladder, urethra including sphincter. This would be great also for the community.

      * Detailed tables showing data from each mouse examined.

      * Survival curves.

      * Use of measurements that are done in vivo (spot assay for example). This sounds relatively simple.

      * Assessment of viral integration in tissues besides the liver (could be done by QPCR).

      * Discuss subtypes of neurons that are present and targeted in the context of mutants and controls.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This paper reports the development of SCA-seq, a new method derived from PORE-C for simultaneously measuring chromatin accessibility, genome 3D and CpG DNA methylation. Most of the conclusions are supported by convincing data. SCA-seq has the potential to become a useful tool to the scientific communities to interrogate genome structure-function relationships.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this work, Xie et al. developed SCA-seq, which is a multiOME mapping method that can obtain chromatin accessibility, methylation, and 3D genome information at the same time. SCA-seq first uses M.CviPI DNA methyltransferase to treat chromatin, then perform proximity ligation followed by long-read sequencing. This method is highly relevant to a few previously reported long read sequencing technologies. Specifically, NanoNome, SMAC-seq, and Fiber-seq have been reported to use m6A or GpC methyltransferase accessibility to map open chromatin, or open chromatin together with CpG methylation; Pore-C and MC-3C have been reported to use long read sequencing to map multiplex chromatin interactions, or together with CpG methylation. Therefore, as a combination of NanoNome/SMAC-seq/Fiber-seq and Pore-C/MC-3C, SCA-seq is one step forward. The authors tested SCA-seq in 293T cells and performed benchmark analyses testing the performance of SCA-seq in generating each data module (open chromatin and 3D genome). The QC metrics appear to be good and I am convinced that this is a valuable addition to the toolsets of multi-OMIC long-read sequencing mapping.

      The revised manuscript addressed most of my questions except my concern about Fig. S9. This figure is about a theory that a chromatin region can become open due to interaction with other regions, and the author propose a mathematic model to compute such effects. I was concerned about the errors in the model of Fig. S9a, and I was also concerned about the lack of evidence or validation. In their responses, the authors admitted that they cannot provide biological evidence or validations but still chose to keep the figure and the text.

      The revised Fig. S9a now uses a symmetric genome interaction matrix as I suggested. But Figure S9a still have a lot of problems. Firstly, the diagonal of the matrix in Fig. S9a still has many 0's, which I asked in my previous comments without an answer. The legend mentioned that the contacts were defined as 2, 0 or -2 but the revised Fig. S9a only shows 1,0, or -1 values. Furthermore, Fig. S9b,9c,9d all added a panel of CTCF+/- but there is no explanation in text or figure legend about these newly added panels. Given many unaddressed problems, I would still suggest deleting this figure.

      In my opinion, this paper does not need Fig. S9 to support its major story. The model in this figure is independent of SCA-seq. I think it should be spinoff as an independent paper if the authors can provide more convincing analysis or experiments. I understand eLife lets authors to decide what to include in their paper. If the authors insist to include Fig. S9, I strongly suggest they should at least provide adequate explanation about all the figure panels. At this point, the Fig. S9 is not solid and clearly have many errors. The readers should ignore this part.

      We appreciate the reviewer for raising these concerns regarding Fig. S9. After careful consideration, we have decided to address your concerns by deleting Fig. S9 and the corresponding text from the manuscript. We understand your point that the model presented in Fig. S9 is independent of SCA-seq and may require additional evidence and validation to be presented in a separate paper.

      We agree that it is important to maintain the integrity and accuracy of the manuscript, and we appreciate your feedback in helping us make this decision.

      Reviewer #2 (Public Review):

      In this manuscript, Xie et al presented a new method derived from PORE-C, SCA-seq, for simultaneously measuring chromatin accessibility, genome 3D and CpG DNA methylation. SCA-seq provides a useful tool to the scientific communities to interrogate the genome structure-function relationship.

      The revised manuscript has clarified almost of the concerns raised in the previous round of review, though I still have two minor concerns,

      1. In fig 2a, there is no number presented in the Venn diagram (although the left panel indeed showed the numbers of the different categories, including the numbers in the right panel would be more straightforward).

      We appreciate the reviewer for pointing out the need for clarification in the Venn diagram in Fig 2a. We have added the numbers to Venn diagram.

      1. The authors clarified the discrepancy between sfig 7a and sfig 7g. However, the remaining question is, why is there a big difference in the percentage of the cardinality count of concatemers of the different groups between the chr7 and the whole genome?

      We apologize for the confusion regarding the difference in the percentage of the cardinality count of concatemers between chr7 and the whole genome in figures S7a and S7g. The difference arises because the chr7 cardinality count only considers the intra-chromosome segments that are adjacent to each other on a SCA-seq concatemer, while the whole genome cardinality count includes both intra-chromosome and inter-chromosome segments.

      In the case of a SCA-seq concatemer that contains both intra-chromosome junctions and inter-chromosome junctions, the whole genome cardinality count will be greater than the intra-chromosome cardinality count. This explains the difference in the percentages between chr7 and the whole genome in figures S7a and S7g.

      To better clarify the definition of intra-chromosome cardinality, we have added an illustrative graph in figure S7a. In the updated figure S7a, the given exemplary SCA-seq concatemer has a whole genome cardinality of 4 and a chr7 intra-chromosome cardinality of 3.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This important study reports investigation of the dynamics of PKA at the single-cell level in in vitro and in epithelia in vivo. Using different fluorescent biosensors and optogenetic actuators, the authors dissect the signaling pathway responsible for PKA waves, finding that PKA activation is a consequence of PGE2 release, which in turn is triggered by calcium pulses, requiring high ERK activity. The evidence supporting the claims is solid. At this stage the work is still partly descriptive in nature, and additional measurements would increase the strength of mechanistic insights and physiological relevance.

      We deeply appreciate Dr. Alejandro San Martín and Dr. Jonathan Cooper and the reviewers. Each comment is valuable and reasonable. We will revise our paper as much as possible.

      We have described what we will do for the reviewer’s comments one by one in the below section.

      Reviewer #1 (Recommendations For The Authors):

      1. Even though the phenomenon of PGE2 signal propagation is elegantly demonstrated and well described, the whole paper is mostly of descriptive nature - the PGE2 signal is propagated via intercellular communication and requires Ca transients as well as MAPK activity, however function of these RSPAs in dense epithelium is not taken into consideration. What is the function of these RSPAs in cellular crowding? - Does it promote cell survival or initiate apoptosis? Does it feed into epithelial reorganization during cellular crowding? Still something else? The authors discuss possible roles of this phenomenon in cell competition context, but show no experimental or statistical efforts to answer this question. I believe some additional analysis or simple experiment would help to shed some light on the functional aspect of RSPAs and increase the importance of all the elegant demonstrations and precise experimental setups that the manuscript is rich of. Monolayer experiments using some perturbations that challenge the steady state of epithelial homeostasis - drug treatments/ serum deprivation/ osmotic stress/ combined with live cell imaging and statistical methods that take into account local cell density might provide important answers to these questions. The authors could consider following some of these ideas to improve the overall value of the manuscript.

      We would like to thank the reviewer’s comment. Although we have intensively tried to identify the physiological relevance of RSPA, we could not detect the function at present.

      In the case of MDCK, the treatment of NSAIDs, which cancels RSPA, did not affect its cell growth, ERK wave propagation during collective migration, migration velocity, cell survival, or apoptosis. In mouse epidermis, the frequency of RSPA was NOT affected by inflammation and collective cell migration, evoked by TPA treatment and wound, respectively.

      Notably, RSPA also occurs in the normal epidermis, implying its relevance in homeostasis. However, at the current stage, we believe that the PGE2 dynamics and its regulation mechanism in the normal epidermis would be worth reporting to researchers in the field.

      1. In the line 82-84 the authors claim: "We found that the pattern of cAMP concentration change is very similar to the activity change of PKA, indicating that a Gs protein-coupled receptor (GsPCR) mediates RSPA". In our opinion, this conclusion is not well-supported by the results. The authors should at least show that some measurements of the two patterns show correlation. Are the patterns of cAMP of the same size as the pattern of PKA? Do they have the same size depending on cell density? Do they occur at the same frequency as the PKA patterns, depending on the cell density? Do they have an all or nothing activation as PKA or their activation is shading with the distance from the source?

      We have modified the text (line85)

      “Although the increment of the FRET ratio was not so remarkable as that of Booster-PKA, Wwe found that the pattern of cAMP concentration change is very similar to the activity change of PKA, indicating that a Gs protein-coupled receptor (GsPCR) mediates RSPA. This discrepancy may be partially explained by the difference in the dynamic ranges for cAMP signaling in each FRET biosensor (Watabe2020). “

      1. In general, the absolute radius of the waves is not a good measurement for single-cell biology studies, especially when comparing different densities or in vivo vs in vitro experiments. We suggest the authors add the measurement of the number of the cells involved in the waves (or the radius expressed in number of cells).

      We appreciate the reviewer’s comment. We have analyzed our results to demonstrate the number of cells as in Fig2E, which would be easy for readers to understand.

      1. In 6D, the authors should also show the single-cell trajectories to understand better the correlation between PKA and ERK peaks. Is the huger variability in ERK activity ratio dues to different peak time or different ERK activity levels in different cells? The authors should show both the variability in the time and intensity.

      We have added a few representative results as Fig. S4.

      1. In lines 130-132, the authors write, "This observation indicates that the amount of PGE2 secretion is predetermined and that there is a threshold of the cytoplasmic calcium concentration for the triggered PGE2 secretion". How could the author exclude that the amount of PGE2 is not regulated in its intensity as well? For sure, there is a threshold effect regarding calcium, but this doesn't mean that PGE2 secretion can be further regulated, e.g. by further increasing calcium concentration or by other mechanisms.

      We agree with the reviewer’s comment. We have modified the text.

      1. The manuscript shows that not all calcium transients are followed by RSPAs. Does the local cell density/crowding increase the probability of overlap between calcium transients and RSPAs?

      We appreciate the reviewer’s comment. We have also hypothesized the model. However, we did not see the correlation that the reviewer pointed out. Currently, the increment of the RSPA frequency at high density is partially caused by the increment of calcium transients.

      Reviewer #2 (Recommendations For The Authors):

      1. The work is hardly conclusive as to the actual biological significance of the phenomenon. It would be interesting to know more under which physiological and pathological conditions PGE2 triggers such radial PKA activity changes. It is not well explained in which tissues and organs and under what conditions this type of cell-to-cell communication could be particularly important.

      The greatest weakness of the study seems to be that the biological significance of the phenomenon is not clearly clarified. Although it can be deduced that PKA activation has many implications for cell signaling and metabolism, the work lacks the actual link to physiological or pathological significance.

      We deeply appreciate the reviewer’s comment. Similar to the reseponse of reviewer#1, although we have intensively tried to identify the physiological relevance of RSPA, we could not detect the function.

      On the other hand, we believe that the PGE2 dynamics and its regulation mechanism in the normal epidermis would be worth reporting to researchers in the field.

      1. The authors do not explain further why in certain cells of the cell clusters Ca2+ signals occur spontaneously and thus trigger the phenomenon. What triggers these Ca2+ changes? And why could this be linked to certain cell functions and functional changes?

      At this moment, we do not have a clear answer or model for the comment although the calcium transients have been reported in the epidermis (https://doi.org/10.1038/s41598-018-24899-7). Further studies are needed and we will pursue this issue as a next project.

      1. What explains the radius and the time span of the radial signal continuation? To what extent are these factors also related to the degradation of PGE2? The work could be stronger if such questions and their answers would be experimentally integrated and discussed.

      We agree with the reviewer’s comment. Although we have intensively studied that point, we have omitted the results because of its complications. In HeLa cells, but not MDCK cells, we demonstrate the meaning of the radius of RSPA (https://pubmed.ncbi.nlm.nih.gov/37813623/)

      1. The authors could consider whether they could investigate the subcellular translocation of cPLA2 in correlation with cytosolic Ca2+ signals using GFP technology and high-resolution fluorescence microscopy with their cell model.

      Actually, we tried to monitor the cPLA2 translocation using GFP-tagged cPLA2. However, the translocation of GFP-cPLA2 was detected, only when the cells were stimulated by calcium ionophore. At this point, we have concluded that the quantitative analysis of cPLA2 translocation would be difficult.  

      Reviewer #3 (Recommendations For The Authors):

      1. "The cell density in the basal layer is approximately 2x106 cells cm-2, which is markedly higher than that in MDCK cells (Fig. 2D). It is not clear whether this may be related to the lower frequency (~300 cm-2 h-1) and smaller radius of RSPA in the basal layer cells compared to MDCK cells (Fig. 2E)." Wasn't the relationship with cell density the opposite, higher density higher frequency? Isn't then this result contradicting the "cell density rule" that the authors argue is there in the in vitro system? The authors need to revise their interpretation of the data obtained.

      We agree with the reviewer’s comment. Currently, we do not find the "cell density rule" in mouse epidermis. It would be difficult to identify common rules between mouse epidermis and MDCK cells. However, although it is descriptive, we believe it is worth comparing the MDCK results at this moment.

      1. Similarly, the authors over conclude on the explanation of lack of change in the size of RSPA size when the change in fluorescence for the calcium reporter surpasses a threshold by saying that "This observation indicates that the amount of PGE2 secretion is predetermined and that there is a threshold of the cytoplasmic calcium concentration for the triggered PGE2 secretion." First, the study does not really measure directly PGE2 secretion. Hence, there is no way that they can argue that the level of PGE2 secreted is "predetermined". Instead, there could be an inhibitory mechanism that is triggered to limit further activation of PGE2 signaling/PKA in neighboring cells.

      We agree with the reviewer’s comment. We have omitted the context.

      1. To rule out a transcription-dependent mechanism in the apparent cell density-regulated sensitivity to PGE2, the authors need to inhibit transcription. We agree that our RNA-seq analysis would not 100% rule out the transcription-dependent mechanism. However, we believe that shutting down all transcription will show a severe off-target effect that indirectly affects the calcium transients and the PGE2-synthetase pathway. Therefore, our conclusion is limited.

      4) EGF is reported to increase the frequency of RSPA but the change shown in Fig. 6F is not statistically significant, hence, EGF does not increase RSPA frequency in their experiments.

      We have toned down the claim that EGF treatment increases the frequency (line172).

      "Accordingly, the addition of EGF faintly increased the frequency of RSPA in our experiments, while the MEK and EGFR inhibitors almost completely abrogated RSPA (Fig. 6F), representing that ERK activation or basal ERK activity is essential for RSPA.“

      1. The Discussion section is at times redundant with the results section. References to figures should be kept in the Results section.

      We would like to argue in opposition to this comment. For readers, we believe that the reference to figures would be helpful and kind. However, if eLife recommends removing the reference from the Discussion section, we will follow the publication policy.

      1. "Notably, the propagation of PKA activation, ~100 μm/min (Fig. 1H), is markedly faster than that of ERK activation, 2-4 μm/min (Hiratsuka et al., 2015)." The 2 kinase reporters are based on different molecular designs. Thus, it does not seem appropriate to compare the kinetics of both reporters as a proxy of the comparison of the kinetics of propagation of both kinases.

      We think that we should discuss the comparison of the activity propagation between ERK and PKA. First, among many protein kinases, only ERK and PKA activities have been shown to spread in the epithelial cells. Second, both pathways are considered to be intercellular communication. Finally, crosstalk between these two pathways has been reported in several cells and organs.

      1. In Figure 1E it is unclear what is significantly different from what. Statistical analysis should be added and reporting of the results should reflect the results from that analysis.

      2. In Figure 3F and G the color coding is confusing. In F pink is radius and black is GCaMP6 and in G is RSPA+ and - cells. The authors should change the color to avoid ambiguity in the code.

      We have amended the panels.

      1. In Fig. 5C, how do they normalize per cell density if they are measuring radius of the response?

      In Fig5C, we just measure the increment of FRET ratio in the view fields.

      1. In Fig. 5D, what is the point of having a label for PTGER3 if data were not determined (ND)?

      We have added what N.D. means.

      “N.D. represents Not Detected.”

      1. It is important to assess whether ERK activation depends of PGE2 signaling to better place ERK in the proposed signaling pathway. In fact, the authors argue that "ERK had a direct effect on the production of PGE2." But it could be that ERK is downstream PGE2 signaling instead.

      It could be possible in other experimental conditions via EP1 and/or EP3 pathways. However, we never detected an effect of RSPA on ERK activity by analyzing our imaging system. In addition, treatment with NSAIDs or COX-2 depletion, which completely abolishes RSPA, did not affect ERK wave propagation. Thus, in our context, we concluded that ERK is not downstream of PGE2. This notion is also supported by the NGS results in Fig. 5D.

      We have refrained from discussing the pathway of PGE2-dependent ERK activation because it would be redundant.

      1. The authors need to explain better what they mean by "AND gate" if they want to reach a broad readership like that of eLife

      We have modified the legend to explain the “AND gate” as much as possible (line639).

      “Figure 7: Models for PGE2 secretion.

      The frequency of calcium transients is cell density-dependent manner. While the ERK activation wave is there in both conditions. Because both calcium transient and ERK activation are required for RSPA, the probability for PGE2 secretion is regulated as “AND gate”. ”

      1. In Fig. 5D, "The average intensity of the whole view field of mKate2 or mKOκ, at 20 to 30 min after the addition of PGE2, was applied to calculate the mKate2/mKOκ ratio." But this means that overlapping/densely plated cells in high density will show stronger changes in fluorescence. This should be done per cell not per field of view. It is obvious that the higher density will have more dense/brighter signal in a given field of view.

      We are sorry for the confusion. The cell density does not affect the FRET ratio, although the brightness could be changed. A typical example is Fig1D. Thus, we are sure that our procedures represent the PKA activity in plated cells.

      1. In Fig. 6B the authors need to explain how were the "randomly set positions" determined.

      We have modified the legend section as below (line618).

      “The ERK activities within 10 µm from the center of RSPA and within 10 µm from randomly set positions with a random number table generated by Python are plotted in the left panel. Each colored dot represents an average value of an independent experiment.”

      1. Sentences 314-318 are repeated in 318-322.

      We deeply appreciate the reviewer’s comment and have amended

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #2 (Recommendations For The Authors):

      The evidence provided in this study reflects important discoveries on language lateralisation and most of the conclusions of this paper are supported by evidence. However, there are several areas regarding the characteristics of participants tested, hypotheses/predictions and the type of analysis, that need to be clarified and/or corrected.

      1. There is a substantial disconnection between the introduction and the methods/results section.

      One reason is because of lack of consistency. One example refers to the fact that, in the introduction, only IFC is mentioned. However, the analyses carried out to examine neural activity in different groups focused on IFC as well as other brain regions related to inhibitory control. However, these areas were not mentioned at all in the introduction. Second and related to the above, the rationale for conducting certain types of analyses is not specified. Some brain analyses focus on IFC only. Instead, other analyses focus on several areas.

      Another weakness is that there is not sufficient detail regarding the hypotheses/predictions and the specific types of analyses chosen to test these hypotheses/predictions. For example, there is no mention of resting state fMRI data in the introduction, but later we discover that this type of data was collected and analyzed. Even a brief mention of the inclusion of resting state data in the introduction would be beneficial. Along the same lines, by reading the methods section we find out that VBM analyses were conducted. But it is unclear why. What was the purpose of this data analysis? This should be clarified briefly in the introduction and then in the methods section. It remains unclear why resting state results would be particularly informative for addressing the research question of this study. Task-related brain connectivity seems a more appropriate choice. Additionally, it is not explained what comparisons and outcomes would be informative/expected to distinguish between the two mentioned competing hypotheses. This should be made clear.

      Another aspect that lacks clarity is the authors' predictions when investigating the relationship "between the lateralization of both functions and inter-hemispheric structural-functional connectivity, as well as with behavioural markers of certain clinical conditions that have been related with atypical lateralization". The hypotheses are completely omitted in this section.

      Thank you for bringing this to our attention. We concur with Reviewer #2 that our introduction was somewhat lacking in detail and assumed too much prior knowledge on the part of the reader. This, together with a lack of a clear presentation of our tested hypotheses, made the introduction have a poor connection with both the results and discussion sections, which hindered the understanding of the paper.

      As a result, we have made some additions to enhance the exposition of the following areas: (1) the causal and statistical hypotheses of lateralization (Lines 55-65); and (2) the hypotheses regarding subclinical markers of neurological disorders and the corpus callosum (Lines 90-104).

      Furthermore, we have extensively revised the final paragraph of the introduction (Lines 105-121) to provide a clearer and more coherent linkage between the drivers presented during the introduction, our hypotheses, and the subsequent analyses.

      1. It is important to provide more information on the language background of the participants. Were the participants in this study Catalan-Spanish bilinguals? If so, it is crucial for the authors to mention this.

      Language background of the participants has been added to the corresponding section (Lines 138-145).

      In fact, previous studies, including several publications from the authors themselves (Garbin et al., 2010; Rodríguez-Pujadas et al., 2013; Anderson et al., 2018), have shown that there are qualitative differences between bilinguals and monolinguals in the neural circuitry underlying executive control. Across all these studies, it was consistently reported that bilingual individuals, when engaged in non-linguistic inhibitory control tasks, recruited a broader network of left-brain regions associated with language control, including the left IFC, in comparison to monolingual individuals. If the participants in this study were indeed bilinguals, it raises concern if the aim of the study is to generalize the conclusions on lateralization effects beyond the bilingual population.

      Rodríguez-Pujadas, A., Sanjuán, A., Ventura-Campos, N., Román, P., Martin, C., Barceló, F., … & Ávila, C. (2013). Bilinguals use language-control brain areas more than monolinguals to perform non-linguistic switching tasks. PLoS One, 8(9), e73028.

      Garbin, G., Sanjuan, A., Forn, C., Bustamante, J. C., Rodríguez-Pujadas, A., Belloch, V., ... & Ávila, C. (2010). Bridging language and attention: Brain basis of the impact of bilingualism on cognitive control. NeuroImage, 53(4), 1272-1278.

      Anderson, J. A., Chung-Fat-Yim, A., Bellana, B., Luk, G., & Bialystok, E. (2018). Language and cognitive control networks in bilinguals and monolinguals. Neuropsychologia, 117, 352-363.

      Indeed, we have thoroughly reported that, when compared to monolinguals, bilinguals exhibit a significant implication of left brain regions during switching and inhibition tasks. So, this is a legitimate concern. Unfortunately, the society from which our participants were drawn is primarily bilingual, encompassing both active and passive bilinguals. The monolingual sample in those previous studies consisted of university students originating from predominantly monolingual regions of Spain. Given this context, it is unsurprising that the current study has a rather limited number of monolinguals (n=8), with only 2 displaying atypical language lateralization. Thus, we cannot provide a reliable answer to the role of bilingualism status in our data. Consequently, we have included a comment on this limitation on the discussion (Lines 504-512).

      1. Regarding the methods section, I have the following specific queries. The first is about the control condition in the verb generation task. I find it puzzling that the 'task' and 'control' conditions differ in terms of the number of words uttered. Could the authors please provide further clarification on this?

      Thank you for raising this question. Regarding the control condition, it is important to note that the design of this task drew inspiration from previously published verb generation tasks for fMRI (Benson et al., 1999; Fitzgerald et al., 1997) and PET (Petersen et al., 1988). In the fMRI tasks, a fixation cross served as the control condition, while the PET study used word repetition as the control. We acknowledged that a mere fixation cross might not adequately control for the movement and visual-related activations inherent in the verb generation task. Conversely, word repetition could potentially engage the default mode network due to the repetition of the same simple task, which might not be suitable for a control condition, and it could be overly linguistic because it involves a word. Consequently, we aimed to strike a balance by employing a control condition that consisted of reading letters. This approach allowed us to control for movement and vision factors without invoking semantics. Thus, after careful consideration, we ultimately opted on the reading of two letters to equate the response to the vocalization length of generating a verb.

      Although we understand the concern of single vs. two vocalizations, it is worth emphasizing that this version of the verb generation task had undergone prior testing to assess its suitability for determining language lateralization in both healthy and clinical populations (Sanjuan et al., 2010). In fact, this task has been an integral component of our lab’s standard presurgical assessment protocol, which has been used for nearly two decades in individually evaluating language function in over 500 patients with central nervous system lesions.

      Benson, R. R., Fitzgerald, D. B., Lesueur, L. L., Kennedy, D. N., Kwong, K. K., Buchbinder, B. R., Davis, T. L., Weisskoff, R. M., Talavage, T. M., Logan, W. J., Cosgrove, G. R., Belliveau, J. W., & Rosen, B. R. (1999). Language dominance determined by whole brain functional MRI in patients with brain lesions. Neurology, 4(52), 798–809.

      Fitzgerald, D. B., Cosgrove, G. R., Ronner, S., Jiang, H., Buchbinder, B. R., Belliveau, J. W., Rosen, B. R., & Benson, R. R. (1997). Location of Language in the Cortex: A Comparison between Functional MR Imaging and Electrocortical Stimulation. AJNR Am J Neuroradiol, 18, 1529–1539.

      Petersen, S. E., Fox, P. T., Posner, M. I., Mintun, M., & Raichle, M. E. (1988). Positron emission tomographic studies of the cortical anatomy of single-word processing. Nature, 331(18), 585–589.

      Sanjuán, A., Bustamante, J. C., Forn, C., Ventura-Campos, N., Barrós-Loscertales, A., Martínez, J. C., Villanueva, V., & Ávila, C. (2010). Comparison of two fMRI tasks for the evaluation of the expressive language function. Neuroradiology, 52(5), 407–415. https://doi.org/10.1007/s00234-010-0667-8

      Second, it is mentioned that some participants were excluded from different tasks due to technical issues or time constraints. It is important to ensure that all the results can be attributed to the exact same sample of participants across all tasks.

      We absolutely agree that excluding participants can be problematic when presenting the results of multiple sets of analyses. Therefore, we repeated all analyses while excluding the 7 participants that lacked resting-state data. All results remained virtually identical, with a few minor exceptions:

      1) Region-wise analysis of the stop-signal task: Hemisphere × Group effect in the preSMA region is significant (uncorrected P = 0.019), but it does not survive Bonferroni correction (corrected P = 0.076)

      2) Voxel-wise analysis of the stop-signal task: The Thalamus + STN and Caudate clusters are significant at the voxel level, but do not survive the cluster-based FWE correction. They do survive FDR correction, though.

      3) Correlation between SPQ score and LI of the stop-signal task: This correlation weakens just behind statistical significance, with a P value of 0.053.

      4) Correlation between reading variables and LIs of both tasks: Severe drops in P values are evident between both LIs and reading length accuracy (P = .111 and .133), as well as between verb generation LI and reading familiarity accuracy (P = .111). However, the association between the stop-signal LI and the reading length time is now significant (r = −.229, P = .042).

      According to this, we have included this statement in the methods section: (Lines 218-220).“It is important to highlight that the exclusion of these seven participants across all analyses does not notably impact the overall results.“

      It is unclear how the authors have estimated the RTs results from the practice trials. This requires more explanation. Also, why was the median used for the Go Reaction Time instead of the mean, when calculating the individual SSRT?

      We adapted the procedure used by Xue et al. (2008), implementing their approach to calculate SSRT. This has been elaborated further (Lines 227-230), together with the use of practice trials (Lines 233-236).

      Xue, G., Aron, A.R., and Poldrack, R.A. (2008). Common Neural Substrates for Inhibition of Spoken and Manual Responses. Cerebral Cortex 18, 1923–1932. 10.1093/CERCOR/BHM220.

      On a final note, information about the different types of pre-processing and data analysis is all reported in the same paragraph. I think using subsections would increase the intelligibility of the section.

      Thank you for this suggestion. We have added subsections in both the ‘image processing’ and ‘statistical analyses’ sections.

      1. Data analysis and Interpretation of the results. It is unclear how the mean BOLD signal was extracted to conduct ROI analysis (Marsbar?).

      Thank you for ponting this out. Indeed, we were not very accurate in the description of this procedure. We extracted the first eigenvariate via the VOI function within SPM12. This has been included in Lines 291-293.

      I feel uneasy about the way results are corrected for multiple comparisons. For instance, it is mentioned that in the ROI analysis, all p-values were FDR-corrected for four comparisons, but it is unclear why. The correct procedure for supporting conclusions about the effect of specific brain would be to have 'brain region' (n=4) as another within-subject factor. Furthermore, the one-tailed correlation is appropriate but only when testing for the possibility of a relationship in one direction and completely disregarding the possibility of a relationship in the other direction. However, this does not seem to be the case here (see Introduction), so a two-tailed correlation would be more appropriate.

      We agree with Reviewer #2 that presenting this analysis as a single MANOVA that includes a ‘Region’ factor is a more accurate approach. Consequently, we have made the aforementioned correction in the methods section (Lines 357-364) and the results section (Lines 395-406). The LI-LI one-tailed correlation was also changed to a two-tailed correlation in the methods section (Line 383), the results section (Line 417), and Figure 2 (Line 886).

      I am quite confused about using the term interhemispheric connectivity to refer to the volume of the genu, body and splenium of the corpus callosum. In fact, the volumes of genu, body and splenium of the corpus callosum do not reflect a measure of how strongly RH and LH IFC are connected to each other.

      We agree that using the term ‘interhemispheric connectivity’ when referring to callosal volume may be somewhat misleading. We have replaced every instance of this terminology throughout the paper.

      Furthermore, it is unclear why in a set of analyses (ROI and whole brain analyses) the authors focus on brain responses in different ROIs but instead, in connectivity measures the focus is only on IFC.

      Our initial rationale was to focus on regions that are prominently involved in language, particularly the IFC, for examining inter-hemispheric connectivity at rest.

      However, upon more careful consideration, it is true that the preSMA is also implicated in the language network (Labache et al., 2018), and certain studies have reported an impact of STN stimulation on specific language skills (for a review, see Vos et al., 2021). Consequently, we have incorporated these two regions into the resting-state analysis, along with subsequent correlations with LIs (Table 1 and Lines 118, 321-322 & 449-452).

      Labache, L., Joliot, M., Saracco, J., Jobard, G., Hesling, I., Zago, L., Mellet, E., Petit, L., Crivello, F., Mazoyer, B., & Tzourio-Mazoyer, N. (2018). A SENtence Supramodal Areas AtlaS (SENSAAS) based on multiple task-induced activation mapping and graph analysis of intrinsic connectivity in 144 healthy right-handers. Brain Structure and Function 2018 224:2, 224(2), 859–882. https://doi.org/10.1007/S00429-018-1810-2

      Vos, S. H., Kessels, R. P. C., Vinke, R. S., Esselink, R. A. J., & Piai, V. (2021). The Effect of Deep Brain Stimulation of the Subthalamic Nucleus on Language Function in Parkinson’s Disease: A Systematic Review. Journal of Speech, Language, and Hearing Research, 64(7), 2794–2810. https://doi.org/10.1044/2021_JSLHR-20-00515

      Minor corrections/comments:

      It is unclear why in figure caption 1, the conjunction maps are mentioned even if formal conjunction analysis was not conducted.

      This poor choosing of words has been replaced to ‘overlapping maps’.

      Line 382. VHMC should be VMHC.

      Fixed. Thank you.

      Line 334. This sentence and especially its relationship with the results is not clear at all. What do you mean by 'This finding is consistent with previous reports showing that cognitive deficits appear only in specific cognitive domains'?

      This has been clarified (Lines 521-525).

    1. Reviewer #3 (Public Review):

      Summary of the findings:

      The authors explore an important question concerning the underlying mechanism of representational drift, which despite intense recent interest remains obscure. The paper explores the intriguing hypothesis that drift may reflect changes in the intrinsic excitability of neurons. The authors set out to provide theoretical insight into this potential mechanism.

      They construct a rate model with all-to-all recurrent connectivity, in which recurrent synapses are governed by a standard Hebbian plasticity rule. This network receives a global input, constant across all neurons, which can be varied with time. Each neuron also is driven by an "intrinsic excitability" bias term, which does vary across cells. The authors study how activity in the network evolves as this intrinsic excitability term is changed.

      They find that after initial stimulation of the network, those neurons where the excitability term is set high become more strongly connected and are in turn more responsive to the input. Each day the subset of neurons with high intrinsic excitability is changed, and the network's recurrent synaptic connectivity and responsiveness gradually shift, such that the new high intrinsic excitability subset becomes both more strongly activated by the global input and also more strongly recurrently connected. These changes result in drift, reflected by a gradual decrease across time in the correlation of the neuronal population vector response to the stimulus.

      The authors are able to build a classifier that decodes the "day" (i.e. which subset of neurons had high intrinsic excitability) with perfect accuracy. This is despite the fact that the excitability bias during decoding is set to 0 for all neurons, and so the decoder is really detecting those neurons with strong recurrent connectivity, and in turn strong responses to the input. The authors show that it is also possible to decode the order in which different subsets of neurons were given high intrinsic excitability on previous "days". This second result depends on the extent by which intrinsic excitability was increased: if the increase in intrinsic excitability was either too high or too low, it was not possible to read out any information about the past ordering of excitability changes.

      Finally, using another Hebbian learning rule, the authors show that an output neuron, whose activity is a weighted sum of the activity of all neurons in the network, is able to read out the activity of the network. What this means specifically, is that although the set of neurons most active in the network changes, the output neuron always maintains a higher firing rate than a neuron with randomly shuffled synaptic weights, because the output neuron continuously updates its weights to sample from the highly active population at any given moment. Thus, the output neuron can read out a stable memory despite drift.

      Strengths:

      The authors are clear in their description of the network they construct and in their results. They convincingly show that when they change their "intrinsic excitability term", upon stimulation, the Hebbian synapses in their network gradually evolve, and the combined synaptic connectivity and altered excitability result in drifting patterns of activity in response to an unchanging input (Fig. 1, Fig. 2a). Furthermore, their classification analyses (Fig. 2) show that information is preserved in the network, and their readout neuron successfully tracks the active cells (Fig. 3). Finally, the observation that only a specific range of excitability bias values permits decoding of the temporal structure of the history of intrinsic excitability (Fig. 2f and Figure S1) is interesting, and as the authors point out, not trivial.

      Weaknesses:

      1) The way the network is constructed, there is no formal difference between what the authors call "input", Δ(t), and what they call "intrinsic excitability" Ɛ_i(t) (see Equation 3). These are two separate terms that are summed (Eq. 3) to define the rate dynamics of the network. The authors could have switched the names of these terms: Δ(t) could have been considered a global "intrinsic excitability term" that varied with time and Ɛ_i(t) could have been the external input received by each neuron in the network. In that case, the paper would have considered the consequence of "slow fluctuations of external input" rather than "slow fluctuations of intrinsic excitability", but the results would have been the same. The difference is therefore semantic. The consequence is that this paper is not necessarily about "intrinsic excitability", rather it considers how a Hebbian network responds to changes in excitatory drive, regardless of whether those drives are labeled "input" or "intrinsic excitability".

      A revised version of the manuscript models "slope-based" excitability changes in addition to "threshold-based" changes. This serves to address the above concern that as constructed here changes in excitability threshold are not distinguishable from changes in input. However, it remains unclear what the model would do should only a subset of neurons receive a given, fixed input. In that case, are excitability changes sufficient to induce drift? This remains an important question that is not addressed by the paper in its current form.

      2) Given how the learning rule that defines the input to the readout neuron is constructed, it is trivial that this unit responds to the most active neurons in the network, more so than a neuron assigned random weights. What would happen if the network included more than one "memory"? Would it be possible to construct a readout neuron that could classify two distinct patterns? Along these lines, what if there were multiple, distinct stimuli used to drive this network, rather than the global input the authors employ here? Does the system, as constructed, have the capacity to provide two distinct patterns of activity in response to two distinct inputs?

      A revised version of the manuscript addresses this question, demonstrating that the network is capable of maintaining two distinct memories.

      Impact:

      Defining the potential role of changes in intrinsic excitability in drift is fundamental. Thus, this paper represents an important contribution. What we see here is that changes in intrinsic excitability are sufficient to induce drift. This raises the question for future work of the specific contributions of changing excitability from changing input to representational drift.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Throughout the study, there is insufficient information about how experiments were performed and how often (imaging, pull-downs etc), how data was acquired, modified and analysed (especially imaging data, see below), how statistical analyses were done and what is presented in the figures (single planes or maximum intensity projections etc). This makes it difficult to evaluate the data and results.

      We have incorporated additional experimental details to the Materials and Methods section: "Recent advancements in optical and camera technologies permit the acquisition of Z-stacks without perturbing Q cell division or overall animal development. Z-stack images were acquired over a range of -1.6 to +1.6 μm from the focal plane, at intervals of 0.8 μm. The field-of-view spanned 160 μm × 160 μm, and the laser power, as measured at the optical fiber, was approximately 1 mW. ImageJ software (http://rsbweb.nih.gov/ij/) was used to perform image analysis and measurement. Image stacks were z-projected using the average projection for quantification and using the maximum projection for visual display. "

      The majority of our experimental procedures adhere to methodologies delineated in our prior publications and other scientific literature. We were pioneers in the development of fluorescence time-lapse live microscopy techniques for capturing Q cell migration and asymmetric division (Ou and Vale, Journal of Cell Biology, 2009; Ou et al., Science, 2010; Chai et al., Nature Protocols, 2012). Our innovative imaging protocol uncovered a novel mode of polarized, non-muscle myosin-II-dependent asymmetric cell division (Ou et al., Science, 2010). Subsequently, we unveiled another previously uncharacterized mechanism of asymmetric cell division dependent on polarized actin polymerization (Chai et al., Cell Discovery, 2022). In the present study, we have significantly refined our imaging and quantification protocols. Different from the single-focal-plane imaging employed in our earlier study by Ou et al. 2009, advancements in optical technologies and camera resolution now enable us to undertake time-lapse imaging across multiple focal planes and track signal differences between the anterior and posterior segments of dividing cells.

      There is insufficient information about tools and reporters used. This is misleading and impacts the conclusions that can be made from the results presented. To give an example, in Figure 1D-F, the authors present data that HDA-1::GFP and LIN-53::mNeonGreen (both components of the nucleosome remodeling and deacetylation complex) but not the histone acetyltransferase MYS-1::GFP are 'asymmetrically segregated' during QR.a division. However, the authors do not mention that HDA-1::GFP and LIN-53::mNeonGreen are expressed at endogenous levels (they are CRISPR alleles) whereas MYS-1::GFP is overexpressed (integration of a multi-copy extrachromosomal array). The difference in 'segregation' could therefore be a consequence of different levels of expression rather than different modes of segregation ('asymmetric' versus 'symmetric').

      Figure S2 shows overexpressed HDA-1, LIN-53 and CHD-3 are also asymmetrically segregated during ACD of QR.a, which indicates that different levels of expression do not affect the modes of segregation, at least for the NuRD subunits. In the main text, however, we presented the asymmetric segregation of HDA-1::GFP and LIN-53::mNeonGreen using their CRISPR KI alleles.

      There is insufficient information about the phenotypes of the animals used (RNAi knock-downs of hda-1, lin-53 RNAi, pig-1 etc). Again this is misleading and impacts the conclusions that can be made. To give some examples,

      1. In Figure 3A-G, control RNAi embryos are compared to hda-1 RNAi and lin-53 RNAi embryos. What the authors do not mention is that hda-1 RNAi and lin-53 RNAi embryos have severe developmental defects and essentially cannot be compared to control RNAi embryos. The differences between the embryos can be seen in Figure S7B where bright-field images of control RNAi, hda-1 RNAi and lin-53 RNAi embryos are shown. At the 350 min time point, a normal embryo is visible for the control, a 'ball of cells' embryo for hda-1 RNAi and an embryo that seems to have arrested at an earlier developmental stage (and therefore have much larger cells) for lin-53 RNAi. Because of these pleiotropic phenotypes, it is unclear whether differences seen for example in sAnxV::GFP positive cells (Figure 3A) are the result of a direct effect of hda-1(RNAi) on cell death or whether they are the result of global changes in development and cell fate induced by hda-1(RNAi). hda-1(RNAi) and lin-53(RNAi) embryos are also used for the data shown in Figures S6 and S7, raising the same concerns;

      In the submitted manuscript, we mentioned that hda-1 RNAi and lin-53 RNAi caused embryonic lethality and that we could track some of the apoptotic events in hda-1 RNAi embryos arrested between the late gastrulation stage and bean stage. We agree with the reviewers that because of the pleiotropic phenotypes, we cannot distinguish whether sAnxV::GFP positive cells (Figure 3A) are the result of a direct effect of hda-1 (RNAi) on cell death or whether they are the result of global changes in development and cell fate induced by hda-1 (RNAi). We added the sentence to page 9 line 26: “Considering the pleiotropic phenotypes caused by loss of HDA-1, we cannot exclude the possibility that ectopic cell death might result from global changes in development, even though HDA-1 may directly contribute to the life-versus-death fate determination.”

      1. The authors do not mention what the impact of Baf A1 treatment is on animals; however, the images provided in Figure 5E indicate that Baf A1 treatment causes pleiotropic effects in L1 larvae.

      We have carefully checked the BafA1 treated animals, but have not been able to detect any visible defect in Baf A1 treated animals under a 25× dissection microscope at the given dosage and duration of treatment. We also searched for the published images or literature and did not find pleiotropic effects on the animal level at this dosage and duration; however, we agree with the reviewers that perturbation of pH homeostasis in lysosomes by BafA1 will certainly generate pleiotropic cellular defects. We discussed the issue below:

      "Although BafA1-mediated disruption of lysosomal pH homeostasis is recognized to elicit a wide array of intracellular abnormalities, we found no evidence of such pleiotropic effects at the organismal level with the dosage and duration of treatment employed in this study."

      There is a lack of adequate controls. Because of this, some of the data presented must be considered as preliminary. To give some examples:

      1. Controls are lacking for the data shown in Figure 3D-G (i.e. genes other than egl-1). Since hda-1 RNAi has a pleiotropic effect and most likely affects H3K27 acetylation genome-wide, this is critical. Based on what is shown, it is unclear whether the results presented are specific to egl-1 or not;

      In figure 3F, we added F23B12.1 and sru-43 as the controls of egl-1. We added “while the H3K27ac level of genes adjacent to egl-1 showed no significant changes” to Page 10 line 22 in the revised text.

      1. The co-IP and mass spec data shown in Figure 4A, C and Figure S8 also lack a critical control, which is GFP only. Because of this, it is unclear whether subunits of the V-ATPase bind to HDA-1 or GFP. The co-IP and mass spec data forms the basis of Figures 5 and 6 as well as Figure S9. Data presented in these figures therefore has to be considered preliminary as well.

      In the co-IP and mass spec shown in Figure 4A, we used ACT-4::GFP as the negative control, which can preclude V-ATPase subunits that bind to GFP. In Figure 4C, we used anti-V1A (V-ATPase V1 domain A subunit) antibody to confirm the interaction between V1A and HDA-1. In Figure S8B, we also used ACT-4::GFP as a control, showing other NuRD subunits bind to HDA-1 rather than GFP.

      Inappropriate methods are used. For this reason, some of the data again must be considered preliminary. To give some examples:

      1. In Figure 5A, B, the authors used super-ecliptic pHluorin to look at changes in pH in the daughter cells. However, the authors used quenching of super-ecliptic pHluorin fluorescence rather than a ratio-metric method to 'measure' changes in pH. Because of this, it is unclear whether the changes in fluorescence observed are due to changes in pH or changes in the amount of pHluorin protein. Figure 5A, B forms the basis for the experiments presented in the remaining parts of Figure 5 as well as in Figure 6 and Figure S9;

      Bafilomycin A1 inhibits the activity of V-ATPase, presumably preventing the pumping of protons into the apoptotic daughter cell. It is more likely that the apoptotic daughter cell becomes less acidic and more neutral after the treatment of Baf1A, although we cannot exclude the possibility that the changes in fluorescence could be due to changes in the amount of pHluorin protein. A ratio-metric method to measure changes in pH will be further used to distinguish the two possibilities.

      We added “although we cannot exclude the possibility that the changes in fluorescence could be due to changes in the amount of pHluorin protein.” to Page 12 line 12 in the revised text.

      1. The authors' description of how some images were modified before quantitative analysis raises concerns. The figures of concern are particularly Figure 1 and Figure S4, where background subtraction with denoising and deconvolution was used. Background subtraction, with denoising and deconvolution is an image manipulation that enhances the contrast between background and what looks like foreground. Therefore, background subtraction should be applied primarily in experiments involving image segmentation not fluorescence intensity measurement. Not being provided any information by the authors about the kind of subtraction that was made, this processing could lead to an uneven subtraction across the image, which can easily lead to artefacts. Since the fluorescence intensity in the smaller daughter cell is lower, and thus closer to background, the algorithm the authors used may have misinterpreted the grey value information in the smaller daughter cell pixels. This could have led to an asymmetric subtraction of background in the two daughter cells, leading to a stronger subtraction in the smaller daughter cell. Ultimately, their processing could have artificially increased the intensity asymmetry between the two daughter cells in all their results.

      As mentioned earlier, the imaging and quantification methods of this manuscript have been routinely used in our previous publications or studies from many other labs (Gräbnitz F, et al., Cell Rep. 2023; Herrero E, et al., Genetics. 2020; Roubinet C, et al., Curr Biol. 2021). Background subtraction is a standard procedure to quantify cellular fluorescence intensities. The fluorescence intensity of the slide background was measured from a region without worm bodies, of the same size as the region of interest. We have added how we measured the background to page 19 Line 24: “The fluorescence intensity of the slide background was measured from a region without worm bodies, of the same size as the region of interest.”

      The imaging data is of low quality (for example Figures 1, 2, 5, 6; Figures S2, S3, S5, S6, S9). Since much of the study and the findings are based on imaging, this is a major concern. Critical parameters are not mentioned (number of sections in z-stack, size of the field-of-view, laser power used etc), which makes it difficult to understand what was done and what one is looking at.

      Fluorescence images of neuroblast asymmetric cell division in developing C. elegans larvae has historically presented considerable challenges. Our recent methodological advancements have facilitated live imaging in this intricate system with improved resolution. In the revised manuscript, we have elucidated the specific z-stack parameters, field-of-view dimensions, and laser power settings employed: "Z-stack images were acquired over a range of -1.6 to +1.6 μm from the focal plane, at intervals of 0.8 μm. The field-of-view spaned 160 μm × 160 μm, and the laser power, as measured at the optical fiber, was approximately 1 mW."

      To give some specific examples,

      1. The images shown in Figure 2B are of very low quality with severe background from neighbouring cells. In addition, the outline of the cells (plasma membrane) or the nuclei of the daughter cells is unknown. Based on this it is not clear how the authors could have measured 'Fluorescence intensity ratio between sister nuclei' in an accurate and unbiased way (what is clear from these images is that there is an increase in HDA-1::GFP signal in ALL surviving daughters (asymmetric and symmetric divisions) post cytokinesis but not in the daughter cell that is about to die (asymmetric and unequal division));

      We employed live-cell imaging in conjunction with automated cell lineage tracing algorithms (Du et al., Cell, 2014) to scrutinize NuRD asymmetry in embryos from the two- or four-cell stage up to the 350-cell stage. This sophisticated approach was initially pioneered by Dr. Zhirong Bao at Sloan Kettering and subsequently refined by Dr. Zhuo Du during Dr. Du's postdoctoral training in Dr. Bao's laboratory. This advanced imaging pipeline enables the scientific community to quantify cellular fluorescence intensity in an automated fashion, thereby substantially mitigating manual intervention and bias.

      1. The images in Figure 6A and Figure S9A on VHA-17 segregation and its colocalization to ER and lysosome segregation during QR.a division are of very low quality and it is unclear to the reviewer how such images were used to obtain the quantitative data shown.

      In some cases, there is a discrepancy between what is shown in figures and what the authors state in the text. To give some examples:

      1. On page 7, the authors state "..., we found that nuclear HDA-1 or LIN-53 asymmetry gradually increased from 1.1-fold at the onset of anaphase to 1.5 or 1.8-fold at cytokinesis, respectively (Figure 1D-E)." Looking at the images for HDA-1 and LIN-53 in Figure 1D, the increase in the ratio mainly occurs between 4 min and 6 min, which is post cytokinesis and NOT prior to cytokinesis;

      Thank the reviewer for pointing out this. The nuclear HDA-1 or LIN-53 asymmetry increased to 1.5 or 1.8-fold 6 min after the onset of anaphase, when QR.a just completes cytokinesis. Therefore, We change the sentence “we found that nuclear HDA-1 or LIN-53 asymmetry gradually increased from 1.1-fold at the onset of anaphase to 1.5 or 1.8-fold at cytokinesis, respectively (Figure 1D-E).” to “we found that nuclear HDA-1 or LIN-53 asymmetry gradually increased from 1.1-fold at the onset of anaphase to 1.5 or 1.8-fold upon the completion of cytokinesis, respectively (Figure 1D-E).”

      However, nuclear HDA-1 or LIN-53 asymmetry initiates prior to cytokinesis. We started to see the nuclear HDA-1 or LIN-53 asymmetry (1.4 fold for HDA-1 and 1.2 fold for LIN-53 ) 2 min after the onset of anaphase (Figure 1D).

      1. These images (Figure 1D) also show that there is an increase in the HDA-1 and LIN-53 signals in the larger daughter cells (QR.ap), which suggests that the increase in ratios (Figure 1E) is the result of increased HDA-1 and LIN-53 synthesis post cytokinesis. However, on top of page 8, the authors state "The total fluorescence of HDA-1, LIN-53 and MYS-1 remained constant during ACDs, suggesting that protein redistribution may establish NuRD asymmetry (Figure S4C)." In Figure S4C, the authors present straight lines for 'relative total fluorescence' for imaging (probably z-stacks) that was done every min over the course of 7 min. If there was no increase in material as the authors claim, they should have seen significant photobleaching over the course of the 7 min and therefore reduced level of 'relative total fluorescence' over time. How the data presented in Figure S4C was generated is therefore unclear. (Despite the fact that the authors claim that the asymmetry seen is not due to new synthesis in the larger daughter cell post cytokinesis, it would be more consistent with the first experiment presented in this study (Figure S1) that shows that there is more hda-1 mRNA in egl-1(-) cells compared to egl-1(+) cells);

      Regarding the concern of photo-bleaching, we have meticulously calibrated our imaging system over the past several years. Rigorous controls, qualification, and analyses were scrupulously undertaken during the development of our fluorescence time-lapse imaging system for the investigation of Q cell dynamics, initially established by Dr. Guangshuo Ou in Ron Vale's laboratory—a renowned hub for avant-garde imaging techniques (Ou & Vale, Journal of Cell Biology, 2009; Ou et al., Science, 2010). Remarkably, no discernible photobleaching was observed even during two to three-hour imaging.

      We agree that protein turnover, involving both degradation and synthesis, may occur. However, NuRD asymmetric distribution occurred within several minutes after metaphase and QR.a completes cytokinesis ~6min after the onset of anaphase, while GFP protein translation and maturation require ~ 30 min in Q neuroblast (Ou & Vale, Journal of Cell Biology, 2009). Even if hda-1::gfp mRNA is translated during cell division, the nascent GFP-tagged protein will mature long after the completion of cytokinesis. Consequently, we postulate that the influence of newly synthesized GFP-tagged protein during Q cell division is negligible for quantification purposes. It is plausible that the asymmetry in HAD-1 protein distribution is independent of hda-1 mRNA asymmetry.

      1. On page 12, the authors state "..., in Baf A1-treated animals, QRaa inherited similar levels of HDA-1::GFP as its sister cell,...". However, looking at the image provided in Figure 5E (0 min), there seems to be a similar ratio of HDA-1::GFP between the daughter cells in DMSO and Baf A1-treated animals.

      We have adjusted the images in Figure 5E to show the asymmetry in DMSO-treated control animals. We acknowledge variations among animals. Our quantifications from more than 10 animals show the HDA-1 asymmetry in DMSO-treated animals in Figure 5B.

      Recommendations for the authors:

      Conclusion 1

      "Here, we demonstrate that the nucleosome remodeling and deacetylase (NuRD) complex is asymmetrically segregated into the surviving daughter cell rather than the apoptotic one during ACDs in Caenorhabditis elegans" (Abstract)

      Results described on pages 6-9 ("NuRD asymmetric segregation during neuroblast ACDs" and "NuRD asymmetric segregation in embryonic cell lineages") and data shown in Figure S1, Figure 1, Figures S2, S3, S4, S5, Figure 2.

      Conclusion 1 is not supported by the results as numerous concerns exist about the data in many of these figures (see above, major weaknesses). A more likely explanation for the authors' observations is that there is synthesis of NuRD post cytokinesis and that asymmetries in the amounts of NuRD observed in the two daughter cells is a consequence of their different cell sizes (QR.ap is 3x as large as QR.aa). This is supported by the finding that the loss of pig-1, which causes 'equal' division resulting in two daughter cells of similar sizes, abolishes the differences in NuRD seen between the daughter cells.

      As discussed earlier, GFP protein translation and maturation require ~ 30 min in Q neuroblast (Ou & Vale, Journal of Cell Biology, 2009). Even if there is the synthesis of NuRD post cytokinesis, the nascent GFP-tagged protein will not mature within our imaging timeframe, Therefore, NuRD asymmetry is unlikely to be a result of the synthesis of NuRD post cytokinesis. In addition, We found that MYS-1::GFP was symmetrically segregated into the small apoptotic daughter cells and big surviving daughter cells, suggesting NuRD asymmetry may be irrelevant to cell size asymmetry.

      Interestingly, despite the fact that the loss of pig-1 causes 100% of the divisions to be equal by size and symmetric with respect to NuRD amounts, it only causes about 30% of QR.aa cells to inappropriately survive. This demonstrates that there is a correlation between NuRD asymmetry and daughter cell size asymmetry but NOT between NuRD asymmetry and cell death. This also demonstrates that loss of 'NuRD asymmetry' and presence of NuRD in the daughter that should die is NOT sufficient to block its death.

      Cordes et al. 2006 (DOI: 10.1242/dev.02447) reported that in pig-1 loss-of-function mutants, <40% of Q.p lineages produce extra neurons because Q.pp cells inappropriately survive. Noticeably, only 30% and 5% Q.p lineages produce extra neurons in ced-3 and egl-1 loss of function single mutant, respectively. pig-1 ced-3 double mutant or pig-1 egl-1 double mutants show a dramatically stronger phenotype than either single mutant, resulting in about 80% of Q.p lineages producing extra neurons. These results suggest that pig-1 functions in parallel to the EGL-1-CED-9-CED-4-CED-3 cell death pathway to promote Q cell apoptosis.

      We agree with the reviewer that “loss of 'NuRD asymmetry' and presence of NuRD in the daughter that should die is NOT sufficient to block its death” in pig-1 loss-of-function mutants. However, these results do not rule out the correlation between NuRD asymmetry and cell death. In the pig-1 mutant, the concentration of NuRD in Q.pp might not be high enough to completely block the death pathway. Alternatively, NuRD may be one but not the only factor blocking the cell death pathway.

      Lastly, it is imperative to underscore that cellular aberrations observed during early developmental stages frequently undergo compensatory correction during subsequent developmental stages or even initial aging stages. For example, in certain cell migration mutants exhibiting early migration defects, the initial penetrance exceeds 80%; however, the penetrance is mitigated to a mere 30% in adults. Such observations have been corroborated in our prior publications focusing on cell migration dynamics (Wang et al., PNAS, 2013; Zhu et al., Dev Cell, 2016). This appears to be a pervasive phenomenon, echoed by several laboratories specializing in neural development. Sengupta and Blacque’s labs has reported that early aging can ameliorate ciliary phenotypes in C. elegans mutants with compromised intraflagellar transport mechanisms. Accordingly, late developmental stages may act as a compensatory buffer for antecedent developmental abnormalities.

      Conclusion 2

      "The absence of NuRD triggers apoptosis via the EGL-1-CED-9-CED-4-CED-3 pathway, while an ectopic gain of NuRD enables apoptotic cells to survive." (Abstract) Results described on pages 8-10 ("Loss of the deacetylation activity of NuRD causes ectopic apoptosis" and "NuRD RNAi upregulates the egl-1 expression by increasing its H3K27 aceylation") and data shown in Figure S6, Figure 3, Figure S7 and data shown in Figure 5.

      Because of the various concerns raised above (major weaknesses) about the data presented in Figure S6, Figure 3, Figure S7 (pleiotropic phenotypes of hda-1 and lin-53 RNAi animals, lack of controls etc), there is no evidence that NuRD has a specific and/or direct effect on egl-1 expression in cells programmed to die or that loss of NuRD causes ectopic egl-1-dependent cell death. The claim that "ectopic gain of NuRD enables apoptotic cells to survive." is based on Figure 5E, where the authors show that Baf A1 treatment causes symmetric NuRD segregation in 11/12 animals and that QR.aa survives in 11/12 animals. However, those data are unconvincing. As mentioned above (major weaknesses), from the low-quality images provided, it is not clear whether there is 'symmetric NuRD segregation' in Baf A1 treated animals, and the conditions of the animals are a concern because of pleiotropic effects of blocking V-ATPase. (I am not convinced I am actually looking at the same region of an L1 larvae in the three animals because the HDA-1::GFP signal seems inconsistent across them.) One process that is affected by a block of V-ATPase is engulfment. The fact that the authors observe that 130 min post-cytokinesis, QR.aa still persists in Baf A1 treated animals could therefore be the result of a delay in engulfment rather than a block in cell death. In addition, the claim that ectopic gain of NuRD enables apoptotic cells to survive contradicts their findings on loss of pig-1 described about ('Conclusion 1').

      We acknowledge the limitations of our imaging system; however, as we pointed out earlier that we developed imaging methods and kept improving them. We have tried our best to obtain images from developing C. elegans larvae. On the other hand, we showed that hda-1 RNAi and lin-53 RNAi increase the expression of a subset of genes, including egl-1, either directly or indirectly (Fig. 3C). Figure 3B shows the ectopic cell death caused by loss of NuRD is dependent on EGL-1-CED-9-CED-4-CED-3 pathway. While we cannot exclude several other possibilities while addressing such a complex problem in such a challenging model system, these results provide some evidence supporting that our claim can be one of the possibilities.

      Conclusion(s) 3

      "We identified the vacuolar H+-adenosine triphosphatase (V-ATPase) complex as a crucial regulator of NuRD's asymmetric segregation. V-ATPase interacts with NuRD and is asymmetrically segregated into the surviving daughter cell. Inhibition of V-ATPase disrupts cytosolic pH asymmetry and NuRD asymmetry" (Abstract)

      Results described on pages 10-13 ("V-ATPase regulates asymmetric segregation of NuRD during somatic ACDs") and data shown in Figures 4, 5, 6, Figures S8, S9.

      As outlined above (major weaknesses), the evidence that HDA-1 interacts with the V-ATPase complex is preliminary (no GFP control), and I consider the imaging data showing that V-ATPase asymmetrically segregates very low quality and unconvincing (Figure 6). The data on pH changes are also preliminary as the experiment was not done the way it should have (quenching rather than ratiometric). Finally, there are concerns about the results that apparently demonstrate that inhibiting V-ATPase activity disrupts pH asymmetry and NuRD asymmetry (impact of Baf A1 treatment).

      As discussed earlier, Bafilomycin A1 inhibits the activity of V-ATPase, presumably preventing the pumping of protons into apoptotic daughter cells. It is more likely that the apoptotic daughter cell becomes less acidic and more neutral after the treatment of Baf1A, although we cannot exclude the possibility that the changes in fluorescence could be due to changes in the amount of pHluorin protein. A ratio-metric method to measure changes in pH will be further used to distinguish the two possibilities.

      We added “although we cannot exclude the possibility that the changes in fluorescence could be due to changes in the amount of pHluorin protein.” to Page 12 line 12 in the revised text.

      Conclusion 4

      "We suggest that asymmetric segregation of V-ATPase may cause distinct acidification levels in the two daughter cells, enabling asymmetric epigenetic inheritance that specifies their respective life-versus-death fates." (Abstract) Discussion and model Figure 6C.

      I consider the model premature and not based on any convincing data. In addition, the role of V-ATPase and acidification does not make sense. V-ATPase is involved in the acidification of the lysosomal system (lumen), and it is thought that cytosolic acidification in apoptotic cells is caused by lysosomal leakage. This is not consistent with the authors' model.

      This manuscript lacks a section describing details of statistical analyses and the rationale for the chosen test, sample sizes, exclusion criteria, and replication details. Although the sample size is relatively smaller (less than 30), the authors used "unpaired t-test" for most of the tests. They should describe which type of t-test they used (parametric or non-parametric test). They also should provide replication details for non-statistical data set, for example Fig 3F and Fig 4C.

      We used the Unpaired two-tailed parametric t-test. We have now added the information for statistic tests in the revised methods and figure legends.

    1. eLife assessment

      Efforts to increase the representation of women in academia have focussed on efforts to recruit more women and to reduce the attrition of women. This study - which is based on analyses of data on more than 250,000 tenured and tenure-track faculty from the period 2011-2020, and the predictions of counterfactual models - shows that hiring more women has a bigger impact than reducing attrition. The study is an important contribution to work on gender representation in academia, and while the evidence in support of the findings is solid, the description of the methods used is in need of improvement.

    2. Reviewer #1 (Public Review):

      Summary and strengths<br /> This is an interesting paper that concludes that hiring more women will do more to improve the gender balance of (US) academia than improving the attrition rates of women (which are usually higher than men's). Other groups have reported similar findings but this study uses a larger than usual dataset that spans many fields and institutions, so it is a good contribution to the field.

      Weaknesses<br /> The paper uses a mixture of mathematical models (basically Leslie matrices, though that term isn't mentioned here) parameterised using statistical models fitted to data. However, the description of the methods needs to be improved significantly. The author should consider citing Matrix Population Models by Caswell (Second Edition; 2006; OUP) as a general introduction to these methods, and consider citing some or all of the following as examples of similar studies performed with these models:<br /> Shaw and Stanton. 2012. Proc Roy Soc B 279:3736-3741<br /> Brower and James. 2020. PLOS One 15:e0226392<br /> James and Brower. 2022. Royal Society Open Science 9:220785<br /> Lawrence and Chen. 2015. [http://128.97.186.17/index.php/pwp/article/view/PWP-CCPR-2015-008]<br /> Danell and Hjerm. 2013. Scientometrics 94:999-1006

      The analysis also runs the risk of conflating the fraction of women in a field with gender diversity! In female-dominated fields (e.g. Nursing, Education) increasing the proportion of women in the field will lead to reduced gender diversity. This does not seem to be accounted for in the analysis. It would also be helpful to state the number of men and women in each of the 111 fields in the study.

    3. Reviewer #2 (Public Review):

      Summary:<br /> This important study by LaBerge and co-authors seeks to understand the causal drivers of faculty gender demographics by quantifying the relative importance of faculty hiring and attrition across fields. They leverage historical data to describe past trends and develop models that project future scenarios that test the efficacy of targeted interventions. Overall, I found this study to be a compelling and important analysis of gendered hiring and attrition in US institutions, and one that has wide-reaching policy implications for the academy. The authors have also suggested a number of fruitful future avenues for research that will allow for additional clarity in understanding the gendered, racial, and socioeconomic disparities present in US hiring and attrition, and potential strategies for mitigating or eliminating these disparities.

      Strengths:<br /> In this study, LaBerge et al use data from over 268,000 tenured and tenure-track faculty from over 100 fields at more than 12,000 PhD-granting institutions in the US. The period they examine covers 2011-2020. Their analysis provides a large-scale overview of demographics across fields, a unique strength that allows the authors to find statistically significant effects for gendered attrition and hiring across broad areas (STEM, non-STEM, and topical domains).

      LaBerge et al. find gendered disparities in attrition-using both empirical data and their counterfactual model-that account for the loss of 1378 women faculty across all fields between 2011 and 2020. It is true that "this number is both a small portion of academia... and a staggering number of individual careers," as ." - as this loss of women faculty is comparable to losing more than 70 entire departments. I appreciate the authors' discussion about these losses-they note that each of these is likely unnecessary, as women often report feeling that they were pushed out of academic jobs.

      LaBerge et al. also find-by developing a number of model scenarios testing the impacts of hiring, attrition, or both-that hiring has a greater impact on women's representation in the majority of academic fields in spite of higher attrition rates for women faculty relative to men at every career stage. Unlike many other studies of historical trends in gender diversity, which have often been limited to institution-specific analyses, they provide an analysis that spans over 100 fields and includes nearly all US PhD-granting institutions. They are able to project the impacts of strategies focusing on hiring or retention using models that project the impact of altering attrition risk or hiring success for women. With this approach, they show that even relatively modest annual changes in hiring accumulate over time to help improve the diversity of a given field. They also demonstrate that, across the model scenarios they employ, changes to hiring drive the largest improvement in the long-term gender diversity of a field.

      Future work will hopefully - as the authors point out - include intersectional analyses to determine whether a disproportionate share of lost gender diversity is due to the loss of women of color from the professoriate. I appreciate the author's discussion of the racial demographics of women in the professoriate, and their note that "the majority of women faculty in the US are white" and thus that the patterns observed in this study are predominately driven by this demographic. I also highly appreciate their final note that "equal representation is not equivalent to equal or fair treatment," and that diversifying hiring without mitigating the underlying cause of inequity will continue to contribute to higher losses of women faculty.

      Weaknesses<br /> First, and perhaps most importantly, it would be beneficial to include a distinct methods section. While the authors have woven the methods into the results section, I found that I needed to dig to find the answers to my questions about methods. I would also have appreciated additional information within the main text on the source of the data, specifics about its collection, inclusion and exclusion criteria for the present study, and other information on how the final dataset was produced. This - and additional information as the authors and editor see fit - would be helpful to readers hoping to understand some of the nuance behind the collection, curation, and analysis of this important dataset.

      I would also encourage the authors to include a note about binary gender classifications in the discussion section. In particular, I encourage them to include an explicit acknowledgement that the trends assessed in the present study are focused solely on two binary genders - and do not include an analysis of nonbinary, genderqueer, or other "third gender" individuals. While this is likely because of the limitations of the dataset utilized, the focus of this study on binary genders means that it does not reflect the true diversity of gender identities represented within the professoriate.

      In a similar vein, additional context on how gender was assigned on the basis of names should be added to the methods section.

      I do think that some care might be warranted regarding the statement that "eliminating gendered attrition leads to only modest changes in field-level diversity" (Page 6). while I do not think that this is untrue, I do think that the model scenarios where hiring is "radical" and attrition is unchanged from present (equal representation of women and men among hires (ER) + observed attrition (OA)) shows that a sole focus on hiring dampens the gains that can otherwise be addressed via even modest interventions (see, e.g., gender-neutral attrition (GNA) + increasing representation of women among hires (IR)). I am curious as to why the authors did not include an additional scenario where hiring rates are equal and attrition is equalized (i.e., GNA + ER). The importance of including this additional model is highlighted in the discussion, where, on Page 7, the authors write: "In our forecasting analysis, we find that eliminating the gendered attrition gap, in isolation, would not substantially increase representation of women faculty in academia. Rather, progress towards gender parity depends far more heavily on increasing women's representation among new faculty hires, with the greatest change occurring if hiring is close to gender parity." I believe that this statement would be greatly strengthened if the authors can also include a comparison to a scenario where both hiring and attrition are addressed with "radical" interventions.

    4. Reviewer #3 (Public Review):

      This manuscript investigates the roles of faculty hiring and attrition in influencing gender representation in US academia. It uses a comprehensive dataset covering tenured and tenure-track faculty across various fields from 2011 to 2020. The study employs a counterfactual model to assess the impact of hypothetical gender-neutral attrition and projects future gender representation under different policy scenarios. The analysis reveals that hiring has a more significant impact on women's representation than attrition in most fields and highlights the need for sustained changes in hiring practices to achieve gender parity.

      Strengths:<br /> Overall, the manuscript offers significant contributions to understanding gender diversity in academia through its rigorous data analysis and innovative methodology.

      The methodology is robust, employing extensive data covering a wide range of academic fields and institutions.

      Weaknesses:<br /> The primary weakness of the study lies in its focus on US academia, which may limit the generalizability of its findings to other cultural and academic contexts. Additionally, the counterfactual model's reliance on specific assumptions about gender-neutral attrition could affect the accuracy of its projections.

      Additionally, the study assumes that whoever disappeared from the dataset is attrition in academia. While in reality, those attritions could be researchers who moved to another country or another institution that is not included in the AARC (Academic Analytics Research Centre) dataset.

    1. eLife assessment

      This study defines a fundamental aspect of protein kinase signalling in the protist parasite Toxoplasma gondii that is required for acute and chronic infections. The authors provide compelling evidence for the role of SPARK/SPARKEL kinases in regulating cAMP/cGMP signalling, although evidence linking the loss of these kinases to changes in the phosphoproteome is incomplete. Overall, this study will be of great interest to those who study Toxoplasma and related apicomplexan parasites.

    2. Reviewer #1 (Public Review):

      Summary:<br /> Herneisen et al characterise the Toxoplasma PDK1 orthologue SPARK and an associated protein SPARKEL in controlling important fate decisions in Toxoplasma. Over recent years this group and others have characterised the role of cAMP and cGMP signalling in negatively and positively regulating egress, motility, and invasion, respectively. This manuscript furthers this work by showing that SPARK and SPARKEL likely act upstream, or at least control the levels of the cAMP and cGMP-dependent kinases PKA and PKG, respectively, thus controlling the transition of intracellular replicating parasites into extracellular motile forms (and back again).

      The authors use quantitative (phospho)proteomic techniques to elegantly demonstrate the upstream role of SPARK in controlling cAMP and cGMP pathways. They use sophisticated analysis techniques (at least for parasitology) to show the functional association between cGMP and cAMP signalling pathways. They therefore begin to unify our understanding of the complicated signalling pathways used by Toxoplasma to control key regulatory processes that control the activation and suppression of motility. The authors then use molecular and cellular assays on a range of generated transgenic lines to back up their observations made by quantitative proteomics that are clear in their design and approach.

      The authors then extend their work by showing that SPARK/SPARKEL also control PKAc3 function. PKAc3 has previously been shown to negatively regulate differentiation into bradyzoite forms and this work backs up and extends this finding to show that SPARK also controls this. The authors conclude that SPARK could act as a central node of regulation of the asexual stage, keeping parasites in their lytic cell growth and preventing differentiation. Whether this is true is beyond the scope of this paper and will have to be determined at a later date.

      Strengths:<br /> This is an exceptional body of work. It is elegantly performed, with state-of-the-art proteomic methodologies carefully being applied to Toxoplasma. Observations from the proteomic datasets are masterfully backed up with validation using quantitative molecular and cellular biology assays.

      The paper is carefully and concisely written and is not overreaching in its conclusions. This work and its analysis set a new benchmark for the use of proteomics and molecular genetics in apicomplexan parasites.

      Weaknesses:<br /> This reviewer did not identify any weaknesses.

    3. Reviewer #2 (Public Review):

      Summary:<br /> The manuscript by Herneisen et al. examines the Toxoplasma SPARK kinase orthologous to mammalian PDK1 kinase. The extracellular signals trigger cascades of the second messengers and play a central role in the apicomplexan parasites' survival. In Toxoplasma, these cascades regulate active replication of the tachyzoites, which manifests as acute toxoplasmosis, or the development into drug-resilient bradyzoites characteristic of the chronic stage of the disease. This study focuses on the poorly understood signaling mechanisms acting upstream of such second messenger kinases as PKA and PKG. The authors showed that similar to PDK1, Toxoplasma SPARK appears to regulate several AGC kinases.

      Strengths:<br /> The study demonstrated a strong association of the SPARK kinase with an elongin-like SPARKEL factor and an uncharacterized AGC kinase. Using a set of standard assays, the authors determined the SPARK/SPARKEL role in parasite egress and invasion. Finally, the study presented evidence of the SPARK/SPARKEL involvement in the bradyzoite differentiation.

      Weaknesses:<br /> Although the study can potentially uncover essential sensing mechanisms operating in Toxoplasma, the evidence of the SPARK/SPARKEL mechanisms is weak. Specifically, due to incomplete data analysis, the SPARK/SPARKEL-dependent phosphoregulation of AGC kinases cannot be evaluated. The manuscript requires better organization and lacks guidance on the described experiments. Although the study is built on advanced genetics, at times, it is unnecessarily complicated, raising doubts rather than benefiting the study.

    4. Reviewer #3 (Public Review):

      Summary:<br /> This paper focuses on the roles of a toxoplasma protein (SPARKEL) with homology to an elongin C and the kinase SPARK that it interacts with. They demonstrate that the two proteins regulate the abundance of PKA and PKG, and that depletion of SPARKEL reduces invasion and egress (previously shown with SPARK), and that their loss also triggers spontaneous bradyzoite differentiation. The data are overall very convincing and will be of high interest to those who study Toxoplasma and related apicomplexan parasites.

      Strengths:<br /> The study is very well executed with appropriate controls. The manuscript is also very well and clearly written. Overall, the work clearly demonstrates that SPARK/SPARKEL regulate invasion and egress and that their loss triggers differentiation.

      Weaknesses:<br /> 1. The authors fail to discriminate between SPARK/SPARKEL acting as negative regulators of differentiation as a result of an active role in regulating stage-specific transcription/translation or as a consequence of a stress response activated when either is depleted.

      2. The function of SPARKEL has not been addressed. In mammalian cells, Elongin C is part of an E3 ubiquitin ligase complex that regulates transcription and other processes. From what I can tell from the proteomic data, homologs of the Elongin B/C complex were not identified. This is an important issue as the authors find that PKG and PKA protein levels are reduced in the knockdown strains

    1. eLife assessment

      This study presents a valuable finding on how lentiviral infection has driven the diversification of the HIV/SIV entry receptor CD4. Using a combination of molecular evolution approaches coupled with functional testing of extant and ancestral reconstructions of great ape CD4, the authors provide solid evidence to support the idea that endemic simian immunodeficiency virus infection in gorillas have selected for gorilla CD4 alleles that are more resistant to SIV infection. However, this conclusion would be supported more strongly with additional functional testing of other great ape CD4 relative to human and ancestral sequences. Additionally, given the difficulty in definitively proving drivers of selection, the current title of the study is considered an overstatement relative to the data presented.

    2. Reviewer #1 (Public Review):

      Summary:<br /> The authors previously demonstrated that species-specific variation in primate CD4 impacts its ability to serve as a functional receptor for diverse SIVs. Here, Warren and Barbachano-Guerrero et al. perform population genetics analyses and functional characterization of great ape CD4 with a particular focus on gorillas, which are natural hosts of SIVgor. They first used ancestral reconstruction to derive the ancestral hominin and hominid CD4. Using pseudotyped viruses representing a panel of envelopes from SIVcpz and HIV strains, they find that these ancestral reconstructions of CD4 are more similar to human CD4 in terms of being a broadly susceptible entry receptor (in the context of mediating entry into Cf2Th cells stably expressing human CCR5). In contrast, extant gorilla and chimpanzee CD4 are functional entry receptors for a narrower range of HIV and SIVcpz isolates. Based on these differences, authors next surveyed gorilla sequences and identified several CD4 haplotypes, specifically in the region encoding the CD4 D1 domain, which directly contacts the viral glycoprotein and thus may impact the interaction. Consistent with this possibility, the authors demonstrated that gorilla CD4 haplotypes are, on average, less capable of supporting entry than human CD4, and that some are largely unable to function as SIV entry receptors. Interestingly, individual residues found at key positions in the gorilla CD4 D1 when tested in the context of human CD4 reduce entry of some virions pseudotyped with diverse SIVcpz envelopes, suggesting that individual amino acids can in part explain the observed differences across gorilla CD4 haplotypes. Finally, the authors perform statistical tests to infer that CD4 from great apes with endemic SIV (i.e., chimpanzees and gorillas) but not non-reservoirs (i.e., orangutans, bonobos) or recent spillover hosts (i.e., humans), have been subject to selection as a result of pressure from endemic SIV.

      The conclusions of this paper are mostly well supported by data.

      Strengths:<br /> The functional assays are appropriate to test the stated hypothesis, and the authors use a broad diversity of envelopes from HIV and SIVcpz strains. The authors also partially characterize one potential mechanism of gorilla CD4 resistance - receptor glycosylation at the derived N15 found in 5/6 gorilla haplotypes.

      Ancestral reconstruction provides a particularly interesting aspect of the study, allowing authors to infer the ancestral state of hominid CD4 relative to modern CD4 from gorillas and chimpanzees. This, coupled with evidence supporting SIV-driven selection of gorilla CD4 diversity and the characterization of functional diversity of extant haplotypes provides several interesting findings.

      Weaknesses:<br /> The major inference of the work is that SIV infection of gorillas drove the observed diversity in gorilla CD4. This is supported by the majority of SNPs being localized to the CD4 D1, which directly interacts with the envelope, and the demonstrated functional consequences of that diversity for viral entry. However, SIVgor (to the best of my knowledge) only infects Western lowland gorillas (Gorilla gorilla gorilla), and one Gorilla gorilla diehli and three Gorilla beringei graueri individuals were included in the haplotype and allele frequency analyses. The presence of these haplotypes or the presence of similar allele frequencies in Eastern lowland and mountain gorillas would impact this conclusion. It would be helpful for the authors to clarify this point.

      The authors appear to use a somewhat atypical approach to assess intra-population selection to compensate for relatively small numbers of NHP sequences (Fig. 6). However, they do not cite precedence for the robustness of the approach or the practice of grouping sequences from multiple species for the endemic vs other comparison. They also state in the methods that some genes encoded in the locus were removed from the analysis "because they have previously been shown to directly interact with a viral protein." This seems to undercut the analysis and prevents alternative explanations for the observed diversity in CD4 (e.g., passenger mutations from selection at a neighboring locus).

      Data in Figure 5 is graphed as % infected cells instead of virus titer (TDU/mL). It's unclear why this is the case, and prevents a comparison to data in Figure 2 and Figure 4.

      The lack of pseudotyping with SIVgor envelope is a surprising omission from this study, that would help to contextualize the findings. Similarly, building gorilla CD4 haplotype SNPs onto the hominin ancestor (as opposed to extant human CD4) may provide additional insights that are meaningful toward understanding the evolutionary trajectory of gorilla CD4.

    3. Reviewer #2 (Public Review):

      Lentiviral infection of primate species has been linked to the rapid mutational evolution of numerous primate genes that interact with these viruses, including genes that inhibit lentiviruses as well as genes required for viral infection. In this manuscript, Warren et al. provide further support for the diversification of CD4, the lentiviral entry receptor, to resist lentiviral infection in great ape populations. This work builds on their prior publication (Warren et al. 2019, PMCID: PMC6561292 ) and that of other groups (e.g., Russell et al. 2021, PMCID: PMC8020793; Bibollet-Ruche et al. 2019, PMCID: PMC6386711) documenting both sequence and functional diversity in CD4, specifically within (1) the CD4 domain that binds to the lentiviral envelope and (2) great ape populations with endemic lentiviruses. Thus, the paper's finding that gorilla populations exhibit diverse CD4 alleles that differ in their susceptibility to lentiviral infection is well demonstrated both here and in a prior publication.

      To bolster the argument that lentiviruses are indeed the causative driver of this diversification, which seems likely from a logical perspective but is difficult to prove, Warren et al. pursue two novel lines of evidence. First, the authors reconstruct ancestral CD4 genes that predate lentiviral infection of hominid populations. They then demonstrate that resistance to lentiviral infection is a derived trait in chimpanzees and gorillas, which have been co-evolving with endemic lentiviruses, but not in humans, which only recently acquired HIV. Nevertheless, the derived resistance could be stochastic or due to drift. This argument would be strengthened by demonstrating that bonobo and orangutan CD4, which also do not have endemic lentiviruses, resemble the ancestral and human susceptibility to great-ape-infecting lentiviruses.

      Second, Warren et al. provide a population genetic argument that only endemically infected primates exhibit diversifying selection, again arguing for endemic lentiviruses being the evolutionary driver. The authors compare SNP occurrence in CD4 to neighboring genes, demonstrating that non-synonymous SNP frequency is only elevated in endemically infected species. Moreover, these amino-acid-coding changes are significantly concentrated in the CD4 domain that binds the lentiviral envelope. This is a creative analysis to overcome the problem of very small sample sizes, with very few great ape individuals sequenced. The additional small number of species compared (2-3 in each group) also limits the power of the analysis; the authors could consider expanding their analysis to Old World Monkey species that do or do not have endemic lentiviruses, as well as great apes.

      Overall, this manuscript lends additional support to a well-documented example of a host-virus arms race: that of lentiviruses and the viral entry receptor.

    1. eLife assessment

      This study presents evidence that suggests that the coalescence of sister chromatids induced by global double-strand DNA breaks (DSBs) during late mitosis is mediated by cohesin SMC3. These findings are valuable for studying the mechanism of eukaryotic cells to repair DNA during late mitosis. Although the discrete DSB induction system in budding yeast is sound, the strength of evidence is incomplete and could be buttressed to better support the major claims and to represent a clear advance with respect to the authors' previous contributions to this field.

    2. Reviewer #1 (Public Review):

      Summary:<br /> The cohesin complex maintains sister chromatid cohesion from S phase to anaphase. Beyond that, DSBs trigger cohesin recruitment and post-replication cohesion at both damage sites and globally, which was originally reported in 2004. In their recent study, Ayra-Plasencia et al reported in telophase, DSBs are repaired via HR with re-coalesced sister chromatids (Ayra-Plasencia & Machín, 2019). In this study, they show that HR occurs in a Smc3-dependent way in late mitosis.

      Strengths:<br /> The authors take great advantage of the yeast system, they check the DSB processing and repair of a single DSB generated by HO endonuclease, which cuts the MAT locus in chromosome III. In combination with cell synchronization, they detect the HR repair during G2/M or late mitosis. and the cohesin subunit SMC3 is critical for this repair. Beyond that, full-length Scc1 protein can be recovered upon DSBs.

      Weaknesses:<br /> These new results basically support their proposal although with a very limited molecular mechanistic progression, especially compared with their recent work.

    3. Reviewer #2 (Public Review):

      Summary:<br /> The manuscript "Cohesin still drives homologous recombination repair of DNA double-strand breaks in late mitosis" by Ayra-Plasencia et al. investigates regulations of HR repair in conditional cdc15 mutants, which arrests the cell cycle in late anaphase/telophase. Using a non-competitive MAT switching system of S. cerevisiae, they show that a DSB in telophase-arrested cells elicits a delayed DNA damage checkpoint response and resection. Using a degron allele of SMC3 they show that MATa-to-alpha switching requires cohesin in this context. The presence of a DSB in telophase-arrested cells leads to an increase in the kleisin subunit Scc1 and a partial rejoining of sister chromatids after they have separated in a subset of cells.

      Strengths:<br /> The experiments presented are well-controlled. The induction systems are clean and well thought-out.

      Weaknesses:<br /> The manuscript is very preliminary, and I have reservations about its physiological relevance. I also have reservations regarding the usage of MAT to make the point that inter-sister repair can occur in late mitosis.

    1. eLife assessment

      This manuscript offers valuable information on the effect of two small molecule combinations (2C), CHIR99021 and A-485, during the reprogramming of mature cardiomyocytes into regenerative cardiac cells. This manuscript is incomplete, as the mechanistic insights derived from transcriptomic and genomic datasets are without experimental validation. This manuscript also needs additional experimental support to confirm the regenerative potential of 2C and improvements in the data analysis and presentation. Overall, this interesting work provides insights into the development of therapeutic targets for cardiac regeneration in infarcted hearts.