    1. Author Response

      We appreciate the thoughtful comments from the reviewers. All reviewers express common support for the study’s meaningful contribution to understanding interoceptive neurocircuitry in health and in psychiatric disorders. Specifically, the reviewers highlight the strong theoretical backing and the novel combination of tasks and analytical methods. In turn, the reviewers identify several areas for improvement that we plan to address in our resubmission. These include a more detailed demographic characterization of the study participants, increased clarity when describing the statistics that support each conclusion, and additional discussion when interpreting the resting state findings, as we did not include a separate control condition for the effect of time. One reviewer commented that we largely cite our previous work with the isoproterenol paradigm; while we will provide an updated and broader view of the literature in our resubmission, there remains a limited number of comparable interoceptive perturbation studies. Finally, one comment referred to our reliance on ratings of interoceptive intensity without included additional behavioral measures. While our measures of interest were chosen for their relevance to our hypotheses, we will consider adding additional measures such as interoceptive accuracy (correspondence between heart rate and dial ratings) that were collected during the perturbation task, should they provide additional insight into the insular responses of the participants.

    2. Reviewer #1 (Public Review):

      Summary:<br /> The aim of this study is to explore the neurocircuitry of top-down and bottom-up interception, and how this differs in psychiatric disorders. Using functional neuroimaging, the research focuses on individuals with anxiety, depression, and/or eating disorders compared to healthy individuals. The findings highlight the dysgranular mid-insula as a key cortical area where attention and real-time bodily inputs converge, potentially serving as a disruption point in psychiatric disorders.

      Strengths:<br /> The authors used robust and validated methods to answer their research question efficiently. They illustrate a complete picture of the theoretical impact of the study and their own strengths and weaknesses.

      Weaknesses:<br /> One concern is regarding the experimental task design. Currently, only subjective reports of interoceptive intensity are taken into account, the addition of objective behavioural measures would have given additional value to the study and its impact.

      This brings me to my second concern. The authors mostly refer to their own previous work, without highlighting other methods used in the field. Some tasks measure interoceptive accuracy or other behavioural outcomes, instead of merely subjective intensity. Expanding the scientific context would aid the understanding and integration of this study with the rest of the field.

      Lastly, the suggestions for future research lack substance compared to the richness of the discussion.

    3. Reviewer #2 (Public Review):

      Summary:<br /> The authors have conducted an exceptionally informative series of studies investigating the neural basis of interoception in transdiagnostic psychiatric symptoms. By comparing differential and overlapping neural activation during 'top-down' and 'bottom-up' interoceptive tasks, they reveal convergent activation largely localised to the ventral dysgranular subregion ('mid-insula'), which differs in extent between patients and controls, replicating and extending previous suggestions of this region as a central locus of disruption in psychiatric disorders. Their work also reveals different extents of divergent activation in the anterior insula during anticipation of interoceptive disruption. This substantially advances our previous knowledge of the anatomy of interoception and confirms theoretical predictions of the roles of different cytoarchitectural subregions of the insula in interoceptive dysfunction in mental health conditions.

      Strengths:<br /> The work is exceptional in terms of breadth and depth, making use of multiple imaging and analysis techniques which are non-standard and go well beyond what is known today. The study is statistically well-powered and the tasks are well-validated in the literature. To my knowledge, these functions of the insula in interoception and mental health have never been compared directly before, so the results are novel and informative for both basic science and psychiatry. The work is strongly theory-driven, building on and directly testing results from influential theories and previous studies. It is likely that the results will strengthen our theoretical models of interoception and advance psychiatric studies of the insula.

      Weaknesses:<br /> The study has three current limitations. (1) The interpretation of the resting-state data is not quite as clear-cut as the task-based data - as presented currently, changes could potentially represent fluctuations over time rather than following interoception specifically. In contrast, much stronger conclusions can be drawn from the authors' task-based data. (2) The transdiagnostic sample could be better characterised in terms of diagnostic information, and was almost entirely female; it is also unclear what the effect of psychotropic medications may have been on the results given the effects of (e.g.) serotonergic medication on the BOLD signal. (3) As the authors point out, there may have been task-specific preprocessing/analysis differences that influenced results, for example, due to physiological correction in one but not both tasks.

    4. Reviewer #3 (Public Review):

      Summary:<br /> Adamic and colleagues present fMRI data from ADE patients and a healthy control group acquired during two interoceptive tasks (attention and perturbation) from the same session. They report convergent activity within the granular and dysgranular insular cortex during both tasks, with a patient group-specific lateralisation effect. Furthermore, insular functional connectivity was found to be linked to disease severity.

      Strengths:<br /> The study is well-designed and - despite some limitations noted by the authors - provides much-needed insight into the functional pathways of interoceptive processing in health and disease. The manuscript is clear, concise, and well-written so that I only have a few comments I would mostly regard as minor points.

      Weaknesses:<br /> There are a few instances where it is not entirely clear whether the authors' claims are fully supported by the underlying statistics.

    1. Author Response

      Reviewer #1 (Public Review):

      This manuscript presents the first evidence for a plastic enhancement in the response of pial cortical arterioles to external stimulation. Specifically, they show (p8; Figure 3A-C) that repeated application of a visual stimulus at 0.25 Hz, at the upper edge of the vasomotor response, leads to a greater change in the diameter of pial arterioles at that frequency. This adds to the earlier, referenced work of Mateo et al (2017) that showed locking - or entrainment of pial arteriole vasomotion - by stimuli at different (0.0 to 0.3 Hz) frequencies.

      We thank the reviewer for positively identifying the value of our manuscript.

      The manuscript has a major flaw. Much as there is plasticity that leads to an increase in the amplitude of vasomotion at the drive frequency, the authors need to show reversibility. This could possibly be accomplished by driving the visual system at a different frequency, say 0.15 Hz, and observing if the 0.25 Hz response is then diminished. The authors could then test if their observation is repeatable by again driving at 0.25 Hz. Unless I missed the presentation on this point, there is no evidence for reversibility.

      The reviewer has raised a very important point of view. In our experiments, the visually induced vasomotion (or visual stimulus-triggered vasomotion) was always entrained by repeated trials of the 0.25 Hz temporal frequency stimuli. When the visual stimulation stops, the vasomotion frequency lock to 0.25 Hz quickly dissipates. After saturated training with this stimulus, the parameters of the visual stimulus were switched, for example to 0.15 Hz. The animal quickly adapted to this new stimulus paradigm and the vasomotion was frequency-locked to 0.15 Hz. The adaptation to this new paradigm occurred well within 5 minutes. In Fig. 5, various paradigms were randomly tested. In some of the trials, 0.25 Hz stimulus was tested after 0.15 Hz. The vasomotion also quickly adapted back to the 0.25 Hz. We agree with the reviewer that this reversibility could have been explicitly documented in the manuscript.

      Drew, P. J., A. Y. Shih, J. D. Driscoll, P. M. Knutsen, D. Davalos, P. Blinder, K. Akassoglou, P. S. Tsai, and D. Kleinfeld. 2010. 'Chronic optical access through a polished and reinforced thinned skull', Nature Methods, 7: 981-84.

      Morii, S., A. C. Ngai, and H. R. Winn. 1986. 'Reactivity of rat pial arterioles and venules to adenosine and carbon dioxide: With detailed description of the closed cranial window technique in rats', Journal of Cerebral Blood Flow & Metabolism, 6: 34-41.

      Reviewer #2 (Public Review):

      Sasaki et al. investigated methods to entrain vasomotion in awake wild-type mice across multiple regions of the brain using a horizontally oscillating visual pattern which induces an optokinetic response (HOKR) eye movement. They found that spontaneous vasomotion could be detected in individual vessels of their wild-type mice through either a thinned cranial window or intact skull preparation using a widefield macro-zoom microscope. They showed that low-resolution autofluorescence signals coming from the brain parenchyma could be used to capture vasomotion activity using a macro-zoom microscope or optical fibre, as this signal correlates well with the intensity profile of fluorescently-labelled single vessels. They show that vasomotion can also be entrained across the cortical surface using an oscillating visual stimulus with a range of parameters (with varying temporal frequencies, amplitudes, or spatial cycles), and that the amplitude spectrum of the detected vasomotion frequency increases with repeated training sessions. The authors include some control experiments to rule out fluorescence fluctuations being due to artifacts of eye movement or screen luminance and attempt to demonstrate some functional benefit of vasomotion entraining as HOKR performance improves after repeat training. These data add in an interesting way to the current knowledge base on vasomotion, as the authors demonstrate the ability to entrain vasomotion across multiple brain areas and show some functional significance to vasomotion with regards to information processing as HOKR task performance correlates well with vascular oscillation amplitudes.

      We thank the reviewer for summarizing the value of our study and recognizing its significance.

      The aims of the paper are mostly well supported by the data, but some streamlining of the data presentation would improve overall clarity. The third aim to establish the functional significance of vasomotion in relation to plasticity in information processing could be better supported by the inclusion of some additional control experiments.

      We thank the reviewer for recognizing our vast amount of data supporting our findings. We agree that better data presentation could have improved the clarity of the manuscript.


      1) The clarity and comprehensibility of the paper could be significantly enhanced by incorporating additional details in both the introduction and discussion sections. In the introduction, a succinct definition of the frequency range of vasomotion should be provided, as well as a better description of the horizontal optokinetic response (i.e. as they have in the results section in the first paragraph below the 'Entrainment of vasomotion with visual stimuli presentation' sub-heading). The discussion would benefit from the inclusion of a clear summary of the results presented at the start, and the inclusion of stronger justification (i.e. more citations) with regards to the speculation about vasomotion and neuronal plasticity (e.g. paragraph 5 includes no citations).

      We agree that a better description of vasomotion and horizontal optokinetic response could have been provided in the introduction. As the reviewer suggests, the discussion could also have started with the following summary of the results.

      “We show that visually induced vasomotion can be frequency-locked to the visual stimulus and can be entrained with repeated trials. The initial drive for the vasomotion, or the sensory-evoked hyperemia, must be coming from the neuronal activity in the visual system. The vasomotion is likely triggered by activation of the neurovascular interaction (Kayser, 2004; van Veluw et al., 2020). Surprisingly, the entrained vasomotion was observed not only in the visual cortex but also widely throughout the surface of the brain and deep in the cerebellar flocculus. The global entrainment could be realized through separate mechanisms from the local neurovascular coupling. What is also unknown is where the plasticity occurs. The neuronal visual response in the primary visual cortex could potentially decrease with repeated visual stimulation presentation as the adaptive movement of the eye should decrease the retinal slip. With repeated training sessions, a more static projection of the presented image will likely be shown to the retina. The neurovascular coupling could be enhanced with increased responsiveness of the vascules and vascular-to-vascular coupling could also be potentiated.”

      2) The novel methods for detecting vasomotion using low-resolution imaging techniques are discussed across the first four figures, but this gets a little bit confusing to follow as the authors jump back and forth between the different imaging and analysis techniques they have employed to capture vasomotion. The data presentation could be better streamlined - for instance by presenting only the methods most relevant for the functional dataset (in Figures 5-7), with the additional information regarding the various controls to establish the use of autofluorescence intensity imaging as a valid method for capturing vasomotion reduced to fewer figure panels, or moved to supplementary figures so as to not detract from the main novel findings contributed in this study.

      We apologize for the confusing presentation of the data. Many of the initial figures were technical; however, we feel that following these steps was necessary to logically conclude that shadow imaging of the autofluorescence could be used as an indicator of vasomotion. We do agree with the reviewer that going back and forth between different techniques can be confusing. We could have added separate supplementary figures to introduce the various methods used upfront before going into the findings.

      3) The authors heavily rely on representative traces from individual vessels to illustrate their findings, particularly evident in Figures 1-4. While these traces offer a valuable visualization, augmenting their approach by presenting individual data points across the entire dataset, encompassing all animals and vessels, would significantly enhance the robustness of their claims. For instance, in Figures 1 and 2, where average basal and dilated traces are depicted for a representative vessel, supplementing these with graphs showcasing peak values across all measured vessels would enable the authors to convey a more holistic representation of their data. Or in Figure 3, where the amplitude spectrum is presented for individual Texas red fluorescence intensity changes in V1 across novice, trained, and expert mice, incorporating a summary graph featuring the amplitude spectrum value at 0.25Hz for each individual trace (across animals/imaging sessions), followed by statistical analysis, would fortify the strength of their assertions. Moreover, providing explicit details on sample sizes for each individual figure panel (where not a representative trace), including the number of animals or vessels/imaging sessions, would contribute to transparency and aid readers in assessing the generalisability of the findings.

      We agree with the reviewer that summarization of the data across a number of vessels/imaging sessions would lead to more generalization of the findings. However, contrary to what the reviewer described, we did summarize the vessel diameter expansion events across multiple vessel observations in Fig. 1F, G. The vasomotion parameters were not summarized for observation in intact skull shown in Fig. 2. However, this figure was intended just to show that vessel boundary cannot be well defined in intact skull imaging and Texas Red intensity or autofluorescence intensity fluctuation would give a better indication of vessel diameter fluctuation. In Fig. 3G, the peak ratio of 0.25 Hz was calculated for individual animals at Novice, Trained, and Expert levels and summarized for n = 5 animals. Statistical analysis was also done. The variability between imaging sessions within individual animals was not analyzed; thus, this could have been indicated.

      4) In the experiments where mice are classed as "novice", "trained" or "expert", the inclusion of the specific range of the number of training sessions for each category would improve replicability.

      We agree with the reviewer that classification on the level of training should have been explicitly indicated. Mice experiencing the first visual training session were defined as “Novice”. The mice that have experienced 3 training sessions are the “Trained” mice and the performance of the “Trained” mice during the 4th training session was evaluated. Mice that experienced 8 to 11 rounds of visual training sessions are the “Expert” mice.

      5) The authors don't state whether mice were habituated to the imaging set-up prior to the first data collection, as head-fixation and restraint can be stress-inducing for animals, especially upon first exposure, which could impact their neurovascular coupling responses differentially in "novice" versus "trained" imaging sessions (e.g. see Han et al., 2020, DOI: https://doi.org/10.1523/JNEUROSCI.1553-20.2020). The stress associated with a tail vein injection prior to imaging could also partially explain why mice didn't learn very well if Texas Red was injected before the training session. If no habituation was conducted in these experiments, the study would benefit from the inclusion of some control experiments where "novice" responses were compared between habituated and non-habituated animals.

      We agree with the reviewer that stress could well affect spontaneous vasomotion as well as visually induced vasomotion (or visual stimulus-triggered vasomotion). As the reviewer suggested, we could have compared the habituated and non-habituated mice to the initial visually induced vasomotion response. In addition, whether the experimentally induced increase in stress would interfere with the vasomotion or not could also be studied. With the Texas Red experiments, we observed that tail-vein injection stress appeared to interfere with the HOKR learning process. In the experiments presented in Fig. 3, Texas Red was injected before session 1. Vasomotion entrainment likely progressed with sessions 2 and 3 training. Before session 4, Texas Red was injected again to visualize the vasomotion. The vasomotion was clearly observed in session 4, indicating that the stress induced by tail-vein injection could not interfere with the generation of visually induced vasomotion.

      6) The experiments regarding the brain-wide vasomotion entrainment across the cortical surface would benefit from some additional information about how brain regions were identified (e.g. particularly how V1 and V2 were distinguished given how close together they are).

      The brain regions were identified by referring to the Mouse Brain Atlas. As the skull was intact, the location of bregma, lambda, and midline was clearly visible. We agree with the reviewer that strict separation of V1 and V2 could be difficult if we rely on the brain atlas alone. However, what we wanted to emphasize was that there was no specific localization of the vasomotion entrainment effect.

      7) Whilst the authors show that HOKR task performance and vasomotion amplitude are increased with repeated training to provide some support to their aim of investigating the functional significance of vasomotion with regards to information processing plasticity, the inclusion of some additional control experiments would provide stronger evidence to address this aim. For instance, if vasomotion signalling is blocked or reduced (e.g. using optogenetics or in an AD mouse model where arteriole amyloid load restricts vasomotion capacity), does flocculus-dependent task performance (e.g. HOKR eye movements) still improve with repeated exposure to the external stimulus.

      We agree that experimental intervention to vasomotion is ideal to test the functional significance of vasomotion. As pharmacological intervention lacks specificity, we are currently exploring the optogenetic approach. We have never thought of using the AD mouse as a model of restricted vasomotion by amyloid, and we agree this would be an interesting model to study. However, the AD mouse model would also have deficits other than the restricted vasomotion. On the other hand, we could test whether the repeated presentation of slowly oscillating visual stimuli can have beneficial effects in improving the cognitive abilities of AD model mice.

      Reviewer #3 (Public Review):


      Here the authors show global synchronization of cerebral blood flow (CBF) induced by oscillating visual stimuli in the mouse brain. The study validates the use of endogenous autofluorescence to quantify the vessel "shadow" to assess the magnitude of frequency-locked cerebral blood flow changes. This approach enables straightforward estimation of artery diameter fluctuations in wild-type mice, employing either low magnification wide-field microscopy or deep-brain fibre photometry. For the visual stimuli, awake mice were exposed to vertically oscillating stripes at a low temporal frequency (0.25 Hz), resulting in oscillatory changes in artery diameter synchronized to the visual stimulation frequency. This phenomenon occurred not only in the primary visual cortex but also across a broad cortical and cerebellar surface. The induced CBF changes adapted to various stimulation parameters, and interestingly, repeated trials led to plastic entrainment. The authors control for different artefacts that may have confounded the measurements such as light contamination and eye movements but found no influence of these variables. The study also tested horizontally oscillating visual stimuli, which induce the horizontal optokinetic response (HOKR). The amplitude of eye movement, known to increase with repeated training sessions, showed a strong correlation with CBF entrainment magnitude in the cerebellar flocculus. The authors suggest that parallel plasticity in CBF and neuronal circuits is occurring. Overall, the study proposes that entrained "vasomotion" contributes to meeting the increased energy demand associated with coordinated neuronal activity and subsequent neuronal circuit reorganization.

      We thank the reviewer for providing a thorough summarization of our manuscript.


      • The paper describes a simple and useful method for tracking vasomotion in awake mice through an intact skull.

      • The work controls for artefacts in their primary measurements.

      • There are some interesting observations, including the nearly brain-wide synchronization of cerebral blood flow oscillations to visual stimuli and that this process only occurs after mice are trained in a visual task.

      • This topic is interesting to many in the CBF, functional imaging, and dementia fields.

      We thank the reviewer for positively recognizing the strength of the paper.


      • I have concerns with the main concepts put forward, regarding whether the authors are actually studying vasomotion as they state, as opposed to functional hyperemia which is sensory-induced changes in blood flow, which is what they are actually doing. I recommend several additional experiments/analyses for them to explore. This is mostly further characterizing their effect which will benefit the interpretations.

      We recognized that the terminology used in our paper was not explicitly explained. Traditionally, “vasomotion” is defined as the dilation and constriction of the blood vessels that occurs spontaneously at low frequencies in the 0.1 Hz range without any apparent external stimuli. Sensory-induced changes in the blood flow are usually called “hyperemia”. However, in our paper, we used the term, vasomotion, literally, to indicate both forms of “vascular” “motion”. Therefore, the traditional vasomotion was called “spontaneous vasomotion” and the hyperemia induced with slow oscillating visual stimuli was called “visually induced vasomotion”.

      Using our newly devised methods, we show the presence of “spontaneous vasomotion”. However, this spontaneous vasomotion was often fragmented and did not last long at a specific frequency. With visual stimuli that slowly oscillated at temporal frequencies close to the frequency of spontaneous vasomotion, oscillating hyperemia, or “visually induced vasomotion” was observed.

      • Neuronal calcium imaging would also benefit the study and improve the interpretations.

      In our paper, we mainly studied the visually induced vasomotion (or visual stimulus-triggered vasomotion). Therefore, visual stimulation must first activate the neurons and, through neurovascular coupling, the initial drive for vasomotion is likely triggered. However, visually induced vasomotion is not observed in novice animals. Therefore, the visually induced vasomotion is not a simple sensory reaction of the vascular in response to neuronal activity in the primary visual cortex. We also do not know how the synchronized vasomotion can spread throughout the whole brain. Where the plasticity for vasomotion entrainment occurs is also unknown. To identify the extent of the neuronal contribution to the vasomotion triggering, whole brain synchronization, and vasomotion entrainment, simultaneous neuronal calcium imaging would be ideal. However, due to the fact that fluorescent Ca2+ indicators expressed in neurons would also be distorted by the “shadow” effect from the vasomotion, exquisite imaging techniques would be required.

      • The plastic effects in vasomotion synchronization that occur with training are interesting but they could use an additional control for stress. Is this really a plastic effect, or is it caused by progressively decreasing stress as trials and progress? I recommend a habituation control experiment.

      As also pointed out by reviewer #2, we agree that, whether stress would affect visually induced vasomotion or not could be studied. Studying the visually induced vasomotion in mice well-habituated to the experimental apparatus would give an idea of whether stress could truly be a profounding factor affecting vasomotion. On the other hand, whether acutely induced stress can interfere with the already entrained vasomotion could also be studied. In the experiments presented in Fig. 3, Texas Red was injected via the tail vein, which would be quite stressful for the mouse. However, in the trained mouse, visually induced vasomotion could be observed regardless of the stress. It is likely that stress can interfere with the acquisition of vasomotion entrainment, but the already acquired entrainment will not be canceled with acute stress induced by tail-vein injection. We agree that further relationship between stress and vasomotion and plasticity related to vasomotion entrainment could be investigated.


      I think the authors have an interesting effect that requires further characterization and controls. Their interpretations are likely sound and additional experiments will continue to support the main hypothesis. If brain-wide synchrony of blood flow can be trained and entrained by external stimuli, this may have interesting therapeutic potential to help clear out toxic proteins from the brain as seen in several neurodegenerative diseases.

      We thank the reviewer for the positive evaluation of our manuscript. Strong entrainment of visually induced vasomotion was observed with a simple presentation of slowly oscillating visual stimuli for several days. This is a totally non-invasive method to train the vasomotion capacity. As the reviewer recognizes, potential benefits for the treatment of dementia and neurodegenerative diseases could be evaluated with further studies.

    2. eLife assessment

      This manuscript presents potentially valuable results indicating a plastic enhancement in the vasomotion response of pial cortical arterioles to external stimulation in awake mice using a wide range of external visual stimulation paradigms. The evidence for this interesting effect, with broad potential applications, is, however, incomplete because it is unclear whether the effect is a modulation of vasomotion rather than stimulus-driven hyperaemia and whether it is reversible. These results are relevant for scientists and clinicians interested in the regulation of blood flow in the brain.

    3. Reviewer #1 (Public Review):

      This manuscript presents the first evidence for a plastic enhancement in the response of pial cortical arterioles to external stimulation. Specifically, they show (p8; Figure 3A-C) that repeated application of a visual stimulus at 0.25 Hz, at the upper edge of the vasomotor response, leads to a greater change in the diameter of pial arterioles at that frequency. This adds to the earlier, referenced work of Mateo et al (2017) that showed locking - or entrainment of pial arteriole vasomotion - by stimuli at different (0.0 to 0.3 Hz) frequencies.

      The manuscript has a major flaw. Much as there is plasticity that leads to an increase in the amplitude of vasomotion at the drive frequency, the authors need to show reversibility. This could possibly be accomplished by driving the visual system at a different frequency, say 0.15 Hz, and observing if the 0.25 Hz response is then diminished. The authors could then test if their observation is repeatable by again driving at 0.25 Hz. Unless I missed the presentation on this point, there is no evidence for reversibility.

      Drew, P. J., A. Y. Shih, J. D. Driscoll, P. M. Knutsen, D. Davalos, P. Blinder, K. Akassoglou, P. S. Tsai, and D. Kleinfeld. 2010. 'Chronic optical access through a polished and reinforced thinned skull', Nature Methods, 7: 981-84.<br /> Morii, S., A. C. Ngai, and H. R. Winn. 1986. 'Reactivity of rat pial arterioles and venules to adenosine and carbon dioxide: With detailed description of the closed cranial window technique in rats', Journal of Cerebral Blood Flow & Metabolism, 6: 34-41.

    4. Reviewer #2 (Public Review):

      Sasaki et al. investigated methods to entrain vasomotion in awake wild-type mice across multiple regions of the brain using a horizontally oscillating visual pattern which induces an optokinetic response (HOKR) eye movement. They found that spontaneous vasomotion could be detected in individual vessels of their wild-type mice through either a thinned cranial window or intact skull preparation using a widefield macro-zoom microscope. They showed that low-resolution autofluorescence signals coming from the brain parenchyma could be used to capture vasomotion activity using a macro-zoom microscope or optical fibre, as this signal correlates well with the intensity profile of fluorescently-labelled single vessels. They show that vasomotion can also be entrained across the cortical surface using an oscillating visual stimulus with a range of parameters (with varying temporal frequencies, amplitudes, or spatial cycles), and that the amplitude spectrum of the detected vasomotion frequency increases with repeated training sessions. The authors include some control experiments to rule out fluorescence fluctuations being due to artifacts of eye movement or screen luminance and attempt to demonstrate some functional benefit of vasomotion entraining as HOKR performance improves after repeat training. These data add in an interesting way to the current knowledge base on vasomotion, as the authors demonstrate the ability to entrain vasomotion across multiple brain areas and show some functional significance to vasomotion with regards to information processing as HOKR task performance correlates well with vascular oscillation amplitudes.

      The aims of the paper are mostly well supported by the data, but some streamlining of the data presentation would improve overall clarity. The third aim to establish the functional significance of vasomotion in relation to plasticity in information processing could be better supported by the inclusion of some additional control experiments. Specifically:

      1) The clarity and comprehensibility of the paper could be significantly enhanced by incorporating additional details in both the introduction and discussion sections. In the introduction, a succinct definition of the frequency range of vasomotion should be provided, as well as a better description of the horizontal optokinetic response (i.e. as they have in the results section in the first paragraph below the 'Entrainment of vasomotion with visual stimuli presentation' sub-heading). The discussion would benefit from the inclusion of a clear summary of the results presented at the start, and the inclusion of stronger justification (i.e. more citations) with regards to the speculation about vasomotion and neuronal plasticity (e.g. paragraph 5 includes no citations).

      2) The novel methods for detecting vasomotion using low-resolution imaging techniques are discussed across the first four figures, but this gets a little bit confusing to follow as the authors jump back and forth between the different imaging and analysis techniques they have employed to capture vasomotion. The data presentation could be better streamlined - for instance by presenting only the methods most relevant for the functional dataset (in Figures 5-7), with the additional information regarding the various controls to establish the use of autofluorescence intensity imaging as a valid method for capturing vasomotion reduced to fewer figure panels, or moved to supplementary figures so as to not detract from the main novel findings contributed in this study.

      3) The authors heavily rely on representative traces from individual vessels to illustrate their findings, particularly evident in Figures 1-4. While these traces offer a valuable visualization, augmenting their approach by presenting individual data points across the entire dataset, encompassing all animals and vessels, would significantly enhance the robustness of their claims. For instance, in Figures 1 and 2, where average basal and dilated traces are depicted for a representative vessel, supplementing these with graphs showcasing peak values across all measured vessels would enable the authors to convey a more holistic representation of their data. Or in Figure 3, where the amplitude spectrum is presented for individual Texas red fluorescence intensity changes in V1 across novice, trained, and expert mice, incorporating a summary graph featuring the amplitude spectrum value at 0.25Hz for each individual trace (across animals/imaging sessions), followed by statistical analysis, would fortify the strength of their assertions. Moreover, providing explicit details on sample sizes for each individual figure panel (where not a representative trace), including the number of animals or vessels/imaging sessions, would contribute to transparency and aid readers in assessing the generalisability of the findings.

      4) In the experiments where mice are classed as "novice", "trained" or "expert", the inclusion of the specific range of the number of training sessions for each category would improve replicability.

      5) The authors don't state whether mice were habituated to the imaging set-up prior to the first data collection, as head-fixation and restraint can be stress-inducing for animals, especially upon first exposure, which could impact their neurovascular coupling responses differentially in "novice" versus "trained" imaging sessions (e.g. see Han et al., 2020, DOI: https://doi.org/10.1523/JNEUROSCI.1553-20.2020). The stress associated with a tail vein injection prior to imaging could also partially explain why mice didn't learn very well if Texas Red was injected before the training session. If no habituation was conducted in these experiments, the study would benefit from the inclusion of some control experiments where "novice" responses were compared between habituated and non-habituated animals.

      6) The experiments regarding the brain-wide vasomotion entrainment across the cortical surface would benefit from some additional information about how brain regions were identified (e.g. particularly how V1 and V2 were distinguished given how close together they are).

      7) Whilst the authors show that HOKR task performance and vasomotion amplitude are increased with repeated training to provide some support to their aim of investigating the functional significance of vasomotion with regards to information processing plasticity, the inclusion of some additional control experiments would provide stronger evidence to address this aim. For instance, if vasomotion signalling is blocked or reduced (e.g. using optogenetics or in an AD mouse model where arteriole amyloid load restricts vasomotion capacity), does flocculus-dependent task performance (e.g. HOKR eye movements) still improve with repeated exposure to the external stimulus.

    5. Reviewer #3 (Public Review):

      Summary:<br /> Here the authors show global synchronization of cerebral blood flow (CBF) induced by oscillating visual stimuli in the mouse brain. The study validates the use of endogenous autofluorescence to quantify the vessel "shadow" to assess the magnitude of frequency-locked cerebral blood flow changes. This approach enables straightforward estimation of artery diameter fluctuations in wild-type mice, employing either low magnification wide-field microscopy or deep-brain fibre photometry. For the visual stimuli, awake mice were exposed to vertically oscillating stripes at a low temporal frequency (0.25 Hz), resulting in oscillatory changes in artery diameter synchronized to the visual stimulation frequency. This phenomenon occurred not only in the primary visual cortex but also across a broad cortical and cerebellar surface. The induced CBF changes adapted to various stimulation parameters, and interestingly, repeated trials led to plastic entrainment. The authors control for different artefacts that may have confounded the measurements such as light contamination and eye movements but found no influence of these variables. The study also tested horizontally oscillating visual stimuli, which induce the horizontal optokinetic response (HOKR). The amplitude of eye movement, known to increase with repeated training sessions, showed a strong correlation with CBF entrainment magnitude in the cerebellar flocculus. The authors suggest that parallel plasticity in CBF and neuronal circuits is occurring. Overall, the study proposes that entrained "vasomotion" contributes to meeting the increased energy demand associated with coordinated neuronal activity and subsequent neuronal circuit reorganization.

      Strengths:<br /> -The paper describes a simple and useful method for tracking vasomotion in awake mice through an intact skull.<br /> -The work controls for artefacts in their primary measurements.<br /> -There are some interesting observations, including the nearly brain-wide synchronization of cerebral blood flow oscillations to visual stimuli and that this process only occurs after mice are trained in a visual task.<br /> -This topic is interesting to many in the CBF, functional imaging, and dementia fields.

      Weaknesses:<br /> -I have concerns with the main concepts put forward, regarding whether the authors are actually studying vasomotion as they state, as opposed to functional hyperemia which is sensory-induced changes in blood flow, which is what they are actually doing. I recommend several additional experiments/analyses for them to explore. This is mostly further characterizing their effect which will benefit the interpretations.

      -Neuronal calcium imaging would also benefit the study and improve the interpretations.

      -The plastic effects in vasomotion synchronization that occur with training are interesting but they could use an additional control for stress. Is this really a plastic effect, or is it caused by progressively decreasing stress as trials and progress? I recommend a habituation control experiment.

      Appraisal<br /> I think the authors have an interesting effect that requires further characterization and controls. Their interpretations are likely sound and additional experiments will continue to support the main hypothesis. If brain-wide synchrony of blood flow can be trained and entrained by external stimuli, this may have interesting therapeutic potential to help clear out toxic proteins from the brain as seen in several neurodegenerative diseases.

    1. Reviewer #2 (Public Review):

      Summary:<br /> Pulfer A. et al. developed a deep learning-based apoptosis detection system named ADeS, which outperforms the currently available computational tools for in vitro automatic detection. Furthermore, ADeS can automatically identify apoptotic cells in vivo in intravital microscopy time-lapses, preventing manual labeling with potential biases. The authors trained and successfully evaluated ADeS in packed epithelial monolayers and T cells distributed in 3D collagen hydrogels. Moreover, in vivo, training and evaluation were performed on polymorphonucleated leukocytes in lymph nodes and spleen.

      Strengths:<br /> Pulfer A. et colleagues convincingly presented their results, thoroughly evaluated ADeS for potential toxicity assay, and compared its performance with available state-of-the-art tools.

      Weaknesses:<br /> The use of ADeS is still restricted to samples where cells are fluorescently labeled either in the cytoplasm or in the nucleus, which limits its use for in vitro toxicity assays that are performed on primary cells or organoids (e.g., iPSCs-derived systems) that are normally harder to transfect.

      In conclusion, ADeS will be a useful tool to improve output quality and accelerate the evaluation of assays in several research areas with basic and applied aims.

    2. eLife assessment

      This valuable study advances our understanding of spatial-temporal cell dynamics both in vivo and in vitro. The authors provide solid evidence for their innovative deep learning-based apoptosis detection system, ADeS, which utilizes the principle of activity recognition. This work will be of broad interest to cell biologists and neuroscientists.

    3. Reviewer #1 (Public Review):

      Summary:<br /> Pulfer et al., describes the development and testing of a transformer based deep learning architecture called ADeS, which the authors use to identify apoptotic events in cultured cells and live animals. The classifier is trained on large datasets and provides robust classification accuracies in test sets that are comparable to and even outperform existing deep learning architectures for apoptosis detection. Following this validation, the authors also design use cases for their technique both in vitro and in vivo, demonstrating the value of ADeS to the apoptosis research space.


      ADeS is a powerful tool in the arsenal of cell biologists interested in the spatio-temporal co-ordinates of apoptotic events in vitro, since live cell imaging typically generates densely packed fields of view that are challenging to parse by manual inspection. The authors also integrate ADeS into the analysis of data generated using different types of fluorescent markers in a variety of cell types and imaging modalities, which increases its adaptability by a larger number of researchers. ADeS is an example of successful deployment of activity recognition (AR) in the automated bioimage analysis space, highlighting the potential benefits of AR to quantifying other intra- and intercellular processes observable using live cell imaging.


      A major drawback was the lack of access to the ADeS platform for the reviewers; the authors state that the code is available in the code availability section, which is missing from the current version of the manuscript. This prevented an evaluation of the usability of ADeS as a resource for other researchers. The authors also emphasize the need for label-free apoptotic cell detection in both their abstract and their introduction but have not demonstrated the performance of ADeS in a true label-free environment where the cells do not express any fluorescent markers. While Pulfer et al., provides a wealth of information about the generation and validation of their DL classifier for in vitro movies, and the utility of ADeS is obvious in identifying apoptotic events among FOVs containing ~1700 cells, the evidence is not as strong for in vivo use cases. They mention the technical challenges involved in identifying apoptotic events in vivo, and use 3D rotation to generate a larger dataset from their original acquisitions. However, it is not clear how this strategy would provide a suitable training dataset for understanding the duration of apoptotic events in vivo since the temporal information remains the same. The authors also provide examples of in vivo acquisitions in their paper, where the cell density appears to be quite low, questioning the need for automated apoptotic detection in those situations. In the use cases for in vivo apoptotic detection using ADeS (Fig 8), it appears that the location of the apoptotic event itself was obvious and did not need ADeS, as in the case of laser ablation in the spleen and the sparse distribution of GFP labeled neutrophils in the lymph nodes. Finally, the authors also mention that video quality altered the sensitivity of ADeS in vivo (Fig 6L) but fail to provide an example of ADeS implementation on a video of poor quality, which would be useful for end users to assess whether to adopt ADeS for their own live cell movies.

    1. Author Response:

      We thank the reviewers and editor for their careful analysis of our manuscript and their appreciation of its strengths. Our plans to address the reviewers’ concerns regarding the weaknesses of the study are outlined below.

      Reviewing Editor (Public Review):

      “Weaknesses mainly concern the experiments and arguments leading to the authors' notion that Cav3 channels may partially compensate for the loss of Cav1.4 calcium currents in cone synapses. It is possible that the non-conducting Cav1.4 variant supports synapse development and the Cav3 channel then provides the calcium influx. However, in its current state, the study does not unequivocally assess Cav3 expression in wild-type cones, it lacks direct evidence of Cav3 expression and upregulation, e.g. via single cell transcriptomics, immunolabeling, or an elaboration on electrophysiology, and it does not test the authors' earlier idea that Cav1.4 might couple to intracellular calcium stores at photoreceptor synapses.”

      Current transcriptomic studies indicate that Cav3 transcripts are present at extremely low levels compared to that for Cav1.4 in cones of young mice (PMID 26000488, summarized in PMID 35650675), adult mice (PMID: 36807640), macaque (PMID 30712875), and human (PMID 31075224). Thus, it was somewhat surprising that Davison et al reported the presence of low voltage activated (LVA) Cav3-like currents with amplitudes that were ~50% of that for the Cav1 current in mouse cones at -40 mV (PMID 35803735). Using similar pharmacological criteria as Davison et al, we did not find functional evidence for a LVA current in cones of wild-type (WT) mouse retina: the Ca2+ current in our recordings was suppressed by the Cav1 antagonist isradipine (Fig 3a) but minimally affected in the expected voltage range by the Cav3 antagonist ML218 (Fig 3b). In WT mouse, voltage clamp steps from -90 mV to more depolarized voltages failed to show a transient inward current at onset (Fig 2e), which is a hallmark of LVA calcium currents. In addition, by standard physiological and pharmacological critera, we could not identify LVA currents in cones of ground squirrel (Fig.3c,d) and macaque retina (Supp. Fig.S3). Our results argue against a significant role for LVA currents in mammalian cones.

      A problem that we discovered (as did Davison et al, their Fig.2C) was that Cav3 blockers (e.g., ML218 and Z944) have non-specific actions on the high voltage activated (HVA) Ca2+ current (presumably mediated by Cav1.4) in WT mouse cones. This is clearly shown in our Supp. figure S1a-b where ML218 causes a dose-dependent negative shift in the I-V relationship but also inhibition of current density in HEK293T cells transfected with Cav1.4. We are planning a second study to thoroughly characterize these actions of ML218 and Z944 on Cav1 channels as the results are important for understanding the actions of these drugs in cell-types with mixed populations of Cav1 and Cav3 channels.

      A second problem is that dihydropyridines (DHP) used in both our study and that of Davison et al (e.g., isradipine, nifedipine) incompletely and slowly block Cav1 channels at negative membrane potentials (PMID: 12853422). Due to the slow kinetics of DHP block, Cav1 currents in the presence of such blockers can appear to inactivate rapidly (see Fig.6A in PMID 11487617). Thus, the Cav current recorded in the presence of DHP blockers in WT mouse cones may represent unblocked Cav1.4-mediated currents that appear rapidly inactivating, and therefore misconstrued as being mediated by Cav3 channels.

      Given the caveats of the pharmacological approach, we agree that stronger evidence is needed to rule out a small contribution of Cav3 channels in WT mouse cones. As mentioned in our text, we have found that currently available Cav3 antibodies produce similar patterns of immunofluorescence in WT and corresponding Cav3 KO retina so analysis at the level of Cav proteins is not possible. Thus, we are planning to compare the relative expression of Cav channel genes in cones using drop-seq experiments of G369i KI and WT mouse retina. We also plan to elaborate on our electrophysiological dissection of the HVA and LVA currents.

      Among the 3 Cav3 subtypes, Cav3.2 was the only one detected in mouse cones by Davison et al using nested RT-PCR (PMID 35803735). Thus, we obtained the Cav3.2 mouse strain from JAX (B6;129-Cacna1htm1Kcam/J) and generated a Cav3.2 KO/G369i KI double mutant mouse strain. If the Cav3 current that appears in the G369i KI cones is mediated by Cav3.2, then it should be undetectable in cones of the double mutant mice. Moreover, if these Cav3.2 channels contribute to the residual cone synaptic responses in G369i KI mice, then the double mutant mice should be deficient in this regard. We will test these predictions in patch clamp recordings and ERGs.

      Finally, we will conduct Ca2+ imaging experiments in cone terminals of the WT vs G369i KI mice to test whether increased coupling of Cav channels to intracellular Ca2+ release may be involved in cone synaptic responses of the G369i KI mice.

      Reviewer #1 (Public Review):


      “The major criticism that I have of the study is that it infers Ca channel molecular composition based solely on pharmacological analysis, which, as the authors note, is confounded by the cross-reactivity of many of the "specific" channel-type antagonists. The authors note that Cav3 mRNAs have been found in cones, but here, they do not perform any analysis to examine Cav3 transcript expression after G369i-KI nor do they examine Ca channel transcript expression in monkey or squirrel cones, which serve as controls of sorts for the G369i-KI (i.e. like WT mouse cones, cones of these other species do not seem to exhibit LVA Ca currents).”

      Actually, we also used non-pharmacological (i.e., electrophysiological) criteria to back up our interpretation that Cav3 channels contribute to the Cav current in cones primarily in the absence of functional Cav1.4 channels. For example, in Fig.2, we show that the Ca2+ current in G369i KI and Cav1.4 KO mice exhibit the hallmarks of the Cav3 channel (negative activation and inactivation voltages and window current, rapid inactivation), which are quite distinct from the Ca2+ currents in WT cones. In recordings of ground squirrel and macaque cones (Supp.Figs.S2-3), negative holding voltages do not unmask a LVA current according to various criteria. In addition to the transcriptomic approaches described above, we plan to elaborate on the electrophysiological evidence for the absence of a LVA current in WT mouse cones as part of the revision.

      “Secondarily, in Maddox et al. 2020, the authors raise the possibility that G369i-KI, by virtue of having a functional voltage-sensing domain-might couple to intracellular Ca2+ stores, and it seems appropriate that this possibility be considered experimentally here.”

      We will conduct Ca2+ imaging experiments in cone terminals of the WT vs G369i KI mice to test whether increased coupling of Cav channels to intracellular Ca2+ release may be involved in cone synaptic responses of the G369i KI mice.

      “As a minor point: the authors might wish to note - in comparison to another retinal ribbon synapse-that Zhang et al. 2022 (in J. Neuroscience) performed a study of mouse rod bipolar cells found a number of LVA and HVA Ca conductances in addition to the typical L-type conductance mediated by Cav1-containing channels.”

      We are aware of the extensive evidence for the expression of Cav3 channels in retinal bipolar cells (PMID 11604141, 22909426, 19275782, 35896423) and our recordings of cone bipolar cells in ground squirrel confirm this (Supp. Fig.S2D). We could add reference to this work in our revision.

      Reviewer #2 (Public Review):


      “The major critiques are related to the description of the Cav1.4 knock-in mouse as "sparing" function, which can be remedied in part by a simple rewrite, and in certain places, the data may need to be examined more critically. In particular, the authors should address features in the data presented in Figures 6 and 7 that seem to indicate that the retina of the Cav1.4 knock-in is not intact, but the interpretation given by the authors as "intact" is not appropriate and made without rigorous statistical testing.”

      We intended to use “sparing” and “intact” to indicate that cone synapses are present and to some extent functional, in contrast to their complete absence in the Cav1.4 KO mouse. However, we recognize this may be misinterpreted as “normal”. As suggested by the reviewer, we will revise our statistical analyses and text to clarify that cone synaptic responses do indeed differ significantly in G369i KI as compared to WT mice. We feel that this will be a strong addition to the study and will emphasize the key point that Cav3 cannot fully compensate for loss of Cav1.4 with respect to cone synapse structure and function.

      Reviewer #3 (Public Review):


      “The study has been expertly performed but remains descriptive without deciphering the underlying molecular mechanisms of the observed phenomena, including the proposed homeostatic switch of synaptic calcium channels. Furthermore, a relevant part of the data in the present paper (presence of T-type calcium channels in cone photoreceptors) has already been identified/presented by previous studies of different groups (Macosko et al., 2015; pmid 26000488; Davison et al., 2021; pmid 35803735; Williams et al., 2022; pmid 35650675). The degree of novelty of the present paper thus appears limited.”

      We respectfully disagree that our paper lacks novelty. As indicated by Reviewer 2, a major advance of our study is in providing a mechanism that can explain the longstanding conundrum that congenital stationary night blindness type 2 mutations that would be expected to severely compromise Cav1.4 function do not produce complete blindness. We also disagree that the presence of T-type channels in cone photoreceptors has been unequivocally demonstrated, as the non-biased transcriptomic approaches show very little Cav3 transcript expression in mouse cones (PMIDs 26000488, 35650675, 36807640), macaque cones (PMID 30712875), and human cones (PMID 31075224). Transcription may not equate to translation, particularly at low expression levels. We also note that the one study to date that suggests a functional contribution of Cav3 channels in mouse cones (Davison et al., 2021; pmid 35803735) used a DHP to isolate the “LVA” current, which is problematic as described above. Our demonstration of minimal or undetectable Cav3-type currents in mammalian cones using physiological and pharmacological approaches, while a negative result, adds important context to the recent literature. As described in our response to the editor’s review, our planned revisions include testing whether Cav3 transcripts are upregulated in G369i KI cones and whether the Cav3.2 subtype suggested to be present in cones (PMID 35803735) contributes to Cav currents in these cells using Cav3.2 KO and Cav3.2 KO/G369i KI double mutant mice.

    1. eLife assessment

      This useful study tackles the well-established overflow metabolism issue by applying a coarse-grained metabolic flux model to predict how individual cells execute various energy strategies, such as respiration versus fermentation. While the model's population average is convincing enough to align with experimental observations on overflow metabolism, the overall assertions to enhance our comprehension of this biological phenomenon are incomplete.

    2. Reviewer #1 (Public Review):

      Summary:<br /> Cell metabolism exhibits a well-known behavior in fast-growing cells, which employ seemingly wasteful fermentation to generate energy even in the presence of sufficient environmental oxygen. This phenomenon is known as Overflow Metabolism or the Warburg effect in cancer. It is present in a wide range of organisms, from bacteria and fungi to mammalian cells.

      In this work, starting with a metabolic network for Escherichia coli based on sets of carbon sources, and using a corresponding coarse-grained model, the author applies some well-based approximations from the literature and algebraic manipulations. These are used to successfully explain the origins of Overflow Metabolism, both qualitatively and quantitatively, by comparing the results with E. coli experimental data.

      By modeling the proteome energy efficiencies for respiration and fermentation, the study shows that these parameters are dependent on the carbon source quality constants K_i (p.115 and 116). It is demonstrated that as the environment becomes richer, the optimal solution for proteome energy efficiency shifts from respiration to fermentation. This shift occurs at a critical parameter value K_A(C).

      This counterintuitive result qualitatively explains Overflow Metabolism.

      Quantitative agreement is achieved through the analysis of the heterogeneity of the metabolic status within a cell population. By introducing heterogeneity, the critical growth rate is assumed to follow a Gaussian distribution over the cell population, resulting in accordance with experimental data for E. coli. Overflow metabolism is explained by considering optimal protein allocation and cell heterogeneity.

      The obtained model is extensively tested through perturbations: 1) Introduction of overexpression of useless proteins; 2) Studying energy dissipation; 3) Analysis of the impact of translation inhibition with different sub-lethal doses of chloramphenicol on Escherichia coli; 4) Alteration of nutrient categories of carbon sources using pyruvate. All model perturbation results are corroborated by E. coli experimental results.

      Strengths:<br /> In this work, the author employs modeling methods typical of Physics to address a problem in Biology, standing at the interface between these two scientific fields. This interdisciplinary approach proves to be highly fruitful and should be further explored in the literature. The use of Escherichia coli as an example ensures that all hypotheses and approximations in this study are well-founded in the literature. Examples include the approximation for the Michaelis-Menten equation (line 82), Eq. S1, proteome partition in Appendix 1.1 (lines 68-69), and a stable nutrient environment in Appendix 1.1 (lines 83-84). The section "Testing the model through perturbation" heavily relies on bacterial data. The construction of the model and its agreement with experimental data are convincingly presented.

      Weaknesses:<br /> In Section Appendix 6.4, the author explores the generalization of results from bacteria to cancer cells, adapting the metabolic network and coarse-grained model accordingly. It is argued that as a consequence, all subsequent steps become immediately valid. However, I remain unconvinced, considering the numerous approximations used to derive the equations, which the literature demonstrates to be valid primarily for bacteria. A more detailed discussion about this generalization is recommended. Additionally, it is crucial to note that the experimental validation of model perturbations heavily relies on E. coli data.

    3. Reviewer #2 (Public Review):

      Summary<br /> This paper has three parts. The first part applied a coarse-grained model with proteome partition to calculate cell growth under respiration and fermentation modes. The second part considered single-cell variability and performed population average to acquire an ensemble metabolic profile for acetate fermentation. The third part used model and simulation to compare experimental data in literature and obtained substantial consistency.

      Strengths and major contributions<br /> (i) The coarse-grained model considered specific metabolite groups and their inter-relations and acquired an analytical solution for this scenario. The "resolution" of this model is in between the Flux Balanced Analysis/whole-cell simulation and proteome partition analysis.

      (ii) The author considered single-cell level metabolic heterogeneity and calculated the ensemble average with explicit calculation. The results are consistent with known fermentation and growth phenomena qualitatively and can be quantitatively compared to experimental results.

      Weaknesses<br /> (i) If I am reading this paper correctly, the author's model predicts binary (or "digital") outcomes of single-cell metabolism, that is, after growth rate optimization, each cell will adopt either "respiration mode" or "fermentation mode" (as illustrated in Figure Appendix - Figure 1 C, D). Due to variability enzyme activity k_i^{cat} and critical growth rate λ_C, each cell under the same nutrient condition could have either respiration or fermentation, but the choice is binary.

      The binary choice at the single-cell level is inconsistent with our current understanding of metabolism. If a cell only uses fermentation mode (as shown in Appendix - Figure 1C), it could generate enough energy but not be able to have enough metabolic fluxes to feed into the TCA cycle. That is, under pure fermentation mode, the cell cannot expand the pool of TCA cycle metabolites and hence cannot grow.

      This caveat also appears in the model in Appendix (S25) that assumes J_E = r_E*J_{BM} where r_E is a constant. From my understanding, r_E can be different between respiration and fermentation modes (at least for real cells) and hence it is inappropriate to conclude that cells using fermentation, which generates enough energy, can also generate a balanced biomass.

      (ii) The minor weakness of this model is that it assumes a priori that each cell chooses its metabolic strategy based on energy efficiency. This is an interesting assumption but there is no known biochemical pathway that directly executes this mechanism. In evolution, growth rate is more frequently considered for metabolic optimization. In Flux Balanced Analysis, one could have multiple objective functions including biomass synthesis, energy generation, entropy production, etc. Therefore, the author would need to justify this assumption and propose a reasonable biochemical mechanism for cells to sense and regulate their energy efficiency.

      My feeling is that the mathematical structure of this model could be correct, but the single-cell interpretation for the ensemble averaging has issues. Each cell could potentially adopt partial respiration and partial fermentation at the same time and have temporal variability in its metabolic mode as well. With the modification of the optimization scheme, the author could have a revised model that avoids the caveat mentioned above.

      Discussion and impact for the field<br /> Proteome partition models and Flux Balanced Analysis are both commonly used mathematical models that emphasize different parts of cellular physiology. This paper has ingredients for both, and I expect after revision it will bridge our understanding of the whole cell.

    4. Reviewer #3 (Public Review):

      Summary:<br /> In the manuscript "Overflow metabolism originates from growth optimization and cell heterogeneity" the author Xin Wang investigates the hypothesis that the transition into overflow metabolism at large growth rates actually results from an inhomogeneous cell population, in which every individual cell either performs respiration or fermentation.

      Weaknesses:<br /> The paper has several major flaws. First, and most importantly, it repeatedly and wrongly claims that the origins of overflow metabolism are not known. The paper is written as if it is the first to study overflow metabolism and provide a sound explanation for the experimental observations. This is obviously not true and the author actually cites many papers in which explanations of overflow metabolism are suggested (see e.g. Basan et al. 2015, which even has the title "Overflow metabolism in E. coli results from efficient proteome allocation"). The paper should be rewritten in a more modest and scientific style, not attempting to make claims of novelty that are not supported. In fact, all hypotheses in this paper are old. Also the possiblility that cell heterogeneity explains the observed 'smooth' transition into overflow metabolism has been extensively investigated previously (see de Groot et al. 2023, PNAS, "Effective bet-hedging through growth rate dependent stability") and the random drawing of kcat-values is an established technique (Beg et al., 2007, PNAS, "Intracellular crowding defines the mode and sequence of substrate uptake by Escherichia coli and constrains its metabolic activity"). Thus, in terms of novelty, this paper is very limited. It reinvents the wheel and it is written as if decades of literature debating overflow metabolism did not exist.

      Moreover, the manuscript is not clearly written and is hard to understand. Variables are not properly introduced (the M-pools need to be discussed, fluxes (J_E), "energy coefficients" (eta_E), etc. need to be more explicitly explained. What is "flux balance at each intermediate node"? How is the "proteome efficiency" of a pathway defined? The paper continues to speak of energy production. This should be avoided. Energy is conserved (1st law of thermodynamics) and can never be produced. A scientific paper should strive for scientific correctness, including precise choice of words.

      The statement that the "energy production rate ... is proportional to the growth rate" is, apart from being incorrect - it should be 'ATP consumption rate' or similar (see above), a non-trivial claim. Why should this be the case? Such statements must be supported by references. The observation that the catabolic power indeed appears to increase linearly with growth rate was made, based on chemostat data for E.coli and yeast, in a recent preprint (Ebenhöh et al, 2023, bioRxiv, "Microbial pathway thermodynamics: structural models unveil anabolic and catabolic processes").

      All this criticism does not preclude the possibility that cell heterogeneity plays a role in overflow metabolism. However, according to Occam's razor, first the simpler explanations should be explored and refuted before coming up with a more complex solution. Here, it means that the authors first should argue why simpler explanations (e.g. the 'Membrane Real Estate Hypothesis', Szenk et al., 2017, Cell Systems; maximal Gibbs free energy dissipation, Niebel et al., 2019, Nature Metabolism; Saadat et al., 2020, Entropy) are not considered, resp. in what way they are in disagreement with observations, and then provide some evidence of the proposed cell heterogeneity (are there single-cell transcriptomic data supporting the claim?).

    1. eLife assessment

      This study presents a valuable finding on the relationship between neuronal dynamics in the thalamus and brain state modulation. The evidence supporting the claims of the authors is incomplete, as additional analyses and better presentation of the data are needed to support the specific role of the mediodorsal nucleus in this phenomenon. The work will be of interest to systems neuroscientists interested in brain dynamics and behavioural states.

    2. Reviewer #1 (Public Review):

      Summary:<br /> This an interesting and valuable study that uses multiple approaches to understand the role of bursting involving voltage-gated calcium channels within the mediodorsal thalamus in the sedative-hypnotic effects of alcohol. Given its unique functional roles and connectivity pattern, the idea that the mediodorsal thalamus may have a fundamental role in regulating alcohol-induced transitions in consciousness state would be both important for researchers investigating thalamocortical dynamics and more broadly interesting for understanding brain function. In addition, the author's examination of the role of the voltage-gated calcium channel Cav3.1 provides some evidence that burst-firing mediated by this channel in the thalamus is functionally important for behavioral-state transitions. While many previous studies have suggested an analogous role for sleep-state regulation, the evidence for an analogous role of this type of bursting in sedative-induced transitions is more limited. Despite the importance of these results, however, there is some concern that the manipulations and recording approaches employed by the authors may affect other thalamic nuclei adjacent to the MD, such as the central lateral nucleus, which has also been implicated in controlling state transitions. The evidence for a specific role of the mediodorsal thalamus is therefore somewhat incomplete, and so additional validation is needed.

      Strengths:<br /> This study employs multiple, complementary research approaches including behavioral assays, sh-RNA-based localized knockdown, single-unit recordings, and patterned optogenetic interventions to examine the role of activity in the mediodorsal thalamus in the sedative-hypnotic effects of alcohol. Experiments and analyses included in the manuscript generally appear well conceived and are also generally well executed. Sample sizes are sufficiently large and statistical analysis appears generally appropriate though in some cases additional quantification would be helpful. The findings presented are novel and provide some interesting insight into the role of the thalamus as well as voltage-gated calcium channels within this region in controlling behavioral state transitions induced by alcohol. In particular, the observed effects of selective knockout along with recordings in total knockout of the voltage-gated calcium channel, Cav3.1, which has previously been implicated in bursting dynamics as well as state transitions, particularly in sleep, together suggest that the transition of thalamic neurons to a bursting pattern of firing from a more constant firing is important for transition to the sedated state produced by ethanol intoxication. While previous studies have similarly implicated Cav3.1 bursting in behavioral state transitions, the direct optogenetic interventions and single-unit recordings provide valuable new insight. These findings may also have interesting implications for the relationship between sleep process disruption associated with ethanol dependence, although the authors do not appear to examine this directly or extensively discuss these implications of their findings.

      Weaknesses:<br /> A key claim of the study is that the mediodorsal thalamus is specifically important for the sedative-hypnotic effect of ethanol and that a transition to a bursting pattern of firing in this circuit facilitates these effects due to a loss of a more constant tonic firing pattern. Despite the generally clear observed effects across the included experiments, however, the evidence presented does not fully support that the mediodorsal thalamus, in particular, is involved. This distinction is important because some previous studies have suggested that another thalamic nucleus which is very close to the mediodorsal thalamus, the central-lateral thalamus, has previously been suggested to play a role in preventing sedative-induced transitions. Despite its proximity to the mediodorsal thalamus, the central-lateral thalamus has a substantially different pattern of connectivity so distinguishing which region is impacted is important for understanding the findings in the manuscript. While sh-RNA knockdown appears to be largely centered in the mediodorsal thalamus in the example shown, (Figure 2) this is rather minimal evidence and it is also not well explained (indeed, the relevant panels do not even appear to be referenced in the text of the manuscript) and the consistency of the knockdown targeting is not quantified. Additional evidence should be provided to validate this approach. Similarly, while an example is shown for the expression of ChR2 (Fig. 5) there seems to be some spread of expression outside of the mediodorsal thalamus even in his example raising a concern about how regionally specific this effect.

      The recordings targeting the mediodorsal thalamus could provide evidence of a direct association between changes in activity specifically in this part of the thalamus with the behavioral measures but there are currently some issues with making this link. One difficulty is that, although lesions are shown in Figure S5 to validate recording locations, this figure is relatively unclear and the examples appear to be taken from a different anterior/posterior location compared to the reference diagram. A larger image and improved visualization of the overall set of lesion locations that includes multiple anterior/posterior coronal sections would be helpful. Moreover, even for these example images, it is difficult to evaluate whether these are in the mediodorsal thalamus, particularly given the small size of the image shown. Ideally, an example image that is more obviously in the mediodorsal thalamus would also be included. Finally, an assessment of the relationship between the approximate locations of recorded neurons across the tetrode arrays and the behavioral measures would be very helpful in supporting the unique role of the mediodorsal thalamus. The lack of these direct links, in combination with the histological issues, reduces the insight that can be gained from this study.

      In addition to the key experimental issues mentioned above, there are often problems in the text of the manuscript with reasoning or at least explanation as well as numerous minor issues with editing. The most substantial such issue is the lack of clarity in discussing the mediodorsal thalamus and other adjacent thalamic nuclei, such as the central-lateral nucleus, in the author's discussion of previous findings. Given that at last one of the manuscripts cited by the authors (Saalman, Front. Sys. Neuro. 2014) has directly claimed that central-lateral, rather than the mediodorsal, thalamus is important for arousal regulation related to a conscious state, this distinction should be addressed clearly in the discussion rather than papered over by grouping multiple thalamic nuclei as being medial. As part of this discussion, it would be important to consider additional relevant literature including Bastos et al., eLife, 2021 and Redinbaugh et al., Neuron, 2020 which are quite critical but currently do not appear to be cited. Considering additional literature relevant to the function of the mediodorsal thalamus would also be beneficial.<br /> While the methods employed generally seem sound, the description in the methods section is lacking in detail and is often difficult to follow. Analysis methods such as the burst index appear to only be given a brief explanation in the text and appear not to be mentioned in the methods section. Similarly, the staining method used in Figure 2 does not appear to be described in the methods section. The most substantial case is for the UMAP approach used in Figure 4-E which does not appear to be described in the methods or even described in the main text. The lack of detailed descriptions makes it difficult to evaluate the applicability and quality of the experimental and analytical approaches. Citations justifying the use of methods such as the approach to separate regular spiking and narrow spiking neuron subtypes are also needed.

      Beyond the problems with content and reasoning discussed above, there are also some relatively minor issues with the clarity of writing throughout the paper (for example, in the abstract the authors refer to "the ethanol resistance behavior in WT mice" but it is difficult to parse what they mean by this statement. Similarly, the next sentence "These results support that the maintenance..." while clearer, is not well phrased. Though individually minor, issues like this re-occur throughout the manuscript and sometimes make it difficult to follow so the text should be revised to correct them. There are also some problems with labels such as the labels of A1/A2 in Figure 4, which appear to be incorrect. Also, S7 has no label on the B panels. Finally, some references are not included (only a label of [ref]).

    3. Reviewer #2 (Public Review):

      In the current study, Latchoumane and collaborators focus on the Cav3.1 calcium channels in the mediodorsal thalamic nucleus as critical players in the regulation of brain-states and ethanol resistance in mice. By combining behavioural, electrophysiological, and genetic techniques, they report three main findings. First, KO Cav3.1 mice exhibit resistance to ethanol-induced sedation and sustained tonic firing in thalamocortical units. Second, knocked-down Cav3.1 mice reproduce the same behaviour when the mediodorsal, but not the ventrobasal, thalamic nucleus is targeted. Third, either optogenetic or electric stimulation of the mediodorsal thalamus reduces ethanol-induced sedation in control animals.

      Overall, the study is well designed and performed, correctly controlled for confounds, and properly analysed. Nonetheless, it is important to address some aspects of the report. The results support the conclusions of the study. These results are likely to be relevant in the field of systems neuroscience, as they increase the molecular evidence showing how the thalamus regulates brain states.

    1. eLife assessment

      The manuscript by Agha et al. provides a fundamental understanding regarding the participation of V2a interneurons in generating and patterning the locomotor rhythm. The authors provide convincing and solid evidence regarding the heterogeneity of V2a neurons in their intrinsic and synaptic properties and how these shape their outputs. The manuscript could be much improved by the inclusion of statistical analysis of some of the key data currently presented qualitatively.

    2. Reviewer #1 (Public Review):

      Summary:<br /> In this very interesting study, Agha and colleagues show that two types of Chx10-positive neurons (V2a neurons) have different anatomical and electrophysiological properties and receive distinct patterns of excitatory and inhibitory inputs as a function of speed during fictive swimming in the larval zebrafish. Using single-cell fills they show that one cell type has a descending axon ("descending V2as"), while the other cell type has both a descending axon and an ascending axon ("bifurcating V2as"). In the Chx10:GFP line, descending V2as display strong GFP labeling, while bifurcating V2as display weak GFP labeling. The bifurcating V2as are located more laterally in the spinal cord. These two cell types have different electrophysiological properties as revealed by patch-clamp recordings. Positive current steps indicated that descending V2as comprise tonic spiking or bursting neurons. Bifurcating V2as comprise chattering or bursting neurons. The two types of V2a neurons display different recruitment patterns as a function of speed. Descending tonic and bifurcating chattering neurons are recruited at the beginning of the swimming bout, at fast speeds (swimming frequency above 30 Hz). Descending bursting neurons were preferentially recruited at the end of swimming bouts, at low speeds (swimming frequency below 30 Hz), while bifurcating bursting neurons were recruited for a broader swimming frequency range. The two types of V2a neurons receive distinct patterns of excitatory and inhibitory inputs during fictive locomotion. In descending V2as, when speed increases: i) excitatory conductances increase in fast neurons and decrease in slow neurons; ii) inhibitory conductances increase in fast neurons and increase in slow neurons. In bifurcating V2as, when speed increases: i) excitatory conductances increase in fast neurons but do not change in slow neurons; ii) inhibitory conductances increase in fast neurons and do not change in slow neurons. The timing of excitatory and inhibitory inputs was then studied. In descending V2as, fast neurons receive excitatory and inhibitory inputs that are in anti-phase with low contrast in amplitude and are both broadly distributed over the phase. The slow neurons receive two peaks of inhibition, one in anti-phase with the excitatory inputs and another just after the excitation. In bifurcating V2as, fast neurons receive two peaks of inhibition, while slow ones receive anti-phase inhibition.

      Strengths:<br /> This study focuses on the diversity of V2a neurons in zebrafish, an interesting cell population playing important roles in locomotor control and beyond, from fish to mammals. The authors provide compelling evidence that two subtypes of V2as show distinct anatomical, electrophysiological, and speed-dependent spiking activity, and receive distinct synaptic inputs as a function of speed. This opens the door to future investigation of the inputs and outputs of these neurons. Finding ways to activate or inhibit specifically these cells would be very helpful in the years to come.

      Weaknesses:<br /> No major weakness was detected. The experiments were carefully done, and the data were of high quality.

    3. Reviewer #2 (Public Review):

      Summary:<br /> Animals exhibit different speeds of locomotion. In vertebrates, this is thought to be implemented by different groups of spinal interneurons and motor neurons. A fundamental assumption in the field has been that neural mechanisms that generate and sustain the rhythm at different locomotor speeds are the same. In this study, the authors challenge this view. Using rigorous in vivo electrophysiology during fictive locomotion combined with genetics, the authors provide a detailed analysis of cellular and synaptic properties of different subtypes of spinal V2a neurons that play a crucial role in rhythm generation. Importantly, they are able to show that speed-related subsets of V2a neurons have distinct cellular and synaptic properties and may utilize different mechanisms to implement different locomotor speeds.

      Strengths:<br /> The authors fully utilize the zebrafish model system and solid electrophysiological analyses to study the active and passive properties of speed-related V2a subsets. Identification of the V2a subtype is based directly on their recruitment at different locomotor speeds and not on indirect markers like soma size, D-V position etc. Throughout the article, the authors have cleverly used standard electrophysiological tests and analysis to tease out different neuronal properties and link it to natural activity. For example, in Figures 2 and 4, the authors make comparisons of V2a spiking with current steps and during fictive swims showing spike rates measured with current steps are physiologically relevant and observed during natural recruitment. The experiments done are rigorous and well-controlled.

      Weaknesses:<br /> The authors claim that a primary result of their study is that reciprocal inhibition is important for rhythmogenesis at fast speeds while recurrent inhibition is key at slow speeds. This is shown in Figure 6, however, the authors do not show any statistical tests for this claim. The authors also do not show any conclusive evidence that reciprocal inhibition is required for rhythmogenesis at fast speeds and vice versa for slow speeds. Additional experiments or modeling studies that conclusively show the necessity of these different inhibitory sources to the generation of different rhythms would be needed to strengthen this claim.

      The authors do a great job of teasing out cellular and synaptic properties in the different V2a subsets, however, it is not clear if or how these match the final output. For example, V2aD neurons are tonic or bursting for fast and slow speeds respectively but it is not intuitive how these cellular properties would influence phasic excitation and inhibition these neurons receive.

      It is not clear from the discussion why having different mechanisms of rhythm generation at different speeds could be an important circuit design. The authors use anguilliform and carangiform modes of swimming to denote fast and slow speeds but there are differences in these movements other than speed, like rostrocaudal coordination. The frequency and pattern of these movements are linked and warrant more discussion.

    4. Reviewer #3 (Public Review):

      The manuscript by Agha et al. explores mechanisms of rhythmicity in V2a neurons in larval zebrafish. Two subpopulations of V2a neurons are distinguishable by anatomy, connectivity, level of GFP, and speed-dependent recruitment properties consistent with V2a neurons involved in rhythm generation and pattern formation. The descending neurons proposed to be consistent with rhythm-generating neurons are active during either slow or fast locomotion, and their firing frequencies during current steps are well matched with the swim frequency they firing during. The bifurcating (patterning neurons) are active during a broader swim frequency range unrelated to their firing during current steps. All of the V2a neurons receive strong inhibitory input but the phasing of this input is based on neuronal type and swim speed when the neuron is active, with prominent in-phase inhibition in slow descending V2a neurons and bifurcating V2a neurons active during fast swimming. Antiphase inhibition is observed in all V2a neurons but it is the main source of rhythmic inhibition in fast descending V2a neurons and bifurcating neurons active during slow swimming. The authors suggest that properties supporting rhythmic bursting are not directly related to locomotor speed but rather to functional neuronal subtypes.

      This is a well-written paper with many strengths including the rigorous approach. Many parameters, including projection pattern, intracellular properties, inhibition received, and activity during slow/fast swimming were obtained from the same neuron. This links up very well with prior data from the lab on cell position, birth order, morphology/projections, and control of MN recruitment to provide a comprehensive overview of the functioning of V2a interneuronal populations in the larval zebrafish. The overall conclusions are well supported by the data. Weaknesses are relatively minor and were largely related to terminology for some of the secondary conclusions.

      1. The assumption is made that all in-phase inhibition is recurrent and out-of-phase inhibition is reciprocal. The latter is likely true but the definition of recurrent may be a bit loose as could be multisegmental feed-forward inhibition as well.

      2. In a few places, it is mentioned that the properties of the V2a-D neurons are consistent with pacemakers. This could be true of both the V2a-D and -B neurons that burst in response to depolarizing steps but the properties of the remaining (fast) V2a-D neurons do not seem to be consistent with pacemakers, based on the properties shown. Tonic firing at a frequency related to the locomotor speed the neuron is active during and strong antiphase inhibition may instead suggest a stronger network component driving the rhythmicity.

    1. eLife assessment

      This study presents an important set of results illuminating how movement sequences are planned. Using several different behavioural manipulations and analysis methods, the authors present compelling evidence that multiple future movements are planned simultaneously with execution, and that these future movement plans influence each other. The work will be of great interest to those studying motor control.

    2. Reviewer #1 (Public Review):

      Mehrdad Kashefi et al. investigated the availability of planning future reaches while simultaneously controlling the execution of the current reach. Through a series of experiments employing a novel sequential arm reaching paradigm they developed, the authors made several findings: 1) participants demonstrate the capability to plan future reaches in advance, thereby accelerating the execution of the reaching sequence, 2) planning processes for future movements are not independent one another, however, it's not a single chunk neither, 3) Interaction among these planning processes optimizes the current movement for the movement that comes after for it.

      The question of this paper is very interesting, and the conclusions of this paper are well supported by data. However, certain aspects require further clarification and expansion.

      1) The question of this study is whether future reach plans are available during an ongoing reach. In the abstract, the authors summarized that "participants plan at least two future reaches simultaneously with an ongoing reach and that the planning processes of the two future reaches are not independent of one another" and showed the evidence in the next sentences. However the evidence is about the relationship about ongoing reach and future plans but not about in between future plans (Line 52-55). But the last sentence (Line 55-58) mentioned about interactions between future plans only. There are some discrepancies between sentences. Could you make the abstract clear by mentioning interference between 1) ongoing movement and future plans and 2) in between future plans?<br /> 2) I understood the ongoing reach and future reaches are not independent from the results of first experiment (Figure 2). A target for the current reach is shown at Horizon 1, on the other hand, in Horizon 2, a current and a future target are shown on the screen. Inter-reach-interval was significantly reduced from H1 to H2 (Figure 2). The authors insist that "these results suggest that participants can plan two targets (I guess +1 and +2) ahead of the current reach (I guess +0)". But I think these results suggest that participants can plan a target (+1) ahead of the current reach (+0) because participants could see the current (+0) and a future target (+1) in H2. Could the authors please clarify this point?<br /> 3) Movement correction for jump of the +1 target takes longer time in H3 compared to H2 (Figure 4). Does this perturbation have any effect on reaching for +2 target? If the +1 jump doesn't affect reaching for +2 target, combined with the result that jump of the +2 target didn't affect the movement time of +1 target (Figure 3C), perturbation (target jump) only affects the movement directly perturbed. Is this implementation correct? If so, does these results support to decline future reaches are planned as motor chunk? I would like to know the author's thoughts about this.<br /> 4) Any discussion about Saccade position (Figure 7)?

    3. Reviewer #2 (Public Review):

      Summary:<br /> In this work, Kashefi et al. investigate the planning of sequential reaching movements and how the additional information about future reaches affects planning and execution. This study, carried out with human subjects, extends a body of research in sequential movements to ask important questions: How many future reaches can you plan in advance? And how do those future plans interact with each other?

      The authors designed several experiments to address these questions, finding that information about future targets makes reaches more efficient in both timing and path curvature. Further, with some clever target jump manipulations, the authors show that plans for a distant future reach can influence plans for a near future reach, suggesting that the planning for multiple future reaches is not independent. Lastly, the authors show that information about future targets is acquired parafoveally--that is, subjects tend to fixate mainly on the target they are about to reach to, acquiring future target information by paying attention to targets outside the fixation point.

      The study opens up exciting questions about how this kind of multi-target planning is implemented in the brain. As the authors note in the manuscript, previous work in monkeys showed that preparatory neural activity for a future reaching movement can occur simultaneously with a current reaching movement, but that study was limited to the monkey only knowing about two future targets. It would be quite interesting to see how neural activity partitions preparatory activity for a third future target, given that this study shows that the third target's planning may interact with the second target's planning.

      Strengths:<br /> A major strength of this study is that the experiments and analyses are designed to answer complementary questions, which together form a relatively complete picture of how subjects act on future target information. This complete description of a complex behavior will be a boon to future work in understanding the neural control of sequential, compound movements.

      Weaknesses:<br /> I found no real glaring weaknesses with the paper, though I do wish that there had been some more discussion of what happens to planning with longer dwell times in target. In the later parts of the manuscript, the authors mention that the co-articulation result (where reaches are curved to make future target acquisition more efficient) was less evident for longer dwell times, likely because for longer dwell times, the subject needs to fully stop in target before moving to the next one. This result made me wonder if the future plan interaction effect (tested with the target jumps) would have been affected by dwell time. As far as I can tell, the target jump portion only dealt with the shorter dwell times, but if the authors had longer dwell time data for these experiments, I would appreciate seeing the results and interpretations.

      Beyond this, the authors also mentioned in the results and discussion the idea of "neural resources" being assigned to replan movements, but it's not clear to me what this might actually mean concretely. I wonder if the authors have a toy model in mind for what this kind of resource reassignment could mean. I realize it would likely be quite speculative, but I would greatly appreciate a description or some sort of intuition if possible.

    1. eLife assessment

      This valuable study, of interest for students of the biology of genomes, uses simulations in combination with published data to examine how many TADs remain after cohesin depletion. The authors suggest that a significant subset of chromosome conformations do not require cohesin, and that knowledge of specific epigenetic states can be used to identify regions of the genome that still interact in the absence of cohesin. The theoretical approaches and quantitative analysis are state-of-the-art, and the data quality and strength of the conclusions are convincing, but it is unfortunately still unclear whether physical boundaries (of domains?) in the model appear to be a consequence of preserved TADs, or whether preserved TADs are caused by the physical boundaries.

    2. Reviewer #1 (Public Review):

      The revised manuscript by Jeong et al presents a thorough analysis of the prevalence and epigenetic causes of TAD conservation upon cohesin loss. The authors suggest that TAD preservation could be caused by an epigenetic switch at the TAD boundary, or by enhancer-promoter or promoter-promoter interactions between TAD boundaries. Simulations using the CCM model confirm that epigenetic switching can mechanistically explain TAD boundary preservation. The added analysis of the prevalence of enhancer and promoter interactions at TAD boundaries strengthens the authors' claim that these interactions could play an important role in TAD preservation.

    3. Reviewer #2 (Public Review):

      Summary:<br /> Here Jeong et al., use a combination of theoretical and experimental approaches to define molecular contexts that support specific chromatin conformations. They seek to define features that are associated with TADs that are retained after cohesin depletion (the authors refer to these TADs as P-TADs). They were motivated by differences between single cell data, which suggest that some TADs can be maintained in the absence of cohesin, whereas ensemble HiC data suggest complete loss of TADs. By reananalyzing a number of HiC datasets from different cell types, the authors observe that in ensemble methods, a significant subset of TADs are retained. They observe that P-TADs are associated with mismatches in epigenetic state across TAD boundaries. They further observe that "physical boundaries" are associated with P-TAD maintenance. Their structure/simulation based approach appears to be a powerful means to generate 3D structures from ensemble HiC data, and provide chromosome conformations that mimic the data from single-cell based experiments. Their results also challenge current dogma in the field about epigenetic state being more related to compartment formation rather than TAD boundaries. Their analysis is particularly important because limited amounts of imaging data are presently available for defining chromosome structure at the single-molecule level, however, vast amounts of HiC and ChIP-seq data are available. By using HiC data to generate high quality simulated structural data, they overcome this limitation. Overall, this manuscript is important for understanding chromosome organization, particularly for contacts that do not require cohesin for their maintenance, and for understanding how different levels of chromosome organization may be interconnected. I cannot comment on the validity of the provided simulation methods and hope that another reviewer is qualified to do this.

      Specific comments<br /> -It is unclear what defines a physical barrier. From reading the text and the methods, it is not entirely clear to me how the authors have designated sites of physical barriers. It may help to define this on pg 7, second to last paragraph, when the authors first describe instances of P-TAD maintenance in the absence of epigenetic mismatch.

      -Figure 7 adds an interesting take to their approach. Here the authors use microC data to analyze promoter-enhancer/promoter-promoter contacts. These data are included as part of the discussion. I think this data could be incorporated into the main text, particularly because it provides a biological context where P-TADs would have a rather critical role.

      -Figure 3a- the numbers here do not match the text (page 6, second to last paragraph). The numbers have been flipped for either chromosome 10 or chromosome 13 in the text or the figures.

      In the revision, the authors have sufficiently addressed my specific concerns from above.

    4. Reviewer #3 (Public Review):

      This manuscript presents a comprehensive investigation into the mechanisms that explain the presence of TADs (P-TADs) in cells where cohesin has been removed. In particular, to study TADs in wildtype and cohesin depleted cells, the authors use a combination of polymer simulations to predict whole chromosome structures de novo and from Hi-C data. Interestingly, they find that those TADs that survive cohesin removal contain a switch in epigenetic marks (from compartment A to B or B to A) at the boundary. Additionally, they find that the P-TADs are needed to retain enhancer-promoter and promoter-promoter interactions.

      Overall, the study is well-executed, and the evidence found provides interesting insights into genome folding and interpretations of conflicting results on the role of cohesin on TAD formation.

    1. eLife assessment

      This useful study uses a microfluidic method to evaluate the ability of single human white blood cells to produce combinations of cytokines and the evidence that this takes place is solid. The paper highlights polyfunctionality using data that are similar to a prior dataset from the same group. The authors comment that, in analysis of larger panels, single cells rarely make more than 2 or 3 cytokines so that investigation of 3 cytokines at a time is sufficient to investigate this phenomenon. Coupling this approach to other modes of single cell analysis may provide greater insight into what limits simultaneous production of multiple cytokines.

    2. Reviewer #1 (Public Review):

      Summary: The authors started by stimulating the PBMCs in bulk, then encapsulated single cells in droplets to monitor the secreted cytokines in each droplet for the next 4 hours. The secreted cytokines are bound by fluorescently labeled detection antibodies. At the same time, the cytokines can be captured by the capture antibodies that are immobilized to the magnetic beads. Under the magnetic field, the magnetic beads will line up in the middle of the droplet along with bound fluorescent antibodies. This effectively enriches the fluorescent antibody to the middle of the droplet, making it a higher fluorescent signal compared to the background signal that is in the rest of the droplet. They can parallel the measurement of three cytokines in each droplet.

      Strengths: Observed heterogeneous cytokine secretion dynamics, which they have reported in their previous paper as well.

      Weaknesses:<br /> Since they used PBMCs, without other assay to confirm the cell subtypes, I am not sure if any of the heterogeneity they detected in 6 cytokine secretion would be able to relate back to biology. In addition, the two panels were measured on separate cells, I am not sure it is meaningful to make any comparisons of the two panels as they are on different cells.

      Their revision failed to make much improvement.

    3. Reviewer #2 (Public Review):

      The responses to the comments and changes in the manuscript are convincing, especially the secretion patterns of high and low secreting cells are interesting and reassuring. The only criticism I still have is that most observations are already published in the previous paper by the same authors.

      Summary:<br /> In their manuscript titled "Stimulation-induced cytokine polyfunctionality as a dynamic concept," the authors investigate the dynamic nature of polyfunctional cytokine responses to established stimulants. The authors use their previously published single-cell encapsulation droplet-microfluidic platform to analyse the response of peripheral blood mononuclear cells (PBMCs) to different stimulants and measure the secretion dynamics of individual cytokines. This assay shows that polyfunctionality in cytokine responses is a complex but short-lived phenomenon that decreases with prolonged stimulation times. The study finds that polyfunctional cells predominantly display elevated cytokine concentrations with similar secretion patterns but higher secretion levels compared to their monocytokine-secreting counterparts. The method is promising to analyse the correlation between the secretion dynamics of different cytokines in primary samples and heterogeneous cell populations.

      Strengths:<br /> This method provides single-cell-resolved and dynamic cytokine concentration information, which might be used to identify "fingerprints" of secretion patterns for selected cytokines. When extending the available data to more than one donor, this might be the basis for a diagnostic tool. The combination of established droplet microfluidics with an epi-fluorescence microscope-based readout makes it convincing that the method is transferable to other labs. Specifically, the dynamic analysis of cytokine concentrations is interesting, and the differences or similarities in secretion timepoints might be missed with end-point methods. The authors convincingly show that they detect up to three different cytokines in single cells.

      Weaknesses:<br /> The conclusions of the study are based on samples from a single donor, which makes the conclusions on secretion patterns difficult to interpret. The choice of cytokines is explained, but the justification of the groupings of the antibodies into the two panels is missing. It would further be helpful to discuss how the single cell incubation might affect the secretion dynamics vs. the influence of co-culture of all cell types during the 24 h activation. The authors compare average secretion rates and levels. However, the right panel in Fig. 6 looks like there might be two different populations of mono- or polyfuntional cells that have two secretion rates. As the authors have single-cell data, I would find the separation into these populations more meaningful than comparing the mean values. In line with this comment, comparing the mean values for these cytokines instead of the mean of the populations with distinct secretion properties might actually show stronger differences than the authors report here.Is the plateau of the cytokine concentration caused by the fluorescence signal saturating the camera, saturation of the magnetic beads, exhaustion of the fluorescent antibodies, or constant cytokine concentrations? The high number of non-CSCs and the limited number of droplets decrease the statistical power of the method. The authors discuss their choice to use PBMCs and not solely T cells, but this aspect is missing in the discussion.

    1. eLife assessment

      This important study indicates a significant role for individual let-7 miRNA clusters in regulating generation of Tc17 CD8 cells and emphysema severity in a mouse model. The authors provide convincing evidence for let-7-mediated repression of the transcription factor RORgt and consequent modulation of IL-17-producing CD8 T cells, with correlated data from human emphysema material, though the most effective let-7 cluster/s is/are yet to be tested for its/their ability to modulate disease. The findings, which substantially advance the understanding of roles that let-7 miRNA clusters play in modulating both T cell responses and emphysematous lung disease, will be of interest to T cell and lung disease researchers.

    2. Reviewer #1 (Public Review):

      Summary: Inflammatory T cells have been recognized to play an important role in human COPD lung tissue and animal models of emphysema. The authors have previously identified that Th17 cells regulate chronic inflammatory diseases, including in mice exposed to smoke or nanoparticulate carbon black (nCB). Here, the authors interrogate the role of Tc17 cells using similar mouse models. Investigating let-7 miRNA, which induces antigen-presenting cells activation and T cell mediated Th17a inflammation, they show that the master regulator of Tc17/Th17 differentiation, RAR-related orphan receptor gamma t (RORγt), is a direct target of let-7 miRNA in T cells. Because RORγt expression is elevated in COPD patients and in mouse models of COPD, the authors generate a Let-7 overexpressing mouse in T cells and reduce RORγt expression and Th17 and Tc17 cell recruitment in nCB-exposed mice.

      Strengths: The authors use previous a previously published RNA-seq dataset (GSE57148) from the lungs of control and COPD subjects to explore the involvement of Let-7 in emphysema. They further evaluate Let-7a expression by qPCR in lung tissue samples of smokers with emphysema and non-emphysema controls. Moreover, expression of Let-7a, Let-7b, Let-7d, and Let-7f in purified CD4+ T cells were inversely correlated with emphysema severity lungs. Similar findings were found in their mouse models (CS or nCB) in both lung tissue and isolated lung CD4+ and CD8+ T cells, with reduced let-7afd and let-7bc2 expression.

      Using mice harboring a conditional deletion of the let-7bc2 cluster in all T cells (let-7bc2LOF) derived from the CD4+CD8+ double-positive stage, the authors show enhanced emphysema in nCB- or CS-exposed mice with enhanced recruitment of macrophages and neutrophils to the lung. While CD8+IL17a+ Tc17 cells and CD4+ IL17a+ Th17 cells were increased in nCB-exposed control animals, only let-7bc2LOF mice showed an increase in CD8+IL17a+ Tc17 cells. Further, unexposed let-7bc2LOF and let-7afdLOF mice expressed greater RORγt expression in both CD8+ and CD4+ T cells.

      Generating a let-7 gain of function mouse with overexpression of let-7g in thymic double-positive-derived T cells, protein levels of RORγt were suppressed in CD8+ and CD4+ T cells of let-7GOF mice relative to controls. Let-7GOF mice treated with nCB showed similar lung alveolar distension as controls suggesting that increased let-7 expression does not protect the lung from emphysema. However, let-7GOF mice showed reduced lung Tc17 and Th17 cell populations and were resistant to the induction of RORγt after nCB exposure.

      Weaknesses: Limited data is shown on the let-7afdLOF mice. Does this mouse respond similarly to nCB as the let-7bc2LOF.<br /> Because the authors validate their findings from a previously published RNA-seq dataset in subjects with and without emphysema, the authors should include patient demographics from the data presented in Figure 1C-D.<br /> To validate their mouse models, the absence of Let-7 or enhanced Let-7 expression needs to be shown in isolated T cells from exposed mice.<br /> In Figure 3, the authors are missing the unexposed let-7bc2LOF group from all panels. This is again an issue in Figure 6 with the let-7GOF.<br /> Because the GOF mouse enhances Let-7g within T cells, the importance of Let-7g should be determined in human subjects. Why did the authors choose to overexpress Let-7g, the rational is not clear?<br /> The purity of the CD4+ and CD8+ T cells is not shown and the full gating strategy should be included.<br /> The authors indicate that Tc17 and Th17 T cells were reduced in the GOF mouse, it remains unclear if macrophage or neutrophil recruitment is altered in GOF mice.

    3. Reviewer #2 (Public Review):

      Summary:<br /> Let-7 family miRNAs are largely redundant in function, and originate from multiple genomic loci ("clusters"). Erice et al demonstrate that two individual clusters (let7afd and let7bc2) in mice regulate the generation of IL-17 producing CD8 T cells in vitro and in vivo in a model of emphysema. These cells also express higher levels of the IL-17-inducing transcription factor RORgt, encoded by Rorc, which the authors demonstrate to be a direct target of let-7. Since multiple let-7 family miRNAs are downregulated in T cells and lung tissue in emphysema, these data support a model in which reduced let-7 allows increased IL-17 production by T cells, contributing to disease pathogenesis.

      Strengths:<br /> The inclusion of miRNA and pri-miRNA expression data from sorted human lung T cells as well as mouse T cells from an emphysema model is a strength.

      The study includes complementary loss of function and gain of function experimental systems to test the effect of altered let-7 function, though it should be noted that these involved different let-7 family members and did not yield simple, complementary results for all experimental outcomes.

      The most important finding is that deletion of just one let-7 cluster ("Let7bc2") is sufficient to exacerbate emphysema in the nCB and CS models.

      Weaknesses:<br /> The functional analyses are unusually focused on IL-17 producing CD8 T cells, but it is not made clear whether these cells are an important player in emphysema pathogenesis in the nCB and CS models. The data shown reveal that they are far less numerous than IL-17-producing CD4 T cells. It is also notable that the Figure 1 expression data from human subjects used sorted CD4+ T cells. And as the author mentioned, prior work on let-7 showed that it regulated Th17 (CD4) responses.

      Compared with Let7bc2 deletion, Let7afd deletion had a much larger effect on IL17 production by CD8 T cells in vitro, and it also had a larger effect on RORgt expression in untreated mice in vivo, especially in the lung. It would be valuable to more thoroughly characterize the let7afd mice. RORgt expression should be shown in the in vitro assays. In the results, the authors state that let7afdLOF mice "did not exhibit lung histopathology nor inflammatory changes" up to 6 months of age. Similarly, it is stated in the conclusion that "the let-7afdLOF mice ... did not exhibit changes in Tc17/Th17 subpopulations" in vivo. All these data should be shown, and if no baseline changes are apparent, then I also recommend challenging these mice with nCB and/or cigarette smoke.

      This brings up the larger issue of redundancy among the let-7 family members and genomic clusters. This should be discussed, including some explanation of the relative expression of each mature family member in T cells, and how that maps to the clusters studied here (and those that were not investigated). It would also be helpful to explain the relationship between mouse Let7bc2 and human Let7a3b, since Let7bc2 is the primary focus of emphysema experiments in this manuscript.

      This is especially important because the study of individual let-7 clusters is the core novelty of this body of work, as described in the first paragraph of the discussion. The regulation of let-7 expression has been reported before and its functional role has been investigated with a variety of tools.

      Let7g overexpression caused a marked reduction in Rorgt expression in T cells at baseline and in the setting of nCB challenge, and it reduced the frequency of IL17+ producing CD8 T cells in the lung to baseline levels. Yet there was no change in the MLI measurement of histopathology. Is this a robust result? The responses in the experiment shown in Fig. 6C-D are quite muted compared to those shown in Figure 2. The latter also shows a larger number of replicates, and it is unclear whether the data in 6D include measurement from all of the mice tested (e.g. pooled from 2 small experiments) or only mice from one experiment.

      Although RORgt is a great candidate to have direct effects on IL-17 expression, the mechanistic understanding of let-7 action on T cell differentiation and cytokine production is limited to this single target. As noted in the discussion, others have identified cytokine receptor targets that may play a role, but it is also likely others among the many targets of let-7 also contribute.

    4. Reviewer #3 (Public Review):

      Summary: The manuscript by Erice et al describes let-7 miRNA promotes Tc17 differentiation and emphysema by repressing the transcription factor RORgt. The authors found that overall expression of the let-7 miRNA clusters, let-7b/let-7c2 and let-7a1/let-30 7f1/let-7d are reduced in the lungs and T cells of mice with cigarette smoke-induced emphysema. They also found that the loss of the let-7b/let-7c2-cluster in T cells exaggerated cigarette smoke-induced emphysema. It appears that deletion of the let-7b/let-7c2-cluster lead to enhancement of IL-17-secreting CD8+ T cells (Tc17) in mice with emphysema. The opposite phenotype was observed when let-7 was overexpressed in T cells. They found a potential let-7 binding site in the 3' UTR of RORgt. They demonstrated a direct effect of let-7 on RORgt expression using let-7 mimic in a RORgt luciferase reporter assay. They have done an outstanding job of translating the finding of reduced let-7 expression in emphysema patients to a thorough delineation of its mechanism in a mouse model. Together, this study suggests an important role for let-7 miRNA in Tc17 cells in emphysema which appears to be mediated via repression of RORgt.

      Strengths: This well written manuscript flows logically and the data supports the overall claim let-7 miRNA promotes Tc17 differentiation during emphysema. There are several strengths to this study including the use of conditional let-7 knock out animals to decipher the role of this miRNA in Tc17 cells in emphysema.

      Weaknesses: There are no major weaknesses in this study. It would be interesting to see if knockdown RORgt could rescue enhanced Tc17 differentiation seen in let-7b/let-7c2-cluster-deficient T cells. The authors show no change in frequencies of Treg cells in let-7bc2LOF mice exposed to nCB. Do these Treg cells also express higher levels of RORgt and IL-17? The major question that was not addressed in this study is how let-7 expression is regulated in emphysema. The other recommendation is that the authors include the sequences of the let-7 mimic oligos used in the luciferase assay.

    1. eLife assessment

      In this useful study, the authors analyze droplet size distributions of multiple protein condensates and their fit to a scaling ansatz, highlighting that they exhibit features of first- and second-order phase transitions. The experimental evidence is still incomplete as the measurements were apparently done only at one time point, neglecting the possibility that droplet size distribution can evolve with time. The text would benefit from a connection to and contextualization with the well-understood expectations from the coupling of percolation and phase separation in protein condensates - a phenomenon that is increasingly gaining consensus amongst the community and that emphasizes "liquid-gas" criticality.

    2. Reviewer #1 (Public Review):

      The authors analyse droplet size distributions of multiple protein condensates and fit to a scaling ansatz to highlight that they exhibit features of first-order and second-order phase transitions. While the experimental evidence is solid, the text lacks connection and contextualization to the well-understood expectations from the coupling of percolation and phase separation in protein condensates - a phenomenon that is increasingly gaining consensus amongst the community. The evidence supports the percolation and phase separation model rather than being close to a true critical point in the liquid-gas phase space. Overall, the work is useful to the community.

      Strengths:<br /> The experimental analysis of distinct protein condensates is very well done and the reported exponents/scaling framework provides a clear framework to help the community deconvolve signatures of percolation in condensates.

      Weaknesses:<br /> The principal concern this reviewer has is that the reviewers adopt a framing in this paper to present a discovery of second-order features and connections to criticality - however, they ignore/miss the connections to percolation (a well-understood second-order transition that is expected to play a major role in protein condensates). I believe this needs to be addressed and the paper suitably revised to help connect with these expectations.

      - Protein condensates have been increasingly understood to be described as fluids whose assembly is driven by a connection of density (phase separation, first-order) and connectivity (percolation, second-order) transitions. This has been long known in the polymer community (Flory, Stockmayer, Tanaka, Rubinstein, Semenov, and others) and recently repopularized in the condensate community (by Pappu and Mittag, in particular, amongst others). The authors make no connections to any of these frameworks - which actually seem to be the essence of what they are describing.

      - Percolation theory, which has been around for more than half a century, has clear-cut scaling laws that have essentially similar forms to the ansatz adopted by the authors, and the commonalities/differences are not discussed by the authors - this is essential since this provides a physical basis for their ansatz rather than an arbitrary mathematical formulation. In particular, percolation models connect size distribution exponents to factors like dimensionality, valence, etc. and if these connections can be made with this data, that would be very powerful.

      - The connections between spinodal decomposition and second-order phase transitions are very confusing. Spindal decomposition happens when the barriers for first-order phase transitions are zero and systems can phase separate without crossing nucleation barriers. Further, the "criticality" discussed in the paper is confusing since it more likely refers to a percolation threshold and much less likely to a "critical temperature" (Tc -where spinodal and binodals become identical). I would recommend reframing this argument.

      It's unlikely, in this reviewer's opinion, that the authors are actually discussing a "first-order" liquid-gas critical point - because saturation concentrations of these proteins can be much higher with temperature and the critical point would thus likely be at much higher concentrations (and ofc temperature). Further, the scaling exponents don't fall into that class naturally. However, if the authors disagree, I would appreciate clear quantitative reasons (including through the scaling exponents in that universality class) and be happy to be convinced to change my mind. As provided, the data does not support this model.

    3. Reviewer #2 (Public Review):

      This is a potentially interesting study addressing a possible scale-invariant log-normal characteristic of droplet size distribution in the phase separation behavior of biomolecular condensates. Some of the data presented are valuable and intriguing. However, as it stands, the validity and utility of this study are uncertain because there are serious deficiencies in the execution and presentation of the authors' results. Many of these shortcomings are fundamental, including a lack of clarity in the basic conceptual framework of the study, insufficient justification of the experimental setup, less-than-conclusive experimental evidence, and inadequate discussion of implications of the authors' findings to future experimental and theoretical studies of biomolecular condensates. Accordingly, this reviewer considers that the manuscript should undergo a major revision to address the following. In particular, the discussion should be significantly expanded by including references mentioned below as well as other references pertinent to the issues raised.

      1. The theoretical analysis in this study is based on experimental data on condensed droplet size distributions for FUS and α-synuclein. The size data for FUS droplet is indirect as it relies on the assumption that FUS droplet diameter is proportional to fluorescence intensity of labeled FUS (page 10 of manuscript), with fluorescence data adopted from a previously published work by another group (Kar et al. & Pappu, ref.27). Because fluorescence of a droplet is expected to be dependent upon the condensed-phase concentration of FUS, this proportional relationship, even if it holds, must also be modulated by FUS concentration in the droplet. Moreover, why should fluorescence be proportional to diameter but not the cross-sectional area or volume of the FUS droplet, which would be more intuitive? These issues should be clarified. A new measure by microscopy is used to determine the size distribution of condensed α-synuclein; but no microscopy image is shown. It is of critical importance that such raw data (for example microscopy images) be presented for the completeness and reproducibility of the experiment because the entire study relies on the soundness of these experimental measurements.

      2. Despite the authors' claim of a universal scaling relationship, the log-log scatter plots in Figure 1 (page 15 of the manuscript) exhibit significant deviations from linearity at low protein concentrations (ρ→0). Given this fact, is universal scaling really valid? Discussion of this behavior is conspicuously absent (except the statement that these data points are excluded in the fit). In any case, the possible origins of these deviations should be thoroughly discussed so that the regime of universal scaling can be properly delineated.

      3. Droplet size distribution most likely depends on the time duration after the preparation of the sample. For α-synuclein, "liquid droplet size characterisation images were captured 10 minutes post-liquid droplet formation" (page 9 of the manuscript). Why 10 minutes? Have the authors tried imaging at different time points and, if so, do the distributions at different time points remain essentially the same? If they are different, what is the criterion for focusing only on a particular time point? Information related to these questions should be provided.

      4. At least two well-known mechanisms can lead to the time-dependent distribution of liquid droplet sizes: (i) coalescence of droplets in spatial proximity to form a larger droplet, and (ii) Ostwald ripening, i.e., formation of larger droplets concomitant with the dissolution of smaller droplets without fusion of droplets. The implications of these mechanisms on the authors' droplet size distributions should be addressed. Indeed, maintaining a size distribution against these mechanisms in vivo often requires active suppression [Bressloff, Phys Rev E 101, 042804 (2020)] with possible involvement of chemical reactions [Kirschbaum & Zwicker, J R Soc Interface 18, 20210255 (2021)]. These considerations are central to the basic rationale of this study and therefore should be carefully tackled.

      5. If coalescence and/or Ostwald ripening do occur, given sufficient time after sample preparation, the condensed phase may become a single large "droplet" or a single liquid layer. Does this occur in the authors' experiments?

      6. It is unclear whether the authors aim to address the kinetic phenomenon of liquid droplet formation and evolution or equilibrium properties. The two types of phenomena appear to be conflated in the authors' narrative. Clarification is needed. If this work aims to address time-independent (or infinite-time) equilibrium properties, how are they expected to be related to droplet size distribution, which most likely is time-dependent?

      7. The relationship between the potentially time-dependent droplet size distribution and equilibrium properties of ρt and ρc (transition and critical concentrations, respectively) should be better spelled out. An added illustrative figure will be helpful.

      8. The authors comment that their findings appear to be inconsistent with Flory-Huggins theory because Flory-Huggins "characterizes droplet formation as a consequence of nucleation ..." (page 8 of the manuscript). Here, three issues need detailed clarification: (i) In what way does Flory-Huggins mandate nucleation? (ii) Why are the findings of apparent scale invariance inconsistent with nucleation? (iii) If liquid droplet formations do not arise from nucleation, what physical mechanism(s) is (are) envisioned by the authors to be underpinning the formation of condensed liquid droplets in protein phase separation?

      9. Are any of the authors' findings related to finite-system effects of phase separation [see, e.g., Nilsson & Irbäck, Phys Rev E 101, 022413 (2020)]?

      10. Since the authors are using their observation of an apparent scale-invariant droplet size distribution to evaluate phase separation theory, it is important to clarify whether their findings provide any constraint on the shape of coexistence curves (phase diagrams).

      11. More specifically, do the authors' findings suggest that the phase diagrams predicted by Flory-Huggins are invalid? Or, are they suggesting that even if the phase diagrams predicted by Flory-Huggins are empirically correct (if verified by experimental testing), they are underpinned by a free energy function different from that of Flory-Huggins? It is important to answer this question to clarify the implications of the authors' findings on equilibrium phase behaviors and the falsifiability of the implications.

      12. How about the implications of the authors' findings on other theories of protein phase separation that are based on interactions that are different from the short spatial range interactions treated by Flory-Huggins? For instance, it has been observed that whereas the Flory-Huggins-predicted phase diagrams always convex upward, phase diagrams for charged intrinsically disordered proteins with long spatial range Coulomb interactions exhibit a region that concave upward [Das et al., Phys Chem Chem Phys 20, 28558-28574 (2018)]. Can information be provided by the authors' findings regarding apparent scale-invariant droplet size distribution on the underlying interaction driving the protein molecules toward phase separation?

      13. Table S1 (page 4) and Table S2 (page 7) are mentioned in the text but these tables are not in the submitted files.

      14. The two systems studied (FUS and α-synuclein) have a single intrinsically disordered protein (IDP) component. It is not clear if the authors expect their claimed scaling relation to be applicable to systems with multiple IDP components and if so, why.

    1. eLife assessment

      This fundamental study explores the relationship between guanine-quadruplex structures and pathogenicity islands in 89 bacterial strains representing a range of pathogens. Guanine-quadruplex structures were found to be non-randomly distributed within pathogenicity islands and conserved within the same strains. These compelling findings shed light on the molecular mechanisms of Guanine-quadruplex structure-pathogenicity island interactions and will be of interest to all microbiologists.

    2. Reviewer #1 (Public Review):

      Summary:<br /> This study explores the relationship between guanine-quadruplex (G4) structures and pathogenicity islands (PAIs) in 89 pathogenic strains.

      Strengths:<br /> The findings of this study hold significant implications for our understanding of bacterial pathogenicity and the role of guanine-quadruplex (G4) structures:

      Molecular Mechanisms of Pathogenicity: The study highlights that G4 structures are not randomly distributed within pathogenicity islands (PAIs), suggesting a potential role in regulating pathogenicity. This insight into the uneven distribution of G4s within PAIs provides a basis for further research into the molecular mechanisms underlying bacterial pathogenicity.

      Conservation of G4 Structures: The consistent conservation of G4 structures within the same pathogenic strains suggests that these structures might play a vital and possibly conserved role in the pathogenicity of these bacteria. This finding opens doors for exploring how G4s influence virulence across different pathogens.

      Unique Nature of PAIs: The differences in GC content between PAIs and the core genome underscore the unique nature of PAIs. This distinction suggests that factors such as DNA topology and G4 structures might contribute to the specialized functions and characteristics of PAIs, which are often associated with virulence genes.

      Regulatory Role of G4s: The identification of high-confidence G4 structures within regulatory regions of Escherichia coli implies that these structures could influence the efficiency or specificity of DNA integration events within PAIs. This finding provides a potential mechanism by which G4s can impact the pathogenicity of bacteria.

      Weaknesses:<br /> None

      Overall, the study provides fundamental insights into the pathogenicity island and conservation of G4 motifs.

    3. Reviewer #2 (Public Review):

      Summary: In the mauscript entitled "The Intricate Relationship of G-Quadruplexes and Pathogenicity Islands: A Window into Bacterial Pathogenicity" Bo Lyu explored the interactions between guanine-quadruplex (G4) structures and pathogenicity islands (PAIs) in 89 bacterial genomes through rigorous computational approach. This paper handles an intriguing and complex topic in the field pathogenomics, it has the potential to contribute significantly to the understanding of G4-PAI interactions and bacterial pathogenicity.

      Strengths: Chosen research area and summarizing the results through neat illustrations

      Weaknesses: I did not find any significant ones.

    1. eLife assessment

      This manuscript provides a useful reconstruction of the structure of the sirtuin-class histone deacetylase Sirt6 bound to a nucleosome based on cryo-EM observations, and additional characterization of the flexibility of the histone tails in the complex based on molecular dynamics simulations. While similar structures have recently been published elsewhere, this solid study supports the conclusions of those papers and also includes new insights into the potential dynamics of Sirt6 bound to a nucleosome, insights that help explain its substrate specificity.

    2. Reviewer #1 (Public Review):

      Smirnova et al. present a cryo-EM structure of a nucleosome-SIRT6 complex to understand how the histone deacetylase SIRT6 deacetylates the N-terminal tail of histone H3. The authors obtained the structure at sub-4 Å resolution and can visualize how interactions between the nucleosome and SIRT6 position SIRT6 to allow for H3 tail deacetylation. Through additional conformational analysis of their cryo-EM data, they reveal that SIRT6 positioning is flexible on the nucleosome surface, and this could accommodate the targeting of certain H3 tail residues. This work is significant as it represents the visualization of a histone deacetylase on its native nucleosomal target and reveals how substrate specificity is achieved. Importantly, it should be noted that recently two additional structures of the nucleosome-SIRT6 complex were already published. Therefore, Smirnova et al. confirm and complement these previous findings. Additionally, Smirnova et al. expand our understanding of the structural flexibility of SIRT6 on the nucleosome and clarify that SIRT6 also shows histone deacetylase activity on H3K27Ac.

    3. Reviewer #3 (Public Review):

      Smirnova et al. present a cryo-EM structure of human SIRT6 bound to a nucleosome as well as the results from molecular dynamics simulations. The results show that the combined conformational flexibilities of SIRT6 and the N-terminal tail of histone H3 limit the residues with access to the active site, partially explaining the substrate specificity of this sirtuin-class histone deacetylase. Two other groups have recently published cryo-EM structures of SIRT6:nucleosome complexes; this manuscript confirms and complements these previous findings, with the addition of some novel insights into the role of structural flexibility in substrate selection.

    1. eLife assessment

      This valuable study from Zaman et al. demonstrates that the cKit-Kit ligand complex is necessary for the formation and/or maintenance of molecular layer interneuron synapses in cerebellar Purkinje cells. The evidence presented is solid; in particular, the use of cell-type specific knockout of cKit in molecular layer interneurons and knockout of Kit ligand in Purkinje cells provides robust evidence. This work will be of particular relevance to those interested in inhibitory synapse formation or the role of inhibition in Purkinje cell behavior.

    2. eLife assessment

      This valuable study from Zaman et al. demonstrates that the cKit-Kit ligand complex is necessary for the formation and/or maintenance of molecular layer interneuron synapses in cerebellar Purkinje cells. The evidence presented is convincing; in particular, the use of cell-type specific knockout of cKit in molecular layer interneurons and knockout of Kit ligand in Purkinje cells provides robust evidence. This work will be of particular relevance to those interested in inhibitory synapse formation or the role of inhibition in Purkinje cell behavior.

    1. eLife assessment

      This manuscript presents a generally convincing set of experiments to address the question of whether the lateral parafacial area (pFL) is active in controlling active expiration, which is particularly relevant in patient populations that rely on active exhalation to maintain breathing (eg, COPD, ALS, muscular dystrophy). This study presents a valuable finding by pharmacologically mapping the core medullary region that contributes to active expiration and addresses the question of where these regions lie anatomically. Results from these experiments will be of value to those interested in the neural control of breathing and other neuroscientists as a framework for how to perform pharmacological mapping experiments in the future.

    1. eLife assessment

      This study presents important findings for understanding cortical processing of color, binocular disparity, and naturalistic textures in the human visual cortex at the spatial scale of cortical layers and columns using state-of-the-art high-resolution fMRI methods at ultra-high magnetic field strength (7 T). Solid evidence supports an interesting layer-specific informational connectivity analysis to infer information flow across early visual areas for processing disparity and color signals. While the question of how the modularity of representation relates to cortical hierarchical processing is interesting, the findings that texture does not map onto previously established columnar architecture in V2 are suggestive but would benefit from further controls. The successful application of high-resolution fMRI methods to study the functional organization along cortical columns and layers is relevant to a broad readership interested in general neuroscience.

    2. Author Response

      eLife assessment

      This study presents important findings for understanding cortical processing of color, binocular disparity, and naturalistic textures in the human visual cortex at the spatial scale of cortical layers and columns using state-of-the-art high-resolution fMRI methods at ultra-high magnetic field strength (7 T). Solid evidence supports an interesting layer-specific informational connectivity analysis to infer information flow across early visual areas for processing disparity and color signals. While the question of how the modularity of representation relates to cortical hierarchical processing is interesting and fundamental, the findings that texture does not map onto previously established columnar architecture in V2 is suggestive but would benefit from further controls. The successful application of high-resolution fMRI methods to study the functional organization along cortical columns and layers is relevant to a broad readership interested in general neuroscience.

      Thank you for your assessment of our manuscript "Mesoscale functional organization and connectivity of color, disparity, and naturalistic texture in human second visual area ". We have carefully considered the public reviews and have outlined our plans of revision by providing point-by-point responses to the reviewers’ comments.

      Reviewer #1 (Public Review):

      To support the finding that texture is not represented in a modular fashion, additional possibilities must be considered. These include the effectiveness and specificity of the texture stimulus and control stimuli, (b) further analysis of possible structure in images that may have been missed, and (c) limitations of imaging resolution.

      Thank you for your suggestions. We will provide evidence and additional analyses to show that there was indeed a large difference in high-order statistical information between the texture and control stimuli in our study, and thus the contrast between the two stimuli should be effective in localizing the processing of high-order texture information. Compared to the previous studies, another reason for the weaker texture selectivity in the current study could be the smaller number of images used and the slower rate of image presentation. Although our fMRI result at 1-mm isotropic resolution did not show a modular processing of naturalistic texture in CO-stripe columns, this does not exclude the possibility that smaller modules exist beyond the current fMRI resolution. We will discuss these limitations in the revised manuscript.

      More in-depth analysis of subject data is needed. The apparent structure in the texture images in peripheral fields of some subjects calls for more detailed analysis. e.g. Relationship to eccentricity and the need for a 'modularity index' to quantify the degree of modularity. A possible relationship to eccentricity should also be considered.

      We will perform further analysis based on your suggestion, especially regarding the relationship between eccentricity and modulation index. We will discuss this possibility in the revised manuscript.

      Given what is known as a modular organization in V4 and V3 (e.g. for color, orientation, curvature), did images reveal these organizations? If so, connectivity analysis would be improved based on such ROIs. This would further strengthen the hierarchical scheme.

      Thank you for your suggestion. The informational connectivity analyses used highly informative voxels by feature selection, which may already represent information from the modular organizations in these higher visual areas. We will examine the functional maps for possible modular organizations.

      Reviewer #2 (Public Review):

      In lines 162-163, it is stated that no clear columnar organization exists for naturalistic texture processing in V2. In my opinion, this should be rephrased. As far as I understand, Figure 2B refers to the analysis used to support the conclusion. The left and middle bar plots only show a circular analysis since ROIs were based on the color and disparity contrast used to define thin and thick stripes. The interesting graph is the right plot, which shows no statistically significant overlap of texture processing with thin, thick, and pale stripe ROIs. It should be pointed out that this analysis does not dismiss a columnar organization per se but instead only supports the conclusion of no coincidence with the CO-stripe architecture.

      Reviewer #1 also raised a similar concern. We agree that there may be a smaller functional module of textures in area V2 at a finer spatial scale than our fMRI resolution. We will rephrase our conclusions to be more precise.

      In Figure 3, cortical depth-dependent analyses are presented for color, disparity, and texture processing. I acknowledge that the authors took care of venous effects by excluding outlier voxels. However, the GE-BOLD signal at high magnetic fields is still biased to extravascular contributions from around larger veins. Therefore, the highest color selectivity in superficial layers might also result from the bias to draining veins and might not be of neuronal origin. Furthermore, it is interesting that cortical profiles with the highest selectivity in superficial layers show overall higher selectivity across cortical depth. Could the missing increase toward the pial surface in other profiles result from the ROI definition or overall smaller signal changes (effect size) of selected voxels? At least, a more careful interpretation and discussion would be helpful for the reader.

      We will discuss the limitations of cortical depth-dependent analysis using GE-BOLD fMRI. All our stimuli produced robust activations in these visual areas, thus the flat laminar profiles of modulatory indices are unlikely to be caused by smaller signal changes. We will show the original BOLD responses in addition to the modulation index.

      I was slightly surprised that no retinotopy data was acquired. The ROI definition in the manuscript was based on a retinotopy atlas plus manual stripe segmentation of single columns. Both steps have disadvantages because they neglect individual differences and are based on subjective assessment. A few points might be worth discussing: (1) In lines 467-468, the authors state that V2 was defined based on the extent of stripes. This classical definition of area V2 was questioned by a recent publication (Nasr et al., 2016, J Neurosci, 36, 1841-1857), which showed that stripes might extend into V3. Could this have been a problem in the present analysis, e.g., in the connectivity analysis? (2) The manual segmentation depends on the chosen threshold value, which is inevitably arbitrary. Which value was used?

      The retinotopic atlas on the standard surface is usually quite accurate in defining the boundaries of early visual areas. Although some stripes may extend into V3, these patterns should be more robust in V2. In our analysis, we selected only those with clear organizations within the retinotopic atlas. Thus, the signal contribution from V3 is likely to be small and would not affect the pattern of results. In addition, the results between V3 and V2 could be very different, we will compare the pattern of results from these areas in additional analyses. The threshold for segmentation is abs(T)>2, we will clarify this in the method.

      The use of 1-mm isotropic voxels is relatively coarse for cortical depth-dependent analyses, especially in the early visual cortex, which is highly convoluted and has a small cortical thickness. For example, most layer-fMRI studies use a voxel size of around isotropic 0.8 mm, which has half the voxel volume of 1 mm isotropic voxels. With increasing voxel volume, partial volume effects become more pronounced. For example, partial volume with CSF might confound the analysis by introducing pulsatility effects.

      We agree that the 1-mm isotropic voxel is much smaller in volume than the 0.8-mm isotropic voxel, but the resolution along the cortical depth is not a large difference. In addition to our study, there are also other studies showing that fMRI at 1-mm isotropic resolution is capable of resolving cortical depth-dependent signals. Also, our fMRI slices were oriented perpendicular to the calcarine sulcus, the higher in-plane resolution will also benefit in resolving depth-dependent signals. We will discuss these issues about fMRI resolution in the revised manuscript.

      The SVM analysis included a feature selection step stated in lines 531-533. Although this step is reasonable for the training of a machine learning classifier, it would be interesting to know if the authors think this step could have reintroduced some bias to remaining draining vein contributions.

      Several precautions have been taken in the ROI definition to reduce the influence of large draining veins. The same number of voxels were selected from each cortical depth for the SVM analysis, thus there was no bias from the superficial layers susceptible to draining veins. Also, since both feedforward and feedback connections involved the superficial voxels, the remaining influence of large draining veins should be comparable between the two connections.

      Reviewer #3 (Public Review):

      The authors tend to overclaim their results.

      Thank you for your comments. We will add more control analyses to strengthen our findings, and have appropriate discussion of results.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Response to reviews

      We would like to extend our thanks to the reviewers who took the time to carefully read our paper and provide thoughtful insights and suggestions on how to strengthen our conclusions. All reviewers agreed that our study presented strong data supporting a role for triglyceride lipase brummer (bmm) in regulating testis lipid droplets and spermatogenesis in Drosophila, and that our findings advance our understanding of lipid biology during sperm development. Reviewers made several helpful suggestions on how to strengthen our manuscript even further. Below, we outline how we revised our manuscript in response to reviewer comments to ensure we clearly communicate our data and conclusions with readers, and properly contextualize our findings.

      REVIEWER 1

      In this study, the authors investigate the role of triglycerides in spermatogenesis. This work is based on their previous study (PMID: 31961851) on triglyceride sex differences in which they showed that somatic testicular cells play a role in whole body triglyceride homeostasis. In the current study, they show that lipid droplets (LDs) are significantly higher in the stem and progenitor cell (pre-meiotic) zone of the adult testis than in the meiotic spermatocyte stages. The distribution of LDs anti-correlates with the expression of the triglyceride lipase Brummer (Bmm), which has higher expression in spermatocytes than early germline stages. Analysis of a bmm mutant (bmm[1]) - a P-element insertion that is likely a hypomorphic - and its revertant (bmm[rev]) as a control shows that bmm acts autonomously in the germline to regulate LDs. In particular, the number of LDs is significantly higher in spermatocytes from bmm[1] mutants than from bmm[rev] controls. Testes from males with global loss of bmm (bmm[1]) are shorter than controls and have fewer differentiated spermatids. The zone of bam expression, typically close to the niche/hub in WT, is now many cell diameters away from the hub in bmm[1] mutants. There is an increase in the number of GSCs in bmm[1] homozygotes, but this phenotype is probably due to the enlarged hub. However, clonal analyses of GSCs lacking bmm indicate that a greater percentage of the GSC pool is composed of bmm[1]-mutant clones than of bmm[rev]-clones. This suggests that loss of bmm could impart a competitive advantage to GSCs, but this is not explored in greater detail. Despite the increase in number of GSCs that are bmm[1]-mutant clones, there is a significant reduction in the number of bmm[1]-mutant spermatocyte and post-meiotic clones. This suggests that fewer bmm[1]-mutant germ cells differentiate than controls. To gain insights into triglyceride homeostasis in the absence of bmm, they perform mass spec-based lipidomic profiling. Analyses of these data support their model that triglycerides are the class of lipid most affected by loss of bmm, supporting their model that excess triglycerides are the cause of spermatogenetic defects in bmm[1]. Consistent with their model, a double mutant of bmm[1] and a diacylglycerol Oacyltransferase 1 called midway (mdy) reverts the bmm-mutant germline phenotypes.

      There are numerous strengths of this paper. First, the authors report rigorous measurements and statistical analyses throughout the study. Second, the authors ulize robust genetic analyses with loss-of-function mutants and lineage-specific knockdown. Third, they demonstrate the appropriate use of controls and markers. Fourth, they show rigorous lipidomic profiling. Lastly, their conclusions are appropriate for the results. In other words, they don't overstate the results.

      We thank the Reviewer for their positive assessment of our paper.

      There are a few weaknesses. Although the results support the germline autonomous role of bmm in spermatogenesis, one potential caveat that the mdy rescue was global, i.e., in both somatic and germline lineages. The authors did not recover somatic bmm clones, suggesting that bmm may be required for somatic stem self-renewal and/or niche residency. While this is beyond the scope of this paper, it is possible that somatic bmm does impact germline differentiation in a global bmm mutant.

      In the revised manuscript, we made several changes to address these points.

      1) We now clearly state when we used global versus germline-only loss of mdy to rescue bmm mutant phenotypes in the testis.

      “Notably, at least some of the effects of global loss of mdy on bmm1 males can be attributed to the germline:

      RNAi-mediated knockdown of mdy in the germline of bmm1 males partially rescued the defects in testis size (Figure 4I; Kruskal-Wallis rank sum test with Dunn’s multiple comparison test) and GSC variance (Figure S5J; p=4.5 x 10-5 and 8.2 x 10-3 by F-test from the GAL4- and UAS-only crosses, respectively).”

      “Importantly, testes isolated from males with global loss of both bmm and mdy (mdyQX25/k03902;bmm1) had fewer LD than testes dissected from bmm1 males (Figures 5D, S5I; one-way ANOVA with Tukey multiple comparison test).”

      2) We also discuss the possibility that somatic bmm may play a role in germline differentiation in a global bmm mutant, and present phenotypic data on somatic bmm1 clones.

      “We also reveal a potential non-cell-autonomous role for somatic bmm. While there was no difference in the ratio of Zd-1-positive cells between homozygous clones and heterozygous clones in animals carrying the bmm1 or bmmrev alleles at 14 days post clone induction (Figure S4O; Kruskal-Wallis rank sum test), the distance from the hub to the Zd-1 positive clones reside was significantly decreased in bmm1 homozygous clones (Figure S4P; Kruskal-Wallis rank sum test). Together, these data indicate bmm may play a cell-autonomous role in germline cells, and potentially a non-cell-autonomous role in somatic cells, to regulate spermatogenesis.”

      3) Finally, we clarify that we were unable to assess somatic LD. Specifically, this was a technical issue as the dye we use to visualize testis LD is incompatible with staining protocols to identify somatic cells. As a result, we were unable to count LD in somatic clones with confidence.

      “While we were unable to assess LD in bmm1 somatic clones, our data when taken together reveals a previously unrecognized cell-autonomous role for bmm as a regulator of testis LD in germline cells.”

      Regarding data presentation, I have a minor point about Fig. 3L: why aren't all data shown as box plots (only Day 14 bmm[rev] does).

      In our revised manuscript Figure 4L does present a boxplot across all genotypes and times; the appearance of ‘no boxes’ is simply due to the large number of datapoints with a value of zero, which compress the box near the X-axis.

      Finally, the authors provide a detailed pseudotime analysis of snRNA-seq of the testis in Fig. S2A-D, but this analysis is not sufficiently discussed in the text.

      In the revised manuscript we added text to describe our pseudotime analysis of single-cell RNA seq data in more detail.

      “Using pseudotime analysis, we arranged the germline (Figure S2A) and the somatic cells (Figure S2B) based on their annotated developmental trajectory. The expression pattern of bmm in the germline matched our observation with bmm-GFP reporter (Figure S2C). While levels of the bmm-GFP reporter were lower in somatic cells, single-cell RNA sequencing data identified bmm expression in the somatic lineage that was higher in cells at later stages of development (Figure S2D). Additional neutral lipid- and lipid droplet-associated genes such as lipid storage droplet-2, Seipin, Lipin, and midway also showed differential regulation during differentiation (Figure S2C, S2D). Combined with our data on the location of testis LD, these data suggest that bmm upregulation in both somatic and germline cells during differentiation corresponds to the downregulation of testis LD. Supporting this, germline GFP levels were negatively correlated with testis LD in bmm-GFP flies (Figure 2A, 2C), suggesting regions with higher bmm expression had fewer LD.”

      Overall, the many strengths of this paper outweigh the relatively minor weaknesses. The rigorously quantified results support the major aim that appropriate regulation of triglycerides are needed in a germline cell-autonomous manner for spermatogenesis.

      This paper should have a positive impact on the field. First and foremost, there is limited knowledge about the role of lipid metabolism in spermatogenesis. The lipidomic data will be useful to researchers in the field who study various lipid species. Going forward, it will be very interesting to determine what triglycerides regulate in germline biology. In other words, what functions/pathways/processes in germ cells are negatively impacted by elevated triglycerides. And as the authors point out in the discussion, it will be important to determine what regulates bmm expression such that bmm is higher in later stages of germline differentiation.

      We agree with the reviewer about the many interesting future directions for this project. We added a model figure in the revised manuscript to visualize our findings and highlight remaining questions about how bmm and triglycerides support normal spermatogenesis in Drosophila (Fig. 6).

      REVIEWER 2


      Here, the authors show that neutral lipids play a role in spermatogenesis. Neutral lipids are components of lipid droplets, which are known to maintain lipid homeostasis, and to be involved in non-gonadal differentiation, survival, and energy. Lipid droplets are present in the testis in mice and Drosophila, but not much is known about the role of lipid droplets during spermatogenesis. The authors show that lipid droplets are present in early differentiating germ cells, and absent in spermatocytes. They further show a cell autonomous role for the lipase brummer in regulating lipid droplets and, in turn, spermatogenesis in the Drosophila testis. The data presented show that a relationship between lipid metabolism and spermatogenesis is congruous in mammals and flies, supporting Drosophila spermatogenesis as an effective model to uncover the role lipid droplets play in the testis.

      We thank the Reviewer for their positive assessment of our paper.

      Strengths and weaknesses:

      The authors do a commendably thorough characterization of where lipid droplets are detected in normal testes: located in young somatic cells, and early differentiating germ cells. They use multiple control backgrounds in their analysis, including w[1118], Canton S, and Oregon R, which adds rigor to their interpretations. The authors employ markers that identify which lipid droplets are in somatic cells, and which are in germ cells. The authors use these markers to present measured distances of somatic and germ cell-derived lipid droplets from the hub. Because they can also measure the distance of somatic and germ cells with age-specific markers from the hub, these results allow the authors to correlate position of lipid droplets with the age of cells in which they are present. This analysis is clearly shown and well quantified.

      The quantification of lipid droplet distance from the hub is applied well in comparing brummer mutant testes to wild type controls. The authors measure the number of lipid droplets of specific diafteters, and the spatial distribution of lipid droplets as a function of distance from the hub. These measurements quantitatively support their findings that lipid droplets are present in an expanded population of cells further from the hub in brummer mutants. The authors further quantify lipid droplets in germline clones of specified ages; the quantitative analysis here is displayed clearly, and supports a cell autonomous role for brummer in regulating lipid droplets in spermatocytes.

      Data examining testis size and number of spermatids in brummer mutants clearly indicates the importance of regulating lipid droplets to spermatogenesis. The authors show beautiful images supported by rigorous quantification supporting their findings that brummer mutants have both smaller testes with fewer spermatids at both 29 and 25C. There is also significant data supporting defects in testis size for 14-day-old brummer mutant animals compared to controls. The comparison of number of spermatids at this age is not significant, which does not detract from the story but does not support sperm development defects specifically caused by brummer loss at 14 days. Their analysis clearly shows an expanded region beyond the testis apex that includes younger germ cells, supporting a role for lipid droplets influencing germ cell differentiation during spermatogenesis.

      We thank the reviewer for pointing out this inaccuracy in our manuscript. In the revised manuscript we chose more precise language to describe defects in 14-day-old bmm mutants:

      “Defects in testis size were also observed at 14-day post eclosion; suggesting testis size defects persist later into the life course (Figure S4C; Welch two-sample t-test). In contrast, the number of spermatid bundles per testis was not significantly different between bmm1 and bmmrev males at this age (Figure S4D; Welch two-sample ttest), potentially due to a large decrease in the number of spermatid bundles in 14-day-old bmmrev males (Figure 4C, S4D).”

      The authors present a series of data exploring a cell autonomous role for brummer in the germline, including clonal analysis and tissue specific manipulations. The clonal data indicating increased lipid droplets in spermatocyte clones, and a higher proportion of brummer mutant GSCs at the hub are convincing and supported by quantitation. The authors also show a tissue specific rescue of the brummer testis size phenotype by knocking down mdy specifically in germ cells, which is also supported by statistically significant quantitation. The authors present data examining the number of spermatocyte and post-meiotic clones 14 days aeer clonal induction. While data they present is significant with a 95% confidence interval and a p value of 0.0496, its significance is not as robust as other values reported in the study, and it is unclear how much information can be gained from that specific result.

      We thank the reviewer for raising this point. In the revised manuscript we displayed the p-value clearly in the text and on the figure to ensure our statistical output is clear for readers to evaluate our conclusions regarding bmm mutant clones 14 days after clone induction. We also state that the finding should be reproduced by others given that the statistical significance of this result was not as strong as our other data.

      “Because we observed significantly fewer bmm1 spermatocyte and spermatid clones at 14 days after clone induction (Figure 4K,4L; p = 0.0496, Kruskal-Wallis rank sum test), these effects on germline development may represent a cell-autonomous role in regulating spermatogenesis for bmm in this cell type. Given that the statistical significance of this finding was not as strong as for our other data, future studies should repeat this experiment with more samples.”

      The authors do a beautiful job of validating where they detect brummer-GFP by presenting their own pseudotime analysis of publicly available single cell RNA sequencing data. Their data is presented very clearly, and supports expression of brummer in older somatic and germline cells of the age when lipid droplets are normally not detected. The authors also present a thorough lipidomic analysis of animals lacking brummer to identify triglycerides as an important lipid droplet component regulating spermatogenesis.


      The authors present data supporting the broad significance of their findings across phyla. This data represents a key strength of this manuscript. The authors show that loss of a conserved triglyceride lipase impacts testis development and spermatogenesis, and that these impacts can be rescued by supplementing diet with medium chain triglycerides. The authors point out that these findings represent a biological similarity between Drosophila and mice, supporting the relevance of the Drosophila testis as a model for understanding the role of lipid droplets in spermatogenesis. The connection buttresses the relevance of these findings and this model to a broad scientific community.

      We thank the Reviewer very much for their positive assessment of our paper!

      REVIEWER 3

      In this manuscript, Chao et al seek to understand the role of brummer, a triglyceride lipase, in the Drosophila testis. They show that Brummer regulates lipid droplet degradation during differentiation of germ and somatic cells, and that this process is essential for normal development to progress. These findings are interesting and novel, and contribute to a growing realisation that lipid biology is important for differentiation.

      We thank the Reviewer for their positive comments about our manuscript.

      Major comments:

      1) The data in Figs 1 and 2, while helpful in setting the scene, do not add much to what was previously shown by the same group, namely that lipid droplets are present in both early germ cells and early somatic cells in the testis, and that Bmm regulates their degradation (PMID: 31961851). Measuring the distance of lipid droplets from the hub, while helpful in quantifying what is apparent, that only stem and early differentiated stages have lipid droplets, is not as informative as the way data are presented later (Fig. 2I), where droplets in specific stages are measured. Much of this could be condensed without much overall loss to the manuscript.

      We thank the reviewer for this comment. In our revised manuscript we edited the first part of the paper while still preserving the detailed characterization that builds upon our previous paper.

      2) It would be important to show images of the clones from which the data in Fig. 2I are generated. The main argument is that Bmm regulates lipid droplets in a cell autonomous manner; these data are the strongest argument in support of this and should be emphasised at the expense of full animal mutants (which could be moved to supplementary data).

      We thank the reviewer for this comment. In the revised manuscript we added a figure showing lipid droplets in control and bmm mutant spermatocyte clones in Fig. 3A, 3B with a quantification of this data in Figure 3C.

      Similarly, the title of Fig. S2 ("brummer regulates lipid droplets in a cell autonomous manner") should be changed as the figure has no experiments with cell (or cell-type)-specific knockdowns/mutants. This figure does show changes in lipid droplets in both lineages in bmm mutants, so an appropriate title could be "brummer regulates lipid droplets in both germ and soma".

      We thank the reviewer for this comment, we adjusted the Figure 2 legend title in the revised manuscript to “brummer regulates lipid droplets in both germline and somatic cells of the testis”.

      3) Interestingly, the clonal data show that bmm is dispensable in germ cells until spermatocyte stages, as no increase in lipid droplet number is seen until then. This should be more clearly stated, as it indicates that the important function of Bmm is to degrade lipid droplets at the transition from spermatogonial to spermatocyte stages. This is consistent with the phenotypes observed in which late stage germ cells are reduced or missing. However, the effect on niche retention of the mutant GSCs at the expense of neighbouring wildtype GSCs is hard to explain. Are lipid droplets in mutant GSCs larger than in control? Is there any discernible effect of bmm mutation on lipids in GSCs? Additionally, bam expression is delayed, suggesting that bmm may have roles on cell fate in earlier stages than its roles that can be detected on lipid droplets.

      We thank the reviewer for this comment. We included more text in the revised manuscript to clarify the key role bmm plays in regulating lipid droplets at the spermatogonia-spermatocyte transition.

      “Because we observed no significant effect of cell-autonomous bmm loss on LD at any other stage of germline development (Figure 3C), this suggests bmm function is not required to regulate LD at early stages of germ cell development. Instead, our data suggests bmm plays a role in regulating LD at the spermatogonia-spermatocyte transition.”

      We also added more detail to our description of how bmm affects lipid droplets in cells at the earliest stages of germline development.

      “Given that we detected no effect of cell-autonomous bmm loss on the number of GSC LD (Fig. 3C), more work will be needed to understand how bmm regulates GSC at a stage prior to its effects on LD number.”

      4) The bmm loss-of-function phenotype could be better described. Some of the data is glossed over with little description in the text (see for example the reference to Fig. 3A-C). For instance, in the discussion, the text states "loss of bmm delays germline differentiation leading to an accumulation of early-stage germ cells" (p13, l.25960). However, this accumulation has not been clearly shown, or at least described in the manuscript. Most of the data show a reduction (or almost complete absence) of differentiated cell types. This could indeed be due to delayed differentiation, or alternatively to a block in differentiation or to death of the differentiated cells. The clonal data presented show a decrease in the number of cells recovered, but do not allow inferences as to the timing of differentiation, making it hard to distinguish between the various possibilities for the lack of differentiated spermatids. Apart from data showing that GSCs are more likely to remain at the niche, no further data are shown to support the fact that mutant germ cells accumulate in early stages. While additional experiments could help resolve some of these issues, much of this could also be resolved by tempering the conclusions drawn in the text.

      We thank the reviewer for these comments. In the revised manuscript we temper our conclusions regarding bmm’s precise role in spermatogenesis by discussing different mechanisms (e.g. differentiation or death) that could lead to the phenotypes we observe.

      “This regulation is important for sperm development, as our data indicates that loss of bmm causes a decrease in the number of differentiated cell types. This reduction in differentiated cell types may be attributed to a delay in differentiation, a block in differentiation, or to a loss of differentiated cells through cell death. Future studies will therefore be essential to resolve why bmm loss causes a reduction in differentiated cell types.”

      5) In the discussion (p.14, l-273 onwards), the authors suggest that products of triglyceride breakdown are important for spermatogenesis. However, an alternative interpretation of the results presented here (especially those using the midway mutant) could be that triglycerides impede normal differentiation directly. Indeed, preventing the cells' ability to produce triglycerides in the first place can rescue many of the defects observed. A better discussion of these results with a model for the function of triglycerides and their by-products would be a great improvement to this manuscript.

      We thank the reviewer for this comment. To ensure our data is clearly communicated with readers, we added a model to the paper suggesting how triglyceride and its by-products influence spermatogenesis (Fig. 6) and text to clarify that triglyceride could potentially impeded differentiation.

      “It will also be important to determine whether it is the loss of metabolites produced by bmm’s enzymatic action, or an increase in triglycerides, that leads to the reduction in differentiated cell types during spermatogenesis. Together, these experiments will provide critical insight into how triglyceride stored within testis LD contributes to overall cellular lipid metabolism during spermatogenesis.”

      Together, these changes will strengthen our overall finding that bmm-mediated regulation of testis triglyceride is important for normal sperm development. Because our findings in flies align with and extend data from rodent models, the developmental mechanisms we uncovered about how triglyceride lipase bmm regulates testis lipid droplets and sperm development will likely operate in other species.  

      Reviewer #1 (Recommendations For The Authors):

      I have a minor concern about methodology: how were spermatocytes identified? I ask because data in Figure 3 indicate that there is a significant delay in germline differentiation in the bmm[1] mutant, with relatively smaller germ cells throughout the apical half of the testis. Typical large spermatocyte-like cells are not clearly obvious to me in Fig. 3.

      We thank the Reviewer for suggesting we add more clarity to how we identified spermatocytes. We state in the revised manuscript how we identify spermatocytes:

      “Cells in the testis region occupied by primary spermatocytes were identified by their large cell size and decondensed chromosome staining occupying three nuclear domains [120].”

      Also, we note that while it is difficult to see where the bmm1 testis have spermatocytes in Fig. 4E, this is due to the large number of early-stage cells in this close-up image. The spermatocytes can be more easily seen in Fig. 4I and 4I’ when the whole testis is included in the image.    

      Reviewer #2 (Recommendations For The Authors):

      • Lines 197-198 mention "Boule-positive area," "individualization complexes," and "waste bags." It would be helpful to the reader to explain what these measurements are to help contextualize the data shown related to these statements.

      We thank the Reviewer for this comment. We added the following text to the revised manuscript:

      “Because Boule-positive area, individualization complexes, and waste bags are all markers for later stages in sperm development, these data indicate the loss of bmm causes a reduction in differentiated cell types.”

      • Line 162 states a defect in sperm development observed in 14-day-old bmm[1] males, but the data presented in Figure S3D does not show a significant difference. The words "sperm development" should be removed from this sentence.

      We thank the Reviewer for pointing out this inaccurate statement. We fixed the statement as follows in the revised manuscript:

      “Defects in testis size were also observed at 14-day post eclosion; suggesting testis size defects persist later into the life course (Figure S4C; Welch two-sample t-test). In contrast, the number of spermatid bundles per testis was not significantly different between bmm1 and bmmrev males at this age (Figure S4D; Welch two-sample ttest), potentially due to a large decrease in the number of spermatid bundles in 14-day-old bmmrev males (Figure 4C, S4D).”

      • Line 294 has a typo: "regulating" should likely be "regulated"

      We thank the Reviewer for pointing out this mistake, which we corrected.

      • Line 456 should include the length of time for heat shock

      We thank the Reviewer for pointing out this omission. We now include these details:

      “Adult males were collected at 3-5 days post-eclosion and heat-shocked three times at 37°C for 30 min followed by a 10 min rest period at room temperature between heat shocks.”

      • Methods section beginning on Line 442 might include an explanation of how hub area was quantified.

      We thank the Reviewer for this suggestion. We now include the following information:

      “Hub size was measured by quantifying FasIII-positive area of the testis.”

      • Figure 1 legend could benefit from adding a statement on how spermatocytes (arrowheads) were identified

      We thank the Reviewer for this suggestion, we now refer the reader to the more detailed description in the methods section.

      • Figure 2A should present the merged panel in A' first. The legend states that Panel A shows Lipid Droplets, but LipidTox is not shown until A'.

      We thank the Reviewer for this suggestion, we now clarify that the text refers to panels A-A''''.

      • Figure 2I would benefit from a key, to emphasize that these are individual cell clones, highlighting the idea of cell autonomous effects of bmm in the spermatocytes. Showing example images of spermatocyte clones with increased lipid droplets could also emphasize this result. The legend for this panel should note the statistical test done to confirm significance in the SC result.

      We agree with the Reviewer and have added images of the LD in bmm1 spermatocyte clones in Figure 3B, and the quantification in Figure 3C. We explicitly state the significance of this result and the statistical test in Figure 3 legend.

      • In Figure 3, the cell autonomous data clearly indicates that there are higher proportions of bmm mutant GSCs occupying the hub compared to control GSCs. It could be worth stating whether this observation indicates an increased ability of bmm mutant GSCs to compete for occupying space at the hub.

      We thank the Reviewer for pointing out this potential implication of our data, which we acknowledge in the revised version of our manuscript:

      “Future studies will also need to confirm whether bmm1 mutant GSCs show an increased ability to occupy space at the hub.”

      • In Figure 4, I suggest changing the title of Panel B to "Proportion of significant species in each lipid class" for clarity.

      We made this change in the Figure 5 legend (Figure 5 is the corresponding figure in the revised manuscript).

      • It could be valuable to quantify the number of spermatids in the germline specific mdy knockdown, which would lend additional support to a cell autonomous requirement for bmm in spermatogenesis

      We added a sentence to the revised manuscript recognizing that this is an interesting experiment for studies on the role of germline triglyceride in promoting spermatogenesis.

      “While future studies will need to test whether germline-specific loss of mdy also rescues spermatid number defects in bmm1 males, our data suggest bmm-mediated regulation of testis triglyceride plays a previously unrecognized role in regulating sperm development.”

      Reviewer #3 (Recommendations For The Authors):

      1) bmm-GFP does not show expression in somatic cells yet previous work by the same group has shown a requirement for bmm in the testis soma using C587-Gal4.

      We thank the Reviewer for raising this issue. While the reporter shows low GFP expression in the somatic cells, the single-cell RNA sequencing data we analyze suggests bmm is expressed in these cells. We address this issue in the revised manuscript as follows:

      “While levels of the bmm-GFP reporter were lower in somatic cells, single-cell RNA sequencing data identified bmm expression in the somatic lineage that was higher in cells at later stages of development (Figure S2D).”

      2) p.11 l.200-202 "Because we recovered fewer bmm1 spermatocyte and spermatid clones 14 days after clone induction (Figure 3K,3L; Kruskal-Wallis rank sum test), this effect on germline development represents a cell-autonomous role for bmm." This sentence should be rephrased as the phenotype could be a combination of autonomous roles within the germline and non-autonomous roles in supporting cyst cells.

      “We also reveal a potential non cell-autonomous role for somatic bmm. While there was no difference in the ratio of Zd-1-positive cells between homozygous clones and heterozygous clones in animals carrying the bmm1 or bmmrev alleles at 14 days post clone induction (Figure S4O; Kruskal-Wallis rank sum test), the distance from the hub to the Zd-1 positive clones reside was significantly decreased in bmm1 homozygous clones (Figure S4P; Kruskal-Wallis rank sum test). Together, these data indicate bmm may play a cell-autonomous role in germline cells, and potentially a non-cell-autonomous role in somatic cells, to regulate spermatogenesis.”

      3) The labelling in Fig. 3 is confusing - presumably the graph in 3C refers to spermatid bundles [this comment applies to other figures showing spermatid bundle numbers], not individual spermatids, while the graph in 3G refers to the proportion of the total GSC pool that is contained within the clone. The data in Fig. 3C are not described in the main text.

      We adjusted the confusing labelling to ‘spermatid bundles’ from ‘number of spermatids’, as suggested. We also changed the title of panel Fig. 3G (now 4G) as suggested and men5oned Fig. 3C (now Fig. 4C) in the text.

      4) On p.9, comments are speculative or seek to draw comparisons with the broader literature and would seem to belong more to the discussion (eg "our data suggests flies are a good model to study how bmm/ATGL influences sperm development" - also there is a typo, it should be "suggest").

      We thank the Reviewer for raising concern about our speculative statement; we changed the text as follows in the revised manuscript:

      “This identifies similarities between flies and mice in fertility-related phenotypes associated with whole-body loss of bmm/ATGL.”

      5) The length of the heat shocks used for clone induction should be specified in the methods (rather than just the period in between heat shocks).

      We now include more information on clone induction:

      “Adult males were collected at 3-5 days post-eclosion and heat-shocked three times at 37°C for 30 min followed by a 10 min rest period at room temperature between heat shocks. Amer heat-shock, the flies were incubated at room temperature until dissection.”

      6) p.8 l.132 "bmm-GFP accurately reproduces changes to bmm mRNA levels". This sentence should be rephrased.

      We thank the Reviewer for this comment and rephrased the sentence:

      “We first examined bmm expression in the testis by isolating this organ from flies carrying a bmm promoter driven GFP transgene (bmm-GFP) that recapitulates many aspects of bmm mRNA regulation [77].”

      7) p.9 l.172 "we used germline-specific marker" should read "we used an antibody against the germline-specific marker".

      We corrected this inaccurate statement in our revised manuscript.

      8) p.10 several lines, "GSC" should be "GSCs".

      We corrected this inaccurate use of GSC in our revised manuscript.

      9) p.13 l.247 should read "variance in GSC numbers".

      Thank you, this error was fixed.

    2. eLife assessment

      This important study identifies a role for triglycerides and lipid droplets in spermatogenesis, with data supporting relevance of this finding across phyla. The work shows with convincing data that a triglyceride lipase is required cell-autonomously for germline differentiation into meiotic stages and haploid spermatids and that an increase in triglycerides is detrimental to spermatogenesis. This paper would be of interest to developmental and cell biologists working on gametogenesis.

    3. Reviewer #1 (Public Review):

      In this study, the authors investigate the role of triglycerides in spermatogenesis. This work is based on their previous study (PMID: 31961851) on triglyceride sex differences in which they showed that somatic testicular cells play a role in whole body triglyceride homeostasis. In the current study, they show that lipid droplets (LDs) are significantly higher in the stem and progenitor cell (pre-meiotic) zone of the adult testis than in the meiotic spermatocyte stages. The distribution of LDs anti-correlates with the expression of the triglyceride lipase Brummer (Bmm), which has higher expression in spermatocytes than early germline stages. Analysis of a bmm mutant (bmm[1]) - a P-element insertion that is likely a hypomorphic - and its revertant (bmm[rev]) as a control shows that bmm acts autonomously in the germline to regulate LDs. In particular, the number of LDs is significantly higher in spermatocytes from bmm[1] mutants than from bmm[rev] controls. Testes from males with global loss of bmm (bmm[1]) are shorter than controls and have fewer differentiated spermatids. The zone of bam expression, typically close to the niche/hub in WT, is now many cell diameters away from the hub in bmm[1] mutants. There is an increase in the number of GSCs in bmm[1] homozygotes, but this phenotype is probably due to the enlarged hub. However, clonal analyses of GSCs lacking bmm indicate that a greater percentage of the GSC pool is composed of bmm[1]-mutant clones than of bmm[rev]-clones. This suggests that loss of bmm could impart a competitive advantage to GSCs, but this is not explored in greater detail. Despite the increase in number of GSCs that are bmm[1]-mutant clones, there is a significant reduction in the number of bmm[1]-mutant spermatocyte and post-meiotic clones. This suggests that fewer bmm[1]-mutant germ cells differentiate than controls. To gain insights into triglyceride homeostasis in the absence of bmm, they perform mass spec-based lipidomic profiling. Analyses of these data support their model that triglycerides are the class of lipid most affected by loss of bmm, supporting their model that excess triglycerides are the cause of spermatogenetic defects in bmm[1]. Consistent with their model, a double mutant of bmm[1] and a diacylglycerol O-acyltransferase 1 called midway (mdy) reverts the bmm-mutant germline phenotypes.

      There are numerous strengths of this paper. First, the authors report rigorous measurements and statistical analyses throughout the study. Second, the authors utilize robust genetic analyses with loss-of-function mutants and lineage-specific knockdown. Third, they demonstrate the appropriate use of controls and markers. Fourth, they show rigorous lipidomic profiling. Lastly, their conclusions are appropriate for the results. In other words, they don't over-state the results. Overall, the rigorously quantified results support the major aim that appropriate regulation of triglycerides are needed in a germline cell-autonomous manner for spermatogenesis.

      This paper should have a positive impact on the field. First and foremost, there is limited knowledge about the role of lipid metabolism in spermatogenesis. The lipidomic data will be useful to researchers in the field who study various lipid species. Going forward, it will be very interesting to determine what triglycerides regulate in germline biology. In other words, what functions/pathways/processes in germ cells are negatively impacted by elevated triglycerides. And as the authors point out in the discussion, it will be important to determine what regulates bmm expression such that bmm is higher in later stages of germline differentiation.

    4. Reviewer #2 (Public Review):


      Here, the authors show that neutral lipids play a role in spermatogenesis. Neutral lipids are components of lipid droplets, which are known to maintain lipid homeostasis, and to be involved in non-gonadal differentiation, survival, and energy. Lipid droplets are present in the testis in mice and Drosophila, but not much is known about the role of lipid droplets during spermatogenesis. The authors show that lipid droplets are present in early differentiating germ cells, and absent in spermatocytes. They further show a cell autonomous role for the lipase brummer in regulating lipid droplets and, in turn, spermatogenesis in the Drosophila testis. The data presented show that a relationship between lipid metabolism and spermatogenesis is congruous in mammals and flies, supporting Drosophila spermatogenesis as an effective model to uncover the role lipid droplets play in the testis.

      Strengths and weaknesses:

      The authors do a commendably thorough characterization of where lipid droplets are detected in normal testes: located in young somatic cells, and early differentiating germ cells. They use multiple control backgrounds in their analysis, including w[1118], Canton S, and Oregon R, which adds rigor to their interpretations. The authors employ markers that identify which lipid droplets are in somatic cells, and which are in germ cells. The authors use these markers to present measured distances of somatic and germ cell-derived lipid droplets from the hub. Because they can also measure the distance of somatic and germ cells with age-specific markers from the hub, these results allow the authors to correlate position of lipid droplets with the age of cells in which they are present. This analysis is clearly shown and well quantified.

      The quantification of lipid droplet distance from the hub is applied well in comparing brummer mutant testes to wild type controls. The authors measure the number of lipid droplets of specific diameters, and the spatial distribution of lipid droplets as a function of distance from the hub. These measurements quantitatively support their findings that lipid droplets are present in an expanded population of cells further from the hub in brummer mutants. The authors further quantify lipid droplets in germline clones of specified ages; the quantitative analysis here is displayed clearly and supports a cell autonomous role for brummer in regulating lipid droplets in spermatocytes.

      Data examining testis size and number of spermatids in brummer mutants clearly indicates the importance of regulating lipid droplets to spermatogenesis. The authors show beautiful images supported by rigorous quantification supporting their findings that brummer mutants have both smaller testes with fewer spermatids at both 29 and 25C. There is also significant data supporting defects in testis size, but not spermatid number, in 14-day-old brummer mutant animals compared to controls. Their analysis clearly shows an expanded region beyond the testis apex that includes younger germ cells, supporting a role for lipid droplets influencing germ cell differentiation during spermatogenesis.

      The authors present a series of data exploring a cell autonomous role for brummer in the germline, including clonal analysis and tissue specific manipulations. The clonal data indicating increased lipid droplets in spermatocyte clones, and a higher proportion of brummer mutant GSCs at the hub are convincing and supported by quantitation. The authors also show a tissue specific rescue of the brummer testis size phenotype by knocking down mdy specifically in germ cells, which is also supported by statistically significant quantitation. The authors present data examining the number of spermatocyte and post-meiotic clones 14 days after clonal induction. Their finding is significant with a p-value of 0.0496, which they acknowledge is less robust than their other data reported in this study, and could be a result of a low sample size. They indicate that future studies might validate these results with additional samples.

      The authors do a beautiful job of validating where they detect brummer-GFP by presenting their own pseudotime analysis of publicly available single cell RNA sequencing data. Their data is presented very clearly, and supports expression of brummer in older somatic and germline cells of the age when lipid droplets are normally not detected. The authors also present a thorough lipidomic analysis of animals lacking brummer to identify triglycerides as an important lipid droplet component regulating spermatogenesis.


      The authors present data supporting the broad significance of their findings across phyla. This data represents a key strength of this manuscript. The authors show that loss of a conserved triglyceride lipase impacts testis development and spermatogenesis, and that these impacts can be rescued by supplementing diet with medium-chain triglycerides. The authors point out that these findings represent a biological similarity between Drosophila and mice, supporting the relevance of the Drosophila testis as a model for understanding the role of lipid droplets in spermatogenesis. The connection buttresses the relevance of these findings and this model to a broad scientific community.

    1. eLife assessment

      This manuscript describes a model to estimate what fraction of DNA from specific human tissues becomes cell-free DNA in plasma. This fundamental study, supported by convincing evidence, will be of great interest to the community, as the amount of DNA from a certain tissue (for example, a tumor) that becomes available for detection in the blood has significant implications for disease detection.

    1. eLife assessment

      This fundamental study identifies the homeodomain transcription factor Meis2 as a transcriptional regulator of maturation and end-organ innervation of low-threshold mechanoreceptors (LTMRs) in the dorsal root ganglia (DRG) of mice. The authors use histology, behavioral tests, RNA-sequencing, and electrophysiological recordings to provide evidence that conditional deletion of Meis2 in postmitotic DRG neurons causes gene expression changes together with targeting errors and altered sensory neuron responses, ultimately resulting in reduced sensitivity to light touch in mutant animals. The data presented are convincing, the discussion comprehensive, and the conclusions drawn justified.

    1. eLife assessment

      This is an important follow-up study to a previous paper in which the authors reconstituted CO2 metabolism (autotrophy) in Escherichia coli. Here, the authors define a set of just three mutations that promote autotrophy, highlighting the malleability of E. coli metabolism. The authors make a convincing case that mutations in pgi are loss-of-function mutations that prevent metabolic efflux from the reductive pentose phosphate autocatalytic cycle, and their data suggest possible roles of mutations in two other genes - crp and rpoB. This research will be particularly interesting to synthetic biologists, systems biologists, and metabolic engineers aiming to develop synthetic autotrophic microorganisms.

    1. eLife assessment

      Yang et al. investigate whether distinct sources of conflict are represented in a common cognitive space. The study uses an interesting task that mixes different sources of difficulty and reports that the brain appears to represent these sources as a mixture on a continuum in prefrontal areas. While the findings could be valuable to theory in this area, there are some concerns with the design and results, that raise uncertainty regarding the main conclusion of a shared cognitive space. The authors appropriately acknowledge these limitations while also highlighting the valid contributions that the study makes. Thus, while solid evidence is reported here, consistent with the central hypothesis, further experiments are required to support the strictest interpretation.

    1. Author Response

      We thank the editors and the reviewers for their assessment of our revised manuscript. Please see bellow, our answers to the recommendations by reviewer #2.

      Figure S2F - Seems like a very narrow range of parameters. Is there some fine tuning here?

      The range of values of tau_P that yields previous-trial biases is bounded by below and above for the following reasons: above a certain value of tau_P (therefore large integration time), the bump that had formed in the previous trial is not strong enough to remain stable for a long time, and therefore dissipates by the time the current trial starts (especially when adaptation is fast, towards the left of the third panel). Below a certain value, instead, this integration timescale is small enough to quickly form a representation of the current trial, hence the bump from the previous trial quickly dissipates (due to mutual inhibition). This interplay between the integration and the adaptation timescale as well as considering a phenomenon which is bounded in time (how close the activity bump is to the second stimulus of the previous trial which is presented between -22.4 and -5.6 seconds from the moment we are considering) yields a region for tau_P which is bounded. This region, however, appears narrow due to the limited number of points we have considered for the simulation grid.

      Regarding my comment on lapse at the boundaries (old line 221). Lapse parameters in psychometric curves correspond to errors on the "easy" trials. But the mechanistic explanation for lapse trials is that there is a non-zero probability for the subject to respond in a manner that is random and independent of the stimulus. In the case of extreme stimuli, this is the only reason for errors, and thus looking at the edges of the psychometric curves allows to calculate lapse rate. But - the usual assumption for underlying mechanism is that the subject lapses in all trials, regardless of stimulus. If I understand correctly, this is different than the mechanistic reason for lapses in the network model, which was described as something that happens more in the edges than in the center. Or more generally, to be a stimulus-dependent effect.

      We thank the reviewer for this clarification. The reviewer is right that in our mechanistic model, lapses (as defined by errors on easy trials) are more likely to occur for extreme stimuli, due to the vicinity to the boundary of the attractor. Such errors also occur for non-extreme stimuli, when delay intervals are long enough for the bump in PPC to drift to the boundaries. In experiments, lapse trials as described by the reviewer occur due to multiple different reasons; for lapse that is independent of the stimuli, mechanisms such as attention have been thought to play a role, this however is not included in our model.

      What are the parameters for the distributions (skewed, bimodal, ...)?

      These parameters are reported in the legend of Fig.6, where the distributions appear.

      Bump with adaptation. Sorry for the draft-like comment. I don't think the existing studies are in the form you describe. I do think it might be useful to point readers to these studies. If an interested reader wishes to understand network dynamics in this and similar scenarios, it might be useful to have the pointers. The reference I had in mind was Romani, S., & Tsodyks, M. (2015). Short‐term plasticity based network model of place cells dynamics. Hippocampus, 25(1), 94-105.

      We thank the reviewer for the clarification, and we will include this reference in the Version of Record.

      The following is the authors’ response to the original reviews.

      eLife assessment

      This is an important study about the mechanisms underlying our capacity to represent and hold recent events in our memory and how they are influenced by past experiences. A key aspect of the model put forward here is the presence of discrete jumps in neural activity with the posterior parietal region of the cortex. The strength of evidence is largely solid, with some weaknesses noted in the methodology. Both reviewers suggested ways in which this aspect of the model can to be tested further and resolve conflicts with previously published experimental results, in particular the study by Papadimitriou et al 2014 in Journal of Neurophysiology.

      We thank the editors for their assessment. As mentioned in the cover letter, we have addressed all the reviewers’ concerns and would like to request and update of the assessment to reflect the revisions we have made.

      Public Reviews:

      We thank both reviewers for their careful reading and feedback that helped clarify many aspects of the model. Below, we address their comments.

      Reviewer #1 (Public Review):

      This paper aims to explain recent experimental results that showed deactivating the PPC in rats reduced both the contraction bias and the recent history bias during working memory tasks. The authors propose a twocomponent attractor model, with a slow PPC area and a faster WM area (perhaps mPFC, but unspecified). Crucially, the PPC memory has slow adaptation that causes it to eventually decay and then suddenly jump to the value of the last stimulus. These discrete jumps lead to an effective sampling of the distribution of stimuli, as opposed to a gradual drift towards the mean that was proposed by other models. Because these jumps are single-trial events, and behavior on single events is binary, various statistical measures are proposed to support this model. To facilitate this comparison, the authors derive a simple probabilistic model that is consistent with both the mechanistic model and behavioral data from humans and rats. The authors show data consistent with model predictions: longer interstimulus intervals (ISIs) increase biases due to a longer effect over the WM, while longer intertrial intervals (ITIs) reduce biases. Finally, they perform new experiments using skewed or bimodal stimulus distributions, in which the new model better fits the data compared to Bayesian models.

      The mechanistic proposed model is simple and elegant, and it captures both biases that were previously observed in behavior, and how these are affected by the ISI and ITI (as explained above). Their findings help rethink whether our understanding of contraction bias is correct.

      On the other hand, the main proposal - discrete jumps in PPC - is only indirectly verified.

      We agree with the reviewer that the evidence for discrete jumps in PPC has been provided in behavioural results (short-term, n-back trial biases), and not from neural data. However, we believe electrophysiological investigations are out of the scope of the current manuscript and future works are needed to further verify the results.

      The model predicts a systematic change in bias with inter-trial-interval. Unless I missed it, this is not shown in the experimental data. Perhaps the self-paced nature of the experiments allows to test this?

      We thank the reviewer for this great suggestion.

      We had not previously looked at this in the data for the reason that in the simulations, the ITI is set to either 2.2, 6 or 11 seconds, whereas the experiment is self-paced. Therefore, any comparison with the simulation should be made carefully.

      However, after the reviewer’s suggestion, we did look at the change in the bias with the inter-trial interval, by dividing trials according to ITIs lower than 3 seconds (“short” ITI), and higher than 3 seconds (“long” ITI). This choice was motivated by the shape of the distribution of ITIs, which is bimodal, with a peak around 1 second, and another after 3 seconds (new Fig 8F). Hence, we chose 3 seconds as it seemed a natural division. However, 3 seconds also happens to be approximately the 75th percentile of the distribution, and this means that there is much more data in the “short” ITI than the “long” ITI set. In order to have sufficient data in the “long” ITI for clearer effects we used all of our dataset – the negatively skewed, and also two bimodal distributions (of which only one was shown in the manuscript, for succinctness). This larger dataset allows us to clearly see not only a decreasing contraction bias with increasing ITI (Fig 8G), but also a decreasing onetrial-back attractive bias with increasing ITI (Fig 8H). We have uploaded all the datasets as well as scripts used to analyze them to this repository: https://github.com/vboboeva/ParametricWorkingMemory_Data.

      The data in some of the figures in the paper are hard to read. For instance, Figure 3B might be easier to understand if only the first 20 trials or so are shown with larger spacing. Likewise, Figure 5C contains many overlapping curves that are hard to make out.

      We have limited the dynamics in Fig 3B to the first 50 trials for better visibility. Likewise, as suggested, we report the standard error of the mean instead of the standard deviation in old Fig 5C (new Fig 6C) – this allows for the different curves to be better discernible.

      There is a gap between the values of tau_PPC and tau_WM. First - is this consistent with reports of slower timescales in PFC compared to other areas?

      Recent studies by Xiao-Jing Wang and colleagues (Refs. 1-3 below) suggest that may be the case. In Wang et al 2023, Ref 1 below), the authors use a generative model to study the concept of bifurcation in space in working memory, that is accompanied by an inverted-V shape of the time constants as a function of cortical hierarchy.

      Briefly, they propose a generative model of the cortex with modularity, incorporating repeats of a canonical local circuit connected via long-range connections. In particular, the authors define a hierarchy for each local circuit. At a critical point in this hierarchy axis, there is a phase transition from monostability to bistability in the firing rate. This means that a local circuit situated below the critical point will only display a low activity steady state, while those above the critical point additionally display a persistent activity steady state.

      The model predicts a critical slowing down of the neural fluctuations at the critical point, resulting in an inverted-V shape of the time constants as a function of the hierarchy. They test the predictions of their model – the bifurcation in space and that inverted-V-shaped time constants as a function of the hierarchy - on connectome-based models of the macaque and mouse cortex. Interestingly both datasets show similar behavior. In particular, during working memory, frontal areas (higher in the hierarchy, e.g. area 24c in macaques) has a smaller time constant relative to posterior parietal areas (lower in the hierarchy, like LIP or f7). We have now cited this new work.

      [1] https://www.biorxiv.org/content/10.1101/2023.06.04.543639v1

      [2] https://elifesciences.org/articles/72136

      [3] https://www.biorxiv.org/content/10.1101/2022.12.05.519094v3.abstract

      Second - is it important for the model, or is it mostly the adaptation timescale in PPC that matters?

      We have run simulations producing a phase diagram with tau_theta^P on the x-axis, tau^P on the y-axis, and in color, the fraction of trials in which the bump is in the vicinity of a target (Fig S2 F), before the network is presented with the second stimulus. This target can be the first stimulus s_1 (left), mean over stimuli (middle) and previous trial’s stimulus (right)). White point corresponds to parameters of the default network.

      In this phase diagram, the lowest value that tau_P takes is tau_WM=0.01. When tau_P=tau_WM, the bump is rarely in the vicinity of 1-trial-back stimulus, and we can see that tau_PPC should be greater than tau_WM in order for the model to yield 1-trial back effects. We conclude that it is indeed important for tau_PPC > tau_WM.

      We have included this in Fig S2 F of the manuscript.

      Regarding the relation to other models, the model by Hachen et al (Ref 45) also has two interacting memory systems. It could be useful to better state the connection, if it exists.

      The model proposed by Hachen et al is conceptually different in that one module stores the mean of the sensory stimulus; it could be related to a variant of our model where adaptation is turned off in the PPC network (Fig S2 A). However, the task they model is also different: subjects have to learn the location of a boundary according to which the stimulus is classified as ‘weak’ or ‘strong’, set by the experimenter. Hence, it is a task where learning is needed - this contrasts with the task we are modelling, where only working memory is required. How task demands reconfigure existing circuits via dynamics and/or learning to perform different computations is a fascinating area of research that is outside the scope of this work.

      Reviewer #2 (Public Review):

      Working memory is not error free. Behavioral reports of items held in working memory display several types of bias, including contraction bias and serial dependence. Recent work from Akrami and colleagues demonstrates that inactivating rodent PPC reduces both forms of bias, raising the possibility of a common cause.

      In the present study, Boboeva, Pezzotta, Clopath, and Akrami introduce circuit and descriptive variants of a model in which the contents of working memory can be replaced by previously remembered items. This volatility manifests as contraction bias and serial dependence in simulated behavior, parsimoniously explaining both sources of bias. The authors validate their model by showing that it can recapitulate previously published and novel behavioral results in rodents and neurotypical and atypical humans.

      Both the modeling and the experimental work is rigorous, providing compelling evidence that a model of working memory in which reports sometimes sample past experience can produce both contraction bias and serial dependence, and that this model is consistent with behavioral observations across rodents and humans in the parametric working memory (PWM) task.

      Evidence for the model advanced by the authors, however, remains incomplete. The model makes several bold predictions about behavior and neural activity, untested here, that either conflict with previous findings or have yet to be reported but are necessary to appropriately constrain the model.

      First, in the most general (descriptive) formulation of the Boboeva et al. model, on a fraction of trials items in working memory are replaced by items observed on previous trials. In delayed estimation paradigms, which allow a more direct behavioral readout of memory items on a trial-by-trial basis than the PWM task considered here, reports should therefore be locked to previous items on a fraction of trials rather than display a small but consistent bias towards previous items. However, the latter has been reported (e.g., in primate spatial working memory, Papadimitriou et al., J Neurophysiol 2014). The ready availability of delayed estimation datasets online (e.g., from Rademaker and colleagues, https://osf.io/jmkc9/) will facilitate in-depth investigation and reconciliation of this issue.

      As pointed out by the reviewer, in the PWM task that we are modelling here, the activity in the network is used to make a binary decision. However, it is possible to directly analyse the network activity before the onset of the second stimulus.

      In their manuscript, Papadimitriou et al. study a memory-guided saccade task in nonhuman primates and argue that the animals display a small but consistent bias towards previous items (Fig 2). In that figure, the authors compute the error as the difference between the saccade direction and target direction in each trial. They compute this error for all trials in which the preceding trial’s target direction is between 35° and 85° relative to the current trial (counterclockwise with respect to the current trial’s target). They discover that the residual error distribution is unimodal with a mode at 1.29° and a mean at 2.21° (positive, so towards the preceding target’s direction), from which they deduce a small but systematic bias towards previous trial targets.

      We have computed a similar measure for our network with default parameters (Table 1), by subtracting the location of the bump at the end of the delay interval (s_hat(t), ‘saccade’) from the initial location of the first stimulus in the current trial (s1(t) or the ‘target’). We have done this for all trials where s1(t)=0.2, and where s2(t-1) takes specific values. These distributions are characterized by two modes. The first corresponds to those trials where the bump is not displaced in WM (i.e. mean of zero). We can also see the appearance of a second mode at the location of s1(t) - s2(t-1), corresponding to the displacements towards the preceding trial’s stimulus described in the main text. If, instead, we limit the analysis to a small range of previous trials close to s1(t) (similar to Papadimitriou et al) then the distribution of residual errors will appear unimodal, as the two modes merge. Importantly, note that there is a large variability around the second mode, expressing a more complex dynamics in the network. As can be seen in Fig 3B, the location of the bump is not always slaved to the one in the PPC in a straightforward way -- due to the adaptation in the PPC, the global inhibition in the connectivity kernel, as well as interleaved design for various delay intervals, the WM bump can be displaced in nontrivial ways (see also Recommendation no 4), yielding the dispersion around the second peak. It remains to be seen whether such patterns can be observed in the data from previous works on continuous working memory recall (including Papadimitriou et al). However, to our knowledge, such detailed and full analysis of errors at the level of individual trials has not been done.

      In summary, this analysis shows that the type of dynamics in our network is not one of the two cases: 1) small and systematic bias in each and every trial or 2) large error that occurs only rarely; rather, the dispersion around both modes suggests that the dynamics in our model are a mixture of these two limit cases.

      We have also performed another typical analysis, reported in several continuous recall tasks (e.g. Jazayeri and Shadlen 2010) where contraction bias has been reported. We plot WM bump locations after the delay period for every trial (s_hat(t)), and their averages, against the nominal value of s1(t). We see that the mean WM location deviates from the identity line toward the mean values of s1(t), again showing contraction bias as an average effect, while individual trials follow the dynamics explained above.

      We have now included a new section on continuous recall (Sect. 1.5 and a new figure (Fig 5)), which details the two above-mentioned analyses. The analysis of freely available datasets of delayed estimation tasks, unfortunately, is out of the scope of this work, and we leave such analyses to future studies.

      Second, the bulk of the modeling efforts presented here are devoted to a circuit-level description of how putative posterior parietal cortex (PPC) and working-memory (WM) related networks may interact to produce such volatility and biases in memory. This effort is extremely useful because it allows the model to be constrained by neural observations and manipulations in addition to behavior, and the authors begin this line of inquiry here (by showing that the circuit model can account for effects of optogenetic inactivation of rodent PPC).

      Further experiments, particularly electrophysiology in PPC and WM-related areas, will allow further validation of the circuit model. For example, the model makes the strong prediction that WM-related activity should display 'jumps' to states reflecting previously presented items on some trials. This hypothesis is readily testable using modern high-density recording techniques and single-trial analyses.

      As mentioned in response to the previous comment, we note again that in the WM network, the bump ‘displacement’ has a complex dynamics -- the examples we have provided in Fig 1A and 2B mainly show the cases in which jumps occur in the WM network, but this is not the only type of dynamics we observe in the model. We do have instances in which the continuity of the model causes drift across values, and we have now replaced the right panel in Fig 2B with one such instance, in order to emphasize that this displacement towards the previous trial’s stimulus (s2(t-1)) can occur in various ways. For a more thorough analysis, we have analyzed the distance between s1(t) and the position of the bump in the WM network at the end of the delay period s_hat(t), conditioned on specific values of s1(t) and s2(t-1) (Fig 5C). In this figure, we can see the appearance of two modes: one centered around 0, corresponding to the correct trials where the stimulus is kept in WM (s1(t) = s_hat(t)), and another mode centered around s2(t-1), the location of the second stimulus of the previous trial, where the bump is displaced. Note, as we explain in Sect. 1.5, the large dispersion around this second mode, which suggests that the bump is not always displaced to that specific location and may undergo drift.

      We agree with the reviewer that future electrophysiological experiments (or analysis of existing datasets) are necessary for validation of these results.

      Finally, while there has been a refreshing movement away from an overreliance on p-values in recent years (e.g., Amrhein et al., PeerJ 2017), hypothesis testing, when used appropriately, provides the reader with useful information about the amount of variability in experimental datasets. While the excellent visualizations and apparently strong effect sizes in the paper mitigate the need for p-values to an extent, the paucity of statistical analysis does impede interpretation of a number of panels in the paper (e.g., the results for the negatively skewed distribution in 5D, the reliability of the attractive effects in 6a/b for 2- and 3- trials back).

      We share the reviewer’s criticism towards the misuse of p-values – in order for a clearer interpretation of old Fig 5D (new Fig 7E), we have looked at the 2 and 3 trials-back biases by using all of our dataset – the negatively skewed, and also two bimodal distributions (of which only one was shown in the manuscript). This larger dataset of 43 subjects (approximately 17,200 trials) allows us to clearly see the 2 and 3 trial back attractive biases, and the effect that the delay interval exerts on them.

      Reviewer #1 (Recommendations For The Authors):

      Fig 5 A&C - It might be beneficial to separate the distribution of stimuli from the performance. It is hard to read the details of the performance, especially with error bars.

      Following the next recommendation, we have exchanged the standard deviation to standard errors of the mean, hopefully this allows to better read the performance.

      Fig 5C. The number of participants should be written. Perhaps standard errors instead of standard deviation?

      We have now changed the standard deviation to standard errors of the mean and included the number of participants in the figure.

      Fig 2B - hard to understand, because there is no marking of where "perfect" memory of s1 would be.

      The perfect memory of s1 is shown in the upper panel as black bars.

      Fig 3B. dot number 9 (blue, around 0.7) - why is WM higher than stimulus?

      This trial has a long ISI (blue means 10s). During this delay, the bump in the PPC, under the influence of adaptation, drifts far below the first stimulus (note that the previous trial also had its first stimulus in the same location, as a result of which the adaptative thresholds have built up significantly, causing the bump to move away from that location). During this delay period, neurons in the WM network receive inputs from the PPC network: if this input is strong enough, it can disrupt an existing bump; if not, this input still exerts inhibiting influence on the existing bump via the global inhibition in the connectivity. This can cause an existing bump to slowly drift in a random direction, and finally dissipate. Note that the lines in Fig 2B represent the neuron with the maximal activity, this activity may be a stable bump, or an unstable bump that may soon dissipate.

      Other examples with similar dynamics include trials 43 and 54.

      L167 fewer -> smaller

      We have now corrected this.

      Fig 3C - bump can also be in between. Is this binned?

      We have not binned the length of the attractor; to produce that figure, we check whether the position of the neuron with the maximal firing rate is within a distance of ±5% of the length of the whole line attractor from the target location.

      L221 Lapse at the boundary of attractor. This seems very different from behavior. Specifically, if it is in the boundaries, it should be stimulus dependent.

      Very sorry, we did not manage to understand the reviewer’s comment.

      L236 are -> is

      We have now corrected this.

      Fig S4 - should be mostly in main text.

      Part of this figure is in Fig 6A, but given the amount of detail, we think Supplementary Material is better suited.

      L253-254. Differences across all distributions - very minor except the bimodal case.

      That is correct, this is why we conducted the experiment with the bimodal distribution, to better differentiate the predictions of the two models.

      L273 extra comma after "This probability"

      We have now corrected this.

      ITI was only introduced in section 1.5.2. Perhaps worth mentioning the default 5s value earlier in the paper.

      We have now mentioned this in line 97-98.

      Fig S6B title: perhaps "previous stimuli"?

      We have now corrected this.

      L364 i"n A given trial"

      Equation 2 - no decay term?

      Thank you for pointing out this error, we have now corrected this.

      Equation 5,6 are j^W and j^P indices of neurons in those populations?

      Yes, j^W indexes neurons in the WM network, and j^P those in the PPC. We have now added this in the text for clarity.

      Bump with adaptation - other REFs? Sandro?

      We are aware of continuous bump attractors implementing short-term synaptic plasticity in various studies (including by Sandro Romani), but not in the form we have described. May the reviewer kindly point us towards the relevant literature.

      Free boundary - what is the connectivity for neurons 1 and N? Is it weaker than others? Is the integral still 1? Does this induce some bias on the extreme values?

      The connectivity of the network is all-to-all. However, as expressed by Eq. (3), the distance-dependent contribution to the weights, K, decreases exponentially as we move from neuron 1 onwards, and from neuron N down. The sum (or integral, in the large-N limit) of the K_ij for j on either side of neuron i is unity only when i is sufficiently far from 1 or N. We have rephrased the paragraph starting in line 516 to make this clearer.

      The presence of a boundary could introduce a bias in theory, but in practice, it affects the dynamics only when the bump drifts sufficiently close to it. The smallest stimulus in the simulated task has amplitude 0.2, with width 0.05, which implies the activation of 50 neurons on either side of neuron 400. If one compares this with the width of the kernel K in stimulus space (d_0 = 0.02), which spans ~10 neurons, we can see that the bump of activity stays mostly far from the boundary. It is possible, though it is observed rarely, when several consecutive long delay intervals happen to occur, that the bump in PPC drifts beyond the location corresponding to either the minimum or maximum stimulus.

      Code availability?

      Code simulating the dynamics of the network as well as analysing the resulting data can be found in the following repository: https://github.com/vboboeva/ParametricWorkingMemory Code used to analyse human behavioural data and fit them with our statistical model can be found in this repository: https://github.com/vboboeva/ParametricWorkingMemory_Data Code used to run the auditory PWM experiments with human subjects (adapted from Akrami et al 2018) can be found here: https://github.com/vboboeva/Auditory_PWM_human

      L547 stimuli

      We have now corrected this.

      Equation 14 uses both stimuli. Was this the same for the rest of analysis in the paper (first figures for instance)?

      This equation was used for all GLM analyses (Figs 9 and S6).

      D0 is very small (0.02). Does this mean that activity is essentially discrete in the model? Fig 1A & 2B - the two examples of model activity suggest this is the case. In other words - are there cases where the continuity of the model causes drift across values? Can you show an example (similar to Fig 1A)?

      Since this point has been raised beforehand, we refer to the first comment, Fig 2B and Sect. 1.5 for the response to this question.

      Table 1 - inter trial interval 6. Text says 5

      We have now corrected this in the text.

      Reviewer #2 (Recommendations For The Authors):

      In addition to my review above, I just have a few minor comments:

      • If I understood correctly, the squares inside the purple rectangle in Figure 1B are meant to show a gradation from red to blue, but this was hard to make out in the pdf.

      Actually the squares are all on one side or the other of the diagonal, therefore they do not have any gradation.

      • line 164: "The resulting dynamics... [are]?"

      We have corrected this in the text.

      • Fig 7B legend: "The network performance is on average worse for longer ITIs" – correct?

      This was a mistake, we have replaced worse with better.

      Other comments

      We realized that the colorbar reported the incorrect fraction classified in Figs 1B, 2C, 7B (new 8B), S2C, S3A, S5B. We have corrected this in the new version of the manuscript.

      We also found a minor mistake in one of our analysis codes that computed the n-trial back biases for different delay intervals. This did not change our results, actually made the effects clearer. The figures concerned are Fig 3F and new Fig 7E.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This article describes a useful python-based image-analysis tool for bacteria growing in the 'mother-machine' microfluidic device. This new method for image segmentation and tracking offers a user-friendly graphical interface based on the previously developed, promising environment for image analysis 'Napari'. The authors demonstrate the usefulness of their software and its robust performance by comparing it to other methods used for the same purpose. The comparison provides solid support for the new method, although it would have been even stronger if tested using data sets from other groups. This article will be of interest for scientists who utilize the 'mother machine', not least because it also provides a short overview of how to set up this widely used device.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors aim to develop an easy-to-use image analysis tool for the mother machine that is used for single-cell time-lapse imaging. Compared with related software, they tried to make this software more user-friendly for non-experts with a design of "What You Put Is What You Get". This software is implemented as a plugin of Napari, which is an emerging microscopy image analysis platform. The users can interactively adjust the parameters in the pipeline with good visualization and interaction interface.


      • Updated platform with great 2D/3D visualization and annotation support.

      • Integrated one-stop pipeline for mather machine image processing.

      • Interactive user-friendly interface.

      • The users can have a visualization of intermediate results and adjust the parameters.

      We thank the reviewer for their positive comments.


      • Based on the presentation of the manuscript, it is not clear that the goals are fully achieved.

      • Although there is great potential, there is little evidence that this tool has been adopted by other labs.

      • The comparison of Otsu and U-Net results does not make much sense to me. The systematic bias could be adjusted by threshold change. The U-Net output is a probability map with floating point numbers. This output is probably thresholded to get a binary mask, which is not mentioned in the manuscript. This threshold could also be adjusted. Actually, Otsu is a segmentation method and U-Net is an image transformation method and they should not be compared together. U-Net output could also be segmented using Otsu.

      We agree that the comparison of the classical and U-Net results may be misleading. As the reviewer points out, the issue ultimately comes down to thresholding. Indeed, the threshold of both the Otsu and U-Net outputs could be adjusted to bring them into line with each other. The comparison between the Otsu pipeline and U-Net pipeline is meant to illustrate that any pipeline (making use of a variety of methods) may be highly susceptible to the value of a user-input (or hard-coded threshold).

      We have clarified the discussion to emphasize that the comparison is not specifically between U-Net and Otsu but between the two pipelines (lines 238 - 257).

      We have also clarified that the U-Net probability map output was binarized with a threshold of 0.5 (lines 538-541). We note the same activation function and threshold are used in DeLTA. As the reviewer points out, Otsu’s method could indeed be applied to threshold the U-Net output as well. What we referred to as the “Otsu” MM3 method itself uses Otsu thresholding coupled with a Euclidean distance transform and a Random Walker algorithm. For clarity we now refer to it as a classical or non-learning method in the text.

      • The diversity of datasets used in this study is limited.

      We have added a section “Testing napari-MM3 on other datasets” (lines 187-196) evaluating the performance of MM3 on 4 datasets (3 E. coli, 1 Corynebacterium glutamicum) from outside our lab, demonstrating its versatility.

      • There is some ambiguity in the main point of this manuscript, the title and figures illustrate a complete pipeline, including imaging, image segmentation, and analysis. While the abstract focus only on the software MM3. If only MM3 is the focus and contribution of this manuscript, more presentations should focus on this software tool. It is also not clear whether the analysis features are also integrated with MM3 or not.

      We have added a line (lines 160-162) clarifying that final analysis and plotting must be done outside of napari. MM3 itself processes raw microscopy images, segments cells and reconstructs cell lineages (Figure 2).

      • The impact of this work depends on the adoption of the software MM3. Napari is a promising platform with expanding community. With good software user experience and long-term support, there is a good chance that this tool could be widely adopted in the mother machine image analysis community.

      We thank the reviewer for their endorsement of MM3’s potential.

      • The data analysis in this manuscript is used as a demo of MM3 features, rather than scientific research.

      Reviewer #2 (Public Review):

      The authors present an image-analysis pipeline for mother-machine data, i.e., for time-lapses of single bacterial cells growing for many generations in one-dimensional microfluidic channels. The pipeline is available as a plugin of the python-based image-analysis platform Napari. The tool comes with two different previously published methods to segment cells (classical image transformation and thresholding as well as UNet-based analysis), which compare qualitatively and quantitatively well with the results of widely accessible tools developed by others (BACNET, DelTA, Omnipose). The tool comes with a graphical user interface and example scripts, which should make it valuable for other mother-machine users, even if this has not been demonstrated yet.

      We thank the reviewer for their positive comments.

      The authors also add a practical overview of how to prepare and conduct mother-machine experiments, citing their previous work and giving more advice on how to load cells using centrifugation. However, the latter part lacks detailed instructions.

      We have added a more detailed experimental protocol, including the procedure we use for cell loading, to the lab github page https://github.com/junlabucsd/mother-machine-protocols (linked in the main text).

      Finally, the authors emphasize that machine-learning methods for image segmentation reproduce average quantities of training datasets, such as the length at birth or division. Therefore, differences in training can propagate to difference in measured average quantities. This result is not surprising and is normally considered a desired property of any machine-learning algorithm as also commented on below.

      Points for improvement:

      Different datasets: The authors demonstrate the use of their method for bacteria growing in different growth conditions in their own microscope. However, they don't provide details on whether they had to adjust image-analysis parameters for each dataset. Similarly, they say that their method also works for other organisms including yeast and C. elegans (as part of the Results section) but they don't show evidence nor do they write whether the method needs to be tuned/trained for those datasets. Finally, they don't demonstrate that their method works on data from other labs, which might be different due to differences in setup or imaging conditions.

      We have added a section “Testing napari-MM3 on other datasets” (lines 187-196) evaluating the performance of MM3 on 4 datasets (3 E. coli, 1 Corynebacterium glutamicum) from outside our lab, demonstrating its versatility. We provide details of the procedure and parameters used in the Methods section. (“Analysis of external datasets” lines 476-486).

      Bias due to training sets:

      The bias in ML-methods based on training datasets is not surprising but arguably a desired property of those methods. Similarly, threshold-based classical segmentation methods are biased by the choice of threshold values and other segmentation parameters. A point that would have profited from discussion in this regard: How to make image segmentation unbiased, that is, how to deliver physical cell boundaries? This can be done by image simulations and/or by comparison with alternative methods such as fluorescence microscopy.

      We agree this is an important point. We have revised the relevant sections (lines 238 - 270) to add context to the discussion of bias in both classical and deep learning methods. We have added a subsection (lines 401 - 410) discussing methods to this end, such as synthetic training data generation or calibrating the segmentation to fluorescence images.

      The authors stress the user-friendliness of their method in comparison to others. For example, they write: 'Unfortunately, many of these tools present a steep learning curve for most biologists, as they require familiarity with command line tools, programming, and image analysis methods.' I suggest to instead emphasize that many of the tools published in recent years are designed to be very use friendly. And as will all methods, MM3 also comes at a prize, which is to install Napari followed by the installation of MM3, which, according to their own instructions, is not easy either.

      We have modified our language to acknowledge that indeed recent software such as DeLTA and BACMMAN make a point to be user-friendly and accessible (lines 52-53).

      Reviewer #1 (Recommendations For The Authors):

      -The resources, including documentation and code, are referenced and are not easy to find. It should be easier for readers to curate them in a separate Resources section.

      We have created a Resources section in the Methods (top of first page) with the documentation, code and protocols hyperlinked.

      • It would be easier to understand the usage of MM3 with a screen recording video. I found a video from the GitHub paper, but the resolution is a bit low. Attaching a high-resolution screenshot video would be helpful.

      A high resolution tutorial video has been made more visible on the github page.

      • In Table 1, AMD GPU is used which is not easy to use for Deep Learning. It is not clear whether the GPU is used for Deep Learning training and inference.

      We have clarified this point in the Table 1 caption, and linked to a reference on how to use AMD GPUs with Tensorflow on Macs.

      • Some paragraphs in the Discussion section are like blogs with general recommendations. Although the suggestions look pretty useful, it is not the focus of this manuscript. It might be more appropriate to put it in the GitHub repo or a documentation page. The discussion should still focus on the software, such as features, software maintenance, software development roadmap, and community adoption.

      • It would be easier for reviewers to add line numbers in the manuscript.

      Reviewer #2 (Recommendations For The Authors):

      Software Installation: This might be something for the GitHub forum, but briefly trying to install the plugin myself, I already failed at the first line of the GitHub instructions, which is to use mamba for installation. This relates to my point above: Any program that is not stand-alone requires some user-savviness and trial-and-error, which is just hard to avoid for any method. I suggest being less critical of 'other methods' and instead focus on the advantage of the mother-machine-specific aspects of napari-mm3.

      The authors write 'Still, most labs do not have the time and resources to evaluate other tools they do not use critically, [...]'. The sentence is not very clear. Evaluating tools not used is obviously difficult/impossible.

      We have reworded this sentence to be more clear (lines 54-55).

      The authors write: 'The supervised learning method uses a convolutional neural net (CNN) with the U-Net architecture [20].' Can the authors cite previous work that has taken advantage of this approach before (e.g., DelTA)?

      We have added citations to DeLTA and other previous software (line 151).

      Cell tracking and lineage reconstruction should be described in more detail and/or with reference to previous work.

      We have added more details to the SI (lines 554 - 567) discussing the method in the context of existing mother machine analysis software.

      The authors provide a figure for a '3D printed cell loader', but as far they don't give instructions including a CAD file and the model of the fan used for spinning. The same holds for the stage inset (which, as far as I see, is not referred to in the manuscript text nor described in a figure caption).

      Thank you for pointing out this omission. The centrifuge is referenced in Box 1. We have updated the manuscript with a link to a Github repository containing CAD files & details of the centrifuge construction. We decided to remove the stage insert from the figure.

      Figure S3: Is the asymmetry in growth rate due to the expression of a fluorescent protein, due to strain differences, or due to imaging artifacts? Maybe this is impossible to tell based on the available datasets, but this could be discussed.

      Based on previous work (DOI 10.1099/mic.0.057240-0) it is likely due to the expression of the fluorescent protein and fluorescence imaging. We have added a brief discussion in the Figure S3 caption.

    2. eLife assessment

      This article provides a review and test of image-analysis methods for bacteria growing in the 'mother-machine' microfluidic device, introduceing also a new graphical user interface for the computational analysis of mother-machine movies based on the 'Napari' environment. The tool allows users to segment cells based on two previously published methods (classical image transformation and thresholding as well as UNet-based analysis), with solid evidence for their robust performance based on comparison with other methods and use of datasets from other labs. While it was difficult to assess the user-friendliness of the new GUI, it appears to be valuable and promising for the field.

    3. Reviewer #1 (Public Review):

      The authors aim to develop an easy-to-use image analysis tool for the mother machine that is used for single-cell time-lapse imaging. Compared with related software, they tried to make this software more user-friendly for non-experts with a design of "What You Put Is What You Get". This software is implemented as a plugin of Napari, which is an emerging microscopy image analysis platform. The users can interactively adjust the parameters in the pipeline with good visualization and interaction interface.

      Strengths:<br /> - Updated platform with great 2D/3D visualization and annotation support.<br /> - Integrated one-stop pipeline for mather machine image processing.<br /> - Interactive user-friendly interface.<br /> - The users can have a visualization of intermediate results and adjust the parameters.

      Weaknesses:<br /> - Based on the presentation of the manuscript, it is not clear that the goals are fully achieved.<br /> - Although there is great potential, there is little evidence that this tool has been adopted by other labs.<br /> - the diversity of datasets used in this study is limited.<br /> - Some paragraphs in the Discussion section are like blogs with general recommendations. Although the suggestions look pretty useful, it is not the focus of this manuscript. It might be more appropriate to put it in the GitHub repo or a documentation page. The discussion should still focus on the software, such as features, software maintenance, software development roadmap, and community adoption.

      A discussion of the likely impact of the work on the field, and the utility of the methods and data to the community.<br /> - The impact of this work depends on the adoption of the software MM3. Napari is a promising platform with an expanding community. With good software user experience and long-term support, there is a good chance that this tool could be widely adopted in the mother machine image analysis community.<br /> - The data analysis in this manuscript is used as a demo of MM3 features, rather than scientific research.

    4. Reviewer #2 (Public Review):

      The authors present an image-analysis pipeline for mother-machine data, i.e., for time-lapses of single bacterial cells growing for many generations in one-dimensional microfluidic channels. The pipeline is available as a plugin of the python-based image-analysis platform Napari. The tool comes with two different previously published methods to segment cells (classical image transformation and thresholding as well as UNet-based analysis), which compare qualitatively and quantitatively well with the results of widely accessible tools developed by others (BACNET, DelTA, Omnipose). The tool comes with a graphical user interface and example scripts, which should make it valuable for other mother-machine users, even if this has not been demonstrated yet.

      The authors also add a practical overview of how to prepare and conduct mother-machine experiments, citing their previous work, referring to detailed instructions on their github page, and giving more advice on how to load cells using centrifugation.

      Finally, the authors emphasize that machine-learning methods for image segmentation reproduce average quantities of training datasets, such as the length at birth or division. Therefore, differences in training can propagate to differences in measured average quantities. This result is not surprising but good to remember before interpreting absolute measurements of cell shape.

    1. eLife assessment

      SCARF1 is a scavenger membrane-bound receptor that binds modified versions of lipoproteins and has a major role in maintaining lipid homeostasis. This useful study reports the crystal structure of SCARF1 and identifies putative binding sites for modified lipoproteins. While some aspects of the analysis are incomplete, others are solid, and overall, the study advances our knowledge of how scavenger receptors clear modified lipoproteins to maintain lipid homeostasis.

    2. Reviewer #1 (Public Review):

      Summary:<br /> This study provides an incremental advance to the scavenger receptor field by reporting the crystal structures of the domains of SCARF1 that bind modified LDL such as oxidized LDL and acylated LDL. The crystal packing reveals a new interface for the homodimerization of SCARF1. The authors characterize SCARF1 binding to modified LDL using flow cytometry, ELISA, and fluorescent microscopy. They identify a positively charged surface on the structure that they predict will bind the LDLs, and they support this hypothesis with a number of mutant constructs in binding experiments.

      Strengths:<br /> The authors have crystallized domains of an understudied scavenger receptor and used the structure to identify a putative binding site for modified LDL particles. An especially interesting set of experiments is the SCARF1 and SCARF2 chimeras, where they confer binding of modified LDLs to SCARF2, a related protein that does not bind modified LDLs, and use show that the key residues in SCARF1 are not conserved in SCARF2.

      Weaknesses:<br /> While the data largely support the conclusions, the figures describing the structure are cursory and do not provide enough detail to interpret the model or quality of the experimental X-ray structure data. Additionally, many of the flow cytometry experiments lack negative controls for non-specific LDL staining and controls for cell surface expression of the SCARF constructs. In several cases, the authors interpret single data points as increased or decreased affinity, but these statements need dose-response analysis to support them. These deficiencies should be readily addressable by the authors in the revision.

      The paper is a straightforward set of experiments that identify the likely binding site of modified LDL on SCARF1 but adds little in the way of explaining or predicting other binding interactions. That a positively charged surface on the protein could mediate binding to LDL particles is not particularly surprising. This paper would be of greater importance if the authors could explain the specificity of the binding of SCARF1 to the various lipoparticles that it does or does not bind. Incorporating these mutants into an assay for the biological role of SCARF1 would be powerful.

    3. Reviewer #2 (Public Review):

      Summary:<br /> The manuscript by Wang and colleagues provided mechanistic insights into SCARF1 and its interactions with the lipoprotein ligands. The authors reported two crystal structures of the N-terminal fragments of SCARF1 ectodomain (ECD). On the basis of the structural analysis, the authors further investigated the interactions between SCARF1 and modified LDLs using cell-based assays and biochemical experiments. Together with the two structures and supporting data, this work provided new insights into the diverse mechanisms of scavenger receptors and especially the crucial role of SCARF1 in lipid metabolism.

      Strengths:<br /> The authors started by determining the crystal structures of two fragments of SCARF1 ECD. The superposition of the two high-resolution structures, together with the predicted model by AlphaFold, revealed that the ECD of SCARF1 adopts a long-curved conformation with multiple EGF-like domains arranged in tandem. Non-crystallographic and crystallographic two-fold symmetries were observed in crystals of f1 and f2 respectively, indicating the formation of SCARF1 homodimers. Structural analysis identified critical residues involved in dimerization, which were validated through mutational experiments. In addition, the authors conducted flow cytometry and confocal experiments to characterize cellular interactions of SCARF1 with lipoproteins. The results revealed the vital role of the 133-221aa region in the binding between SCARF1 and modified LDLs. Moreover, four arginine residues were identified as crucial for modified LDL recognition, highlighting the contribution of charge interactions in SCARF1-lipoprotein binding. The lipoprotein binding region is further validated by designing SCARF1/SCARF2 chimeric molecules. Interestingly, the interaction between SCARF1 and modified LDLs could be inhibited by teichoic acid, indicating potential overlap in or sharing of binding sites on SCARF1 ECD.

      The author employed a nice collection of techniques, namely crystallographic, SEC, DLS, flow cytometry, ELISA, and confocal imaging. The experiments are technically sound and the results are clearly written, with a few concerns as outlined below. Overall, this research represents an advancement in the mechanistic investigation of SCARF1 and its interaction with ligands. The role of scavenger receptors is critical in lipid homeostasis, making this work of interest.

    4. Reviewer #3 (Public Review):

      Summary:<br /> The manuscript by Wang et. al. described the crystal structures of the N-terminal fragments of Scavenger receptor class F member 1 (SCARF1) ectodomains. SCARF1 recognizes modified LDLs, including acetylated LDL and oxidized LDL, and it plays an important role in both innate and adaptive immune responses. They characterized the dimerization of SCARF1 and the interaction of SCARF1 with modified lipoproteins by mutational and biochemical studies. The authors identified the critical residues for dimerization and demonstrated that SCARF1 may function as homodimers. They further characterized the interaction between SCARF1 and LDLs and identified the lipoprotein ligand recognition sites, the highly positively charged areas. Their data suggested that the teichoic acid inhibitors may interact with SCARF1 in the same areas as LDLs.

      Strengths:<br /> The crystal structures of SCARF1 were high quality. The authors performed extensive site-specific mutagenesis studies using soluble proteins for ELISA assays and surface-expressed proteins for flow cytometry.

      Weaknesses:<br /> 1. The schematic drawing of human SCARF1 and SCARF2 in Fig 1A did not show the differences between them. It would be useful to have a sequence alignment showing the polymorphic regions.<br /> 2. The description of structure determination was confusing. The f1 crystal structure was determined by SAD with Pt derivatives. Why did they need molecular replacement with a native data set? The f2 crystal structure was solved by molecular replacement using the structure of the f1 fragment. Why did they need to use EGF-like fragments predicted by AlphaFold as search models?<br /> 3. It's interesting to observe that SCARA1 binds modified LDLs in a Ca2+-independent manner. The authors performed the binding assays between SCARF1 and modified LDLs in the presence of Ca2+ or EDTA on Page 9. However, EDTA is not an efficient Ca2+ chelator. The authors should have performed the binding assays in the presence of EGTA instead.<br /> 4. The authors claimed that SCARF1Δ353-415, the deletion of a C-terminal region of the ectodomain, might change the conformation of the molecule and generate hinderance for the C-terminal regions. Why didn't SCARF1Δ222-353 have a similar effect? Could the deletion change the interaction between SCARF1 and the membrane? Is SCARF1Δ353-415 region hydrophobic?<br /> 5. What was the point of having Figure 8? Showing the SCARF1 homodimers could form two types of dimers on the membrane surface proposed? The authors didn't have any data to support that.

    1. Author Response

      The authors appreciate the reviewers' thoughtful and constructive feedback. We are pleased to have the opportunity to address their comments through a revised version to strengthen our work. In particular:

      (1) As suggested, we will add references/details in Methods to further help readers to establish the cohort as population-derived and clarify details about the analysis and specificity of results.

      (2) We agree that reserve, inefficiency, and compensation are complex issues needing more discussion. We will add definitions and discussion to clarify our approaches, including multivariate/univariate analyses and addressing the specificity of results. We also appreciate the suggestions for future research directions.

      A revised version addressing these valuable recommendations will improve our study's contribution towards quantitative methods for understanding reserve and compensation in healthy cognitive ageing.

    2. eLife assessment

      This study provides an important advancement of knowledge by showing neural functional compensation in the brains of healthy older adults completing a fluid-intelligence task. Validated whole-brain voxel-wide analyses and multivariate Bayesian approaches provide compelling evidence that supports the claims of the authors. The work delivers methods for quantifying reserve and compensation in future studies and will be of interest to researchers in the field of the neuroscience of healthy aging.

    3. Reviewer #1 (Public Review):

      Summary:<br /> This work addresses how to quantify functional compensation throughout the aging process and identifies brain regions that engage in compensatory mechanisms during the Cattell task, a measure of fluid cognition. The authors find that regions of the frontal cortex and cuneus showed unique effects of both age and performance. Interestingly, these two regions demonstrated differential activation patterns taking into account both age and performance. Specifically, the researchers found that the relationship between performance and activation in the cuneal ROI was strongest in older adults, however, this was not found in younger adults. These findings suggest that specifically within the cuneus, greater activation is needed by older adults to maintain performance, suggestive of functional compensation.

      Strengths:<br /> The conclusions derived from the study are well supported by the data. The authors validated the use of the in-scanner Cattell task by demonstrating high reliability in the same sample with the standard out-of-scanner version. Some strengths of the study include the large sample size and wide age range of participants. The authors use a stringent Bayes factor of 20 to assess the strength of evidence. The authors used a whole-brain approach to define regions of interest (ROIs) based on activation patterns that were jointly related to age and performance. Overall, the methods are technically sound and support the authors' conclusions.

      Weaknesses:<br /> While the manuscript is methodologically sound, the following aspects of image acquisition and data analysis need to be clarified to ensure replicability and reproducibility. The authors state that the sample is a "population-derived adult lifespan sample", the lack of demographic information makes it impossible to know if the sample is truly representative. Though this may seem inconsequential, education may impact both cognitive performance and functional activation patterns. Moreover, the authors do not report race/ethnicity in the manuscript. This information is essential to ensure representativeness in the sample. It is imperative that barriers to study participation within minoritized groups are addressed to ensure rigor and reproducibility of findings.

      For the whole-brain analysis in which the ROIs were derived, the authors used a threshold-free cluster enhancement (TFCE; Smith & Nichols 2009). The methodological paper cited suggests that individuals' TCFE image should still be corrected for multiple comparisons using the following: "to correct for multiple comparisons, one [...] has to build up the null distribution (across permutations of the input data) of the maximum (across voxels) TFCE score, and then test the actual TFCE image against that. Once the 95th percentile in the null distribution is found then the TFCE image is simply thresholded at this level to give inference at the p < 0.05 (corrected) level." (Smith & Nichols, 2009). Although the authors mention that clusters were estimated using 2000 permutations, there is no mention of the TFCE image itself being thresholded. While this would impact the overall size of the ROIs used in the study, the remaining analyses are methodologically sound.

    4. Reviewer #2 (Public Review):

      This work by Knights et al., makes use of the Cam-CAN dataset to investigate functional compensation during a fluid processing task in older adults, in a fairly large sample of approximately 200 healthy adults ranging from 19 to 87. Using univariate methods, the authors identify two brain regions in which activity increases as a function of both age and performance and conduct further investigations to assess whether the activity of these regions provides information regarding task difficulty. The authors conclude that the cuneal cortex - a region of the brain previously implicated in visual attention - shows evidence of compensation in older adults.

      The conclusions of the paper are well supported by the data, and the authors use appropriate statistical analyses. The use of multivariate methods over the last 20 years has demonstrated many effects that would have been missed using more traditional univariate analysis techniques. The data set is also of an appropriate size, and as the authors note, fluid processing is an extremely important domain in the field of cognition in aging, due to its steep decline over aging. However, it might have been nice to see an analysis of a more crystallised intelligence task included too, as a contrast since this is an area that does not demonstrate such a decline (and perhaps continues to improve over aging).

    5. Reviewer #3 (Public Review):

      This neuroimaging study investigated how brain activity related to visual pattern-based reasoning changes over the adult lifespan, addressing the topic of functional compensation in older age. To this end, the authors employed a version of the Cattell task, which probes visual pattern recognition for identifying commonalities and differences within sets of abstract objects in order to infer the odd object among a given set. Using a state-of-the-art univariate analysis approach on fMRI data from a large lifespan sample, the authors identified brain regions in which the activation contrast between hard and easy Cattell task conditions was modulated by both age and performance. Regions identified comprised prefrontal areas and bilateral cuneus. Applying a multivariate decoding approach to activity in these regions, the authors went on to show that only in older adults, the cuneus, but not the prefrontal regions, carried information about the task condition (hard vs. easy) beyond that already provided by activity patterns of voxels that showed a univariate main effect of task difficulty. This was taken as compelling evidence for task-specific compensatory activity in the cuneus in advanced age.

      The study is well-motivated and well-written. The authors used appropriate, rigorous methods that allowed them to control for a range of possible confounds or alternative explanations. Laudable aspects include the large sample with a wide and even age distribution, the validation of the in-scanner task performance against previous results obtained with a more standard version outside the scanner, and the control for vascular age-related differences in hemodynamic activity via a BOLD signal amplitude measure obtained from a separate resting-state fMRI scan. Overall, the conclusions are well-supported by the data.

      In the following, I list some points of discussion that I would like to see addressed by the authors in a revision:

      1) I don't quite follow the argumentation that compensatory recruitment would need to show via non-redundant information carried by any given non-MDN region (cf. p14). Wouldn't the fact that a non-MDN region carries task-related information be sufficient to infer that it is involved in the task and, if activated increasingly with increasing age, that its stronger recruitment reflects compensation, rather than inefficiency or dedifferentiation? Put differently, wouldn't "more of the same" in an additional region suffice to qualify as compensation, as compared to the "additional information in an additional region" requirement set by the authors? As a consequence, in my honest opinion, showing that decoding task difficulty from non-MDN ROIs works better with higher age would already count as evidence for compensation, rather than asking for age-related increases in decoding boosts obtained from adding such ROIs. It would be interesting to see whether the arguably redundant frontal ROI would satisfy this less demanding criterion. At any rate, it seems useful to show whether the difference in log evidence for the real vs. shuffled models is also related to age.

      2) Relatedly, does the observed boost in decoding by adding the cuneal ROI (in older adults) really reflect "additional, non-redundant" information carried by this ROI? Or could it be that this boost is just a statistical phenomenon that is obtained because the cuneus just happens to show a more clear-cut, less noisy difference in hard vs. easy task activation patterns than does the MDN (which itself may suffer from increased neural inefficiency in older age), and thus the cuneaus improves decoding performance without containing additional (novel) pieces of information (but just more reliable ones)? If so, the compensation account could still be maintained by reference to the less demanding rationale for what constitutes compensation laid out above.

      3) On page 21, the authors state that "...traditional univariate criteria alone are not sufficient for identifying functional compensation." To me, this conclusion is quite bold as I'd think that this depends on the unvariate criterion used. For instance, it could be argued that compensation should be more clearly indicated by an over additive interaction as observed for the relationship of cuneal activity with age and performance (i.e., the activity increase with better performance becomes stronger with age), rather than by an additive effect of age and performance as observed for the prefrontal ROI (see Fig. 2C). In any case, I'd appreciate it if the authors discussed this issue and the relationship between univariate and multivariate results in more detail (e.g. how many differences in sensitivity between the two approaches have contributed), in particular since the sophisticated multivariate approach used here is not widely established in the field yet.

      4) As to the exclusion of poorly performing participants (see p24): If only based on the absolute number of errors, wouldn't you miss those who worked (overly) slowly but made few errors (possibly because of adjusting their speed-accuracy tradeoff)? Wouldn't it be reasonable to define a criterion based on the same performance measure (correct - incorrect) as used in the main behavioural analyses?

      5) Did the authors consider testing for negative relationships between performance and brain activity, given that there is some literature arguing that neural efficiency (i.e. less activation) is the hallmark of high intelligence (i.e. high performance levels in the Cattell task)? If that were true, at least for some regions, the set of ROIs putatively carrying task-related information could be expanded beyond that examined here. If no such regions were found, it would provide some evidence bearing on the neural efficiency hypothesis.

    1. Author Response

      Reviewer #1 (Public Review):

      In this work, the authors have explored how treating C. albicans fungal cells with EDTA affects their growth and virulence potential. They then explore the use of EDTA-treated yeast as a whole-cell vaccine in a mouse model of systemic infection. In general, the results of the paper are unsurprising. Treating yeast cells with EDTA affects their growth and the addition of metals rescues the phenotype. Because of the significant growth defects of the cells, they don't infect mice and you see reduced virulence. Injection with these cells effectively immunises the mice, in the same way that heat-killed yeast cells would. The data is fairly sound and mostly well-presented, and the paper is easy to follow. However, I feel the data is an incremental advance at best, and the immune analysis in the paper is very basic and descriptive.


      Detailed analysis of EDTA-treated yeast cells


      • Basic immune data with little advance in knowledge.

      • No comparison between their whole-cell vaccine and others tried in the field.

      • The data is largely unsurprising and not novel.

      Thank you so much for appreciating our effort to generate a live whole-cell vaccine by treating with EDTA. Also, we appreciate your comment that the manuscript is sound and well-presented. However, we are afraid that the respected reviewer assumed the CAET cells as dead cells. CAET is a live cell just that it replicates slower than the wild type. Since the respected reviewer presumed CAET to be a dead strain similar to heat-killed, most of his/her comments were partly negative.

      Reviewer #2 (Public Review):


      Invasive fungal infections are very difficult to treat with limited drug options. With the increasing concern of drug resistance, developing an antifungal vaccine is a high priority. In this study, the authors studied the metal metabolism in Candida albicans by testing some chelators, including EDTA, to block the metal acquisition and metabolism by the fungus. Interestingly, they found EDTA-treated yeast cells grew poorly in vitro and non-pathogenic in vivo in a murine model. Mice immunized by EDTA-treated Candida (CAET) were protected against challenge with wild-type Candida cells. RNA-Seq analysis to survey the gene expression profile in response to EDTA treatment in vitro revealed upregulation of genes in metal homeostasis and downregulation of ribosome biogenesis. They also revealed an induction of both pro- and anti-inflammatory cytokines involved in Th1, Th2 and Th17 host immune response in response to CAET immunization. Overall, this is an interesting study with translational potential.


      The main strength of the report is that the authors identified a potential whole-cell live vaccine strain that can provide full protection against candidiasis. Abundant data both on in vitro phenotype, gene expression profile, and host immune response have been presented.


      A weakness is that the immune mechanism of CAET-mediated host protection remains unclear. The immune data is somewhat confusing. The authors only checked cytokines and chemokines in blood. The immune response in infected tissues and antibody response may be investigated.

      Thank you very much for appreciating our work and finding our strain to be a live whole-cell vaccine strain with translational potential. Since the current study focused on the identification and detailed characterization of a non-genetically modified live attenuated strain and its safety and efficacy as a potential vaccine candidate in the preclinical model, we have excluded the possible immune mechanisms involving CAET. We are in the process of developing another manuscript where we describe both cellular and molecular mechanisms that provide protective immunity in CAET-vaccinated mice.

      Reviewer #3 (Public Review):


      The authors are trying to find a vaccine solution for invasive candidiasis.


      The testing of the antifungal activity of EDTA on Candida is not new as many other papers have examined this effect. The novelty here is the use of this EDTA-treated strain as a vaccine to protect against a secondary challenge with wild-type Candida.


      However, data presented in Figure 5 and Figure 6 are not convincing and need further experimental controls and analysis as the authors do not show a time-dependent effect on the CFU of their vaccine formulation. The methodology used is also an issue. As it stands, the impact is minor.

      Thank you so much for appreciating our efforts to develop a novel vaccine against fungal infections. Although the Figs. 5 and 6 are the main straight of the paper, we are afraid that this respected reviewer found them not convincing.

    2. Reviewer #1 (Public Review):

      In this work, the authors have explored how treating C. albicans fungal cells with EDTA affects their growth and virulence potential. They then explore the use of EDTA-treated yeast as a whole-cell vaccine in a mouse model of systemic infection. In general, the results of the paper are unsurprising. Treating yeast cells with EDTA affects their growth and the addition of metals rescues the phenotype. Because of the significant growth defects of the cells, they don't infect mice and you see reduced virulence. Injection with these cells effectively immunises the mice, in the same way that heat-killed yeast cells would. The data is fairly sound and mostly well-presented, and the paper is easy to follow. However, I feel the data is an incremental advance at best, and the immune analysis in the paper is very basic and descriptive.


      Detailed analysis of EDTA-treated yeast cells


      - Basic immune data with little advance in knowledge.<br /> - No comparison between their whole-cell vaccine and others tried in the field.<br /> - The data is largely unsurprising and not novel.

    3. Reviewer #2 (Public Review):


      Invasive fungal infections are very difficult to treat with limited drug options. With the increasing concern of drug resistance, developing an antifungal vaccine is a high priority. In this study, the authors studied the metal metabolism in Candida albicans by testing some chelators, including EDTA, to block the metal acquisition and metabolism by the fungus. Interestingly, they found EDTA-treated yeast cells grew poorly in vitro and non-pathogenic in vivo in a murine model. Mice immunized by EDTA-treated Candida (CAET) were protected against challenge with wild-type Candida cells. RNA-Seq analysis to survey the gene expression profile in response to EDTA treatment in vitro revealed upregulation of genes in metal homeostasis and downregulation of ribosome biogenesis. They also revealed an induction of both pro- and anti-inflammatory cytokines involved in Th1, Th2 and Th17 host immune response in response to CAET immunization. Overall, this is an interesting study with translational potential.


      The main strength of the report is that the authors identified a potential whole-cell live vaccine strain that can provide full protection against candidiasis. Abundant data both on in vitro phenotype, gene expression profile, and host immune response have been presented.


      A weakness is that the immune mechanism of CAET-mediated host protection remains unclear. The immune data is somewhat confusing. The authors only checked cytokines and chemokines in blood. The immune response in infected tissues and antibody response may be investigated.

    4. Reviewer #3 (Public Review):


      The authors are trying to find a vaccine solution for invasive candidiasis.


      The testing of the antifungal activity of EDTA on Candida is not new as many other papers have examined this effect. The novelty here is the use of this EDTA-treated strain as a vaccine to protect against a secondary challenge with wild-type Candida.


      However, data presented in Figure 5 and Figure 6 are not convincing and need further experimental controls and analysis as the authors do not show a time-dependent effect on the CFU of their vaccine formulation.

      The methodology used is also an issue. As it stands, the impact is minor.

    1. Author Response

      Public Reviews:

      Reviewer #1 (Public Review):

      The paper by Perovic and colleagues describes how important blood vessels called collaterals form during development and remodel/expand upon injury to the brain. These vessels are conduits between arteries that do not have strong blood flow physiologically but upon injury can compensate for conduit loss. Published work by others is largely descriptive and does not address the cellular sources of collaterals over time. Here elegant lineage tracing is used to better understand the source of vascular endothelial cells during embryonic development, and how these lineages contribute to remodeling upon injury. The work is ambitious and important as collateral capacity can strongly influence the trajectory of outcomes with vascular blockage. The work reveals that proliferative arterial EC is the primary contributor to the collaterals developmentally, with a small contribution from capillary/venous EC, and that this shifts to almost completely arterial contribution from birth onward. There are several aspects of the work that, if addressed, would strengthen the study and better support the interesting and novel conclusions, including analysis of non-collateral lineage contributions, more careful interpretation of fixed image data, and more careful annotation of the image panels.

      We thank the reviewer for appreciating the ambition, importance and novelty of our work, and for the constructive suggestions for improvements.

      Reviewer #2 (Public Review):

      Pial collateral vessels are anastomotic connections that cross-connect distal arterioles of the middle, anterior, and posterior cerebral arteries. With respect to ischemic stroke, good pial collateral flow positively correlates with decreased infarct volume and improved recovery; accordingly, optimizing collateral flow represents an important intervention for limiting stroke damage. The goal of this study was to determine the endothelial cell (EC) subtype(s) that contribute to the embryonic and neonatal development of pial collaterals and their expansion in response to stroke. To this end, the authors used lineage tracing methods in the mouse, labeling arterial endothelial cells (using Bmx-CreERT on switch line, R26mTmG) or venous and microvascular endothelial cells (using Vegfr3-CreERT on R26mTmG) and assessing pial collaterals via confocal microscopy. The authors convincingly demonstrate that arterial-lineage ECs comprise the majority of pial collateral ECs during development and in adulthood, with a minor contribution from pial plexus-derived microvascular ECs that decline over time. They also convincingly demonstrate that pial collateral outward remodeling after experimentally-induced stroke (distal middle cerebral artery occlusion, or dMCAO) involves, at least in part, local proliferation of arterial-lineage ECs. The latter is intriguing given that arterial ECs generally leave the cell cycle. While these conclusions are quite solid, some key details are missing that could improve analysis, and some important caveats are not addressed. Moreover, less convincing are mechanistic claims that pial collaterals form via a migratory process of "mosaic colonization" of a preexisting vessel.

      We thank the reviewer for the careful assessment and suggestions for improvements. Claiming migratory behaviour from static images is indeed always tricky and comes with caveats. Our conclusions however are based on the appearance of cells in locations where they are not found at earlier stages. Given that we could exclude persistent recombination, a sound conclusion must be that cells appear in the new location through some means of translocation. Given our experience with the morphology of migrating cells in vivo, the appearance of polarized filopodial structures coinciding with the direction of observed appearance of cells at progressive later stages, strongly suggests active migration. Moreover, these highly migrating cells also exhibit ICAM2 positivity, suggesting that they are directly lining the pre-collateral lumen. In our explanation of how the immigration might occur, we would need to consider solitary cell migration through interstitial space, or rather intercalation movement. The active participation of migrating cells in lumen formation of the nascent pre-collateral suggests intercalation, but further analysis needs to be performed (such as a detailed analysis of cell-cell junctions or sustained apico-basal polarity). The conclusion that such a process highlights mosaic colonization of preexisting vessels is tightly linked to the demonstration of continuous lumen, whilst being found in a vessel without lineage marker, but beginning expression of arterial markers such as Cx40.

      1) It is difficult to understand whether individual collaterals are truly mosaic vessels, or whether arterial or venous/microvascular lineage ECs predominate in any particular region of the pial collateral vasculature. This is due to a number of methodological reasons: arterial and venous/microvascular contributions to pial collaterals were assessed independently, only a few (and in some cases, just one) collaterals were analyzed in each mouse, and regionality/location of collaterals was not addressed. Additionally, the inefficiency and variability of EC labeling, especially with the Vegfr3-CreERT line (Fig. S1, ~6-30%), compounds this problem.

      Factual error: 6 - 22% (not 30)

      The reviewer is correct in their statement that the independent assessment of contribution makes it difficult to locally demonstrate mosaicism. However, we are not aware of a method that could trace two different populations from different sources using recombination genetics simultaneously. Mosaicism however can be concluded from two observations independently. One, we find contribution from an alternative source that at the time point of labelling does not colocalize with arterial BMX lineage cells. Second, the BMX-lineage labelling is never complete in the collaterals, at least at developmental stages. Future work using scRNA seq may shed more light onto the degree of mosaicism. However at this point, the data strongly suggest mosaicism, even if the majority of the cells are of the BMX-lineage. The comment on inefficiency or variability of labelling in particular with the Vegfr3-CreERT line is interesting. At this point, we cannot rule out that the observed variability is due to intrinsic variability in expression, rather than inefficient recombination, or variability thereof. With our current tools we cannot easily distinguish between the two. Again, we hope that future studies with scRNA seq will be able to shed more light onto this interesting biology. Finally, we have not carefully assessed regionality, but have not seen obvious correlations with the degree of mosaicism. It is however important to note that in no case did we just examine one collateral per hemisphere. Each data point is an average of all collaterals from a part of a given collateral zone (imaging region). Usually, it is possible to image 2-4 collateral regions in each embryo. We always imaged multiple collaterals per animal, but sometimes only one region was imaged (due to technical issues).

      2) The identification of "pre-collateral" vessels requires further support. The authors define these vessels by their connection to the feeding artery, their (often) larger diameter, and their more pronounced ICAM2 expression. While most of these criteria are demonstrated in Figure S3, it is not apparent how these vessels were defined in Figure 4, which lacks specific annotation of each of these identifying criteria. As the identification of these novel vessels is one of the key findings of this paper, a more robust method of unambiguously defining them is warranted.

      We agree that it would be fabulous to have a unique marker at hand that identifies pre-collaterals. Our careful analysis of the distribution of the markers we tested, firmly established that the levels of ICAM2 expression nicely highlight structures that become colonized by these BMX lineage cells. Cx40 staining also confirmed this impression. We will attempt better annotation based on these markers to help the reader appreciate these findings. The combination of anatomical location and connection pattern with the stronger ICAM2 staining in our hands is a highly reliable and unambiguous identifier of what we called “pre-collaterals”.

      3) The conclusion that collateral-forming ECs migrate in the direction of flow into preexisting vessels is not well supported. The authors state that the presence of filopodial projections (Figure 4) supports this conclusion. However, filopodia number and directional polarization/orientation were not quantified, and "intercalation movements"/migration, per se, cannot be inferred from these static images.

      The reviewer is correct that claiming migration from static images is always difficult. As stated above, we base our conclusions on the progressive appearance of cells exhibiting migratory behavior, as well as the morphology including filopodia. Although we indeed didn’t quantify filopodia, these structures are in our experience not found on endothelial cells that do not engage in migration. Their consistent presence, and directionality is strongly suggestive of movement. . We will attempt to clarify this better in the text and the figures.

      4) In Figure 5, the simplest explanation for relative Cx40 expression in different vessels is the absence (low expression) or presence (high expression) of flow. This figure provides little mechanistic insight beyond this already-known relationship, and it is unclear how many times this experiment was performed (there is no N, no quantification or correlation).

      Flow is indeed one component of what regulated Cx40. However, a key point of this figure is to show that Cx40 expression can precede the recruitment of BMX lineage cells. This is important to distinguish whether arterial identity is only achieved by recruitment of BMX lineage cells, or exists in certain vessels (for example because they may have more flow) already before this colonization event. It suggests that the BMX population may rather serve to consolidate arterial state, as other structures that may have been Cx40 before, but do not become colonized lose arterial identity? We disagree that this finding does not contribute important information. If only BMX-lineage cells would express Cx40, the conclusion would be very different. This is not a question of how much, but of whether arterialization requires the recruitment of particular cells, or is induced in vessels that adopt arterial identity. This is not a singular observation and we will add the N number onto the figure legend.

      5) There is no statistical analysis in this work. This is justified by the authors by their admission that the study is of a "descriptive nature and...exploratory design."

      This is correct.

      Reviewer #3 (Public Review):


      These studies focus on a very interesting, understudied phenomenon in vascular development - the formation of pial collaterals between cerebral arteries. Understanding the mechanism(s) that regulates this process during normal development could provide important insights for the treatment of adult stroke patients, for which repair is highly dependent on collateral formation. Insights may also be relevant to other collateral-dependent diseases, such as heart disease and chronic peripheral ischemia.


      The investigators use lineage tracing and 3D imaging to show that, in mouse embryos, endothelial cells (ECs) predominantly from Bmx+ arteries and some from the Vegfr3+ microvasculature, invade pre-existing pre-collateral vascular structures in a process they termed "mosaic colonization", and arterialization of the vessel segments is said to occur concurrently with colonization, although details about EC phenotypes are lacking. Growth of the collaterals in response to ischemic injury relies on local replication of the ECs within the collaterals and not further recruitment from veins and the microvasculature. Although detailed molecular mechanisms are not provided, demonstration of the "cellular mechanism" of pial collateral vascularization is novel.


      Nonetheless, there are some issues that should be addressed, particularly to clarify the phenotype of the ECs forming the collaterals and expanding in response to injury; only their "origin" was traced and not their identity/growth after labeling in Bmx+ vessels.

      We thank the reviewer for pointing out the importance and novelty of our findings, and for the constructive suggestions for improvements. We indeed focussed here on origin and an attempt to distinguish how the cells arrive in their location rather than on their phenotype. We have performed detailed phenotypic analysis including EM analysis of collaterals but without the ability to connect these to the traced lineages. We therefore chose to leave these data for a separate manuscript. Future work will attempt to fully characterize these populations including their transcriptome using scRNA seq. However, isolating collateral ECs to faithfully characterize them is very challenging, and will not be a part of this manuscript. We have performed stainings for various arterial markers, with variable success.. Nevertheless, a full functional study will be part of future work.

    2. eLife assessment

      This study provides insights into the developmental origin of endothelial cells found in blood vessels called pial collaterals. The work is important, as collateral capacity can strongly influence the trajectory of outcomes with vascular blockage, and the approaches are novel and overall convincing; however, some mechanistic claims are only partially supported, and collateral characterization is incomplete. Given the clear positive correlation between pial collateral flow and improved stroke outcome, this study will be of interest to vascular biologists and clinicians caring for stroke patients.

    3. Reviewer #1 (Public Review):

      The paper by Perovic and colleagues describes how important blood vessels called collaterals form during development and remodel/expand upon injury to the brain. These vessels are conduits between arteries that do not have strong blood flow physiologically but upon injury can compensate for conduit loss. Published work by others is largely descriptive and does not address the cellular sources of collaterals over time. Here elegant lineage tracing is used to better understand the source of vascular endothelial cells during embryonic development, and how these lineages contribute to remodeling upon injury. The work is ambitious and important as collateral capacity can strongly influence the trajectory of outcomes with vascular blockage. The work reveals that proliferative arterial EC is the primary contributor to the collaterals developmentally, with a small contribution from capillary/venous EC, and that this shifts to almost completely arterial contribution from birth onward. There are several aspects of the work that, if addressed, would strengthen the study and better support the interesting and novel conclusions, including analysis of non-collateral lineage contributions, more careful interpretation of fixed image data, and more careful annotation of the image panels.

    4. Reviewer #2 (Public Review):

      Pial collateral vessels are anastomotic connections that cross-connect distal arterioles of the middle, anterior, and posterior cerebral arteries. With respect to ischemic stroke, good pial collateral flow positively correlates with decreased infarct volume and improved recovery; accordingly, optimizing collateral flow represents an important intervention for limiting stroke damage. The goal of this study was to determine the endothelial cell (EC) subtype(s) that contribute to the embryonic and neonatal development of pial collaterals and their expansion in response to stroke. To this end, the authors used lineage tracing methods in the mouse, labeling arterial endothelial cells (using Bmx-CreERT on switch line, R26mTmG) or venous and microvascular endothelial cells (using Vegfr3-CreERT on R26mTmG) and assessing pial collaterals via confocal microscopy. The authors convincingly demonstrate that arterial-lineage ECs comprise the majority of pial collateral ECs during development and in adulthood, with a minor contribution from pial plexus-derived microvascular ECs that decline over time. They also convincingly demonstrate that pial collateral outward remodeling after experimentally-induced stroke (distal middle cerebral artery occlusion, or dMCAO) involves, at least in part, local proliferation of arterial-lineage ECs. The latter is intriguing given that arterial ECs generally leave the cell cycle. While these conclusions are quite solid, some key details are missing that could improve analysis, and some important caveats are not addressed. Moreover, less convincing are mechanistic claims that pial collaterals form via a migratory process of "mosaic colonization" of a preexisting vessel.

      1. It is difficult to understand whether individual collaterals are truly mosaic vessels, or whether arterial or venous/microvascular lineage ECs predominate in any particular region of the pial collateral vasculature. This is due to a number of methodological reasons: arterial and venous/microvascular contributions to pial collaterals were assessed independently, only a few (and in some cases, just one) collaterals were analyzed in each mouse, and regionality/location of collaterals was not addressed. Additionally, the inefficiency and variability of EC labeling, especially with the Vegfr3-CreERT line (Fig. S1, ~6-30%), compounds this problem.

      2. The identification of "pre-collateral" vessels requires further support. The authors define these vessels by their connection to the feeding artery, their (often) larger diameter, and their more pronounced ICAM2 expression. While most of these criteria are demonstrated in Figure S3, it is not apparent how these vessels were defined in Figure 4, which lacks specific annotation of each of these identifying criteria. As the identification of these novel vessels is one of the key findings of this paper, a more robust method of unambiguously defining them is warranted.

      3. The conclusion that collateral-forming ECs migrate in the direction of flow into preexisting vessels is not well supported. The authors state that the presence of filopodial projections (Figure 4) supports this conclusion. However, filopodia number and directional polarization/orientation were not quantified, and "intercalation movements"/migration, per se, cannot be inferred from these static images.

      4. In Figure 5, the simplest explanation for relative Cx40 expression in different vessels is the absence (low expression) or presence (high expression) of flow. This figure provides little mechanistic insight beyond this already-known relationship, and it is unclear how many times this experiment was performed (there is no N, no quantification or correlation).

      5. There is no statistical analysis in this work. This is justified by the authors by their admission that the study is of a "descriptive nature and...exploratory design."

    5. Reviewer #3 (Public Review):

      Summary:<br /> These studies focus on a very interesting, understudied phenomenon in vascular development - the formation of pial collaterals between cerebral arteries. Understanding the mechanism(s) that regulates this process during normal development could provide important insights for the treatment of adult stroke patients, for which repair is highly dependent on collateral formation. Insights may also be relevant to other collateral-dependent diseases, such as heart disease and chronic peripheral ischemia.

      Strengths:<br /> The investigators use lineage tracing and 3D imaging to show that, in mouse embryos, endothelial cells (ECs) predominantly from Bmx+ arteries and some from the Vegfr3+ microvasculature, invade pre-existing pre-collateral vascular structures in a process they termed "mosaic colonization", and arterialization of the vessel segments is said to occur concurrently with colonization, although details about EC phenotypes are lacking. Growth of the collaterals in response to ischemic injury relies on local replication of the ECs within the collaterals and not further recruitment from veins and the microvasculature. Although detailed molecular mechanisms are not provided, demonstration of the "cellular mechanism" of pial collateral vascularization is novel.

      Weaknesses:<br /> Nonetheless, there are some issues that should be addressed, particularly to clarify the phenotype of the ECs forming the collaterals and expanding in response to injury; only their "origin" was traced and not their identity/growth after labeling in Bmx+ vessels.

    1. eLife assessment

      This important work describes a compelling analysis of DNA damage-induced changes in nascent RNA transcripts, and a genome-wide screening effort to identify the responsible proteins. A significant discovery is the inability of arrested cells to undergo DNA damage-induced gene silencing, which, is attributed to an inability to mediate ATM-induced transcriptional repression. Revisions are suggested that would significantly enhance and support the central claims of the study. This work will be of general interest to the DNA damage, repair, and transcription fields, with a potential impact on the cancer field.

    2. Reviewer #1 (Public Review):

      This manuscript by Tyler and colleagues describes a thorough analysis of IR-induced changes in nascent RNA transcripts, and a genome-wide screening effort to identify the responsible proteins. The findings extend previous work describing DNA damage-induced transcriptional repression from DNA breaks in cis to bulk genomic DNA damage. A significant discovery is the inability of arrested cells to undergo DNA damage-induced gene silencing, which, at least at the rDNA locus, is attributed to an inability to mediate ATM-induced transcriptional repression. While the findings add to our knowledge of how DNA damage affects gene expression, there are several limitations to the current study that remain inadequately addressed. In addition, some of the proposed conclusions seem speculative and should be marked as such, omitted, or experimentally supported.

      Two major concerns are as follows:

      1) The CIRSPR screen designed to detect regulators of damage-induced transcriptional repression is based on EU incorporation following a 7-day selection of stable knockout cells. As the authors point out, cell cycle arrest reduces rDNA transcription on its own. The screen, which assesses changes in sgRNA distribution in EU high cells, is thus likely to be dominated by factors that affect cell cycle progression. This is exemplified in the analyses of top hits related to neddylation. The screen's limitations in terms of identifying DDR effectors of damage-induced silencing need to be clearly stated.

      2) The authors confirm previous findings of DNA damage-induced repression of rDNA and histone gene transcription. The authors propose that these highly transcribed genes are more susceptible to silencing than the bulk of protein-coding genes and propose a global damage-induced signaling event that is independent of DNA breaks in cis. While this is possible, it is not demonstrated in this manuscript, and the authors should acknowledge alternative explanations. For example, the loci found to be repressed by bulk IR are highly repetitive gene arrays that tend to form nuclear sub-compartments (nucleoli, histone bodies). As such, their likelihood of being in the vicinity of DNA damage is high, at least for a fraction of gene copies. The findings, therefore, remain consistent with cis-induced silencing. Moreover, silencing may spread through the relevant nuclear sub-compartments, consistent with the formation of DNA damage compartments described recently (PMID: 37853125).

      Other comments:<br /> 1) The statement that silencing is due to transcription initiation rather than elongation is not sufficiently supported by the data. Could equivalent nascent transcript reduction not be the result of the suppression of elongating RNA PolII? To draw the proposed conclusion, the authors would need to demonstrate that RNA PolII initiation is altered, using RNA PollII ChIP and/or analysis of relevant RNA PolII phosphorylation patterns.

      2) The lack of rDNA silencing in arrested cells is interesting, though the underlying mechanism remains unclear. To further corroborate the proposed defect in ATM-mediated signaling, the authors should look directly at ATM and Treacle phosphorylation upstream of TOPBP1.

      3) The "change in relative heights of the EU low (G1) and EU high (S/G2) peaks" in Figures 5D, 5E, and 6B is central to the proposed model of transcriptional changes being affected by cell cycle arrest. These differences should be visualized more clearly and quantified across independent experiments. Ideally, the cell cycle stage should be dissected as in Figure 2B. How do the authors envision cell cycle arrest triggers the defect in transcriptional silencing?

    3. Reviewer #2 (Public Review):

      Summary:<br /> In this manuscript, the authors attempted to study mechanisms of transcription inhibition in cells treated with IR. They observed that, unlike histone chaperone HIRA-dependent transcription inhibition during UV-induced damage, IR-induced transcription inhibition does not depend on HIRA. Through the CRISPR/Cas9 screen, they identified protein neddylation is important for transcription inhibition. By sequencing nascent RNA, they observed that down-regulated transcripts upon IR treatment are largely highly transcribed genes including histone genes and rDNA.

      Strengths:<br /> The authors utilized comprehensive approaches to fill in the knowledge gap of IR-induced transcription inhibition.

      Weaknesses:<br /> it is not clear that inhibition of histone genes by IR is due to a reduction of S phase progression.

    1. eLife assessment

      This study is partly useful as it corroborates what is already known about the elevated proliferation capacity of mid lobular hepatocytes in liver regeneration. Lineage tracing and scRNAseq studies are powerful for the investigation of such heterogeneous hepatocyte proliferation capacity. Nevertheless, based on experimental limitations, incomplete method description and inadequate data analyses the presented data are insufficient to support the proposed conclusions of a mesenchymal-hepatocyte hybrid population in the murine liver.

    2. Reviewer #1 (Public Review):


      This valuable study by Gui Yu and colleagues aims to investigate the function of a subtype of hepatocyte, which expressed Twist2 at some point in its lineage. First, using reporter mice, they show that hepatocytes can be labelled using Twist2-cre mice. Importantly, Twist2-cre also labels liver mesenchymal cells. Using scRNA seq of P1 and P14 Twist2-Cre tomato-labelled cells, they identify both mesenchymal cells and hepatocytes, and (using trajectory analyses) propose that Twist-traced hepatocytes and mesenchymal cells are derived from Epcam+ progenitors. The authors propose that the Twist2-traced hepatocytes occupy the midzone, and are polyploid. Using partial hepatectomy and Ccl4 as models for regeneration, the authors show that, after insult, there is an increase in Twist2-Tomato+ cells, which are more proliferative. Next, the authors propose that Notch signaling is suppressed during regeneration (based on the downregulation of HES1 protein in midzone hepatocytes during regeneration). They therefore knock out Notch1 in Twist2-expressing cells and show that this leads to hepatocyte proliferation (but fewer liver lobes) in homeostatic conditions. Finally, the authors also interfere with mTor and VEGF signaling and show that both interventions suppresses the excess hepatocyte proliferation in Twist2-conditional Notch1-knockout mice.


      Overall, the data show that Twist2 is expressed at some point during hepatocyte development or homeostasis, and that Notch1 in hepatocytes or mesenchymal cells plays a role in limiting homeostatic hepatocyte proliferation.


      The study relies heavily on the use of Twist2-Cre mice, which labels both mesenchymal cells and hepatocytes. Several experiments on hepatocytes (such as scRNA seq or bulk RNA seq) could be confounded by doublets or contamination with mesenchymal cells.

    3. Reviewer #2 (Public Review):


      There are two potential contributions made by this study, both of which are not fully supported by the data presented. First, that Twist-positive hepatocytes in the midlobular zone are derived from Twist2-expressing cells in embryonic livers via intermediate EpCAM-expressing cells. Second, that there is a population of hepatocytes with mesenchymal features that drive regeneration after various injuries. The concept that mid-lobular hepatocytes are more regenerative in adult injury settings has already been established and this paper further supports that body of knowledge.


      There are copious scRNA-seq data that are supportive of the claims, but these analyses were not definitive.


      1. There is not sufficient evidence to support the following assertion: "markers identified a mesenchymal-hepatocyte hybrid population (13.7% of total hepatocytes) that express signature genes of both lineages." Twist-Cre reporter mice mark hepatocytes and mesenchymal populations, but it is not clear whether or not this means that the hepatocyte population labeled by Twist is mesenchymal. It is very possible for hepatocytes to express mesenchymal genes without being a true hybrid population. There is not much evidence that zone 2 cells are a mix of hepatocyte and mesenchymal. The idea of a hybrid population needs to be defined. The definition probably needs to involve the concept that hybrid cells must have morphologic or functional features of mesenchymal cells, rather than just expressing some genes from each cell type.

      Related to this, the authors claim that co-expression of Twist and EpCAM in E10.5 liver cells might support the existence of a hepatomesenchymal cell type. This is possible, but one should note that adult hepatocytes can express EpCAM, especially during ductular reactions, so it is not necessarily a mesenchymal marker per se.

      2. The authors assert several times that Twist-Cre mice appear to have no effect on overall liver regeneration phenotypes. They use this to suggest a lack of an effect for heterozygous deletion of Twist by the Cre allele. It is still possible for these mice to have altered lineage tracing results. It is very difficult to rule this out. For example, Axin2-CreER mice did not have any overt liver function or regeneration phenotypes, but the lineage tracing results from these mice differed from other CreER mice.

      3. The central problem with this study is that the authors use a Cre strain and not a CreER strain. With a Cre strain, there could be new labeling of Twist-positive cells at multiple later time points. Thus, it is very difficult to assert that the Tomato-positive population at later time points are really descendants of the originally labeled population. It is very difficult to interpret the results of Cre-based lineage tracing experiments.

      With this technical limitation in mind, I do not think that there is enough evidence to support the assertion made on page 6: "These findings suggest that EpCAMlow progenitor cells give rise to hepatocytes and MCs." The authors use scRNA-seq trajectory analysis to come to the conclusion that mesenchymal cells give rise to hepatocytes between p1 and p14. Much more evidence is needed before the authors can arrive at this conclusion. It is much more likely that midlobular hepatocytes arise from other hepatocytes. To support their arguments, the authors would have to use a CreER line that exclusively labels mesenchymal cells in the liver, then lineage traces them until p14 to determine if they become hepatocytes. Without such an experiment, I do not think the current experiments are interpretable.

      4. The injury experiments are again limited in their interpretability because they do not use CreER. It is very possible that Twist is turned on after CCl4 or surgical injury, and thus new hepatocytes might activate Tomato. It is unclear if previously Tomato-positive midzone hepatocytes were proliferating to increase the Tomato positive population. The authors use expression-based studies to argue against ectopic activation of Twist, but it is very difficult to exclude Cre activation using these types of studies.

    1. eLife assessment

      This valuable study investigates likely molecular mechanisms underlying the increasingly common deletions of the hrp2 and hrp3 genes of the human malaria parasite Plasmodium falciparum, that render parasites undetectable by widely used rapid diagnostic tests. The generation of additional long-read data, alongside a new analysis of 19,000 public short-read sequenced genomes, makes this the most detailed investigation currently available on this topic. The authors provide solid evidence for chromosomal breakage with subsequent telomere healing as the mechanism for hrp2 deletion, with more complex patterns for hrp3 deletion, but further methodological details would bolster confidence in the conclusions and enable replication of the results.

    2. Reviewer #1 (Public Review):

      Summary:<br /> Deletion of the hrp2 and hrp3 loci in P. falciparum poses an immediate public health threat. This manuscript provides a more complete understanding of the dynamic nature with which these deletions are generated. By delving into the likely mechanisms behind their generation, the authors also provide interesting insight into general Plasmodium biology that can inform our broader understanding of the parasite's genomic evolution.

      Strengths:<br /> The sub-telomeric regions of P. falciparum (where hrp2 and hrp3 are located) are notoriously difficult to study with short-read sequence data. The authors take an appropriate, targeted approach toward studying the loci of interest, which includes read-depth analysis and local haplotype reconstruction. They additionally use both long-read and short-read data to validate their major findings. There is an extensive set of supplementary plots, which helps clarify several aspects of the data.

      Weaknesses:<br /> In this first version, there are a few factors that hinder a full assessment of the robustness and replicability of the results. First, a number of the analyses lack basic details in the methods; for instance, one must visit the authors' personal website to find some of the tools used. Second, there are several tricky methodological points that are not fully documented. Read depths are treated (and plotted) discretely as 0/1/2 without any discussion of how thresholds were used and determined. For read mapping to standard vs hybrid chromosomes, there is no documentation on how assignments were made if partially ambiguous or how final sample calls were determined when some reads were discordant. There is no mention of how missing data were handled. Without this, it is difficult to know when conclusions were based on analyses that were more quantitative (for instance, using pre-determined read thresholds) or more subjective (with patterns being extracted visually). Third, while a new method is employed for local haplotype reconstruction (PathWeaver), the manuscript does not include details on this approach or benchmarking data with which to evaluate its performance and understand any potential artifacts.

    3. Reviewer #2 (Public Review):

      This work investigates the mechanisms, patterns, and geographical distribution of pfhrp2 and pfhrp3 deletions in Plasmodium falciparum. Rapid diagnostic tests (RDTs) detect P. falciparum histidine-rich protein 2 (PfHRP2) and its paralog PfHRP3 located in subtelomeric regions. However, laboratory and field isolates with deletions of pfhrp2 and pfhrp3 that can escape diagnosis by RDTs are spreading in some regions of Africa. They find that pfhrp2 deletions are less common and likely occur through chromosomal breakage with subsequent telomeric healing. Pfhrp3 deletions are more common and show three distinct patterns: loss of chromosome 13 from pfhrp3 to the telomere with evidence of telomere healing at breakpoint (Asia; Pattern 13-); duplication of a chromosome 5 segment containing pfhrp1 on chromosome 13 through non-allelic homologous recombination (NAHR) (Asia; Pattern 13-5++); and the most common pattern, duplication of a chromosome 11 segment on chromosome 13 through NAHR (Americas/Africa; Pattern 13-11++). The loss of these genes impacts the sensitivity of RDTs, and knowing these patterns and geographic distribution makes it possible to make better decisions for malaria control.

    4. Reviewer #3 (Public Review):

      Summary:<br /> The study provides a detailed analysis of the chromosomal rearrangements related to the deletions of histidine-rich protein 2 (pfhrp2) and pfhrp3 genes in P. falciparum that have clinical significance since malaria rapid diagnostic tests detect these parasite proteins. A large number of publicly available short sequence reads for the whole genome of the parasite were analyzed, and data on coverage and discordant mapping allowed the authors to identify deletions, duplications, and chromosomal rearrangements related to pfhrp3 deletions. Long-read sequences showed support for the presence of a normal chromosome 11 and a hybrid 13-11 chromosome lacking pfhrp3 in some of the pfhrp3-deleted parasites. The findings support that these translocations have repeatedly occurred in natural populations. The authors discuss the implications of these findings and how they do or do not support previous hypotheses on the emergence of these deletions and the possible selective pressures involved.

      Strengths:<br /> The genomic regions where these genes are located are challenging to study since they are highly repetitive and paralogous and the use of long-read sequencing allowed to span the duplicated regions, giving support to the identification of the hybrid 13-11 chromosome.

      All publicly available whole-genome sequences of the malaria parasite from around the world were analysed which allowed an overview of the worldwide variability, even though this analysis is biased by the availability of sequences, as the authors recognize.

      Despite the reduced sample size, the detailed analysis of haplotypes and identification of the location of breakpoints gives support to a single origin event for the 13-5++ parasites.

      The analysis of haplotype variation across the duplicated chromosome-11 segment identified breakpoints at varied locations that support multiple translocation events in natural populations. The authors suggest these translocations may be occurring at high frequency in meiosis in natural populations but are strongly selected against in most circumstances, which remains to be tested.

      Weaknesses:<br /> Relying on sequence data publicly available, that were collected based on diagnostic test positivity and that are limited by sequencing availability, limits the interpretation of the occurrence and relative frequency of the deletions. In the discussion, caution is needed when identifying the least common and most common mechanisms and their geographical associations. The identification of only one type of deletion pattern for Pfhrp2 may be related to these biases.

      The specific objectives of the study are not stated clearly, and it is sometimes difficult to know which findings are new to this study. Is it the first study analyzing all the worldwide available sequences? Is it the first one to do long-read sequencing to span the entire duplicated region?

      Another aspect that should be explained in the introduction is that there was previous information about the association of the deletions to patterns found in chromosomes 5 and 11. In the short-read sequences results, it is not clear if these chromosomes were analysed because of the associations found in this study (and no associations were found to putative duplications or deletions in other chromosomes), or if they were specifically included in the analysis because of the previous information (and the other chromosomes were not analysed).

      An interesting statement in the discussion is that existing pfhrp3 deletions in a low-transmission environment may provide a genetic background on which less frequent pfhrp2 deletion events can occur. Does it mean that the occurrence of pfhrp3 deletions would favor the pfhrp2 deletion events? How, and is there any evidence for that?

    1. eLife assessment

      This fundamental study addresses the question of how certain zooplankton achieve barotaxis, directed locomotion in response to changes in hydraulic pressure. The authors provide compelling evidence that the response involves ciliary photoreceptors interacting with motoneurons. This work should be of broad interest to scientists working on mechanosensation, cilia, locomotion, and photoreceptors.

    2. Reviewer #1 (Public Review):

      In this work, the authors address a fundamental question in the biological physics of many marine organisms, across a range of sizes: what is the mechanism by which they measure and respond to pressure. Such responses are classed under the term "barotaxis", with a specific response termed "barokinesis", in which swimming speed increases with depth (hence with pressure). While macroscopic structures such as gas-filled bladders are known to be relevant in fish, the mechanism for smaller organisms has remained unclear. In this work, the authors use ciliated larvae of the marine annelid Platynereis dumerilii to investigate this question. This organism has previously been of great importance in unravelling the mechanism of multicellular phototaxis associated with a ciliated band of tissue directed by light falling on photoreceptors.

      In the present work, the authors use a bespoke system to apply controlled pressure changes to organisms in water and to monitor their transient response in terms of swimming speed and characteristics of swimming trajectories. They establish that those changes are based on relative pressure, and are reflected in changes in the ciliary beating. Significantly, by imaging neuronal activity during pressure stimulation, it was shown that ciliary photoreceptor cells are activated during the pressure response. That these photoreceptors are implicated in the response was verified by the reduced response of certain mutants, which appear to have defective cilia. Finally, serotinin was implicated in the synaptic response of those neurons.

      This work is an impressive and synergistic combination of a number of different biological and physical probes into this complex problem. The ultimate result, that ciliary photoreceptors are implicated, is fascinating and suggests an interesting interplay between photoreception and pressure detection. I see no obvious weaknesses.

    3. Reviewer #2 (Public Review):


      Bezares Calderon et al demonstrate that the planktonic larva of marine annelid Platynereis dumerii responds to increased pressure in the water column by swimming upward. The authors show the larvae do so via their ciliated photoreceptors that recruit serotoninergic motor neurons to elicit swimming via an increased ciliary beat frequency of the multiciliary band of their head.


      The authors built original setups to increase water pressure and monitor behavior or calcium activity in the cells. Using their original setups, they combined behavioral and imaging experiments on wild type and mutant larvae for an opsin to show how photoreceptors encode the response to pressure and recruit in response serotoninergic motor neurons that increase the ciliary beating frequency of the multiciliary band in the head.


      Technical note:<br /> The authors should use DF/F to quantify over time the calcium response in photoreceptors. Furthermore, they should show that there is no concern of motion artifact when the pressure changes - as it could be a concern.

      The authors have not shown<br /> 1- how the off response to decrease of pressure is mediated<br /> 2- which receptor/channel mediates in photoreceptors the response to increased pressure,<br /> 3- nor how the integration of light and pressure information is integrated by photoreceptors in order to guide the behavior of the larvae.

      These points are beyond the scope of the study. However, if possible within a short time frame, it would be really interesting to find out whether conflicting stimuli or converging stimuli (light & pressure) can cancel each other out or synergize. In particular since the authors cite unpublished results in the discussion: "Our unpublished results indeed suggest that green light determines the direction of swimming and can override upward swimming induced by pressure, which only influences the speed of swimming (LABC and GJ, unpublished)." Showing in one panel this very cool phenomenon would be exciting & open tons of questions for the field.

    1. eLife assessment

      This important study provides a new way to enhance organ preservation. The authors provide solid evidence that an existing drug, SNC80, can rapidly and reversibly slow biochemical and metabolic activities while preserving cell and tissue viability. This study will be of interest to a broad set of readers interested in organ transplantation, tissue engineering, regenerative medicine organoids, and organ-on-a-chip engineering.

    2. Reviewer #1 (Public Review):

      Summary:<br /> In this study of metabolism using Xenopus, explanted porcine hearts and limbs, and human organs-on-chips, Sperry et al studied the ability of WB3 to slow metabolism and mobility. The group developed WB3, an analog of SNC80, void of SNC80's delta-opioid receptor binding capacity, and studied its metabolic impact. The authors concluded that SNC80 and its analog WB3 can induce "biostasis" and produce a hypometabolic state which holds promise for prolonging organ viability in transplant surgery as well as other potential clinical benefits.

      Strengths:<br /> This study also opens new avenues for therapeutic possibilities in areas such as trauma, acute infection, and brain injuries. The overall methodology is acceptable, but certain concerns should be addressed.

      Weaknesses:<br /> 1. In cardiac and renal transplantation, cold preservation in ice remains a common practice for transporting explanted hearts to donors which remains a cheap and easily accessible way of preserving organs. While ex-vivo mechanical circulatory platforms have been developed and are increasingly being utilized to prolong organ viability, cold preservation remains widely used. The authors perfused explanted hearts with oxygenated perfusion preservation devices at subnormothermic temperatures (20-23C) which is even much lower than routinely used in clinical cardiopulmonary bypass scenarios (28-32C) (in the discussion, the authors allude to SNC80's possible "protective effect" in cardiac bypass). It is unclear how much of the hypometabolic state is related to WB3 administration versus hypothermia. The study will benefit from a comparison of WB3 administration and hypothermia in Xenopus, explanted porcine organs versus cold preservation alone to show distinction in biostasis parameters.

      2. The authors selected SNC80 based on a literature survey where it was identified based on its ability to induce hypothermia and protect against the effects of spinal cord ischemia in rodents. While this makes sense, were other drugs (eg. Puerarin) considered? The induction of hypothermia and spinal cord protective effect of SNC80 may be multifactorial and not necessarily related to its biostatic effects as the authors describe. Please provide some more context into the background of SNC80.

      3. In most of the models, the primary metric that the authors utilize to characterize metabolic activity is oxygen consumption, which is a somewhat limited indicator. For instance, this does not provide any information, however, on anaerobic metabolic activity. In addition, the ATP/ADP ratio was found to decrease in the organ chips where SNC80 was utilized, but similar findings were not presented for the other models.

      4. The authors should provide a more detailed explanation of SNC80's mechanisms of interaction with proteins related to transmembrane transport, mitochondrial activity, and metabolic processes. What is the impact of SNC80 on mitochondrial function, particularly ATP production and mitochondrial respiration? Are there changes in mitochondrial membrane potential, electron transport chain activity, or oxidative phosphorylation? In this context, the authors discuss the potential role of NCX1 as a binding target for SNC80 and its various mechanisms in slowing metabolism. However, no experiments have been done to confirm this binding in the present study. Co-immunoprecipitation studies using appropriate antibodies against SNC80 and NCX1 should be considered to demonstrate their direct binding. Additionally, surface plasmon resonance (SPR) or isothermal titration calorimetry (ITC) experiments could be employed to quantify the binding affinity between SNC80 and NCX1, providing further evidence of their interaction. These experiments would elucidate the binding mechanism between SNC80 and NCX1 and reveal more information on the mechanism of action for SNC80.

      5. The manuscript notes that histological analysis was conducted, but it seems that only example images are provided, such as Figure 4f. Quantified histological data would provide a more thorough understanding of tissue integrity.

      6. Some of the points mentioned in the discussion and conclusion are rather strong and based on possible associations such as SNC80's potential vasodilatory capacity conferring a cardioprotective effect, and ability to reversibly suppress metabolism across different temperatures and species. Please tone this down and stay limited to the organs studied. Further, the reversibility of the findings may be more objectively assessed by biomarkers with decreased immunofluorescence in response to ischemia such as troponin I for the heart and albumin for the liver. Additionally, an investigation of proteins involved in inflammation, hypoxia, and key cell death pathways using immunohistochemistry analysis can better describe the impact of treatment on apoptosis/necroptosis.

      7. What could be the underlying cause of the observed increase in intercellular spacing after SNC80 administration in porcine limbs which also seems to be evident in the heart histology samples? This seems to be more prominent in the SNC80 compared to the vehicle group.

      8. In the Discussion section, it would be valuable to provide a concise interpretation of the lipidomic data, particularly explaining how changes in acylcarnitine and cholesterol ester levels may relate to tadpole metabolism, hibernation, or other biological processes.

      9. What are the limitations or disadvantages of the study? Does SNC80 possess any immunomodulatory properties that might affect the outcomes of organ transplantation? Are there specific organs for which SNC80 may not be a suitable preservation agent, and if so, what are the reasons behind this?

    3. Reviewer #2 (Public Review):

      Summary:<br /> This manuscript titled "Identification of pharmacological inducers of a reversible hypometabolic state for whole organ preservation" reports the effects of delta opioid receptor activator SNC80 and its modified analog WB3 with ~1,000 times less delta opioid receptor binding activity on metabolic state.

      Strengths:<br /> This is an interesting study with potentially broad implications for organ preservation.

      Weaknesses:<br /> There are several limitations that raise concerns.

      1. The authors developed an analog of a known delta opioid receptor activator SNC80 with three orders of magnitude lesser binding with the delta opioid receptor WB3. This will likely reduce the undesirable effects of SNC80 while preserving the metabolic slowing needed for organ preservation. Yet, most experiments were done with SNC80, not the superior modification, WB3, shown in only a limited set of experiments, Figure 3.

      2. The heart is one of the most challenging organs to preserve, and some experiments are done to establish the metabolic effects of SNC80. However, the biodistribution study, shown in Figure 2, conspicuously omitted the heart.

      3. I do not understand the design of the electrophysiology and contractility experiments with the porcine hearts. How did you defibrillate the hearts after removal and establishing perfusion? Lines 173-175 on Page 7 state: "After defibrillation with epinephrine, the P and QRS waveforms were visible in ECGs from 3 of 4 SNC80-treated hearts (Table S1), suggesting that those hearts regain atrial and ventricular polarization." Please clarify. Defibrillation is done with an electric shock. Also, please show the ECG recordings to support your conclusions about "polarization." What did you mean by "polarization"? Depolarization? Repolarization? Or resting potential. To establish a normal physiological state, please show ECG waveforms and present data on basic ECG characteristics: heart rate, PQ and QT intervals, and P and QRS durations. I recommend perfusion of the porcine heart with WB3, not only SNC80.

      4. Pathology data also raises concerns. The histology images shown in Figure 4f are not quantified, and they show apparently higher levels of tissue disruption in SNC80-treated tissue vs vehicle-treated. The test (lines 169-171) confirms this concern: "In some hearts treated with SNC80, greater waviness of muscle fibers was observed, possibly indicating a state of muscle contraction." It will be helpful to measure markers of apoptosis and necrosis and to apply TTC viability staining.

      5. The apparent state of contracture suggests a higher degree of myocardial damage and a high intracellular calcium level in SNC80-treated hearts. The authors suggested that the sodium-calcium exchanger NCX is a possible target of SNC80 and could be responsible for the "hypometabolic state." However, NCX1 is critically important in the extrusion of cytosolic Ca2+ during the diastolic phase. Failure to remove excessive calcium and restore ionic homeostasis would lead to calcium overload and heart failure.

      6. I am surprised the authors did not consider using the gold standard assay for measuring mitochondrial function in cells by the Seahorse Cell Mito Stress Test.

    4. Reviewer #3 (Public Review):

      In this manuscript, Sperry and colleagues identify SNC80 as a compound that can slow metabolism and mimic hibernation, thereby prolonging tissue viability in organ transplantation and cardiovascular disease settings. Overall, the use of varied and relevant model systems is a strength of this study.

      The authors perform a literature search to identify SNC80 as a promising hit. However, the details of the literature search, a list of other potential hits, and the criteria for identification of SNC80 are not described. The hypometabolic effect of SNC80 exposure is well-characterized in the Xenopus model. Furthermore, the authors show that SNC80 localises to the brain, but do not discuss several studies that have pointed to convulsions induced by exposure to high doses of SCN80, and whether this would be apparent in the Xenopus studies. The authors have promising data on the WB3 morpholino that retains or even improves on the hypometabolism phenotype of SCN80 while likely not retaining delta opioid activity. However, this is not functionally demonstrated. Moreover, WB3 is not used in any of the other assays and models used in the study. In the setting of cardiac transplant surgery, co-administration of SNC80 reduces metabolic activity and inflammation, although it is unclear if there is an improvement in recovery of organ function due to SCN80. The reversible induction of hypometabolic status is also demonstrated in two different organ chips. These models could identify the differential response of epithelial cells and vascular cells to drug perfusion, but the authors have mostly focused on the former. Finally, the authors identify specific targets for the hypometabolic effect of SNC80, which is a valuable resource for other screening studies and can form the basis for future work.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary: The authors study the appearance of oscillations in motifs of linear threshold systems, coupled in specific topologies. They derive analytical conditions for the appearance of oscillations, in the context of excitatory and inhibitory links. They also emphasize the higher importance of the topology, compared to the strength of the links. Finally, the results are confirmed with WC oscillators, which are also linear. The findings are to some extent confirmed with spiking neurons, though here results are less clear, and they are not even mentioned in the Discussion.

      Overall, the results are sound from a theoretical perspective, but I still find it hard to believe that they are of significant relevance for biological networks, or in particular for the oscillations of BG-thalamus-cortex loop in PD. I find motifs in general to be too simplistic for multiscale and generally large networks as is the case in the brain. Moreover, the division of regions is more or less arbitrary by definition, and having such a strong dependence on an odd/even number of inhibitory links is far from reality. Another limitation is the fact that the cortex is considered a single node. Similarly, decomposing even such a coarse network in all possible (238 in this case) motifs doesn't seem of much relevance, when I assume that the emergence of pathological rhythms is more of an emergent phenomenon.


      From the point of view of nonlinear dynamics, the results are solid, and the intuition behind the proofs of the theorems is well explained.


      As stated in the summary, I find the work to be too theoretical without a real application in biological systems or the brain, where the networks are generally very large.

      We respectfully disagree with the reviewer here. The second half of the paper is all about explaining a biological problem. We have shown the validity of our theoretical results (which indeed were obtained in idealized settings) to explain emergence of oscillations in the basal ganglia. We clearly show that our theoretical results hold both in a rate-based model and in a network model with spiking neurons. The model with spiking neurons is one of the most complete network models of the basal ganglia available in the literature. So we emphasize that we have provided a clear application of our results for the brain networks.

      It is not the problem in the simplicity of the model or of the topology, it is often the case that the phenomena are explained by very reduced systems, but the problem is that the applicability of the finding cannot be extended. E.g. the Kuramoto model uses all-to-all coupling, or similar with QIF neurons which also need to follow a Lorentzian distribution in order to derive a mean field.

      We do not understand this comment. There is no need to extend these results to a network of Kuramoto models because in that setting we already assume that individual nodes/populations are oscillating – there is no problem of emergence of oscillations. Here, we are specifically considering a setting in which nodes themselves are not oscillators. We agree that we, at this point, have no insight into how to extend our analytical proof to a situation where individual nodes are spiking.

      But in those cases, relaxing the strict conditions that were necessary for the derivations, still conserves the main findings of the analysis, which I don't see being the case here. The odd/even number rule is too strict, and talking about a fixed and definite number of cycles in the actual brain seems too simplistic.

      We have clearly relaxed most of our assumptions when we considered a network model of basal ganglia in which each subpopulation is a collection of spiking neurons. And as we have shown our results still hold (see Figure 5). Again our model is about oscillations in a network of networks i.e. network of brain regions.

      At meso-scale it is not unreasonable to find such cycles and even-odd number rules. We have shown this for the case of a cortico-basal ganglia model. We can also extend this to cortico-thalamic networks and so on. We have already emphasized this point in the introduction to avoid any confusion: see lines 62-66 – “We prove this conjecture for the threshold-linear network (TLN) model without delays which can closely capture the dynamics of neural populations. Therefore, it is implicit that our results do not hold at the neuronal level but rather at the level of neuron populations/brain regions e.g. the basal ganglia (BG) network which can be described a network of different nuclei.” and lines 69-70 – ’Within the framework of the odd-cycle theory, distinct nuclei are associated with either excitatory or inhibitory nodes.’

      Being linear is another strong assumption, and it is not clear how much of the results are preserved for spiking neurons, even though there is such an analysis, or maybe for other nonlinear types of neuronal masses.

      Clearly our results hold in a network of spiking neurons (see Figure 5). It is of course interesting to ask whether our results hold in a network where individual spiking neurons have more complex spiking behavior like AdEx or Quadratic IF. But that kind of analysis deserves a full manuscript on its own.

      Delays are also mentioned, and their impact on the oscillatory networks is as expected: it reduces the amplitude, but there is no link to the literature, where this is an established phenomenon during synchronization. Finally, the authors should also discuss the time-delays as a known phenomenon to cause or amplify oscillations at different frequencies in a network of coupled oscillators, e.g Petkoski & Jirsa Network Neuroscience 2022, Tewarie et al. NeuroImage 2019, Davis et al. Nat Commun 2021.

      This is indeed a weakness of our model. But as the reviewer already knows, dynamical systems with delays are very difficult to analyze analytically. We have mentioned this in the limitations of the model and the analysis. In our simulations we have considered delays and when the delays are within reasonable limits our results hold.

      Reviewer #2 (Public Review):


      The authors present here a mathematical and computational study of the topological/graph theory requirements to obtain sustained oscillations in neural network models. A first approach mathematically demonstrates that a given network of interconnected neural populations (understood in the sense of dynamical systems) requires an odd number of inhibitory populations to sustain oscillations. The authors extend this result via numerical simulations of (i) a simplified set of Wilson-Cowan networks, (ii) a simplified circuit of the cortico-basal ganglia network, and (iii) a more complex, spike-based neural network of basal ganglia network, which provides insight on experimental findings regarding abnormal synchrony levels in Parkinson's Disease (PD).


      The work elegantly and effectively combines solid mathematical proof with careful numerical simulations at different levels of description, which is uncommon and provides additional layers of confidence to the study. Furthermore, the authors included detailed sections to provide intuition about the mathematical proof, which will be helpful for readers less inclined to the perusal of mathematical derivations. Its insightful and well-informed connection with a practical neuroscience problem, the presence of strong beta rhythms in PD, elevates the potential influence of the study and provides testable predictions.


      In its current form, the study lacks a more careful consideration of the role of delays in the emergence of oscillations. Although they are addressed at certain points during the second part of the study, there are sections in which this could have been done more carefully, perhaps with additional simulations to solidify the authors' claims. Furthermore, there are several results reported in the main figures which are not explained in the main text. From what I can infer, these are interesting and relevant results and should be covered. Finally, the text would significantly benefit from a revision of the grammar, to improve the general readability at certain sections. I consider that all these issues are solvable and this would make the study more complete.

      This point has been made by the first reviewer as well. So we repeat our answer:

      This is indeed a weakness of our model. But as the reviewer already knows, dynamical systems with delays are very difficult to analyze analytically. We have mentioned this in the limitations of the model and the analysis. In our simulations we have considered delays and when the delays are within reasonable limits our results hold.

      Reviewer #2 (Recommendations For The Authors):

      As mentioned in my comments above, I think that the work is already quite solid and relevant but would significantly improve if some issues were addressed:

      We would like to thank the reviewer for valuable comments and constructive feedback which has helped us greatly improve the manuscript.

      1) While the authors acknowledge early on the limitations of this study in terms of not considering plasticity or neuron biophysics (line 72), I think that the absence of propagation delays should be explicitly included here. This absence leads to inaccuracies --for example, the sentence "Consider a small network of two nodes. If we connect them mutually with excitatory synapses, intuitively we can say that the two-population network will not oscillate" (line 74) is only correct if the delays (or signal latencies) are zero. With a proper delay, two excitatory neurons can engage in oscillations with a period given by two times the value of the delay.

      A similar situation happens for inhibitory neurons, where the winner-take-all dynamics described in line 77 is only valid for zero delay. It is known that a homogeneous population of inhibitory spiking neurons with delayed synapses can lead to fast oscillations (Brunel and Hakim 1999), something which is also valid for the equivalent inhibitory single node with delayed self-inhibition. Indeed, a circuit of two inhibitory populations with delayed self- and cross-inhibition can generate oscillations, contradicting the main conclusion of the odd number of inhibitory nodes needed for oscillations.

      Because of these considerations, I think the authors should be more careful when explaining the effects of delays, and state that their main results on the link between oscillations and having an odd number of inhibitory nodes are not valid when delays are considered. They could modify the sentences in lines 72-77 above and include a supplementary figure right after their simulation study for the Wilson-Cowan (to explain the examples above, and also the one in the next point).

      The reviewer has brought up a critical point regarding the impact of propagation delays, and we completely concur with your assessment. In our study, we indeed did not comprehensively consider the effects of propagation delays in cycles with even inhibition, which may introduce inaccuracies in our conclusions.

      We note that in the Wilson-Cowan model with delays, certain cycles with even number of inhibitory links can also generate oscillations with a period equal to twice the delay value. However, in our hand such oscillations were transient and dissipated quickly.

      To better reflect the limitations of our research, we have made significant modifications to the relevant sections in our manuscript.

      In line 100, we've added text to explicitly state that we considered delays in our simulations and acknowledged their potential to generate oscillations ("Given the importance of delays in biological network such as BG, we will consider them in the simulations.").

      In line 102, we've clarified that our conclusions are based on a scenario without delays ("In this following, we give simple examples of the possibility of oscillation (or not) based on the connectivity characteristics of small networks without delays. Let us start with a network of two nodes.").

      Additionally, in line 230, we've included a reference figure supplement 3-2 to highlight the outcomes in terms of oscillations ("EII network only resulted in transient oscillations (Fig. 3, figure supplement 3-1, figure supplement 3-2)").

      In lines 234-237, we've added a sentence discussing the role of synaptic delays in generating transient oscillations in cycles with an even number of inhibitory components, referring to figure supplement 3-2 ("In networks with even number of inhibitory connections (e.g. EII, EEE, II), synaptic delays are the sole mechanism for initiating oscillations, however, unless delays are precisely tuned such oscillations will remain transient (see Supplementary figure supplement 3-2)").

      Moreover, in response to the reviewer’s suggestion, we have included an additional figure supplement 3-2 to illustrate how cycles with even inhibitory components generate transient oscillations when propagation delays are taken into account. This figure provides a visual representation of the phenomenon and enhances the clarity of our findings.

      2) In Figure 3, two motifs (III and EII) are explored to demonstrate the validity of the results across different parameters. Delays don't seem to play a disruptive role in these two cases, but the results seem to be different for other motifs not considered here. Aside from the examples mentioned above, I can imagine how a motif of EEE (i.e. a circle of three excitatory Wilson-Cowan neurons) would display oscillations when delays are included, as the activation would 'circulate' along the ring. However, this EEE motif has an even number of inhibitory units (or perhaps zero is considered an exception, but if so it's not mentioned in the text).

      We thank the reviewer for this observation regarding Figure 3. Indeed, the impact of delays may differ for other motifs not considered in our study. For example, as the reviewer has correctly anticipated, a motif of EEE (a circular network of three excitatory Wilson-Cowan neurons) would exhibit oscillations when delays are included, as activation could 'circulate' along the ring.

      To address this concern,we have performed new simulations (added as a new supplementary figure supplement 3-2). As illustrated in figure supplement 3-2, oscillations may indeed arise in the EEE motif when delays are introduced. However, these oscillations will eventually dissipate – at least with our settings.

      3) Figures 1b, 1c, and 4e display interesting results, but these are absent from the main text. Please include the description of those results. Particularly the case of Figs 1b and 1c seems very relevant to understanding the main results in the context of more complex networks, in which multiple loops with odd and even numbers of inhibitory units would coexist in the network. Does the number of odd-inhibitory loops in a given network affect somehow the power or frequency of the resulting network oscillations? It would be interesting to show this.

      Indeed, we did not explain Figs 1b,c and 4e properly. Now we have revised the manuscript in the following way to incorporate these results:

      In lines 124-128, we added the following text to introduce the concept: "We can generalize these results to cycles of any size, categorizing them into two types based on the count of their inhibitory connections in one direction (referred to as the odd cycle rule, as illustrated in Fig. 1b). More complex networks can also be decomposed into cycles of size 2…N (where N is number of nodes), and predict the ability of the network to oscillate (as shown in Fig. 1c)" In line 298, we included the following text to highlight the relevant result: "Next, we removed the STN output (equivalent to inhibition of STN), the Proto-D2-Arky subnetwork generated oscillations for weak positive inputs to the D2-SPNs (Fig.4e, bottom)."

      How the number of odd/even loops affect the frequency is an interesting question. Intuitively there should be a relation between the two. However, a complete treatment of this question is beyond the scope of the manuscript but we think that in a network with identical node properties, more odd cycles should imply higher oscillation power.

      4) The cortico-BG model is focused on how inactivating STN could suppress (or not) beta oscillations, following experimental observations. However, besides mechanisms for extinguishing oscillations, it would be interesting to see if the progressive emergence of pathological beta oscillations could be explained by the modification of some of the nodes in the model (for example, explicitly mimicking the loss of dopaminergic neurons in the substantia nigra). This could be a very interesting additional figure in the main text.

      This is an interesting suggestion. Something similar has been already done – e.g. Kumar et al. (2010) showed that progressive increase of inhibition of GPe can lead to oscillations. Similarly Holgado et al. (2008) showed how progressive change in the mutual connectivity between STN and GPe can cause oscillations. More recently, Ortone et al. (PloS Comp. Biol 2023) and Azizpour et al. (2023 Bioarxiv) have also shown the effect of progressive change in individual node properties on oscillations in basal ganglia using numerical simulations. Our work in a way provides the theoretical backing to their work. Therefore, we think it is not necessary to again show these results in our model. Instead we have cited these papers. Lines 392-396

      5) I observed some grammatical inconsistencies in the text, some of them are indicated below. I would suggest carefully going through the text to correct those issues or seeking help with editing.

      -line 32 "...which can closely capture the neural population dynamics". Which population dynamics? Do the authors refer to general neural dynamics?

      -line 33 "long term behavior" -> long-term behavior

      -line 68 "given the ionic channel composition" -> "given its ionic channel composition"

      We apologize for the grammatical inconsistencies in our manuscript. We have made the necessary corrections to improve the clarity and accuracy of our text.

      Reviewer #3 (Recommendations For The Authors):

      This manuscript is useful for analytically showing that a cyclic network of threshold-linear neural populations can only oscillate if it has an odd number of inhibitory nodes with strong enough connections. Establishing this result, which holds under rather narrow assumptions, relies on standard tools from dynamical system theory. I find the strength of support for this result to be incomplete for the reasons detailed below:

      Although the mathematical arguments used appear to be correct, the manuscript lacks in rigor and clarity. For instance, the main result presented in theorem 2 is stated in a very unclear fashion: aside from the oddity of the number of inhibitory nodes, there are two conditions to check, which determines four cases. This can be explained in a much more straightforward way without introducing four relations in equations 4-7.

      We acknowledge the reviewer’s concern regarding the presentation of the main result in Theorem 2.

      We would like to emphasize that the introduction of four relations in equations 4-7 was intended to provide a detailed and transparent exposition of the conditions for the main result. While we understand that this approach may appear less straightforward, it allows for a more comprehensive understanding of the underlying logic and the multiple factors influencing the outcomes.

      However, we are open to suggestions for more concise and clear ways to express these conditions if the reviewer has specific recommendations or if there are alternative approaches that the reviewer believes would be more effective in conveying the information.

      Moreover, equation 3 in that same theorem is clearly wrong.

      We sincerely apologize for the typographical error in equation 3 within the same theorem. We thank the reviewer for noticing this. We have revised the text to rectify this mistake. The equation has now been corrected to ensure its accuracy.

      The proof of theorem 2 relies on standard linear algebra and can be improved as well: there are typos, approximations, and missing words (see line 664). The rigor of the exposition is also unsatisfactory. For instance, the proof of Lemma 1 ends with the sentence: "Similarly as before, the convergence of the dynamics driven by the left and right terms ends the proof". I don't know what this means.

      We thank the reviewer for the comments and suggestions. We have made the necessary adjustments to enhance the rigor and clarity of our mathematical reasoning in the revised manuscript.

      In line 644, we have provided clarification for the sentence you found unclear. The revised version now offers a more precise explanation that should help in understanding the proof.

      At the same time, the intuitive arguments presented in the main text are vague at best and do not really help grasping the possible generalizability of the results. For instance, I do not understand the message of panel B in Figure 2 and there seems to be no explanation about it in the main text.

      The main purpose of Figure 2B is to offer a visual representation of the concept and to serve as an aid for readers who may prefer a graphical illustration over extensive equations. While we understand that the figure may not provide a complete explanation on its own, it is intended to complement the text and mathematical content presented in the main text. In the revised version we have added the explanation of Figure 2B.

      Aside from the analytical result, most of the paper consists in simulating networks with distinct inhibitory cyclic structure to validate the theoretical argument. I do not find this approach particularly convincing due to the qualitative nature of the numerical results presented. There is little quantitative analysis of the network structure in relation to the emergence of oscillations. It is also hard to judge whether the examples discussed are cherry picked or truly representative of a large class of dynamics.

      The reviewer has a valid concern about numerical simulations and qualitative nature of the results. We would like to provide some perspective on our approach.

      In our paper, the primary focus is on the mathematical proof, which rigorously establishes the existence of our results. However, we understand that numerical simulations are valuable for illustrating the applicability of the theoretical framework and providing insights into the practical implications.

      If we get into the quantitative description of all the results, the manuscript will become prohibitively long. We acknowledge that there is a balance to be struck between theory and numerical examples in a research paper. We believe that, in conjunction with the mathematical proof, the numerical simulations serve the purpose of illustrating the existence of our results in specific examples. While we cannot provide an exhaustive exploration of all possible network structures, we have chosen representative cases to demonstrate the applicability of our findings. Some of these are already provided in figure supplements S3-1 and S3-3. In the absence of specific suggestions from the reviewer we would like to leave it as is.

      Moreover, the authors apply their cycle analysis to real-world networks by considering cycles of inhibitory nodes independently, whereas the same nodes can belong to several cycles. I find it hard to believe that considering these cycles independently should be enough to make predictions about the emergence of oscillations, as these cycles must interact with one another via shared nodes. I do not understand the color coding used to mark distinct cycles in supplementary figures. There is also not enough information to understand figures in the main text. For instance, I do not understand what the grids are representing in panel B, Figure 4.

      We have clarified the color coding and added more information to understand the figures. We appreciate the reviewer’s concern about our application of cycle analysis to real-world networks and the clarity of our figures. It is not a matter of belief – we have provided a mathematical proof and complemented that with illustrative examples from real-world networks i.e. cortico-basal ganglia network with both rate-based and spiking neurons. Clearly our results hold.

      Regarding the color coding in supplementary figures, we have revised the color scheme to make it more intuitive and informative in caption of figure 4: we use different colors to mark potential oscillators in each motif in BG, and each color means an oscillator from panel a. For more details, see figure supplements 4-1–4-6. The colors now represent distinct cycles more clearly, helping readers better interpret the figures.

    2. eLife assessment

      The present study offers valuable insights into the emergence of oscillations in neural networks. It underscores the importance of achieving a delicate balance between excitatory and inhibitory links, and deals with the topological conditions for oscillations. The study provides solid evidence in simple networks based on formal mathematical theory and advanced simulations, but the wider implications to biological networks would require a more detailed investigation into delays and nonlinearities.

    3. Reviewer #2 (Public Review):

      The authors present here a mathematical and computational study of the topological/graph theory requirements to obtain sustained oscillations in neural network models. A first approach mathematically demonstrates that, for a given network of interconnected neural populations (understood in the sense of dynamical systems) requires an odd number of inhibitory populations to sustain oscillations. The authors extend this result via numerical simulations of (i) a simplified set of Wilson-Cowan networks, (ii) a simplified circuit of the cortico-basal ganglia network, and (iii) a more complex, spike-based neural network of basal ganglia network, which provides insight on experimental findings regarding abnormal synchrony levels in Parkinson's Disease (PD).

      The work elegantly and effectively combines a solid mathematical proof with careful numerical simulations at different levels of description, which is uncommon and provides additional layers of confidence to the study. Furthermore, the authors included detailed sections to provide intuition about the mathematical proof, which will be helpful for readers less inclined to the perusal of mathematical derivations. Its insightful and well-informed connection with a practical neuroscience problem, the presence of strong beta rhythms in PD, elevates the potential influence of the study and provides testable predictions.

      In its updated form, the authors have solved the most pressing issues of the study, by acknowledging the limitations of their work regarding the effects of delays in oscillations, and addressing some of these effects in new simulations. Although some interesting simulations are still not presented in the revised version, they could constitute the focus of future work to complement the conclusions presented here. The absence of explanations for some of the figures and panels has been corrected, and the issues with grammar and lack of clarity have been improved. This important work is therefore now improved.

    4. Reviewer #1 (Public Review):


      Authors study appearance of oscillations in motifs of linear threshold systems, coupled in specific topologies. They derive analytically conditions for appearance of oscillations, in the context of excitatory and inhibitory links. They also emphasize the higher importance of the topology, compared to the strength of the links, though it is not straightforward to apply this for brain networks where the weights can be distributed several orders of magnitude. Finally the results are confirmed with WC oscillators. The findings are to some extent confirmed with spiking neurons, though here results are less clear.

      Overall, the results are sound from a theoretical perspective, but I still find hard to believe that they are of significant relevance for biological networks, or in particular for the oscillations of BG-thalamus-cortex loop in PD. I find motifs in general to be too simplistic for multiscale and generally large networks as it is the case in the brain. Moreover, the division on regions is more or less arbitrary by definition, and having such a strong dependence on odd/even number of inhibitory links is far from reality. Another limitation is the fact that the cortex is considered as a single node. Similarly, decomposing even such a coarse network in all possible (238 in this case) motifs doesn't seem of much relevance, when I'd assume that the emergence of pathological rhythms is more of an emergent phenomena.


      From the point of nonlinear dynamics, the results are solid, and the intuition behind the proofs of the theorems is well explained.


      As stated in the summary, I find the work to be too theoretical without a real application for the brain dynamics, where the networks are generally very large. The odd/even number rule is too strict, and talking about fixed and definite number of cycles in actual brain seems too simplistic. Moreover, the cortex is considered as a single node, and finally the impact of the delays is ignored even though they define the synchronizability of the brain network, and previous works on the amplitude reduction due to the time-delays in difference-coupled networks of oscillators is not mentioned.

    1. eLife assessment

      This study uses carefully designed experiments to generate a useful behavioural and neuroimaging dataset on visual cognition. The results provide solid evidence for the involvement of higher-order visual cortex in processing visual oddballs and asymmetry. However, the evidence provided for the very strong claims of homogeneity as a novel concept in vision science, separable from existing concepts such as target saliency, is inadequate.

    2. Reviewer #1 (Public Review):


      The authors define a new metric for visual displays, derived from psychophysical response times, called visual homogeneity (VH). They attempt to show that VH is explanatory of response times across multiple visual tasks. They use fMRI to find visual cortex regions with VH-correlated activity. On this basis, they declare a new visual region in the human brain, area VH, whose purpose is to represent VH for the purpose of visual search and symmetry tasks.


      The authors present carefully designed experiments, combining multiple types of visual judgments and multiple types of visual stimuli with concurrent fMRI measurements. This is a rich dataset with many possibilities for analysis and interpretation.


      The datasets presented here should provide a rich basis for analysis. However, in this version of the manuscript, I believe that there are major problems with the logic underlying the authors' new theory of visual homogeneity (VH), with the specific methods they used to calculate VH, and with their interpretation of psychophysical results using these methods. These problems with the coherency of VH as a theoretical construct and metric value make it hard to interpret the fMRI results based on searchlight analysis of neural activity correlated with VH. In addition, the large regions of VH correlations identified in Experiments 1 and 2 vs. Experiments 3 and 4 are barely overlapping. This undermines the claim that VH is a universal quantity, represented in a newly discovered area of the visual cortex, that underlies a wide variety of visual tasks and functions.

      Maybe I have missed something, or there is some flaw in my logic. But, absent that, I think the authors should radically reconsider their theory, analyses, and interpretations, in light of the detailed comments below, to make the best use of their extensive and valuable datasets combining behavior and fMRI. I think doing so could lead to a much more coherent and convincing paper, albeit possibly supporting less novel conclusions.


      1) VH is an unnecessary, complex proxy for response time and target-distractor similarity.

      VH is defined as a novel visual quality, calculable for both arrays of objects (as studied in Experiments 1-3) and individual objects (as studied in Experiment 4). It is derived from a center-to-distance calculation in a perceptual space. That space in turn is derived from the multi-dimensional scaling of response times for target-distractor pairs in an oddball detection task (Experiments 1 and 2) or in a same-different task (Experiments 3 and 4). Proximity of objects in the space is inversely proportional to response times for arrays in which they were paired. These response times are higher for more similar objects. Hence, proximity is proportional to similarity. This is visible in Fig. 2B as the close clustering of complex, confusable animal shapes.

      VH, i.e. distance-to-center, for target-present arrays, is calculated as shown in Fig. 1C, based on a point on the line connecting the target and distractors. The authors justify this idea with previous findings that responses to multiple stimuli are an average of responses to the constituent individual stimuli. The distance of the connecting line to the center is inversely proportional to the distance between the two stimuli in the pair, as shown in Fig. 2D. As a result, VH is inversely proportional to the distance between the stimuli and thus to stimulus similarity and response times. But this just makes VH a highly derived, unnecessarily complex proxy for target-distractor similarity and response time. The original response times on which the perceptual space is based are far more simple and direct measures of similarity for predicting response times.

      2) The use of VH derived from Experiment 1 to predict response times in Experiment 2 is circular and does not validate the VH theory.

      The use of VH, a response time proxy, to predict response times in other, similar tasks, using the same stimuli, is circular. In effect, response times are being used to predict response times across two similar experiments using the same stimuli. Experiment 1 and the target present condition of Experiment 2 involve the same essential task of oddball detection. The results of Experiment 1 are converted into VH values as described above, and these are used to predict response times in Experiment 2 (Fig. 2F). Since VH is a derived proxy for response values in Experiment 1, this prediction is circular, and the observed correlation shows only consistency between two oddball detection tasks in two experiments using the same stimuli.

      3) The negative correlation of target-absent response times with VH as it is defined for target-absent arrays, based on the distance of a single stimulus from the center, is uninterpretable without understanding the effects of center-fitting. Most likely, center-fitting and the different VH metrics for target-absent trials produce an inverse correlation of VH with target-distractor similarity.

      The construction of the VH perceptual space also involves fitting a "center" point such that distances to center predict response times as closely as possible. The effect of this fitting process on distance-to-center values for individual objects or clusters of objects is unknowable from what is presented here. These effects would depend on the residual errors after fitting response times with the connecting line distances. The center point location and its effects on the distance-to-center of single objects and object clusters are not discussed or reported here.

      Yet, this uninterpretable distance-to-center of single objects is chosen as the metric for VH of target-absent displays (VHabsent). This is justified by the idea that arrays of a single stimulus will produce an average response equal to one stimulus of the same kind. However, it is not logically clear why response strength to a stimulus should be a metric for homogeneity of arrays constructed from that stimulus, or even what homogeneity could mean for a single stimulus from this set. It is not clear how this VHabsent metric based on single stimuli can be equated to the connecting line VH metric for stimulus pairs, i.e. VHpresent, or how both could be plotted on a single continuum.

      It is clear, however, what *should* be correlated with difficulty and response time in the target-absent trials, and that is the complexity of the stimuli and the numerosity of similar distractors in the overall stimulus set. The complexity of the target, similarity with potential distractors, and the number of such similar distractors all make ruling out distractor presence more difficult. The correlation seen in Fig. 2G must reflect these kinds of effects, with higher response times for complex animal shapes with lots of similar distractors and lower response times for simpler round shapes with fewer similar distractors.

      The example points in Fig. 2G seem to bear this out, with higher response times for the deer stimulus (complex, many close distractors in the Fig. 2B perceptual space) and lower response times for the coffee cup (simple, few close distractors in the perceptual space). While the meaning of the VH scale in Fig. 2G, and its relationship to the scale in Fig. 2F, are unknown, it seems like the Fig. 2G scale has an inverse relationship to stimulus complexity, in contrast to the expected positive relationship for Fig. 2F. This is presumably what creates the observed negative correlation in Fig. 2G.

      Taken together, points 1-3 suggest that VHpresent and VHabsent are complex, unnecessary, and disconnected metrics for understanding target detection response times. The standard, simple explanation should stand. Task difficulty and response time in target detection tasks, in both present and absent trials, are positively correlated with target-distractor similarity.

      I think my interpretations apply to Experiments 3 and 4 as well, although I find the analysis in Fig. 4 especially hard to understand. The VH space in this case is based on Experiment 3 oddball detection in a stimulus set that included both symmetric and asymmetric objects. However, the response times for a very different task in Experiment 4, a symmetric/asymmetric judgment, are plotted against the axes derived from Experiment 3 (Fig. 4F and 4G). It is not clear to me why a measure based on oddball detection that requires no use of symmetry information should be predictive of within-stimulus symmetry detection response times. If it is, that requires a theoretical explanation not provided here.

      4) Contrary to the VH theory, same/different tasks are unlikely to depend on a decision boundary in the middle of a similarity or homogeneity continuum.

      The authors interpret the inverse relationship of response times with VHpresent and VHabsent, described above, as evidence for their theory. They hypothesize, in Fig. 1G, that VHpresent and VHabsent occupy a single scale, with maximum VHpresent falling at the same point as minimum VHabsent. This is not borne out by their analysis, since the VHpresent and VHabsent value scales are mainly overlapping, not only in Experiments 1 and 2 but also in Experiments 3 and 4. The authors dismiss this problem by saying that their analyses are a first pass that will require future refinement. Instead, the failure to conform to this basic part of the theory should be a red flag calling for revision of the theory.

      The reason for this single scale is that the authors think of target detection as a boundary decision task, along a single scale, with a decision boundary somewhere in the middle, separating present and absent. This model makes sense for decision dimensions or spaces where there are two categories (right/left motion; cats vs. dogs), separated by an inherent boundary (equal left/right motion; training-defined cat/dog boundary). In these cases, there is less information near the boundary, leading to reduced speed/accuracy and producing a pattern like that shown in Fig. 1G.

      This logic does not hold for target detection tasks. There is no inherent middle point boundary between target present and target absent. Instead, in both types of trials, maximum information is present when the target and distractors are most dissimilar, and minimum information is present when the target and distractors are most similar. The point of greatest similarity occurs at the limit of any metric for similarity. Correspondingly, there is no middle point dip in information that would produce greater difficulty and higher response times. Instead, task difficulty and response times increase monotonically with the similarity between targets and distractors, for both target present and target absent decisions. Thus, in Figs. 2F and 2G, response times appear to be highest for animals, which share the largest numbers of closely similar distractors.


      1) The area VH boundaries from different experiments are nearly completely non-overlapping.

      In line with their theory that VH is a single continuum with a decision boundary somewhere in the middle, the authors use fMRI searchlight to find an area whose responses positively correlate with homogeneity, as calculated across all of their target present and target absent arrays. They report VH-correlated activity in regions anterior to LO. However, the VH defined by symmetry Experiments 3 and 4 (VHsymmetry) is substantially anterior to LO, while the VH defined by target detection Experiments 1 and 2 (VHdetection) is almost immediately adjacent to LO. Fig. S13 shows that VHsymmetry and VHdetection are nearly non-overlapping. This is a fundamental problem with the claim of discovering a new area that represents a new quantity that explains response times across multiple visual tasks. In addition, it is hard to understand why VHsymmetry does not show up in a straightforward subtraction between symmetric and asymmetric objects, which should show a clear difference in homogeneity.

      2) It is hard to understand how neural responses can be correlated with both VHpresent and VHabsent.

      The main paper results for VHdetection are based on both target-present and target-absent trials, considered together. It is hard to interpret the observed correlations, since the VHpresent and VHabsent metrics are calculated in such different ways and have opposite correlations with target similarity, task difficulty, and response times (see above). It may be that one or the other dominates the observed correlations. It would be clarifying to analyze correlations for target-present and target-absent trials separately, to see if they are both positive and correlated with each other.

      3) The definition of the boundaries and purpose of a new visual area in the brain requires circumspection, abundant and convergent evidence, and careful controls.

      Even if the VH metric, as defined and calculated by the authors here, is a meaningful quantity, it is a bold claim that a large cortical area just anterior to LO is devoted to calculating this metric as its major task. Vision involves much more than target detection and symmetry detection. The cortex anterior to LO is bound to perform a much wider range of visual functionalities. If the reported correlations can be clarified and supported, it would be more circumspect to treat them as one byproduct of unknown visual processing in the cortex anterior to LO, rather than treating them as the defining purpose for a large area of the visual cortex.

    3. Reviewer #2 (Public Review):


      This study proposes visual homogeneity as a novel visual property that enables observers perform to several seemingly disparate visual tasks, such as finding an odd item, deciding if two items are the same, or judging if an object is symmetric. In Experiment 1, the reaction times on several objects were measured in human subjects. In Experiment 2, the visual homogeneity of each object was calculated based on the reaction time data. The visual homogeneity scores predicted reaction times. This value was also correlated with the BOLD signals in a specific region anterior to LO. Similar methods were used to analyze reaction time and fMRI data in a symmetry detection task. It is concluded that visual homogeneity is an important feature that enables observers to solve these two tasks.


      1) The writing is very clear. The presentation of the study is informative.<br /> 2) This study includes several behavioral and fMRI experiments. I appreciate the scientific rigor of the authors.


      1) My main concern with this paper is the way visual homogeneity is computed. On page 10, lines 188-192, it says: "we then asked if there is any point in this multidimensional representation such that distances from this point to the target-present and target-absent response vectors can accurately predict the target-present and target-absent response times with a positive and negative correlation respectively (see Methods)". This is also true for the symmetry detection task. If I understand correctly, the reference point in this perceptual space was found by deliberating satisfying the negative and positive correlations in response times. And then on page 10, lines 200-205, it shows that the positive and negative correlations actually exist. This logic is confusing. The positive and negative correlations emerge only because this method is optimized to do so. It seems more reasonable to identify the reference point of this perceptual space independently, without using the reaction time data. Otherwise, the inference process sounds circular. A simple way is to just use the mean point of all objects in Exp 1, without any optimization towards reaction time data.

      2) On page 11, lines 214-221. It says: "these findings are non-trivial for several reasons". However, the first reason is confusing. It is unclear to me why "it suggests that there are highly specific computations that can be performed on perceptual space to solve oddball tasks". In fact, these two sentences provide no specific explanation for the results.

      3) The second reason is interesting. Reaction times in target-present trials can be easily explained by target-distractor similarity. But why does reaction time vary substantially across target-absent stimuli? One possible explanation is that the objects that are distant from the feature distribution elicit shorter reaction times. Here, all objects constitute a statistical distribution in the feature (perceptual) space. There is certainly a mean of this distribution. Some objects look like outliers and these outliers elicit shorter reaction times in the target-absent trials because outlier detection is very salient.

      One might argue that the above account is merely a rephrasing of the idea of visual homogeneity proposed in this study. If so, feature saliency is not a new account. In other words, the idea of visual homogeneity is another way of reiterating the old feature saliency theory.

      4) One way to reject the feature saliency theory is to compare the reaction times of the objects that are very different from other objects (i.e., no surrounding objects in the perceptual space, e.g., the wheel in the lower right corner of Fig. 2B) with the objects that are surrounded by several similar objects (e.g., the horse in the upper part of Fig. 2B). Also, please choose the two objects with similar distance from the reference point. I predict that the latter will elicit longer reaction times because they can be easily confounded by surrounding similar objects (i.e., four-legged horses can be easily confounded by four-legged dogs). If the density of object distribution per se influences the visual homogeneity score, I would say that the "visual homogeneity" is essentially another way of describing the distributional density of the perceptual space.

      5) The searchlight analysis looks strange to me. One can easily perform a parametric modulation by setting visual homogeneity as the trial-by-trial parametric modulator and reaction times as a covariate. This parametric modulation produces a brain map with the correlation of every voxel in the brain. On page 17 lines 340-343, it is unclear to me what the "mean activation" is.

      Minor points:

      1) In the intro, it says: "using simple neural rules..." actually it is very confusing what "neural rules" are here. Better to change it to "computational principles" or "neural network models"??

      2) In the intro, it says: "while machine vision algorithms are extremely successful in solving feature-based tasks like object categorization (Serre, 2019), they struggle to solve these generic tasks (Kim et al., 2018; Ricci et al. 2021). These are not generic tasks. They are just a specific type of visual task-judging relationship between multiple objects. Moreover, a large number of studies in machine vision have shown that DNNs are capable of solving these tasks and even more difficult tasks. Two survey papers are listed here.

      Wu, Q., Teney, D., Wang, P., Shen, C., Dick, A., & Van Den Hengel, A. (2017). Visual question answering: A survey of methods and datasets. Computer Vision and Image Understanding, 163, 21-40.

      Małkiński, M., & Mańdziuk, J. (2022). Deep Learning Methods for Abstract Visual Reasoning: A Survey on Raven's Progressive Matrices. arXiv preprint arXiv:2201.12382.

    1. eLife assessment

      This study presents useful findings that describe how activity in the corticotropin-releasing hormone neurons in the paraventricular nucleus of the hypothalamus (PVHCRH neurons) modulates sevoflurane anesthesia, as well as a phenomenon the authors term a "sevoflurane general anesthetic-elicited stress response". The technical approaches are solid, and the data presented is largely clear. However, the primary conclusion, that the PVHCRH neurons are critical for the mechanisms of sevoflurane anesthesia, is incompletely supported by the data.

    2. Joint Public Review:

      This study describes a group of CRH-releasing neurons, located in the paraventricular nucleus of the hypothalamus, which, in mice, affects both the state of sevoflurane anesthesia and a grooming behavior observed after it. PVHCRH neurons showed elevated calcium activity during the post-anesthesia period. Optogenetic activation of these PVHCRH neurons during sevoflurane anesthesia shifts the EEG from burst-suppression to a seemingly activated state (an apparent arousal effect), although without a behavioral correlate. Chemogenetic activation of the PVHCRH neurons delays sevoflurane-induced loss of righting reflex (another apparent arousal effect). On the other hand, chemogenetic inhibition of PVHCRH neurons delays recovery of righting reflex and decreases sevoflurane-induced stress (an apparent decrease in the arousal effect). The authors conclude that PVHCRH neurons "integrate" sevoflurane-induced anesthesia and stress. The authors also claim that their findings show that sevoflurane itself produces a post-anesthesia stress response that is independent of any surgical trauma, such as an incision. In its revised form, the article does not achieve its intended goal and will not have impact on the clinical practice of anesthesiology nor on anesthesiology research.


      The manuscript uses targeted manipulation of the PVHCRH neurons with state-of-the-art methods, and is technically sound. Also, the number of experiments is substantial.


      The most significant weaknesses remain: a) overinterpretation of the significance of their findings b) the failure to use another anesthetic as a control, c) a failure to compellingly link their post-sevoflurane measures in mice to anything measured in humans, and d) limitations in the novelty of the findings. These weaknesses are related to the primary concerns described below:

      Concerns about the primary conclusion that PVHCRH neurons integrate the anesthetic effects and post-anesthesia stress response of sevoflurane GA:

      1) After revision, their remain multiple places where it is claimed that PVHCRH neurons mediate the anesthetic effects of sevoflurane (impact statement: we explain "how sevoflurane-induced general anesthesia works..."; introduction: "the neuronal mechanisms that mediate the anesthetic effects...of sevoflurane GA remain poorly understood" and "PVHCRH neurons may act as a crucial node integrating the anesthetic effect and stress response of sevoflurane"). The manuscript simply does not support these statements. The authors show that a short duration exposure to sevoflurane inhibits PVHCRH neurons, but this is followed by hyperexcitability of these neurons for a short period after anesthesia is terminated. They show that the induction and recovery from sevoflurane anesthesia can be modulated by PVHCRH neuronal activity, most likely through changes in brain state (measured by EEG). They also show that PVHCRH neuronal activity modulates corticosterone levels and grooming behavior observed post-anesthesia (which the authors argue are two stress responses). These two things (effects during anesthesia and effects post-anesthesia) may be mechanistically unrelated to each other. None of these observations relate to the primary mechanism of action for sevoflurane. All claims relating to "anesthetic effects" should be removed. Even the term "integration" seems wrong-it implies the PVH is combining information about the anesthetic effect and post-anesthesia stress responses.

      2) It is important to compare the effects of sevoflurane with at least one other inhaled ether anesthetic as one step towards elevating the impact of this paper. Isoflurane, desflurane, and enflurane are ether anesthetics that are very similar to each other, as well as being similar to sevoflurane. For example, one study cited by the authors (Marana et al. 2013) concludes that there is weak evidence for differences in stress-related hormones between sevoflurane and desflurane, with lower levels of cortisol and ACTH observed during the desflurane intraoperative period. It is important to determine whether desflurane activates PVHCRH neurons in the post-anesthesia period, and whether this is accompanied by excess grooming in the mice, because this will distinguish whether the effects of sevoflurane generalize to other inhaled anesthestics, or, alternatively, relate to unique idiosyncratic properties of this gas that may not be a part of its anesthetic properties.

      Concerns about the clinical relevance of the experiments:

      In anesthesiology practice, perioperative stress observed in patients is more commonly related to the trauma of the surgical intervention, with inadequate levels of antinociception or unconsciousness intraoperatively and/or poor post-operative pain control. The authors seem to be suggesting that the sevoflurane itself is causing stress because their mice receive sevoflurane but no invasive procedures, but there is no evidence of sevoflurane inducing stress in human patients. It is important to know whether sevoflurane effectively produces behavioral stress in the recovery room in patients that could be related to the putative stress response (excess grooming) observed in mice. For example, in surgeries or procedures which required only a brief period of unconsciousness that could be achieved by administering sevoflurane alone (comparable to the 30 min administered to the mice), is there clinical evidence of post-operative stress? It is also important to describe a rationale for using a 30 min sevoflurane exposure. What proportion of human surgeries using sevoflurane use exposure times that are comparable to this?

      It is the experience of one of the reviewers that human patients who receive sevoflurane as the primary anesthetic do not wake up more stressed than if they had had one of the other GABAergic anesthetics. If there were signs of stress upon emergence (increased heart rate, blood pressure, thrashing movements) from general anesthesia, this would be treated immediately. The most likely cause of post-operative stress behaviors in humans is probably inadequate anti-nociception during the procedure, which translates into inadequate post-op analgesia and likely delirium. It is the case that children receiving sevoflurane do have a higher likelihood of post-operative delirium. Perhaps the authors' studies address a mechanism for delirium associated with sevoflurane, but this is barely mentioned. Delirium seems likely to be the closest clinical phenomenon to what was studied. As noted by the Besnier et al (2017) article cited by the authors, surgery can elevate postoperative glucocorticoid stress hormones, but it generally correlates with the intensity of the surgical procedure. Besnier et al also note the elevation of glucocorticoids is generally considered to be adaptive. Thus, reducing glucocorticoids during surgery with sevoflurane may hamper recovery, especially as it relates to tissue damage, which was not measured or considered here. This paper only considers glucocorticoid release as a negative factor, which causes "immunosuppression", "proteolysis", and "delays postoperative recovery and...leads to increased morbidity".

      It is also the case that there are explicit published findings showing that mild and moderate surgical procedures in children receiving sevoflurane (which might be the closest human proxy to the brief 30 minute sevoflurane exposure used here) do not have elevated cortisol (Taylor et al, J Clin Endocrinol Metab, 2013). This again raises the question of whether the enhanced grooming or elevated corticosterone observed in the mice here has any relevance to humans.

      Concerns about the novelty of the findings:

      The key finding here is that CRH neurons mediate measures of arousal, and arousal modulates sevoflurane anesthesia induction and recovery. However, CRH is associated with arousal in numerous studies. In fact, the authors' own work, published in eLife in 2021, showed that stimulating the hypothalamic CRH cells lead to arousal and their inhibition promoted hypersomnia. In both papers the authors use fos expression in CRH cells during a specific event to implicate the cells, then manipulate them and measure EEG responses. In the previous work, the cells were active during wakefulness; here- they were active in the awake state the follows anesthesia (Figure 1). Thus, the findings in the current work are incremental and not particularly impactful. Claims like "Here, a core hypothalamic ensemble, corticotropin-releasing hormone neurons in the paraventricular nucleus of the hypothalamus, is discovered" are overstated. PVHCRH cell populations were discovered in the 1980s. Suggesting that it is novel to identify that hypothalamic CRH cells regulate post-anesthesia stress is unfounded as well: this PVH population has been shown over four decades to regulate a plethora of different responses to stress. Anesthesia stress is no different. Their role in arousal is not being discovered in this paper. Even their role in grooming is not discovered in this paper.

      The activation of CRH cells in PVH has already been shown to result in grooming by Jaideep Bains (a paper cited by the authors). Thus, the involvement of these cells in this behavior is not surprising. The authors perform elaborate manipulations of CRH cells and numerous analyses of grooming and related behaviors. For example, they compare grooming and paw licking after anesthesia with those after other stressors such as forced swim, spraying mice with water, physical attack and restraint. The authors have identified a behavioral phenomenon in a rodent model that does not have a clear correlation with a behavior state observed in humans during the use of sevoflurane as part of an anesthetic regimen. The grooming behaviors are not a model of the emergence delirium or the cognitive dysfunction observed commonly in patients receiving sevoflurane for general anesthesia. Emergence delirium is commonly seen in children after sevoflurane is used as part of general anesthesia and cognitive dysfunction is commonly observed in adults-particularly the elderly-- following general anesthesia. No features of delirium or cognitive dysfunction are measured here.

      Other concerns:

      In Figure 2, cFos was measured in the PVH at different points before, during and after sevoflurane. The greatest cFos expression was seen in Post 2, the latest time point after anesthesia. However, this may simply reflect the fact that there is a delay between activity levels and expression of cFos (as noted by the authors, 2-3 hours). Thus, sacrificing mice 30 minutes after the onset of sevoflurane application would be expected to drive minimal cFos expression, and the cFos observed at 30 minutes would not accurately reflect the activity levels during the sevoflurane. Also, the authors state that the hyperactivity, as measured by cFos, lasted "approximately 1 hours before returning to baseline", but there is no data to support this return to baseline.

      In Figure 7, the number of animals appears to change from panel to panel even though they are supposed to show animals from the same groups. For example, cort was measured in only 3 saline-treated O2 animals (Fig 7E), but cFos and CRH were assessed in 4 (Fig C,D). Similarly, grooming time and time spent in open arms was measured in 6 saline-treated O2 controls (Fig 7F,H) but central distance was measured in 8 (Fig 7G). There are other group number discrepancies in this figure-- the number of data points in the plots do not match what is reported in the legend for numerous groups. Similarly, Figure 4 has a mismatch between the Ns reported in the legend and the number of points plotted per bar. For example, there were 10 animals in the hM3Di group; all are shown for the LORR and time to emergence plots, but only 8 were used for time to induction. The legends reported N=7 for the mCherry group, yet 9 are shown for the time to emergence panel. No reason for exclusions is cited. These figures (and their statistics) should be corrected.

    3. Author Response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study presents potentially useful findings describing how activity in the corticotropin-releasing hormone neurons in the paraventricular nucleus of the hypothalamus modulates sevoflurane anesthesia, as well as a phenomenon the authors term a "general anesthetic stress response". The technical approaches are solid and the data presented are largely clear. However, the primary conclusion, that the PVHCRH neurons are a mechanism of sevoflurane anesthesia, is inadequately supported.

      We appreciate the editors and reviewers for their thorough assessment and constructive feedback. We have provided clarifications and updated the manuscripts to better interpret our results, please see below. As for the primary conclusion, we revised it as PVH CRH neurons potently modulate states of anaesthesia in sevoflurane general anesthesia, being a part of anaesthesia regulatory network of sevoflurane.

      Combined Public Review:

      This study describes a group of CRH-releasing neurons, located in the paraventricular nucleus of the hypothalamus, which, in mice, affects both the state of sevoflurane anesthesia and a grooming behavior observed after it. PVH-CRH neurons showed elevated calcium activity during the post-anesthesia period. Optogenetic activation of these PVH-CRH neurons during sevoflurane anesthesia shifts the EEG from burst-suppression to a seemingly activated state (an apparent arousal effect), although without a behavioral correlate. Chemogenetic activation of the PVH-CRH neurons delays sevoflurane-induced loss of righting reflex (another apparent arousal effect). On the other hand, chemogenetic inhibition of PVH-CRH neurons delays recovery of the righting reflex and decreases sevoflurane-induced stress (an apparent decrease in the arousal effect). The authors conclude that PVH-CRH neurons are a common substrate for sevoflurane-induced anesthesia and stress. The PVH-CRH neurons are related to behavioral stress responses, and the authors claim that these findings provide direct evidence for a relationship between sevoflurane anesthesia and sevoflurane-mediated stress that might exist even when there is no surgical trauma, such as an incision. In its current form, the article does not achieve its intended goal.

      Thank you for the detailed review. We have carefully considered your comments and have revised the manuscript to provide a clearer interpretation of our findings. Our findings indicate that PVH CRH neurons integrate the anesthetic effect and post-anesthesia stress response of sevoflurane (GA), providing new evidence for understanding the neuronal regulation of sevoflurane GA and identifying a potential brain target for further investigation into modulating the post-anesthesia stress response. However, we did not propose that there was a direct relationship between sevoflurane anesthesia and sevoflurane-mediated stress in the absence of incision. Our results mainly concluded that PVH CRH neurons integrate the anaesthetic effect and post-anaesthesia stress response of sevoflurane GA, which offers new evidence for the neuronal regulation of sevoflurane GA and provides an important but ignored potential cause of the post-anesthesia stress response.


      The manuscript uses targeted manipulation of the PVH-CRH neurons, and is technically sound. Also, the number of experiments is substantial.

      Thank you.


      The most significant weaknesses are a) the lack of consideration and measurement of GABAergic mechanisms of sevoflurane anesthesia, b) the failure to use another anesthetic as a control, c) a failure to document a compelling post-anesthesia stress response to sevoflurane in humans, d) limitations in the novelty of the findings. These weaknesses are related to the primary concerns described below:

      Concerns about the primary conclusion, that PVH-CRH neurons mediate "the anesthetic effects and post-anesthesia stress response of sevoflurane GA".

      Thanks for the advice. Our responses are as below:

      1) Just because the activity of a given neural cell type or neural circuit alters an anesthetic's response, this does not mean that those neurons play a role in how the anesthetic creates its anesthetic state. For example, sevoflurane is commonly used in children. Its primary mechanism of action is through enhancement of GABA-mediated inhibition. Children with ADHD on Ritalin (a dopamine reuptake inhibitor) who take it on the day of surgery can often require increased doses of sevoflurane to achieve the appropriate anesthetic state. The mesocortical pathway through which Ritalin acts is not part of the mechanism of action of sevoflurane. Through this pathway, Ritalin is simply increasing cortical excitability making it more challenging for the inhibitory effects of sevoflurane at GABAergic synapses to be effective. Similarly, here, altering the activity of the PVHCRH neurons and seeing a change in anesthetic response to sevoflurane does not mean that these neurons play a role in the fundamental mechanism of this anesthetic's action. With the current data set, the primary conclusions should be tempered.

      Thank you for your comments. Our results adequately uncover PVH CRH neurons that modulate the state of consciousness as well as the stress response in sevoflurane GA, but are insufficient to demonstrate that these neurons play a role in the underlying mechanism of sevoflurane anesthesia. We will revise our conclusions and make them concrete. The primary conclusion has been revised as PVH CRH neurons potently modulate states of anaesthesia in sevoflurane GA, being a part of the anaesthesia regulatory network of sevoflurane.

      2) It is important to compare the effects of sevoflurane with at least one other inhaled ether anesthetic. Isoflurane, desflurane, and enflurane are ether anesthetics that are very similar to each other, as well as being similar to sevoflurane. It is important to distinguish whether the effects of sevoflurane pertain to other anesthetics, or, alternatively, relate to unique idiosyncratic properties of this gas that may not be a part of its anesthetic properties.

      For example, one study cited by the authors (Marana et al.. 2013) concludes that there is weak evidence for differences in stress-related hormones between sevoflurane and desflurane, with lower levels of cortisol and ACTH observed during the desflurane intraoperative period. It is not clear that this difference in some stress-related hormones is modeled by post-sevoflurane excess grooming in the mice, but using desflurane as a control could help determine this.

      Thank you for your suggestions. We completely agree on the importance of determining whether the effects of sevoflurane apply to other anesthetics or arise from unique idiosyncratic attributes separate from its anesthetic properties. However, it is challenging to definitively conclude whether the effects of sevoflurane observed in our study extend to other inhaled anesthetics, even with desflurane as a control. While sevoflurane shares many common anesthetic properties with other inhalation agents, it also exhibits distinct characteristics and potential idiosyncrasies that set it apart from its counterparts. Regarding studies related to desflurane's impact on hormone levels or stress-like behaviors, one study involving 20 women scheduled for elective total abdominal hysterectomy demonstrated that there was no significant correlation between the intra-operative depth of anesthesia achieved with desflurane and the extent of the endocrine-metabolic stress response (as indicated by the concentrations of plasma cortisol, glucose, and lactate)1. Besides, a study conducted with mice suggested the abilities related to sensorimotor functions, anxiety and depression did not undergo significant changes after 7 days of anesthesia administered with 8.0% desflurane for 6 h2. Furthermore, a study involving 50 Caucasian women undergoing laparoscopic surgery for benign ovarian cysts demonstrated that in low stress surgery, desflurane, when compared to sevoflurane, exhibited superior control over the intraoperative cortisol and ACTH response 3. Based on these findings, we propose that the effect we observed in this study is likely attributed to the unique idiosyncratic properties of sevoflurane. We will conduct additional experiments to investigate this proposal with other commonly used anaesthetics in our future studies.

      Concerns about the clinical relevance of the experiments

      In anesthesiology practice, perioperative stress observed in patients is more commonly related to the trauma of the surgical intervention, with inadequate levels of antinociception or unconsciousness intraoperatively and/or poor post-operative pain control. The authors seem to be suggesting that the anesthetic itself is causing stress, but there is no evidence of this from human patients cited. We were not aware that this is a documented clinical phenomenon. It is important to know whether sevoflurane effectively produces behavioral stress in the recovery room in patients that could be related to the putative stress response (excess grooming) observed in mice. For example, in surgeries or procedures that required only a brief period of unconsciousness that could be achieved by administering sevoflurane alone (comparable to the 30 min administered to the mice), is there clinical evidence of post-operative stress?

      Thank you for your question. There is currently no direct evidence available. Studies on sevoflurane in humans primarily focus on its use during surgical interventions, making it difficult to find studies that solely administer sevoflurane, as was done in our study with mice. Generally, a short anesthesia time refers to procedures that last less than one hour, while a long anesthesia time could be considered for procedures lasting several hours or more4. A study published in eLife investigated the patterns of reemerging consciousness and cognitive function in 30 healthy adults who underwent GA for three hours 5. This finding suggests that the cognitive dysfunction observed immediately and persistently after GA in healthy animals may not necessarily apply anesthesia and postoperative neurocognitive disorders could be influenced by factors other than GA, such as surgery or patient comorbidity. Therefore, further studies are needed to verify the post-operative stress in sevoflurane-only short time anesthesia.

      Indeed, stress after surgeries can result from multiple factors aside from anesthesia, including pain, anxiety, inflammation, but what we want to illustrate in this study is that anesthesia could be one of these factors that we ignored in previous studies. In our current study, we did not propose that there was a direct relationship between sevoflurane anesthesia and sevoflurane-mediated stress without incision. We observed stress-related behavioural changes after exposure of sevoflurane GA in mouse model, indicating sevoflurane-mediated stress might exist without surgical trauma. Importantly, whether anesthetic administration alone will cause post-operative stress is worth studying in different species especially human.

      Patients who receive sevoflurane as the primary anesthetic do not wake up more stressed than if they had had one of the other GABAergic anesthetics. If there were signs of stress upon emergence (increased heart rate, blood pressure, thrashing movements) from general anesthesia, the anesthesiologist would treat this right away. The most likely cause of post-operative stress behaviors in humans is probably inadequate anti-nociception during the procedure, which translates into inadequate post-op analgesia and likely delirium. It is the case that children receiving sevoflurane do have a higher likelihood of post-operative delirium. Perhaps the authors' studies address a mechanism for delirium associated with sevoflurane, but this is not considered. Delirium seems likely to be the closest clinical phenomenon to what was studied.

      We agree with your idea. We aim to establish a connection between post-operative delirium in humans and stress-like behaviors observed in mice following sevoflurane anesthesia. Specifically, we have observed that the increased grooming behavior exhibited by mice after sevoflurane anesthesia resembles the fuzzy state of consciousness experienced during post-operative delirium6. In our discussion, we also emphasized the occurrence of sevoflurane-induced emergence agitation, a common phenomenon reported in clinical studies with an incidence of up to 80%. This state is characterized by hyperactivity, confusion, delirium, and emotional agitation 7,8. Meanwhile, in our experimental tests, namely the open field test (OFT) and elevated plus maze (EPM) test, we observed that mice exposed to sevoflurane inhalation displayed reduced movement distances during both the OFT and EPM tests (Figure 7G and I). These findings suggest a decline in behavioral activity similar to what is observed in cases of delirium.

      Concerns about the novelty of the findings

      CRH is associated with arousal in numerous studies. In fact, the authors' own work, published in eLife in 2021, showed that stimulating the hypothalamic CRH cells leads to arousal and their inhibition promotes hypersomnia. In both papers, the authors use fos expression in CRH cells during a specific event to implicate the cells, then manipulate them and measure EEG responses. In the previous work, the cells were active during wakefulness; here- they were active in the awake state that follows anesthesia (Figure 1). Thus, the findings in the current work are incremental.

      Thank you for acknowledging our previous work focusing on the changes in the sleep-wake state of mice when PVH CRH neurons are manipulated. In this study, our primary objective was to identify the neuronal mechanisms mediating the anesthetic effects and post-anesthetic stress response of sevoflurane GA. While our study claims that activation of PVH CRH neurons leads to arousal, it provides evidence that PVH CRH neurons may play a role in the regulation of conscious states in GA. Our current findings uncover that PVH CRH neurons modulate the state of consciousness as well as the stress response in sevoflurane GA, and that the modulation of PVH CRH neurons bidirectionally altered the induction and recovery of sevoflurane GA. This identifies a new brain region involved in sevoflurane GA that goes beyond the arousal-related regions.

      The activation of CRH cells in PVN has already been shown to result in grooming by Jaideep Bains (cited as reference 58). Thus, the involvement of these cells in this behavior is expected. The authors perform elaborate manipulations of CRH cells and numerous analyses of grooming and related behaviors. For example, they compare grooming and paw licking after anesthesia with those after other stressors such as forced swim, spraying mice with water, physical attack, and restraint. However, the relevance of these behaviors to humans and generalization to other types of anesthetics is not clear.

      The hyperactivity of PVH CRH neurons and behavior (e.g., excessive self-grooming) in mice may partially mirror the observed agitation and underlying mechanisms during emergence from sevoflurane GA in patients. As mentioned in the Discussion section (page 16, lines 371-374), sevoflurane-induced emergence agitation represents a prevalent manifestation of the post-anesthesia stress response. It is frequently observed, with an incidence of up to 80% in clinical reports, and is characterized by hyperactivity, confusion, delirium, and emotional agitation7,8. Our aim in this study is to distinguish the excessive stress responses of patients to sevoflurane GA from stress triggered by other factors. Other stimuli, such as forced swimming, can be considered sources of both physical and emotional stress, which are associated with depression and anxiety in humans.

      Regarding generalization to other types of anesthetics, we propose that the stress-related behavioral effects observed in this study might occur in cases of the administration of certain types of anesthetics. For example, one study showed that intravenous ketamine infusion (10 mg/kg, 2 hours) elevated plasma corticosterone and progesterone levels in rats, reducing locomotor activity (sedation) 9. The administration of intravenous anesthesia with propofol combined with sevoflurane caused greater postoperative stress than the single use of propofol10. However, desflurane, a common inhaled ether anesthetic, when compared to sevoflurane, was associated with better control of intraoperative cortisol and ACTH response in low-stress surgeries8. Thus, these behaviors observed after exposure to sevoflurane GA may be related to the post-anesthesia stress response in humans, which might also occur in cases of the administration of certain types of anesthetics.

      Recommendations for the authors:

      Reviewer 1

      1) The CRH-Cre mouse line should be validated. There are several lines of these mice, and their fidelity varies.

      The CRH-Cre mouse line we used in this study is from The Jackson Laboratory (https://www.jax.org/strain/012704) with the name B6(Cg)-Crhtm1(cre)Zjh/J (Strain #: 012704). These CRH-ires-CRE knock-in mice have Cre recombinase expression directed to CRH positive neurons by the endogenous promoter/enhancer elements of the corticotropin releasing hormone locus (Crh). We have done standard PCR to validate the mouse line following genotyping protocols provided by the Jackson Laboratory. The protocol primers were: 10574 (SEQUENCE 5' → 3': CTT ACA CAT TTC GTC CTA GCC); 10575 (SEQUENCE 5' → 3': CAC GAC CAG GCT GCG GCT AAC); 10576 (SEQUENCE 5' → 3': CAA TGT ATC TTA TCA TGT CTG GAT CC). The 468-bp CRH-specific PCR product was amplified in mutant (CRH-Cre+/+) mice; in heterozygote (CRH-Cre+/-) mice, both the 468-bp and the 676-bp PCR products were detected; in wild type (WT) mice, only the 676-bp WT allele-specific PCR product was amplified. An example of PCR results is presented below. The heterozygote and mutant mice were included in our study.

      Author response image 1.

      1. It would be very helpful to validate the CRH antibody. Using any antiserum at 1:800 suggests that it may not be potent or highly specific.

      As requested, we used the same CRH antibody at a concentration of 1:800, following the methods described in the Method section. The results are displayed below.

      Author response image 2.

      1. In Figure 1C, the control sections are out of focus, any cells are blurry, reducing confidence in the analyses (locus ceruleus cells appear confluent in the control?)

      Sorry for the confusing figure and we have revised the control section part of Figure 1C:

      Author response image 3.

      Reviewer 2

      1) In the Abstract, to say that "General anesthetics benefit patients undergoing surgeries without consciousness. ..." is a gross understatement of the essential role that general anesthesia plays today to make surgery not only tolerable but humane. This opening sentence should be rewritten. General anesthesia is a fundamental process required to undertake safely and humanely a high fraction of surgeries and invasive diagnostic procedures.

      As requested, we rewrote this opening sentence, please see the follows:

      GA is a fundamental process required to undertake surgeries and invasive diagnostic procedures safely and humanely. However, the undesired stress response associated with GA can lead to delayed recovery and even increased morbidity in clinical settings.

      2) In the Abstract, when discussing the response of the PVN-CRH neurons to chemogenetic inhibition, say exactly what the "opposite effect" is.

      Thanks for your insights. We have rewritten our abstract as follows:

      Chemogenetic activation of these neurons delayed the induction and accelerated emergence from sevoflurane GA, whereas chemogenetic inhibition of PVH CRH neurons promoted induction and prolonged emergence from sevoflurane GA.

      3) In all spectrograms the dynamic range is compressed between 0.5 and 1. Please make use of the full range, as some details might be missed because of this compression.

      We are sorry for the incorrect unit of the spectrograms. We have provided the correct one with full range, please see below:

      Author response image 4.

      Author response image 5.

      4) The spectrogram in Figure 2D has several frequency chirps that do not seem physiological.

      Thank you for your comments. The frequency chips of the spectrogram during the During and Post 1 phase were caused by recording noises. To avoid confusion, we have deleted the spectrogram in Figure 2D.

      5) The 3D plots in Figures 3G and H are not helpful. Thanks for the comment. We'd like to keep the 3D plots as they aid visual comparison of three different features of grooming, which complements other panels in Figure 3.

      6) The spectrograms in Figures 5A and B are too small, while the spectra in Figures 5C and D are too large. Please invert this relationship, as it is interesting and important to see the details in the spectrograms. The same happens in Figure 6.

      We adjusted the layout of the Figure 5 and Figure 6 as requested, please see below:

      Author response image 6.

      Author response image 7.

      7) In Figure 6H, the authors compute the burst-suppression ratio during a period that seemingly has no bursts or suppressions (Figure 6B).

      The burst-suppression ratio was computed from data with the minimum duration of burst and suppression periods set at 0.5 s. Sorry for the confusion. We added a new supplementary figure (Figure 6-figure supplement 8) displaying a 40-second EEG with a burst suppression period to better visualize the burst suppression.

      Author response image 8.

      8) The data analyses are done in terms of p-values. They should be reported as confidence intervals so that any effect the authors wish to establish is measured along with its uncertainty.

      Thank you for your valuable suggestions regarding our manuscript. We appreciate your thoughtful consideration of our work. We understand your concern but we would like to provide some justification for our choice of reporting p-values and explain why we believe they are appropriate for our study. First, the use of p-values for hypothesis testing and significance assessment is a common practice in our field. Many previous studies in our area of research also report results in terms of p-values. For example, Wei Xu11 published in 2020 suggested sevoflurane inhibits MPB neurons through postsynaptic GABAA-Rs and background potassium channels, Ao Y12 demonstrated that activation of the TH:LC-PVT projections is helpful in facilitating the transition from isoflurane anesthesia to an arousal state, using P-value as data analyses. By adhering to this convention, we ensure that our findings are consistent with the existing body of literature. This makes it easier for readers to compare and integrate our results with previous work. Secondly, while confidence intervals can provide a measure of effect size and uncertainty, p-values offer a concise way to communicate statistical significance. They help readers quickly assess whether an effect is statistically significant or not, which is often the primary concern when interpreting research findings. We hope that by providing these reasons for our choice of reporting p-values, we can address your concern while maintaining the integrity and consistency of our study. If you believe there are specific instances where reporting confidence intervals would be more informative, please feel free to highlight those, and we will consider your suggestion on a case-by-case basis. 


      1. Baldini, G., Bagry, H. & Carli, F. Depth of anesthesia with desflurane does not influence the endocrine-metabolic response to pelvic surgery. Acta Anaesthesiol Scand 52, 99-105, doi:10.1111/j.1399-6576.2007.01470.x (2008).
      2. Niikura, R. et al. Exploratory analyses of postanesthetic effects of desflurane using behavioral test battery of mice. Behav Pharmacol 31, 597-609, doi:10.1097/fbp.0000000000000567 (2020).
      3. Marana, E. et al. Desflurane versus sevoflurane: a comparison on stress response. Minerva Anestesiol 79, 7-14 (2013).
      4. Vutskits, L. & Xie, Z. Lasting impact of general anaesthesia on the brain: mechanisms and relevance. Nat Rev Neurosci 17, 705-717, doi:10.1038/nrn.2016.128 (2016).
      5. Mashour, G. A. et al. Recovery of consciousness and cognition after general anesthesia in humans. Elife 10, doi:10.7554/eLife.59525 (2021).
      6. Mattison, M. L. P. Delirium. Ann Intern Med 173, Itc49-itc64, doi:10.7326/aitc202010060 (2020).
      7. Dahmani, S. et al. Pharmacological prevention of sevoflurane- and desflurane-related emergence agitation in children: a meta-analysis of published studies. Br J Anaesth 104, 216-223, doi:10.1093/bja/aep376 (2010).
      8. Lim, B. G. et al. Comparison of the incidence of emergence agitation and emergence times between desflurane and sevoflurane anesthesia in children: A systematic review and meta-analysis. Medicine (Baltimore) 95, e4927, doi:10.1097/MD.0000000000004927 (2016).
      9. Radford, K. D. et al. Association between intravenous ketamine-induced stress hormone levels and long-term fear memory renewal in Sprague-Dawley rats. Behav Brain Res 378, 112259, doi:10.1016/j.bbr.2019.112259 (2020).
      10. Yang, L., Chen, Z. & Xiang, D. Effects of intravenous anesthesia with sevoflurane combined with propofol on intraoperative hemodynamics, postoperative stress disorder and cognitive function in elderly patients undergoing laparoscopic surgery. Pak J Med Sci 38, 1938-1944, doi:10.12669/pjms.38.7.5763 (2022).
      11. Xu, W. et al. Sevoflurane depresses neurons in the medial parabrachial nucleus by potentiating postsynaptic GABA(A) receptors and background potassium channels. Neuropharmacology 181, 108249, doi:10.1016/j.neuropharm.2020.108249 (2020).
      12. Ao, Y. et al. Locus Coeruleus to Paraventricular Thalamus Projections Facilitate Emergence From Isoflurane Anesthesia in Mice. Front Pharmacol 12, 643172, doi:10.3389/fphar.2021.643172 (2021).
    1. eLife assessment

      This useful study provides incomplete evidence for the functional roles of the human DCP1 paralogs in regulating RNA decay by DCP2. Using a combination of cellular-based assays and in vitro assays, the authors conclude that DCP1a/b plays a role in regulating DCP2 activity. This study makes a number of interesting and potentially relevant observations; however, a number of outstanding questions remain to be addressed. These results will be of interest to the RNA community.

    2. Reviewer #1 (Public Review):

      Summary & Assessment:

      The catalytic core of the eukaryotic decapping complex consists of the decapping enzyme DCP2 and its key activator DCP1. In humans, there are two paralogs of DCP1, DCP1a, and DCP1b, that are known to interact with DCP2 and recruit additional cofactors or coactivators to the decapping complex; however, the mechanisms by which DCP1 activates decapping and the specific roles of DCP1a versus DCP1b, remain poorly defined. In this manuscript, the authors used CRISPR/Cas9-generated DCP1a/b knockout cells to begin to unravel some of the differential roles of human DCP1a and DCP1b in mRNA decapping, gene regulation, and cellular metabolism. While this manuscript presents some new and interesting observations on human DCP1 (e.g. human DCP1a/b KO cells are viable and can be used to investigate DCP1 function; only the EVH1 domain, and not its disordered C-terminal region which recruits many decapping cofactors, is apparently required for efficient decapping in cells; DCP1a and b target different subsets of mRNAs for decay and may regulate different aspects of metabolism), there are several major issues that undercut some of the main conclusions of the paper, and some key claims that are incompletely or inconsistently supported by the presented data.

      Strengths & well-supported claims:

      • Through in vivo tethering assays in CRISPR/Cas9-generated DCP1a/b knockout cells, the authors show that DCP1 depletion leads to significant defects in decapping and the accumulation of capped, deadenylated mRNA decay intermediates.

      • DCP1 truncation experiments reveal that only the EVH1 domain of DCP1 is necessary to rescue decapping defects in DCP1a/b KO cells.

      • RNA and protein immunoprecipitation experiments suggest that DCP1 acts as a scaffold to help recruit multiple decapping cofactors to the decapping complex (e.g. EDC3, DDX6, PATL1 PNRC1, and PNRC2), but that none of these cofactors are essential for DCP2-mediated decapping in cells.

      • The authors investigated the differential roles of DCP1a and DCP1b in gene regulation through transcriptomic and metabolomic analysis and found that these DCP1 paralogs target different mRNA transcripts for decapping and have different roles in cellular metabolism and their apparent links to human cancers. (Although I will note that I can't comment on the experimental details and/or rigor of the transcriptomic and metabolomic analyses, as these are outside my expertise.)

      Weaknesses & incompletely supported claims:

      1) A central mechanistic claim of the paper is that "DCP1a can regulate DCP2's cellular decapping activity by enhancing DCP2's affinity to RNA, in addition to bridging the interactions of DCP2 with other decapping factors. This represents a pivotal molecular mechanism by which DCP1a exerts its regulatory control over the mRNA decapping process." Similar versions of this claim are repeated in the abstract and discussion sections. However, this appears to be entirely at odds with the observation from in vitro decapping assays with immunoprecipitated DCP2 that showed DCP1 knockout does not significantly affect the enzymatic activity of DCP2 (Figures 2B-D; I note that there may be a very small change in DCP2 activity shown in panel C, but this may be due to slightly different amounts of immunoprecipitated DCP2 used in the assay, as suggested by panel D). If DCP1 pivotally regulates decapping activity by enhancing RNA binding to DCP2, why is no difference in decapping activity observed in the absence of DCP1? Furthermore, the authors show only weak changes in relative RNA levels immunoprecipitated by DCP2 with versus without DCP1 (~2-3 fold change; consistent with the Valkov 2016 NSMB paper, which shows what looks like only modest changes in RNA binding affinity for yeast Dcp2 +/- Dcp1). Is the argument that only a 2-3 fold change in RNA binding affinity is responsible for the sizable decapping defects and significant accumulation of deadenylated intermediates observed in cells upon Dcp1 depletion? (and if so, why is this the case for in-cell data, but not the immunoprecipitated in vitro data?)

      The authors acknowledge this apparent discrepancy between the in vitro DCP2 decapping assays and in-cell decapping data, writing: "this observation could be attributed to the inherent constraints of in vitro assays, which often fall short of faithfully replicating the complexity of the cellular environment where multiple factors and cofactors are at play. To determine the underlying cause, we postulated that the observed cellular decapping defect in DCP1a/b knockout cells might be attributed to DCP1 functioning as a scaffold." This is fair. They next show that DCP1 acts as a scaffold to recruit multiple factors to DCP2 in cells (EDC3, DDX6, PatL1, and PNRC1 and 2). However, while DCP1 is shown to recruit multiple cofactors to DCP2 (consistent with other studies in the decapping field, and primarily through motifs in the Dcp1 C-terminal tail), the authors ultimately show that *none* of these cofactors are actually essential for DCP2-mediated decapping in cells (Figures 3A-F). More specifically, the authors showed that the EVH1 domain was sufficient to rescue decapping defects in DCP1a/b knockout cells, that PNRC1 and PNRC2 were the only cofactors that interact with the EVH1 domain, and finally that shRNA-mediated PNRC1 or PNCR2 knockdown has no effect on in-cell decapping (Figures 3E and F). Therefore, based on the presented data, while DCP1 certainly does act as a scaffold, it doesn't seem to be the case that the major cellular decapping defect observed in DCP1a/b knockout is due to DCP1's ability to recruit specific cofactors to DCP2.

      So as far as I can tell, the discrepancy between the in vitro (DCP1 not required) and in-cell (DCP1 required) decapping data, remains entirely unresolved. Therefore, I don't think that the conclusions that DCP1 regulates decapping by (a) changing RNA binding affinity (authors show this doesn't matter in vitro, and that the change in RNA binding affinity is very small) or (b) by bridging interactions of cofactors with DCP2 (authors show all tested cofactors are dispensable for robust in-cell decapping activity), are supported by the evidence presented in the paper (or convincingly supported by previous structural and functional studies of the decapping complex).

      2) Related to the RNA binding claims mentioned above, are the differences shown in Figure 3H statistically significant? Why are there no error bars shown for the MBP control? (I understand this was normalized to 1, but presumably, there were 3 biological replicates here that have some spread of values?). The individual data points for each replicate should be displayed for each bar so that readers can better assess the spread of data and the significance of the observed differences. I've listed these points as major because of the key mechanistic claim that DCP1 enhances RNA binding to DCP2 hinges in large part on this data.

      3) Also related to point (1) above, the kinetic analysis presented in Figure 2C shows that the large majority of transcript is mostly decapped at the first 5-minute timepoint; it may be that DCP2-mediated decapping activity is actually different in vitro with or without DCP1, but that this is being missed because the reaction is basically done in less than 5 minutes under the conditions being assayed (i.e. these are basically endpoint assays under these conditions). It may be that if kinetics were done under conditions to slow down the reaction somewhat (e.g. lower Dcp2 concentration, lower temperatures), so that more of the kinetic behavior is captured, the apparent discrepancy between in vitro and in-cell data would be much less. Indeed, previous studies have shown that in yeast, Dcp1 strongly activates the catalytic step (kcat) of decapping by ~10-fold, and reduces the KM by only ~2 fold (Floor et al, NSMB 2010). It might be beneficial to use purified proteins here (only a Western blot is used in Figure 2D to show the presence of DCP2 and/or DCP1, but do these complexes have other, and different, components immunoprecipitated along with them?), if possible, to better control reaction conditions.

      This contradiction between the in vitro and in-cell decapping data undercuts one of the main mechanistic takeaways from the first half of the paper. This needs to be addressed/resolved with further experiments to better define the role of DCP1-mediated activation, or the mechanistic conclusions significantly changed or removed.

      4) The second half of the paper compares the transcriptomic and metabolic profiles of DCP1a versus DCP1b knockouts to reveal that these target a different subset of mRNAs for degradation and have different levels of cellular metabolites. This is a great application of the DCP1a/b KO cells developed in this paper and provides new information about DCP1a vs b function in metazoans, which to my knowledge has not really been explored at all. However, the analysis of DCP1 function/expression levels in human cancer seems superficial and inconclusive: for example, the authors conclude that "...these findings indicate that DCP1a and DCP1b likely have distinct and non-redundant roles in the development and progression of cancer", but what is the evidence for this? I see that DCP1a and b levels vary in different cancer cell types, but is there any evidence that these changes are actually linked to cancer development, progression, or tumorigenesis? If not, these broader conclusions should be removed.

      5) The authors used CRISPR-Cas9 to introduce frameshift mutations that result in premature termination codons in DCP1a/b knockout cells (verified by Sanger sequencing). They then use Western blotting with DCP1a or DCP1b antibodies to confirm the absence of DCP1 in the knockout cell lines. However, the DCP1a antibody used in this study (Sigma D5444) is targeted to the C-terminal end of DCP1a. Can the authors conclusively rule out that the CRISPR/Cas-generated mutations do not result in the production of truncated DCP1a that is just unable to be detected by the C-terminally targeted antibody? While it is likely the introduced premature termination codon in the DCP1a gene results in nonsense-mediated decay of the resulting transcript, this outcome is indeed supported by the knockout results showing large defects in cellular decapping which can be rescued by the addition of the EVH1 domain, it would be better to carefully validate the success of the DCP1a knockout and conclusively show no truncated DCP1a is produced by using N-terminally targeted DCP1a antibodies (as was the case for DCP1b).

      Some additional minor comments:

      • More information would be helpful on the choice of DCP1 truncation boundaries; why was 1-254 chosen as one of the truncations?<br /> • Figure S2D is a pretty important experiment because it suggests that the observed deadenylated intermediates are in fact still capped; can a positive control be added to these experiments to show that removal of cap results in rapid terminator-mediated degradation?

    3. Reviewer #2 (Public Review):


      Chen et al., investigate the role of DCP1 paralogs in regulating RNA decay in human tissue culture cells. They assess the impact of the absence of DCP1a and/or DCP1b on the interaction of DCP2 with mRNA and other members of the decapping complex. In vitro RNA decay assays were performed to demonstrate that DCP1a/b plays a minor role in DCP2-mediated decapping and decay. The impacts of DCP1a and/or DCP1b knockout on the transcriptome and metabolome were determined.


      Analysis of RNA abundance and metabolite differences in human tissue culture cells lacking DCP1a and/or DCP1b was performed.

      The protein-protein interactions between DCP2 and other members of the decapping machinery mediated by DCP1a and/or DCP1b were assessed.

      The functional role of DCP1a and/or DCP1b in mediating mRNA decapping/decay in human tissue culture cell extracts was determined.

      Human tissue culture cells lacking DCP1a and/or DCP1b appear to have altered metabolomes, however, the significance and meaning of these differences are not clear.


      The direct targets of DCP1a and/or DCP1b were not determined as the analysis was restricted to RNA-seq to assess RNA abundance, which can be a result of direct or indirect regulation by DCP1a/b.

      P-bodies appear to be larger in human cells lacking DCP1a and DCP1b but a lack of image quantification prevents this conclusion from being drawn.

      The lack of details in the methodology and figure legends limit reader understanding.

    1. eLife assessment

      In their valuable study, Chen et al. aim to define the neuronal role of HMMR, a microtubule-associated protein typically associated with cell division. Their findings suggest that HMMR is necessary for proper neuronal morphology and the generation of polymerizing microtubules within neurites, potentially by promoting the function of TPX2. While the study is recognized as a first step in deciphering the influence of HMMR on microtubule organization in neurons, the reviewers note the current work is incomplete, with significant gaps and it would benefit from further exploration of the mechanism of microtubule stability by HMMR, the link between HMMR-mediated microtubule generation and morphogenesis, and the physiological implications of disrupting HMMR during neuronal morphogenesis.

    2. Reviewer #1 (Public Review):

      The microtubule cytoskeleton is essential for basic cell functions, enabling intracellular transport, and establishment of cell polarity and motility. Microtubule-associated proteins (MAPs) contribute to the regulation of microtubule dynamics and stability - mechanisms that are specifically important for the development and physiological function of neurons. Here, the authors aimed to elucidate the neuronal function of the MAP Hmmr, which they had previously identified in a quantitative study of the proteome associated with neuronal microtubules.

      The authors conduct well-controlled experiments to demonstrate the localization of endogenous as well as exogenous Hmmr on microtubules within the soma as well as all neurites of hippocampal neurons. Functional analysis using gain- and loss-of-function approaches demonstrates that Hmmr levels are crucial for neuronal morphogenesis, as the length of both dendrites and axons decreases upon loss of Hmmr and increases upon Hmmr overexpression. In addition to length alterations, the branching pattern of neurites changes with Hmmr levels. To uncover the mechanism of how Hmmr influences neuronal morphology, the authors follow the lead that Hmmr overexpression induces looped microtubules in the soma, indicative of an increase in microtubule stability. Microtubule acetylation indeed decreases and increases with Hmmr LOF and GOF, respectively. Together with a rescue of nocodazole-induced microtubule destabilization by Hmmr GOF, these results argue that Hmmr regulates microtubule stability. Highlighted by the altered movement of a plus-end-associated protein, Hmmr also has an effect on the dynamic nature of microtubules. The authors present evidence suggesting that the nucleation frequency of neuronal microtubules depends on Hmmr's ability to recruit the microtubule nucleator Tpx2. Together, these data add novel insight into MAP-mediated regulation of microtubules as a prerequisite for neuronal morphogenesis. While the data shown support the author's conclusions, the study also has several weaknesses:

      - The study appears incomplete as the initial proteomics analysis which is referenced as an entry into the study is not presented. This surely is the authors' choice, however, without presenting this data set, it would make more sense if the authors first showed the localization of Hmmr on neuronal microtubules and then started with the functional analysis.

      - Neurite branching is quantified, but the methods used are not consistent (normalized branch density vs. Sholl analysis) and there is no distinction between alterations of branching in dendrites vs. axons. This information should be added as it could prove informative with respect to the physiological function of Hmmr in neurite branching.

      - The authors show that altered Hmmr levels affect neurite branching and identify an effect on microtubule stability and dynamics as a molecular mechanism. However, how branching correlates with or is regulated by Hmmr-mediated microtubule dynamics is neither addressed experimentally nor discussed by the authors. The physiological significance of altered neuronal morphogenesis also lacks discussion.

      - Multiple times, the manuscript lacks a rationale for an experimental approach, choice of cell type, time points, regions of interest, etc. Also, a meaningful description of the methods and for how data were analyzed is missing, making the paper hard to read for someone not directly from the field.

    3. Reviewer #2 (Public Review):

      The mechanism of microtubule formation, stabilization, and organization in neurites is important for neuronal function. In this manuscript, the authors examine the phenotype of neurons following alteration in the level of the protein HMMR, a microtubule-associated protein with established roles in mitosis. Neurite morphology is measured as well as microtubule stability and dynamic parameters using standard assays. A binding partner of HMMR, TPX2, is localized. The results support a role for HMMR in neurons.

      The work presented in this manuscript seeks to determine if a MAP called HMMR contributes to microtubule dynamics in neurons. Several steps, including validation of the RNAi, additional statistical analysis, use of cells at the same age in culture, and better documentation in figures, would increase the impact of the work.

      In many places, the data can be improved which might make the story more convincing. As presented, the results show that HMMR is distributed as puncta on neurons with data coming from a single HMMR antibody, and some background staining that was not discussed. In the discussion the authors state that HMMR impacts microtubule stability, which was evaluated by the presence of post-translational modification and resistance to nocodazole; the data are suggestive but not entirely convincing. The discussion also states that HMMR increases the "amount" of growing microtubules which was measured as the frequency of comet appearance. The authors did not comment on how the number of growing microtubules results in the observed morphological changes.

    1. Reviewer #3 (Public Review):


      Machhua et al. in their work focused on unravelling the molecular mechanism of daptomycin binding and interaction with bacterial cell membranes. Daptomycin (Dap) is an acidic, cyclic lipopeptide composed of 13 amino acids, known for preferential binding to anionic lipids, particularly phosphatidylglycerol (PG), which are prevalent components in the membranes of Gram-positive bacteria. The process of binding and antimicrobial efficacy of Dap is significantly influenced by the ionic composition of the surrounding environment, especially the presence of Ca2+ ions. The authors underscore the presence of significant knowledge gaps in our understanding of daptomycin's mode of action. Several critical questions remain unanswered, including the basis for selective recognition and accumulation in membranes of Gram-positive strains, the specific role of Ca2+ ions in this process, and the mechanisms by which daptomycin binds to and inserts into the cell membrane.

      Dap is intrinsically fluorescent due to its kynurenine residue (Kyn-13) and this property allows direct imaging of Dap binding to model cell membranes without the need for additional labeling. Taking advantage of this Dap autofluorescence, authors monitored the emission intensity of micelles, composed of varying DMPG content upon their exposure to Dap and compared it with the kinetics of fluorescence observed for zwitterionic DMPC and other negatively charged lipids such as cardiolipin (CA), POPA and POPS. The authors noted that the linear relationship between DMPG content and Dap fluorescence is strongly lipid-specific, as it was not observed for other anionic lipids. The manuscript sheds light on the specificity of Dap's interaction with CA and DMPG lipids. Through Ca2+ sequestration with EGTA, the authors demonstrated that the binding of Dap with CA is reversible, while its interaction with DMPG results in the irreversible insertion of Dap into the lipid membrane structure, caused by the significant conformational change of this lipopeptide. The formation of a stable DMPG-Dap complex was also verified in bacterial cells isolated from Gram-positive bacteria B. subtilis, where Dap exhibited a permanent binding to PG lipids.

      Altogether, the authors endeavored to illuminate novel insights into the molecular basis of Dap binding, interaction, and the mechanism of insertion into bacterial cell membranes. Such understanding holds promise for the development of innovative strategies in combating drug resistance and the emergence of the so-called superbugs.


      - The manuscript by Machhua et al. provides a comprehensive analysis of the Dap mechanism of binding and interaction with the membrane. It discusses various aspects of this, only apparently trivial interactions such as the importance of PG presence in the membrane, the impact of Ca2+ ions, and different mechanisms of Dap binding with other negatively charged lipids.

      - The authors focused not only on model membranes (micelles) but also extended their research to bacterial cell membranes obtained from B. subtilis.

      - The research is not only a report of the experimental findings but tries to give potential hypotheses explaining the molecular mechanisms behind the observed results.


      - The authors overestimate their findings, stating that they propose a novel mechanism of Dap interaction with bacterial cell membranes. In fact, they rather extend the already reported hypotheses.

      - The literature study was not done as thoroughly as it should be. Many publications discussing the importance and mechanism of action of Ca2+ ions or conformational changes of daptomycin were not cited.

    2. Reviewer #1 (Public Review):


      In this manuscript, the molecular mechanism of interaction of daptomycin (DAP) with bacterial membrane phospholipids has been explored by fluorescence and CD spectroscopy, mass spectrometry, and RP-HPLC. The mechanism of binding was found to be a two-step process. A fast reversible step of binding to the surface and a slow irreversible step of membrane insertion. Fluorescence-based titrations were performed and analysed to infer that daptomycin bound simultaneously two molecules of PG with nanomolar affinity in the presence of calcium. Conformational change but not membrane insertion was observed for DAP in the presence of cardiolipin and calcium.


      The strength of the study is the skillful execution of biophysical experiments, especially stopped-flow kinetics that capture the first surface binding event, and the careful delineation of the stoichiometry.


      The weakness of the study is that it does not add substantially to the previously known information and fails to provide additional molecular details. The current study provides incremental information on DAP-PG-calcium association but fails to capture the complex in mass spectrometry. The ITC and NMR studies with G3P are inconclusive There are no structural models presented. Another aspect missing from the study is the reconciliation between PG in the monomer, micellar, and membrane forms.

    3. eLife assessment

      This valuable study describes the molecular mechanism of daptomycin insertion into bacterial membranes. The authors provide solid in vitro evidence for the early events of daptomycin interaction with phospholipid headgroups and stronger, specific interaction with phosphatidylglycerol. This work will be of interest to bacterial membrane biologists and biochemists working in the antimicrobial resistance field.

    4. Reviewer #2 (Public Review):

      The authors provide evidence for the early events of the lipopeptide daptomycin inserting into bacterial membranes. The authors utilize several biochemical and biophysical methods to characterize the nature of daptomycin interactions with a diverse set of phospholipids. The authors found that daptomycin, when complexed with calcium ions, can transiently interact with the headgroups of anionic phospholipids. In particular, the authors found that daptomycin rapidly interacts with the headgroup of cardiolipin and that this interaction is reversible and dependent on calcium. The authors provide evidence supporting previously published data that daptomycin interacts with phosphatidylglycerol (PG) with high affinity in a 1:2 ratio. The authors showed that this interaction includes both a calcium-dependent headgroup interaction (denoted the pre-insertion complex) and a distinct, irreversible interaction that is likely occurring between the hydrophobic tail of daptomycin with the tails of the PG molecules (denoted the quaternary complex of daptomycin, calcium, and 2 PG). The authors also isolated a daptomycin-containing complex from Bacillus subtilis cells following exposure to daptomycin and calcium. PG was identified from the isolated complex, albeit with a different acyl chain length from that used in vitro. Taken together, these data deepen our understanding of the stages of daptomycin interaction and intercalation in a membrane and can contribute to translational research on the development of structural analogs that could augment the efficacy of daptomycin treatment.

      The authors have provided sufficient evidence to support a very specific interaction between daptomycin and PG, but their conclusions drawn from the data are exceedingly broad. In particular, the role of lipid II and lipid II precursors in the insertion and flipping events of daptomycin in the membrane are only briefly addressed despite the recently described pivotal role assigned to lipid II in the formation of a membrane-active daptomycin complex (Grein et al. Nature Communications 2020). While the authors put forth an intriguing and probable hypothesis that there are potentially multiple complexes and conformations of daptomycin as it incorporates within the membrane, the strength of the study's results and conclusions lies in its examination of the early headgroup interactions and distinctive PG interaction rather than the later events of daptomycin insertion in the membrane. The in vivo data presented supports the authors' model, but the conclusions do not address critical differences between the two very different systems i.e., in the behavior of micelles versus cell bilayer membranes.

    1. eLife assessment

      This manuscript presents a useful presentation of a new method for assessing the adhesion strength of axons with the use of a laser-induced shock wave. However, the strength of the evidence is incomplete as critical controls for calibration and time course are lacking.

    2. Reviewer #1 (Public Review):


      Axon growth is of course essential to the formation of neural connections. Adhesion is generally needed to anchor and rectify such motion, but whether the tenacity or forces of adhesion must be optimal for maximal axon extension is unknown. Measurements and contributing factors are generally lacking and are pursued here with a laser-induced shock wave approach near the axon growth cone. The authors claim to make measurements of the pressure required to detach axons from low to high matrix density. The results seem to support the authors' conclusions, and the work - with further support - is likely to impact the field of cell adhesion. In particular, there could be some utility of the methods for the adhesion and those interested in aspects of axon growth.


      A potential ability to control the pressure simply via proximity of the laser spot is convenient and perhaps reasonable. The 0 to 1 scale for matrix density is a good and appropriate measure for comparing adhesion and other results. The attention to detachment speed, time, F-actin, and adhesion protein mutant provides key supporting evidence. Lastly, the final figure of traction force microscopy with matrix varied on a gel is reasonable and more physiological because neural tissue is soft (cite PMID: 16923388); an optimum in Fig.6 also perhaps aligns with axon length results in Fig.5.


      The results seem incomplete and less than convincing. This is because the force calibration curve seems to be from a >10 yr old paper without any more recent checks or validating measurements. Secondly, the claimed effect of pressure on the detachment of the growth cone does not consider other effects such as cavitation or temperature, and certainly needs validation with additional methods that overcome such uncertainties. The authors need to check whether the laser perturbs the matrix, particularly local density. A relation between traction stresses of ~20-50 pN/um2 in Fig.6 and the adhesion pressure of 3-5 kPa of FIg.3 needs to be carefully explained; the former units equate to 0.02-0.05 kPa, and would perhaps suggest cells cannot detach themselves and move forward.

      The authors need to measure axon length on gels (Fig.6) as more physiological because neural tissue is soft. The studies are also limited to a rudimentary in vitro model without clear relevance to in vivo.

    3. Reviewer #2 (Public Review):


      The authors measure axon outgrowth rate, laminin adhesion strength, and actin rearward flow rate. They find that the axon outgrowth rate has a biphasic dependence on adhesion strength. In interpreting the results, they suggest that the results "imply that adhesion modulation is key to the regulation of axon guidance"; however, they measure elongation rate, not guidance.


      The measurements of adhesion strength by laser-induced shock waves are reasonable as is the measurement of actin flow rates by speckle microscopy.


      They only measure the length of the axons after 3 days and have no measurements of the actual rate of growth cone movements when they are moving. They do not measure the rate of actin growth at the leading edge to know its contribution to the extension rate. This is inadequate.

      These studies are unlikely to have an impact on the field because the measurement of axon growth rate at short times is missing.

    4. Reviewer #3 (Public Review):


      Yamada et al. build on classic and more recent studies (Chen et al., 2023; Lemmon et al., 1992; Nichol et al., 2016; Zheng et al., 1994; Schense and Hubbell, 2000) to better understand the relationship between substrate adhesion and neurite outgrowth.


      The primary strength of the manuscript lies in developing a method for investigating the role of adhesion in axon outgrowth and traction force generation using a femtosecond laser technique. The most exciting finding is that both outgrowth and traction force generation have a biphasic relationship with laminin concentration.


      The primary weaknesses are a lack of discussion of prior studies that have directly measured the strength of growth cone adhesions to the substrate (Zheng et al., 1994) and traction forces (Koch et al., 2012), the inverse correlation between retrograde flow rate and outgrowth (Nichol et al., 2016), and prior studies noting a biphasic effect of substrate concentration of neurite outgrowth (Schense and Hubbell, 2000).

      Overall, the claims and conclusions are well justified by the data. The main exception is that the data is more relevant to how the rate of neurite outgrowth is controlled rather than axonal guidance.

      This manuscript will help foster interest in the interrelationship between neurite outgrowth, traction forces, and substrate adhesion, and the use of a novel method to study this problem.

    1. eLife assessment

      This useful study explores the phenomenon of glucose-induced mitochondrial repression, known as the Crabtree effect, in cells cultured under high glucose conditions. The study uses a variety of well-designed assays to support the authors' hypothesis, shedding light on inorganic phosphate-mediated metabolic regulation. The analysis and conclusions could benefit from a more rigorous approach, given the limited replication on a single yeast strain background and inadequacies in the normalization methods, leaving the evidence in parts incomplete.

    1. eLife assessment

      This important study highlights how a single protein transporter dysfunction can significantly alter brain biochemistry, potentially playing a crucial role in the intellectual disability in creatine transporter deficiency (CTD) patients. The evidence is compelling that the new in vitro CTD model using CTD patient's brain organoid cultures will be widely applicable. Despite minor areas for further exploration, the study significantly enhances our understanding of CTD, offering potential therapeutic targets and a robust foundation for continued research in the field.

    1. eLife assessment

      The solid study addresses the role of extracellular matrix (ECM) in neuronal migration. The authors showed that the interaction between the ternary complex formed by tenascin-C, the chondroitin sulfate proteoglycan neurocan, and hyaluronic acid is important for the multipolar to bipolar transition in the intermediate zone (IZ) of the developing cortex.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      The work is a useful contribution towards understanding the role of archaeal and plant D-aminoacyl-tRNA deacylase 2 (DTD2) in deacylation and detoxification of D-Tyr-tRNATyr modified by various aldehydes produced as metabolic byproducts in plants. It integrates convincing results from both in vitro and in vivo experiments to address the long-standing puzzle of why plants outperform bacteria in handling reactive aldehydes and suggests a new strategy for stress-tolerant crops. The impact of the paper is limited by the fact that only one modified D-aminoacyl tRNA was examined, in lack of evidence that plant eEF1A mimics EF-Tu in protecting L-aminoacyl tRNAs from modification, and in failure to measure accumulation of toxic D-aminoacyl tRNAs or impairment of translation in plant cells lacking DTD2.

      We have now addressed all the drawbacks as follows:

      ‘only one modified D-aminoacyl tRNA was examined’

      We wish to clarify that only D-Leu (Yeast), D-Asp (Bacteria, Yeast), D-Tyr (Bacteria, Cyanobacteria, Yeast) and D-Trp (Bacteria) show toxicity in vivo in the absence of known DTD (Soutourina J. et al., JBC, 2000; Soutourina O. et al., JBC, 2004; Wydau S. et al., JBC, 2009) and D-Tyr-tRNATyr is used as a model substrate to test the DTD activity in the field because of the conserved toxicity of D-Tyr in various organisms. DTD2 has been shown to recycle D-Asp-tRNAAsp and D-Tyr-tRNATyr with the same efficiency both in vitro and in vivo (Wydau S. et al., NAR, 2007) and it also recycles acetaldehyde-modified D-Phe-tRNAPhe and D-Tyr-tRNATyr in vitro as shown in our earlier work (Mazeed M. et al., Science Advances, 2021). We have earlier shown that DTD1, another conserved chiral proofreader across bacteria and eukaryotes, acts via a side chain independent mechanism (Ahmad S. et al., eLife, 2013). To check the biochemical activity of DTD2 on D-Trp-tRNATrp, we have now done the D-Trp, D-Tyr and D-Asp toxicity rescue experiments by expressing the archaeal DTD2 in dtd null E. coli cells. We found that DTD2 could rescue the D-Trp toxicity with equal efficiency like D-Tyr and D-Asp (Figure: 1). Considering the action on multiple side chains with different chemistry and size, it can be proposed with reasonable confidence that DTD2 also operates based on a side chain independent manner.

      Author response image 1.

      DTD2 recycles multiple D-aa-tRNAs with different side chain chemistry and size. Growth of wildtype (WT), dtd null strain (∆dtd), and Pyrococcus horikoshii DTD2 (PhoDTD2) complemented ∆dtd strains of E. coli K12 cells with 500 µM IPTG along with A) no D-amino acids, B) 2.5 mM D-tyrosine, C) 30 mM D-aspartate and D) 5 mM D-tryptophan.

      ‘lack of evidence that plant eEF1A mimics EF-Tu in protecting L-aminoacyl tRNAs from modification’

      To understand the role of plant eEF1A in protecting L-aa-tRNAs from aldehyde modification, we have done a thorough sequence and structural analysis. We analysed the aa-tRNA bound elongation factor structure from bacteria (PDB ids: 1TTT) and found that the side chain of amino acid in the amino acid binding site of EF-Tu is projected outside (Figure: 2A; 3A). In addition, the amino group of amino acid is tightly selected by the main chain atoms of elongation factor thereby lacking a space for aldehydes to enter and then modify the L-aa-tRNAs and Gly-tRNAs (Figure: 2B; 3B). Modelling of D-amino acid (D-phenylalanine and smallest chiral amino acid, D-alanine) in the same site shows serious clashes with main chain atoms of EF-Tu, indicating D-chiral rejection during aa-tRNA binding by elongation factor (Figure: 2C-E). Next, we superimposed the tRNA bound mammalian eEF-1A cryoEM structure (PDB id: 5LZS) with bacterial structure to understand the structural differences in terms of tRNA binding and found that elongation factor binds tRNA in a similar way (Figure: 3C-D). Modelling of D-alanine in the amino acid binding site of eEF-1A shows serious clashes with main chain atoms, indicating a general theme of D-chiral rejection during aa-tRNA binding by elongation factor (Figure: 2F; 3E). Structure-based sequence alignment of elongation factor from bacteria, archaea and eukaryotes (both plants and mammals) shows a strict conservation of amino acid binding site (Figure: 2G). This suggests that eEF-1A will mimic EF-Tu in protecting L-aa-tRNAs from reactive aldehydes. Minor differences near the amino acid side chain binding site (as indicated in Wolfson and Knight, FEBS Letters, 2005) might induce the amino acid specific binding differences (Figure: 3F). However, those changes will have no influence when the D-chiral amino acid enters the pocket, as the whole side chain would clash with the active site. We have now included this sequence and structural conservation analysis in our revised manuscript (in text: line no 107-129; Figure: 2 and S2). Overall, our structural analysis suggests a conserved mode of aa-tRNA selection by elongation factor across life forms and therefore, our biochemical results with bacterial elongation factor Tu (EF-Tu) reflect the protective role of elongation factor in general across species.

      Author response image 2.

      Elongation factor enantio-selects L-aa-tRNAs through D-chiral rejection mechanism. A) Surface representation showing the cocrystal structure of EF-Tu with L-Phe-tRNAPhe. Zoomed-in image showing the binding of L-phenylalanine with side chain projected outside of binding site of EF-Tu (PDB id: 1TTT). B) Zoomed-in image of amino acid binding site of EF-Tu bound with L-phenylalanine showing the selection of amino group of amino acid through main chain atoms (PDB id: 1TTT). C) Modelling of D-phenylalanine in the amino acid binding site of EF-Tu shows severe clashes with main chain atoms of EF-Tu. Modelling of smallest chiral amino acid, alanine, in the amino acid binding site of EF-Tu shows D) no clashes with L-alanine and E) clashes with D-alanine. F) Modelling of D-alanine in the amino acid binding site of eEF-1A shows clashes with main chain atoms. (*Represents modelled molecule). G) Structure-based sequence alignment of elongation factor from bacteria, archaea and eukaryotes (both plants and animals) showing conserved amino acid binding site residues. (Key residues are marked with red star).

      Author response image 3.

      Elongation factor protects L-aa-tRNAs from aldehyde modification. A) Cartoon representation showing the cocrystal structure of EF-Tu with L-Phe-tRNAPhe (PDB id: 1TTT). B) Zoomed-in image of amino acid binding site of EF-Tu bound with L-phenylalanine (PDB id: 1TTT). C) Cartoon representation showing the cryoEM structure of eEF-1A with tRNAPhe (PDB id: 5LZS). D) Image showing the overlap of EF-Tu:L-Phe-tRNAPhe crystal structure and eEF-1A:tRNAPhe cryoEM structure (r.m.s.d. of 1.44 Å over 292 Cα atoms). E) Zoomed-in image of amino acid binding site of eEF-1A with modelled L-alanine (PDB id: 5ZLS). (*Modelled) F) Overlap showing the amino acid binding site residues of EF-Tu and eEF-1A. (EF-Tu residues are marked in black and eEF-1A residues are marked in red).

      ‘failure to measure accumulation of toxic D-aminoacyl tRNAs or impairment of translation in plant cells lacking DTD2’

      We agree that measuring the accumulation of D-aa-tRNA adducts from plant cells lacking DTD2 is important. We tried to characterise the same with dtd2 mutant plants extensively through Northern blotting as well as mass spectrometry. However, due to the lack of information about the tissue getting affected (root or shoot), identity of aa-tRNA as well as location of aa-tRNA (cytosol or organellar), we are so far unsuccessful in identifying them from plants. Efforts are still underway to identify them from plant system lacking DTD2. However, we have used a bacterial surrogate system, E. coli, as used earlier in Mazeed M. et al., Science Advances, 2021 to show the accumulation of D-aa-tRNA adducts in the absence of dtd. We could identify the accumulation of both formaldehyde and MG modified D-aa-tRNA adducts via mass spectrometry (Figure: 4). These results are now included in the revised manuscript (in line no: 190-197 and Figure: S5).

      Author response image 4.

      Loss of DTD results in accumulation of modified D-aminoacyl adducts on tRNAs in E. coli. Mass spectrometry analysis showing the accumulation of aldehyde modified D-Tyr-tRNATyr in A) Δdtd E. coli, B) formaldehyde and D-tyrosine treated Δdtd E. coli, and C) MG and D-tyrosine treated Δdtd E. coli. ESI-MS based tandem fragmentation analysis for unmodified and aldehyde modified D-Tyr-tRNATyr in D) Δdtd E. coli, E) and F) formaldehyde and D-tyrosine treated Δdtd E. coli, G) and H) MG and D-tyrosine treated Δdtd E. coli.

      Response to Public Reviews:

      We are grateful for the reviewers’ positive feedback and their comments and suggestions on this manuscript. Reviewer 1 has indicated two weaknesses and Reviewer 2 has none. We have now addressed all the concerns of the Reviewers.

      Reviewer #1 (Public Review):


      This work is an extension of the authors' earlier work published in Sci Adv in 2001, wherein the authors showed that DTD2 deacylates N-ethyl-D-aminoacyl-tRNAs arising from acetaldehyde toxicity. The authors in this study, investigate the role of archaeal/plant DTD2 in the deacylation/detoxification of D-Tyr-tRNATyr modified by multiple other aldehydes and methylglyoxal (produced by plants). Importantly, the authors take their biochemical observations to plants, to show that deletion of DTD2 gene from a model plant (Arabidopsis thaliana) makes them sensitive to the aldehyde supplementation in the media especially in the presence of D-Tyr. These conclusions are further supported by the observation that the model plant shows increased tolerance to the aldehyde stress when DTD2 is overproduced from the CaMV 35S promoter. The authors propose a model for the role of DTD2 in the evolution of land plants. Finally, the authors suggest that the transgenic crops carrying DTD2 may offer a strategy for stress-tolerant crop development. Overall, the authors present a convincing story, and the data are supportive of the central theme of the story.

      We are happy that reviewer found our work convincing and would like to thank the reviewer for finding our data supportive to the central theme of the manuscript.


      Data are novel and they provide a new perspective on the role of DTD2, and propose possible use of the DTD2 lines in crop improvement.

      We are happy for this positive comment on the manuscript.


      (a) Data obtained from a single aminoacyl-tRNA (D-Tyr-tRNATyr) have been generalized to imply that what is relevant to this model substrate is true for all other D-aa-tRNAs (term modified aa-tRNAs has been used synonymously with the modified Tyr-tRNATyr). This is not a risk-free extrapolation. For example, the authors see that DTD2 removes modified D-Tyr from tRNATyr in a chain-length dependent manner of the modifier. Why do the authors believe that the length of the amino acid side chain will not matter in the activity of DTD2?

      We thank the reviewer for bringing up this important point. As mentioned above, we wish to clarify that only half of the aminoacyl-tRNA synthetases are known to charge D-amino acids and only D-Leu (Yeast), D-Asp (Bacteria, Yeast), D-Tyr (Bacteria, Cyanobacteria, Yeast) and D-Trp (Bacteria) show toxicity in vivo in the absence of known DTD (Soutourina J. et al., JBC, 2000; Soutourina O. et al., JBC, 2004; Wydau S. et al., JBC, 2009). D-Tyr-tRNATyr is used as a model substrate to test the DTD activity in the field because of the conserved toxicity of D-Tyr in various organisms. DTD2 has been shown to recycle D-Asp-tRNAAsp and D-Tyr-tRNATyr with the same efficiency both in vitro and in vivo (Wydau S. et al., NAR, 2007). Moreover, we have previously shown that it recycles acetaldehyde-modified D-Phe-tRNAPhe and D-Tyr-tRNATyr in vitro as shown in our earlier work (Mazeed M. et al., Science Advances, 2021). We have earlier shown that DTD1, another conserved chiral proofreader across bacteria and eukaryotes, acts via a side chain independent mechanism (Ahmad S. et al., eLife, 2013). To check the biochemical activity of DTD2 on D-Trp-tRNATrp, we have now done the D-Trp, D-Tyr and D-Asp toxicity rescue experiments by expressing the archaeal DTD2 in dtd null E. coli cells. We found that DTD2 could rescue the D-Trp toxicity with equal efficiency like D-Tyr and D-Asp (Figure 1). Considering the action on multiple side chains with different chemistry and size, it can be proposed with reasonable confidence that DTD2 also operates based on a side chain independent manner.

      (b) While the use of EFTu supports that the ternary complex formation by the elongation factor can resist modifications of L-Tyr-tRNATyr by the aldehydes or other agents, in the context of the present work on the role of DTD2 in plants, one would want to see the data using eEF1alpha. This is particularly relevant because there are likely to be differences in the way EFTu and eEF1alpha may protect aminoacyl-tRNAs (for example see description in the latter half of the article by Wolfson and Knight 2005, FEBS Letters 579, 3467-3472).

      We thank the reviewer for bringing up this important point. As mentioned above, to understand the role of plant eEF1A in protecting L-aa-tRNAs from aldehyde modification, we have done a thorough sequence and structural analysis. We analysed the aa-tRNA bound elongation factor structure from bacteria (PDB ids: 1TTT) and found that the side chain of amino acid in the amino acid binding site of EF-Tu is projected outside (Figure: 2A; 3A). In addition, the amino group of amino acid is tightly selected by the main chain atoms of elongation factor thereby lacking a space for aldehydes to enter and then modify the L-aa-tRNAs and Gly-tRNAs (Figure: 2B; 3B). Modelling of D-amino acid (D-phenylalanine and smallest chiral amino acid, D-alanine) in the same site shows serious clashes with main chain atoms of EF-Tu, indicating D-chiral rejection during aa-tRNA binding by elongation factor (Figure: 2C-E). Next, we superimposed the tRNA bound mammalian eEF-1A cryoEM structure (PDB id: 5LZS) with bacterial structure to understand the structural differences in terms of tRNA binding and found that elongation factor binds tRNA in a similar way (Figure: 3C-D). Modelling of D-alanine in the amino acid binding site of eEF-1A shows serious clashes with main chain atoms, indicating a general theme of D-chiral rejection during aa-tRNA binding by elongation factor (Figure: 2F; 3E). Structure-based sequence alignment of elongation factor from bacteria, archaea and eukaryotes (both plants and mammals) shows a strict conservation of amino acid binding site (Figure: 2G). Minor differences near the amino acid side chain binding site (as indicated in Wolfson and Knight, FEBS Letters, 2005) might induce the amino acid specific binding differences (Figure: 3F). However, those changes will have no influence when the D-chiral amino acid enters the pocket, as the whole side chain would clash with the active site. We have now included this sequence and structural conservation analysis in our revised manuscript (in text: line no 107-129; Figure: 2 and S2). Overall, our structural analysis suggests a conserved mode of aa-tRNA selection by elongation factor across life forms and therefore, our biochemical results with bacterial elongation factor Tu (EF-Tu) reflect the protective role of elongation factor in general across species.

      Reviewer #2 (Public Review):

      In bacteria and mammals, metabolically generated aldehydes become toxic at high concentrations because they irreversibly modify the free amino group of various essential biological macromolecules. However, these aldehydes can be present in extremely high amounts in archaea and plants without causing major toxic side effects. This fact suggests that archaea and plants have evolved specialized mechanisms to prevent the harmful effects of aldehyde accumulation.

      In this study, the authors show that the plant enzyme DTD2, originating from archaea, functions as a D-aminoacyl-tRNA deacylase. This enzyme effectively removes stable D-aminoacyl adducts from tRNAs, enabling these molecules to be recycled for translation. Furthermore, they demonstrate that DTD2 serves as a broad detoxifier for various aldehydes in vivo, extending its function beyond acetaldehyde, as previously believed. Notably, the absence of DTD2 makes plants more susceptible to reactive aldehydes, while its overexpression offers protection against them. These findings underscore the physiological significance of this enzyme.

      We thank the reviewer for the positive comments the manuscript.

      Response to recommendation to authors:

      Reviewer #1 (Recommendations For The Authors):

      I enjoyed reading the manuscript entitled, "Archaeal origin translation proofreader imparts multi aldehyde stress tolerance to land plants" from the Sankaranarayanan lab. This work is an extension of their earlier work published in Sci Adv in 2001, wherein they showed that DTD2 deacylates N-ethyl-D-aminoacyl-tRNAs arising from acetaldehyde toxicity. Now, the authors of this study (Kumar et al.) investigate the role of archaeal/plant DTD2 in the deacylation/detoxification of D-Tyr-tRNATyr modified by multiple other aldehydes and methylglyoxal (which are produced during metabolic reactions in plants). Importantly, the authors take their biochemical observations to plants, to show that deletion of DTD2 gene from a model plant (Arabidopsis thaliana) makes them sensitive to the aldehyde supplementation in the media especially in the presence of D-Tyr. These conclusions are further supported by the observation that the model plant shows increased tolerance to the aldehyde stress when DTD2 is overproduced from the CaMV 35S promoter. The authors propose a model for the role of DTD2 in the evolution of land plants. Finally, the authors suggest that the transgenic crops carrying DTD2 may offer a strategy for stress-tolerant crop development. Overall, the authors present a convincing story, and the data are supportive of the central theme of the story.

      We are happy that reviewer enjoyed our manuscript and found our work convincing. We would also like to thank reviewer for finding our data supportive to the central theme of the manuscript.

      I have the following observations that require the authors' attention.

      1) The title of the manuscript will be more appropriate if revised to, "Archaeal origin translation proofreader, DTD2, imparts multialdehyde stress tolerance to land plants".

      Both the reviewer’s suggested to change the title. We have now changed the title based on reviewer 2 suggestion.

      2) Abstract (line 19): change, "physiologically abundantly produced" to "physiologically produced".

      As per the reviewer’s suggestion, we have now changed it to "physiologically produced".

      3) Introduction (line 50): delete, 'extremely'.

      We have removed the word 'extremely' from the Introduction.

      4) Line 79: change, "can be utilized" to "may be explored".

      We have changed "can be utilized" to "may be explored" as suggested by the reviewers.

      5) Results in general:

      (a) Data obtained from a single aminoacyl-tRNA (D-Tyr-tRNATyr) have been generalized to imply that what is relevant to this model substrate is true for all other D-aa-tRNAs (term modified aa-tRNAs has been used synonymously with the modified D-Tyr-tRNATyr). This is a risky extrapolation. For example, the authors see that DTD2 removes modified D-Tyr from tRNATyr in a chain-length dependent manner of the modifier. Why do the authors believe that the length of the amino acid side chain will not matter in the activity of DTD2?

      We thank the reviewer for bringing up this important point. As mentioned above, we wish to clarify that only half of the aminoacyl-tRNA synthetases are known to charge D-amino acids and only D-Leu (Yeast), D-Asp (Bacteria, Yeast), D-Tyr (Bacteria, Cyanobacteria, Yeast) and D-Trp (Bacteria) show toxicity in vivo in the absence of known DTD (Soutourina J. et al., JBC, 2000; Soutourina O. et al., JBC, 2004; Wydau S. et al., JBC, 2009). D-Tyr-tRNATyr is used as a model substrate to test the DTD activity in the field because of the conserved toxicity of D-Tyr in various organisms. DTD2 has been shown to recycle D-Asp-tRNAAsp and D-Tyr-tRNATyr with the same efficiency both in vitro and in vivo (Wydau S. et al., NAR, 2007). Moreover, we have previously shown that it recycles acetaldehyde-modified D-Phe-tRNAPhe and D-Tyr-tRNATyr in vitro as shown in our earlier work (Mazeed M. et al., Science Advances, 2021). We have earlier shown that DTD1, another conserved chiral proofreader across bacteria and eukaryotes, acts via a side chain independent mechanism (Ahmad S. et al., eLife, 2013). To check the biochemical activity of DTD2 on D-Trp-tRNATrp, we have now done the D-Trp, D-Tyr and D-Asp toxicity rescue experiments by expressing the archaeal DTD2 in dtd null E. coli cells. We found that DTD2 could rescue the D-Trp toxicity with equal efficiency like D-Tyr and D-Asp (Figure 1). Considering the action on multiple side chains with different chemistry and size, it can be proposed with reasonable confidence that DTD2 also operates based on a side chain independent manner.

      (b) Interestingly, the authors do suggest (in the Materials and Methods section) that the experiments were performed with Phe-tRNAPhe as well as Ala-tRNAAla. If what is stated in Materials and Methods is correct, these data should be included to generalize the observations.

      We regret for the confusing statement. We wish to clarify that L- and D-Tyr-tRNATyr were used for checking the TLC-based aldehyde modification, EF-Tu based protection assays and deacylation assays, D-Phe-tRNAPhe was used to characterise aldehyde-based modification by mass spectrometry and L-Ala-tRNAAla was used to check the modification propensity of multiple aldehydes. We used multiple aa-tRNAs to emphasize that aldehyde-based modifications are aspecific towards the identity of aa-tRNAs. All the data obtained with respective aa-tRNAs are included in manuscript.

      (c) While the use of EFTu supports that the ternary complex formation by the elongation factor can resist modifications of L-Tyr-tRNATyr by the aldehydes or other agents, in the context of the present work on the role of DTD2 in plants, one would want to see the data using eEF1alpha. This is particularly relevant because there are likely to be differences in the way EFTu and eEF1alpha may protect aminoacyl-tRNAs (for example see description in the latter half of the article by Wolfson and Knight 2005, FEBS Letters 579, 3467-3472).

      We thank the reviewer for bringing up this important point. As mentioned above, to understand the role of plant eEF1A in protecting L-aa-tRNAs from aldehyde modification, we have done a thorough sequence and structural analysis. We analysed the aa-tRNA bound elongation factor structure from bacteria (PDB ids: 1TTT) and found that the side chain of amino acid in the amino acid binding site of EF-Tu is projected outside (Figure: 2A; 3A). In addition, the amino group of amino acid is tightly selected by the main chain atoms of elongation factor thereby lacking a space for aldehydes to enter and then modify the L-aa-tRNAs and Gly-tRNAs (Figure: 2B; 3B). Modelling of D-amino acid (D-phenylalanine and smallest chiral amino acid, D-alanine) in the same site shows serious clashes with main chain atoms of EF-Tu, indicating D-chiral rejection during aa-tRNA binding by elongation factor (Figure: 2C-E). Next, we superimposed the tRNA bound mammalian eEF-1A cryoEM structure (PDB id: 5LZS) with bacterial structure to understand the structural differences in terms of tRNA binding and found that elongation factor binds tRNA in a similar way (Figure: 3C-D). Modelling of D-alanine in the amino acid binding site of eEF-1A shows serious clashes with main chain atoms, indicating a general theme of D-chiral rejection during aa-tRNA binding by elongation factor (Figure: 2F; 3E). Structure-based sequence alignment of elongation factor from bacteria, archaea and eukaryotes (both plants and mammals) shows a strict conservation of amino acid binding site (Figure: 2G). Minor differences near the amino acid side chain binding site (as indicated in Wolfson and Knight, FEBS Letters, 2005) might induce the amino acid specific binding differences (Figure: 3F). However, those changes will have no influence when the D-chiral amino acid enters the pocket, as the whole side chain would clash with the active site. We have now included this sequence and structural conservation analysis in our revised manuscript (in text: line no 107-129; Figure: 2 and S2). Overall, our structural analysis suggests a conserved mode of aa-tRNA selection by elongation factor across life forms and therefore, our biochemical results with bacterial elongation factor Tu (EF-Tu) reflect the protective role of elongation factor in general across species.

      6) Results (line 89): Figure: 1C-G (not B-G).

      As correctly pointed out by the reviewer(s), we have changed it to Figure: 1C-G.

      7) Results (line 91): Figure: S1B-G (not C-G).

      We wish to clarify that this is correct.

      8) Line 97: change, "propionaldehyde" to "propionaldehyde (Figure: 1H)".

      As per the reviewer’s suggestion, we have now changed, "propionaldehyde" to "propionaldehyde (Figure: 1H)".

      9) Line 124: The statement, "DTD2 cleaved all modified D-aa-tRNAs at 50 pM to 500 nM range (Figure: 2A_D)" is not consistent with the data presented. For example, Figure 2D does not show any significant cleavage. Figure S2A-B also does not show cleavage.

      We thank the reviewers for pointing this out. We have changed the sentence to “DTD2 cleaved majority of aldehyde modified D-aa-tRNAs at 50 pM to 500 nM range".

      10) Line 131: Cleavage observed in Fig. S2E is inconsistent with the generalized statement on DTD1.

      We wish to clarify that the minimal activity seen in Fig. S2E is inconsistent with the general trend of DTD1’s biochemical activity seen on modified D-aa-tRNAs. In addition, we have earlier shown that D-aa-tRNA fits snugly in the active site of DTD1 (Ahmad S. et al., eLife, 2013) whereas the modified D-aa-tRNA cannot bind due to the space constrains in the active site of DTD1 (Mazeed M. et al., Science Advances, 2021). Therefore, this minimal activity could be a result of technical error during this biochemical experiment and could be considered as no activity.

      11) Lines 129-133: Citations of many figure panels particularly in the supplementary figures are inconsistent with generalized statements. This section requires a major rewrite or rearrangement of the figure panels (in case the statements are correct).

      We thank the reviewers for bringing forth this point and we have accordingly modified the statement into “DTD2 from archaea recycled short chain aldehyde-modified D-aa-tRNA adducts as expected (Figure: 3E-G) and, like DTD2 from plants, it did not act on aldehyde-modified D-aa-tRNAs longer than three chains (Figure: 3H; S3C-D; S4G-L)”.

      12) Line 142: I don't believe one can call PTH a proofreader. Its job is to recycle tRNAs from peptidyl-tRNAs.

      We thank the reviewers for pointing out this very important point. This is now corrected.

      13). Line 145: change, "DTD2 can exert its protection for" to "DTD2 may exert protection from".

      As per the reviewer’s suggestion, we have now changed"DTD2 can exert its protection for" to "DTD2 may exert protection from".

      14) Line 148: change, "a homozygous line (Figure: 3A) and checked for" to "homozygous lines (Figure: 3A) and checked them for".

      As per the reviewer’s suggestion, we have now changed, "a homozygous line (Figure: 3A) and checked for" to "homozygous lines (Figure: 3A) and checked them for".

      15) Line 148: Change, the sentence beginning with dtd2 as follows. Similar to earlier results30-32, dtd2-/- (dtd2 hereafter) plants were susceptible to ethanol (Figure: S4A) confirming the non-functionality DTD2 gene in dtd2 plants.

      As per the reviewer’s suggestion, we have now changed the sentence accordingly.

      16) Line 161: change, "linked" to "associated".

      As per the reviewer’s suggestion, we have now changed "linked" to "associated".

      17) Lines 173-176: It would be interesting to know how well the DTD2 OE lines do in comparison to the other known transgenic lines developed with, for example, ADH, ALDH, or AOX lines. Any ideas would help appreciate the observation with DTD2 OE lines!

      We greatly appreciate the reviewer’s suggestion. We have not done any comparison experiment with any transgenic lines so far. However, it can be potentially done in further studies with DTD2 OE lines.

      18) Line 194: change, "necessary" with "present".

      As per the reviewer’s suggestion, we have now changed "necessary" with "present".

      19) Line 210: what is meant by 'huge'? Would 'significant' sound better?

      As per the reviewer’s suggestion, we have now changed "huge" with "significant".

      20) Lines 239-243: This needs to be rephrased. Isn't alpha carbonyl of the carboxyl group that makes ester bond with the -CCA end of the tRNA required for DTD2 activity as well? Are you referring to the carbonyl group in the moiety that modifies the alpha-amino group? Please clarify. The cited reference (no. 64) of Atherly does not talk about it.

      We regret for the confusing statement. To clarify, we were referencing to the carbonyl carbon of the modification post amino group of the amino acid in aa-tRNAs (Figure: 5). We have now included a figure (Figure: S4Q of revised manuscript) to show the comparison of the carbonyl group for the better clarity. The cited reference Atherly A. G., Nature, 1978 shows the activity of PTH on peptidyl-tRNAs and peptidyl-tRNAs possess carbonyl carbon at alpha position post amino group of amino acid in L-aa-tRNAs.

      Author response image 5.

      Figure showing the difference in the position of carbonyl carbon in acetonyl and acetyl modification on aa-tRNAs.

      21) Line 261: thrive (not thrives).

      As per the reviewer’s suggestion, we have now changed it to thrive.

      22) In Fig3A: second last lane, it should be dtd-/-:: AtDTDH150A (not dtd-/-:: AtDTDH150A).

      We thank the reviewers for pointing out this, we have corrected it.

      23). Materials and methods: Please clarify which experiments used tRNAPhe, tRNAAla, PheRS, etc. Also, please carefully check all other details provided in this section.

      As per the reviewer’s suggestion, we would like to provide a table below explaining the use of different substrates as well as enzymes in our experiments.

      Author response table 1.

      24) Figure legends (many places): p values higher than 0.05 (not less than) are denoted as ns.

      We thank the reviewers for pointing out this. We have corrected it.

      Reviewer #2 (Recommendations For The Authors):

      I have only minor comments for the authors:

      Title: I would replace "Archeal origin translation proofreader" with " A translation proofreader of archeal origin"

      As per the reviewer’s suggestion, we have now changed the title.

      Abstract: This section could benefit from some rewriting. For instance, at the outset, the initial logical connection between the first and second sentences of the abstract is somewhat unclear. At the very least, I would suggest swapping their order to enhance the narrative flow. Later in the text, the term "chiral proofreading systems" is introduced; however, it is only in a subsequent sentence that these systems are explained to be responsible for removing stable D-aminoacyl adducts from tRNA. Providing an immediate explanation of these systems would enhance the reader's comprehension. The authors switch from the past participle tense to the present tense towards the end of the text. I would recommend that they choose one tense for consistency. In the final sentence, I would suggest toning down the statement and replacing "can be used" with "could be explored." (https://www.nature.com/articles/d41586-023-02895-w). The same comment applies to the introduction, line 79.

      As per the reviewer’s suggestion, we have now changed the abstract appropriately.

      General note: Conventionally, the use of italics is reserved for the specific species "Arabidopsis thaliana," while the broader genus "Arabidopsis" is not italicized.

      We acknowledge the reviewer for this pertinent suggestion. This is now corrected in revised version of our manuscript.

      General note: I would advise the authors against employing bold characters in conjunction with colors in the figures.

      We thank the reviewer for this suggestion. We have now changed it appropriately in revised version of our manuscript.

      Figure 1A: I recommend including the concentrations of the various aldehydes used in the experiment within the figure legend. While this information is available in the materials and methods section, it would be beneficial to have it readily accessible when analyzing the figure.

      As per the reviewer’s suggestion, we have now included the concentrations in figure legend.

      Figure 1I, J: some error bars are invisible.

      We thank the reviewers for pointing out this, we have corrected it.

      Figure 2M: The table could be simplified by removing aldehydes for which it was not feasible to demonstrate activity. The letter "M" within the cell labeled "aldehydes" appears to be a typographical error, presumably indicating the figure panel.

      As per the reviewer’s suggestion, we have now changed this appropriately.

      Figure 3: For consistency with the other panels in the figure, I recommend including an additional panel to display the graph depicting the impact of MG on germination.

      As per the reviewer’s suggestion, we have now changed this appropriately.

      Figure 4: Considering that only one plant is presented, it would be beneficial to visualize the data distribution for the other plants used in this experiment, similar to what the authors have done in panel A of the same figure.

      We thank the reviewer for bringing up this point. We wish to clarify that we have done experiment with multiple plants. However, for the sake of clarity, we have included the representative images. Moreover, we have included the quantitative data for multiple plants in Figure 3C-G.

      Figure 5E: The authors may consider presenting a chronological order of events as they believe they occurred during evolution.

      We thank the reviewer for the suggestion. However, it is very difficult to pinpoint the chronology of the events. Aldehydes are lethal for systems due to their hyper reactivity and systems would require immediate solutions to survive. Therefore, we think that both problem (toxic aldehyde production) and its solution (expansion of aldehyde metabolising repertoire and recruitment of archaeal DTD2) might have appeared simultaneously.

      Figure 6: The model appears somewhat crowded, which may affect its clarity and ease of interpretation. The authors might also consider dividing the legend sentence into two separate sentences for better readability.

      As per the reviewer’s suggestion, we have now changed this appropriately.

      Line 149: I recommend explicitly stating that ethanol metabolism produces acetaldehyde. This clarification will help the general reader immediately understand why DTD2 mutant plants are sensitive to ethanol.

      As per the reviewer’s suggestion, we have now changed this appropriately.

      Line 289: there is a typographical error, "promotor" instead of the correct term "promoter.".

      We thank the referee for pointing out this, we have now corrected it.

      Figure S5: The root morphology of DTD2 OE plants appears to exhibit some differences compared to the WT, even in the absence of a high concentration of aldehydes. It would be valuable if the authors could comment on these observed differences unless they have already done so, and I may have overlooked it.

      We thank the referee for pointing out this. We do see minor differences in root morphology, but they are more pronounced with aldehyde treatments. The reason for this phenotype remains elusive and we are trying to understand the role of DTD2 in root development in detail in further studies.

      Some Curiosity Questions (not mandatory for manuscript acceptance):

      1) Do DTD2 OE plants display an earlier flowering phenotype than wild-type Col-0?

      We have not done detailed phenotyping of DTD2 OE plants. However, our preliminary observations suggest no differences in flowering pattern as compared to wild-type Col-0.

      2) What is the current understanding of the endogenous regulation of DTD2?

      We have not done detailed analysis to understand the endogenous regulation of DTD2.

      3) Could the protective phenotype of DTD2 OE plants in the presence of aldehydes be attributed to additional functions of this enzyme beyond the removal of stable D-aminoacyl adducts from tRNAs?

      Based on the available evidence regarding the biochemical activity and in vivo phenotypes of DTD2, it appears that removal of stable D-aminoacyl adducts from tRNA is key for the protective phenotype of DTD2 OE.

      A Suggestion for Future Research (not required for manuscript acceptance):

      The authors could explore the possibility of overexpressing DTD2 in pyruvate decarboxylase transgenic plants and assess whether this strategy enhances flood tolerance without incurring a growth penalty under normal growth conditions.

      We thank the referee for this interesting suggestion for future research. We will surely keep this in mind while exploring the flood tolerance potential of DTD2 OE plants.

    2. eLife assessment

      The work is a fundamental contribution towards understanding the role of archaeal and plant D-aminoacyl-tRNA deacylase 2 (DTD2) in deacylation and detoxification of D-Tyr-tRNATyr modified by various aldehydes produced as metabolic byproducts in plants. It integrates convincing results from both in vitro and in vivo experiments to address the long-standing puzzle of why plants outperform bacteria in handling reactive aldehydes and suggests a new strategy for stress-tolerant crops. A limitation of the study is the lack of evidence for accumulation of toxic D-aminoacyl tRNAs and impairment of translation in plant cells lacking DTD2.

    3. Reviewer #1 (Public Review):

      Summary: This work is an extension of their earlier work published in Sci Adv in 2021, wherein they showed that DTD2 deacylates N-ethyl-D-aminoacyl-tRNAs arising from acetaldehyde toxicity. The authors (Kumar et al.) in this study, investigate the role of archaeal/plant DTD2 in the deacylation/detoxification of D-Tyr-tRNATyr modified by multiple other aldehydes and methylglyoxal (produced by plants). Importantly, the authors take their biochemical observations to plants, to show that deletion of DTD2 gene from a model plant (Arabidopsis thaliana) makes them sensitive to the aldehyde supplementation in the media especially in the presence of D-Tyr. These conclusions are further supported by the observation that the model plant shows increased tolerance to the aldehyde stress when DTD2 is overproduced from the CaMV 35S promoter. The authors propose a model for the role of DTD2 in the evolution of land plants. Finally, the authors suggest that the transgenic crops carrying DTD2 may offer a strategy for stress-tolerant crop development. Overall, the authors present a convincing story, and the data are supportive of the central theme of the story.

      Strengths: Data are novel and they provide a new perspective on the role of DTD2, and propose possible use of the DTD2 lines in crop improvement.

      Weaknesses: (a) Data obtained from a single aminoacyl-tRNA (D-Tyr-tRNATyr) have been generalized to imply that what is relevant to this model substrate is true for all other D-aa-tRNAs (term modified aa-tRNAs has been used synonymously with the modified Tyr-tRNATyr). This is not a risk-free extrapolation. For example, the authors see that DTD2 removes modified D-Tyr from tRNATyr in a chain-length dependent manner of the modifier. Why do the authors believe that the length of the amino acid side chain will not matter in the activity of DTD2? (b) While the use of EFTu supports that the ternary complex formation by the elongation factor can resist modifications of L-Tyr-tRNATyr by the aldehydes or other agents, in the context of the present work on the role of DTD2 in plants, one would want to see the data using eEF1alpha. This is particularly relevant because there are likely to be differences in the way EFTu and eEF1alpha may protect aminoacyl-tRNAs (for example see description in the latter half of the article by Wolfson and Knight 2005, FEBS Letters 579, 3467-3472).

      Note added after revision: The authors have addressed all my concerns by doing additional experiments and by providing convincing arguments. I am happy to conclude that all my concerns on the weaknesses of the work have been nicely addressed. The already convincing story is now stronger.

    4. Reviewer #2 (Public Review):

      In bacteria and mammals, metabolically generated aldehydes become toxic at high concentrations because they irreversibly modify the free amino group of various essential biological macromolecules. However, these aldehydes can be present in extremely high amounts in archaea and plants without causing major toxic side effects. This fact suggests that archaea and plants have evolved specialized mechanisms to prevent the harmful effects of aldehyde accumulation.

      In this manuscript, the authors show that the plant enzyme DTD2, originating from archaea, functions as a D-aminoacyl-tRNA deacylase. This enzyme effectively removes stable D-aminoacyl adducts from tRNAs, enabling these molecules to be recycled for translation. Furthermore, they demonstrate that DTD2 serves as a broad detoxifier for various aldehydes in vivo, extending its function beyond acetaldehyde, as previously believed. Finally, the authors suggest a potential application of their findings by showing that the absence of DTD2 renders plants more susceptible to reactive aldehydes, while its overexpression provides protection against them.

      Overall, this study provides a molecular explanation for the remarkable efficiency of plants in handling reactive aldehydes. However, direct evidence that translation is impaired in plants lacking DTD2 experience is currently lacking. Furthermore, because root morphology of DTD2-overexpressing plants appears to differ from that of WT, a thorough phenotypic analysis of DTD2-overexpressing plants will be essential to accurately assess the potential translational application of this enzyme for engineering stress-tolerant plants.

    1. eLife assessment

      The study presents valuable findings concerning how a highly conserved signal transduction pathway helps budding yeast cells adapt their growth to nitrogen sources of differing qualities. However, the evidence is incomplete for the authors' main claim that the pathway adopts three distinct states depending on the nitrogen source. The presented data, particularly phospho-proteomic datasets, will be of interest to the cell growth signaling community.

    2. Reviewer #1 (Public Review):


      TOR complex 1 (TORC1) is a key regulator cell growth in response to nutrients, and it therefore integrates inputs from multiple nutrient-sensing regulators. However, we still do not understand how each upstream regulatory branch contributes to TORC1 activity under different nutrient conditions. The authors set out to answer this question using budding yeast (Saccharomyces cerevisiae) as a model eukaryote. Yeast TORC1 is activated by two upstream regulators: the highly conserved GTPases Gtr1/2 and the PI3P-binding protein Pib2. The cooperation of these regulators towards TORC1 activation has been unclear, with some studies suggesting that they act in parallel (i.e. redundantly), and others suggesting a more complex picture. By exploring the dependence of different TORC1 substrates on Gtr1/2 and Pib2 activity, the authors have discovered that Gtr1/2 and Pib2 do not act redundantly, but instead are part of a mechanism that drives the TORC1 pathways into three distinct activity levels: i) both Gtr1/2 and Pib2 ON in rich nutrients (leading to the highest TORC1 activity), ii) Gtr1/2 OFF and Pib2 ON in poor quality nitrogen sources (intermediate TORC1 activity), and iii) both Gtr1/2 and Pib2 OFF under starvation conditions (lowest TORC1 activity).


      The relation between Gtr1/2 and Pib2 has remained a mystery for a long time, making it difficult to interpret the results of experiments in which one of the two regulators is inactive or missing. By employing a phosphoproteomics assay, the authors were able to monitor the phosphorylation of multiple TORC1 substrates in response to TORC1 inhibition (via rapamycin) and in mutants carrying deletions of Gtr1/2 or Pib2. In this way, they could identify two groups of substrates: those that require the activity of both regulators, and those that remain active when a single regulator is active. These data clearly demonstrate the non-redundancy of the Gtr1/2 and Pib2, especially since the different groups of substrates seem to correspond to groups of proteins with distinct functions.


      - The first section of the Results contains an analysis of Gtr1/2- and Pib2-dependent signaling using Rps6 as a TORC1 reporter. I do not think that Rps6 is an appropriate readout for this type of work, as it is not a direct TORC1 substrate, and it also lies downstream of TORC2 [Yerlikaya et al. 2016]. The authors obtain several puzzling results with Rps6, and later on (pg. 8) remark that the level of Rps6 phosphorylation does not always correspond to TORC1 activity. While this is an interesting finding in its own right and will certainly be interesting for the yeast TOR community, I do not see why the Results need to open with such a confusing section, and why Rps6 features so prominently throughout the manuscript.<br /> - There is very large ambiguity regarding the types of media and strains that are used (prototrophic vs auxotrophic). The authors use SC medium which, if I understand correctly, contains ammonium and a supplement of amino acids. They then use single amino acid dropouts (e.g. SC -gln and SC -leu) to probe TORC1 activity under "partial starvation" conditions. However, the cells are anything but starved in these experiments, and I do not know how to interpret results obtained with such media. Even when amino acids are completely removed, the cells are still able to grow on ammonium. The matter gets further complicated because it appears that the authors use prototrophic strains with single nitrogen source media, but not with complete or "partial starvation" media. Since this study aims to elucidate the roles of nutrient-sensing regulators upstream of TORC1, I would expect that matters related to media composition and strain usage should be addressed more carefully and described more explicitly in the text, especially since nutritional complementation of auxotrophic strains is not always equivalent to genetic complementation [Pronk, 2002].<br /> - A recent publication (Zeng et al. 2023, doi: 10.1016/j.celrep.2023.113599) identified Ser33 and Ser3 as TORC1 substrates and examined their dependence on Pib2 activity. More importantly, the publication addressed a question that is very similar to the one addressed here (i.e. how different amino acids require Gtr1/2 or Pib2 to activate TORC1). I would recommend that the authors cite that publication and compare their findings with the results reported there.<br /> - The GO analysis of TORC1 substrates (from Fig.4) is mentioned in the text but is not shown. The authors should present the GO analysis more explicitly, e.g. in a supplementary table.<br /> - Similar to Rps6, it should be kept in mind that Par32 is not a TORC1 substrate. While I understand the rationale behind the choice of Par32 as a readout, this point needs to be emphasized more. Additionally, previous work [Brito et al. 2019, doi: 10.1016/j.isci.2019.09.025] has suggested that Npr1 and Par32 are implicated in a feedback loop with Pib2. The potential relevance of that work should be discussed more here.<br /> - Besides Sch9, Tod6 phosphorylation is also regulated by PKA [Huber et al. 2011, doi: 10.1038/emboj.2011.221]. This point should be discussed and taken into account in the interpretation of the Tod6 results. I also find it puzzling that Tod6 persists one hour after rapamycin treatment, because the protein seems to be unstable and gets quickly degraded when TORC1 activity is lost [Kusama 2022, doi: 10.1016/j.isci.2022.103986].<br /> - Given the points raised above, I remain skeptical about the three-state model proposed by the authors. On a conceptual level, the intermediate activity state of TORC1 proposed here seems to depend absolutely on Pib2 (since Gtr1/2 appear to be off in that state). The authors make a similar point in the Discussion, where they claim that yeast growth on poor nitrogen sources can be halted by deletion of Pib2. However, they do not test this conjecture experimentally.<br /> - Fig. 6F compares the growth of different strains on different media, but the doubling times are not quantified.<br /> - The Introduction describes regulatory pathways of mTORC1, several of which do not exist in budding yeast. The transition from the second to third paragraph is very abrupt and confusing.

    3. Reviewer #2 (Public Review):

      This work examines the roles of Gtr1/Gtr2 and Pib2 in activation of TORC1 in S cerevisiae and proposes they are non-redundant in activating TORC1. Previous work from many groups has suggested that the Gtr complex and Pib2 activate TORC1 in a parallel manner. One contribution of this study is the suggestion that using the standard readout(s) of TORC1 activation are not sufficient to assess the separate roles of these two components in the complex network of amino acid and starvation response signaling. The overall conclusion of the work, based on phosphoproteome analyses of deletion strains and comparison to rapamycin treatment, with some supporting experimentation, is that Pib2 signaling sustains the starvation response in poor amino acid/nitrogen sources, whereas the additional activation of the Gtr complex is required for the full spectrum of TORC1 effects on growth.

      At first, the authors recapitulate and extend studies on TORC1 inactivation using the Rps6 reporter. Here, Pib2 could inactivate TORC1 on glutamine starvation only if the Gtr complex is partially compromised. The authors speculated that Gtr and Pib2 do lead to different responses, but these cannot be detected by monitoring the phospho state of Rps6.

      The authors determined the phosphoproteome in wild type cells and a variety of knockout strains, in rich media and in the presence of rapamycin. The authors identified 175 phosphosites that are downregulated on rapamycin treatment, at least under these conditions. Many were dependent on both Pib2 and the Gtr complex but, of particular interest for this work , were the phosphosites on Ser33, that were dependent on the presence of Pib2 but not the Gtr complex. The authors noted that phosphosites not dependent on Pib2 or Gtr1/2 included Sch9 and other common readouts of TORC1 activation.

      Focusing on Ser33, the authors next show that rapamycin, amino acid and nitrogen starvation result in loss of Ser33 phosphorylation. Further analysis showed that the Ser33 phosphorylation status depends on the quality of the amino acid and nitrogen source.

      Then the authors use this to develop a model where TORC1 has three states depending on whether either Gtr1/2, or Pib2, or both are active in signaling to TORC1, depending on the nutrient state and quality of amino acids/nitrogen available. The new state is state III, where TORC1 is active to promote growth and the starvation response remains active, via the Npr1/Par32 branch. The remainder of the work involves developing tools to assess the growth (Sch9) and starvation (Par32) branches under various amino acid/nutrient states. While moving from media with an excess of all amino acids to glutamine or leucine led to only transient occupation of state III, the new state was already occupied when the cells were in a poor amino acid/nitrogen source and moved to a better one. In other words, the Pib2 signalling permitted aspects of a starvation response to be maintained in the background of a Sch9 growth signal.

      Finally, the authors address a puzzle: Sch9 phosphorylation does not have the dynamic range to account for the difference in growth rates of yeast cells in SC or proline medium. Tod6 was dephosphorylated in the absence of Gtr1/Gtr2 or Pib2 in the phosphoproteomics and is the likely connection, as it moves to the nucleus on growth on proline media (or on rapamycin), where it may control the chromatin accessibility of ribosome growth and biogenesis genes.

      Overall, the core of this work, the phosphoproteome analyses, convincingly demonstrates that activation of TORC1 relies on a nuanced interplay of signaling pathways and that to fully appreciate and dissect the consequences of the Gtr- and Pib2-responsive signaling pathways a more comprehensive range of readouts is required. The work elegantly shows a scenario where Pib2-based signaling is active, required to sustain some growth even when the amino acid/nitrogen mix is poor.

      There are some areas, however, where the work could be strengthened. The model proposed in this work is based on nuanced signaling responses to various states of nitrogen/amino acid starvation. However, the phosphoproteome was determined in a synthetic rich background, supplemented with rapamycin where relevant, and comparing the phosphoproteome of pib2 del and gtr1 del/gtr2 del to this. The phosphoproteome is by far the strongest data in this work suggesting multi-level regulation so an appropriately matched phosphoproteome condition screen would likely significantly substantiate the model: the conditions used might miss all the nuanced signaling responses the authors develop throughout the paper. Not unrelated, the authors show that Pib2 can transmit glutamine starvation signals to TORC1 in the presence of a partial Gtr1/2 complex (gtr1 del or gtr2 del) but not a complete deletion of the complex (Fig. 2). Similar to the above comment, the phosphoproteome was determined only with full loss of the gtr complex, and then only in a rich background, which may miss this entire branch of Pib2 signaling. Perhaps in support of this, Pib2Ser113 phosphorylation apparently decreased significantly on rapamycin treatment but not on loss of the Gtr complex (TableS1), whereas other Pib2 phospho sites were not similarly affected by rapamycin treatment. Adding to the notion of complexity, the other sites may themselves be subject to other signaling pathways that could regulate Pib2 - and these may change on nutrient starvation.

      The data showing the enrichment of Pib2 with Ser33 is weak (Fig. 5G, mostly because of the significant precipitation of Ser33 in the absence of Pib2), particularly without the contribution of the immunopurifications of Fig5S1. Assessing the binding of Ser3 may be a better candidate?

    4. Reviewer #3 (Public Review):

      Summary:<br /> This work addresses an important question of how Gtr1/2 small GTPases and Pib2, two major regulators of the TORC1 cell growth controller, differentially operate in yeast. They found not all the TORC1 downstream targets respond to Gtr1/2 and Pib2 equally. In fact, they demonstrate that TORC1-dependent phosphorylation of Ser33, a 3-phosphoglycerate dehydrogenase, is responsive to only Pib2. They attributed this specificity to the physical interaction between Ser33 and Pib2. This part is novel and important, revising the canonical view in the field that Gtr1/2 and Pib2 branches act towards the same TORC1 downstream targets. Of note, this claim largely agrees with a recent independent study (PMID: 38127619).

      Moving on, the authors describe different behaviors of TORC1 downstream readouts in intermediate nutrient conditions with a poor nitrogen source, with some readouts still active while others inactive. They argue that selective activation of certain TORC1 downstream targets reflects the "Gtr1/2 off, Pib2 on" state. However, this claim is not sufficiently supported by the presented data.

      Strengths:<br /> The data presented in this paper has high value to the TOR community. In particular, a rigorous and comprehensive phospho-proteomic dataset that compares the Gtr1/2- and Pib2-dependency of diverse TORC1 downstream targets is very informative, potentially stimulating follow-up studies on each target.

      Identification of Ser33 as a Pib2-specific TORC1 downstream is important and convincing (although whether Ser33 is a direct substrate of TORC1 was not addressed in this work). Physical interaction between Ser33 and Pib2 could represent a novel layer of TORC1 signaling regulation, in line with the mammalian Rag-TFEB interaction model, as discussed by the authors.

      Weaknesses:<br /> The authors' three-state model, particularly the claim that cells are in the "Gtr1/2 off, Pib2 on" state in a poor nitrogen condition (e.g., proline medium), is not convincing enough because of the following reasons.

      1) The "Pib2 on" claim contradicts with the observation that Ser33, Pib2-specific readout, is hypo-phosphorylated in proline medium (Fig 5F).

      2) In the genetic experiments (Figure 8), the authors compare pib2D with Gtr1/2OFF. This is not appropriate, because GTR1/2OFF (GTR1-GDP and Gtr2-GTP) actively inhibits TORC1, differing from the null nature of pib2D. pib2D should be compared with gtr1/2D instead.

      3) In general, diverse behaviors of TORC1 targets are not unexpected because their phosphorylation levels should have different dynamic ranges depending on how "good" they are as TORC1 substrates, with some requiring a higher TORC1 activity than others to be detectably phosphorylated. Although this aspect can be physiologically meaningful, and it is indeed important to look at multiple substrates as the authors suggest, this approach does not inform whether the signal is coming from Gtr1/2 or Pib2. An informative way in this context would be to look at the Gtr1/2- or Pib2-specific targets, but the former has not been identified, and observations on the latter, Ser33, do not support the "Pib2 on" claim as mentioned in the above 1).

      4) In addition, comparisons made between direct TORC1 substrates (e.g., Sch9) and indirect downstream targets (e.g., Rps6 and Par32) are not very informative, because indirect targets can be impacted by TORC1-independent regulation of the mediating factors (e.g., Ypk3 for Rps6 and Npr1 for Par32).

      In summary, the presented data do not tell us which of the two branches (Gtr1/2 or Pib2) is "more active" in the poor nitrogen condition. Their observations do not necessarily prefer their 3-state on/off model (Figure 8) over the more natural assumption that both branches have the gradation of activity depending on the nutrient status.

    1. eLife assessment

      This valuable work investigates the role of boundary elements in the formation of 3D genome architecture. The authors established a specific model system that allowed them to manipulate boundary elements and examine the resulting genome topology. The work yielded the first demonstration of the existence of stem and circle loops in a genome and confirms a model which had been posited based on extensive prior genetic work, providing valuable insights into how 3D genome topologies affect enhancer-promoter communication. The evidence is solid, although the degree of generalization remains uncertain.

    2. Reviewer #1 (Public Review):

      Summary:<br /> In this study, the authors engineer the endogenous left boundary of the Drosophila eve TAD, replacing the endogenous Nhomie boundary by either a neutral DNA, a wildtype Nhomie boundary, an inverted Nhomie boundary, or a second copy of the Homie boundary. They perform Micro-C on young embryos and conclude that endogenous Nhomie and Homie boundaries flanking eve pair with head-to-tail directionality to form a chromosomal stem loop. Abrogating the Nhomie boundary leads to ectopic activation of genes in the former neighboring TAD by eve embryonic stripe enhancers. Replacing Nhomie by an inverted version or by Homie (which pairs with itself head-to-head) transformed the stem loop into a circle loop. An important finding was that stem and circle loops differentially impact endogenous gene regulation both within the eve TAD and in the TADs bracketing eve. Intriguingly, an eve TAD with a circle loop configuration leads to ectopic activation of flanking genes by eve enhancers - indicating compromised regulatory boundary activity despite the presence of an eve TAD with intact left and right boundaries.

      Strengths:<br /> Overall, the results obtained are of high-quality and are meticulously discussed. This work advances our fundamental understanding of how 3D genome topologies affect enhancer-promoter communication.

      Weaknesses:<br /> Though convincingly demonstrated at eve, the generalizability of TAD formation by directional boundary pairing remains unclear, though the authors propose this mechanism could underly the formation of all TADs in Drosophila and possibly even in mammals. Strong and ample evidence has been obtained to date that cohesin-mediated chromosomal loop extrusion explains the formation of a large fraction of TADs in mammals. Moreover, given the unique specificity with which Nhomie and Homie are known to pair (and exhibit "homing" activity), it is conceivable that formation of the eve TAD by boundary pairing represents a phenomenon observed at exceptional loci rather than a universal rule of TAD formation. Indeed, characteristic Micro-C features of the eve TAD are only observed at a restricted number of loci in the fly genome, and many TADs lack focal 3D interactions between their boundaries.

    3. Reviewer #2 (Public Review):

      "Chromatin Structure II: Stem-loops and circle-loops" by Ke*, Fujioka*, Schedl, and Jaynes reports a set of experiments and subsequent analyses focusing on the role of Drosophila boundary elements in shaping 3D genome structure and regulating gene expression. The authors primarily focus on the region of the fly genome containing the even skipped (eve) gene; eve is expressed in a canonical spatial pattern in fly embryos and its locus is flanked by the well-characterized neighbor of homie (nhomie) and homie boundary elements. The main focus of investigation is the orientation dependence of these boundary elements, which had been observed previously using reporter assays. In this study, the authors use Crispr/Cas9 editing followed by recombination-mediated cassette exchange to create a series of recombinant fly lines in which the nhomie boundary element is either replaced with exongenous sequence from phage 𝝀, an inversion of nhomie, or a copy of homie that has the same orientation as the endogenous homie sequence. The nhomie sequence is also regenerated in its native orientation to control for effects introduced by the transgenesis process.

      The authors then perform high-resolution Micro-C to analyze 3D structure and couple this with fluorescent and colorimetric RNA in situ hybridization experiments to measure the expression of eve and nearby genes during different stages of fly development. The major findings of these experiments are that total loss of boundary sequence (replacement with 𝝀 DNA) results in major 3D structure changes and the most prominent observed gene changes, while inversion of the nhomie boundary or replacement with homie resulted in more modest effects in terms of 3D structure and gene expression changes and a distinct pattern of gene expression change from the 𝝀 DNA replacement. As the samples in which the nhomie boundary is inverted or replaced with homie have similar Micro-C profiles at the eve locus and show similar patterns of a spurious gene activation relative to the control, the observed effects appear to be driven by the relative orientation of the nhomie and homie boundary elements to one another.

      Collectively, the findings reported in the manuscript are of broad interest to the 3D genome field. Although extensive work has gone into characterizing the patterns of 3D genome organization in a whole host of species, the underlying mechanisms that structure genomes and their functional consequences are still poorly understood. The perhaps best understood system, mechanistically, is the coordinated action of CTCF with the cohesin complex, which in vertebrates appears to shape 3D contact maps through a loop extrusion-pausing mechanism that relies on orientation-dependent sequence elements found at the boundaries of interacting chromatin loops. Despite having a CTCF paralog and cohesin, the Drosophila genome does not appear to be structure by loop extrusion-pausing. The identification of orientation-dependent elements with pronounced structural effects on genome folding thus may shed light on alternative mechanisms used to regulated genome structure, which in turn may yield insights into the significance of particular folding patterns.

      On the whole, this study is comprehensive and represents a useful contribution to the 3D genome field. The transgenic lines and Micro-C datasets generated in the course of the work will be valuable resources for the research community. Moreover, the manuscript, while dense in places, is generally clearly written and comprehensive in its description of the work. However, I have a number of comments and critiques of the manuscript, mainly centering on the framing of the experiments and presentation of the Micro-C results and on manner in which the data are analyzed and reported. They are as follows:

      Major Points:

      1. The authors motivate much of the introduction and results with hypothetical "stem loop" and "circle loop" models of chromosome confirmation, which they argue are reflected in the Micro-C data and help to explain the observed ISH patterns. While such structures may possibly form, the support for these specific models vs. the many alternatives is not in any way justified. For instance, no consideration is given to important biophysical properties such as persistence length, packing/scaling, and conformational entropy. As the biophysical properties of chromatin are a very trafficked topic both in terms of experimentation and computational modeling and generally considered in the analysis of chromosome conformation data, the study would be strengthened by acknowledgement of this body of work and more direct integration of its findings.

      2. Similar to Point 1, while there is a fair amount of discussion of how the observed results are or are not consistent with loop extrusion, there is no discussion of the biophysical forces that are thought to underly compartmentalization such as block-polymer co-segregation and their potential influence. I found this absence surprising, as it is generally accepted that A/B compartmentalization essentially can explain the contact maps observed in Drosophila and other non-vertebrate eukaryotes (Rowley, ..., Corces 2017; PMID 28826674). The manuscript would be strengthened by consideration of this phenomenon.

      3. The contact maps presented in the study represent many cells and distinct cell types. It is clear from single-cell Hi-C and multiplexed FISH experiments that chromosome conformation is highly variable even within populations of the same cell, let alone between cell types, with structures such as TADs being entirely absent at the single cell level and only appearing upon pseudobulking. It is difficult to square these observations with the models of relatively static structures depicted here. The authors should provide commentary on this point.

      4. The analysis of the Micro-C data appears to be largely qualitative. Key information about the number of reads sequenced, reaps mapped, and data quality are not presented. No quantitative framework for identifying features such as the "plumes" is described. The study and its findings would be strengthened by a more rigorous analysis of these rich datasets, including the use of systematic thresholds for calling patterns of organization in the data.

      5. Related to Point 4, the lack of quantitative details about the Micro-C data make it difficult to evaluate if the changes observed are due to biological or technical factors. It is essential that the authors provide quantitative means of controlling for factors like sampling depth, normalization, and data quality between the samples.

      6. The ISH effects reported are modest, especially in the case of the HCR. The details provided for how the imaging data were acquired and analyzed are minimal, which makes evaluating them challenging. It would strengthen the study to provide much more detail about the acquisition and analysis and to include depiction of intermediates in the analysis process, e.g. the showing segmentation of stripes.

    1. eLife assessment

      This valuable work presents elegant experimental data from the Drosophila embryo supporting the notion that interactions among specific loci, called boundary elements, contribute to topologically associated domain (TAD) formation and gene regulation. Although the evidence supporting boundary elements as determinants of 3D structures is compelling, the evidence rejecting loop extrusion is incomplete. This study will be of interest to the nuclear structure community, particularly those using Drosophila as a model.

    2. Reviewer #1 (Public Review):

      Summary:<br /> The authors addressed how long-range interactions between boundary elements are established and influence their function in enhancer specificity. Briefly, the authors placed two different reporters separated by a boundary element. They inserted this construct ectopically ~140 kb away from an endogenous locus that contains the same boundary element. The authors used expression patterns driven by nearby enhancers as an output to determine which enhancers the reporters interact with. They complemented this analysis with 3D DNA contact mapping. The authors found that the orientation of the boundary element determined which enhancers each reporter interacted with. They proposed that the 3D interaction topology, whether being circular or stem configuration, distinguished whether the interaction was cohesin mediated or through an independent mechanism termed pairing.

      Strengths:<br /> The transgene expression assays are built upon prior knowledge of the enhancer activities. The 3D DNA contacts confirm that transgene expression correlates with the contacts. Using 4 different orientations covers all combinations of the reporter genes and the boundary placement.

      Weaknesses:<br /> The interpretation of the data as a refusal of loop extrusion playing a role in TAD formation is not warranted, as the authors did not deplete the loop extruders to show that what they measure is independent. As the authors show, the single long DNA loop mediated by cohesin loop extrusion connecting the ectopic and endogenous boundary is clearly inconsistent with the results, therefore the main conclusion of the paper that the 3D topology of the boundary elements a consequence of pairing is strong. However, the loop extrusion and pairing are not mutually exclusive models for the formation of TADs. Loop-extruding cohesin complexes need not make a 140 kb loop, multiple smaller loops could bring together the two boundary elements, which are then held together by pairing proteins that can make circular topologies.

    3. Reviewer #2 (Public Review):

      In Bing et al, the authors analyze micro-C data from NC14 fly embryos, focusing on the eve locus, to assess different models of chromatin looping. They conclude that fly TADs are less consistent with conventional cohesin-based loop extrusion models and instead rely more heavily on boundary-boundary pairings in an orientation-dependent manner.

      Overall, I found the manuscript to be interesting and thought-provoking. However, this paper reads much more like a perspective than a research article. I strongly suggest the authors spend some time editing their introduction to the most salient points as well as organizing their results section in a more conventional way with conclusion-based titles. It was very difficult to follow the authors' logic throughout the manuscript as written. It was also not clear as written which experiments were performed as part of this study and which were reanalyzed but published elsewhere. This should be made clearer throughout.

      It has been shown several times that Drosophila Hi-C maps do not contain all of the features (frequent corner peaks, stripes, etc.) observed when compared to mammalian cells. Considering these features are thought to be products of extrusion events, it is not an entirely new concept that Drosophila domains form via mechanisms other than extrusion. That being said, the authors' analyses do not distinguish between the formation and the maintenance of domains. It is not clear to this reviewer why a single mechanism should explain the formation of the complex structures observed in static Hi-C heatmaps from a population of cells at a single developmental time point. For example, how can the authors rule out that extrusion initially provides the necessary proximity and possibly the cis preference of contacts required for boundary-boundary pairing whereas the latter may more reflect the structures observed at maintenance? Future work aimed at analyzing micro-C data in cohesin-depleted cells might shed additional light on this.

      Additional mechanisms at play include compartment-level interactions driven by chromatin states. Indeed, in mammalian cells, these interactions often manifest as a "plume" on Hi-C maps similar to what the authors attribute to boundary interactions in this manuscript. How do the chromatin states in the neighboring domains of the eve locus impact the model if at all?

      How does intrachromosomal homolog pairing impact the models proposed in this manuscript (Abed et al. 2019; Erceg et al., 2019). Several papers recently have shown that somatic homolog pairing is not uniform and shows significant variation across the genome with evidence for both tight pairing regions and loose pairing regions. Might loose pairing interactions have the capacity to alter the cis configuration of the eve locus?<br /> In summary, the transgenic experiments are extensive and elegant and fully support the authors' models. However, in my opinion, they do not completely rule out additional models at play, including extrusion-based mechanisms. Indeed, my major issue is the limited conceptual advance in this manuscript. The authors essentially repeat many of their previous work and analyses. The authors make no attempt to dissect the mechanism of this process by modifying extrusion components directly. Some discussion of Rollins et al., 1999 on the discovery of Nipped-B and its role in enhancer-promoter communication should also be made to reconcile their conclusions in the proposed absence of extrusion events.

    4. Reviewer #3 (Public Review):

      Bing et al. attempt to address fundamental mechanisms of TAD formation in Drosophila by analyzing gene expression and 3D conformation within the vicinity of the eve TAD after insertion of a transgene harboring a Homie insulator sequence 142 kb away in different orientations. These transgenes along with spatial gene expression analysis were previously published in Fujioka et al. 2016, and the underlying interpretations regarding resulting DNA configuration in this genomic region were also previously published. This manuscript repeats the expression analysis using smFISH probes in order to achieve more quantitative analysis, but the main results are the same as previously published. The only new data are the Micro-C and an additional modeling/analysis of what they refer to as the 'Z3' orientation of the transgenes. The rest of the manuscript merely synthesizes further interpretation with the goal of addressing whether loop extrusion may be occurring or if boundary:boundary pairing without loop extrusion is responsible for TAD formation. The authors conclude that their results are more consistent with boundary:boundary pairing and not loop extrusion; however, most of this imaging data seems to support both loop extrusion and the boundary:boundary models. This manuscript lacks support, especially new data, for its conclusions. Furthermore, there are many parts of the manuscript that are difficult to follow. There are some minor errors in the labelling of the figures that if fixed would help elevate understanding. Lastly, there are several major points that if elaborated on, would potentially be helpful for the clarity of the manuscript.

      Major Points:<br /> 1. The authors suggest and attempt to visualize in the supplemental figures, that loop extrusion mechanisms would appear during crosslinking and show as vertical stripes in the micro-C data. In order to see stripes, a majority of the nuclei would need to undergo loop extrusion at the same rate, starting from exactly the same spots, and the loops would also have to be released and restarted at the same rate. If these patterns truly result from loop extrusion, the authors should provide experimental evidence from another organism undergoing loop extrusion.<br /> 2. On lines 311-314, the authors discuss that stem-loops generated by cohesin extrusion would possibly be expected to have more next-next-door neighbor contacts than next-door neighbor contacts and site their models in Figure 1. Based on the boundary:boundary pairing models in the same figure would the stem-loops created by head-to-tail pairing also have the same phenotype? Making possible enrichment of next-next-door neighbor contacts possible in both situations? The concepts in the text are not clear, and the diagrams are not well-labeled relative to the two models.<br /> 3. The authors appear to cite Chen et al., 2018 as a reference for the location of these transgenes being 700nM away in a majority of the nuclei. However, the exact transgenes in this manuscript do not appear to have been measured for distance. The authors could do this experiment and include expression measurements.<br /> 4. The authors discuss the possible importance of CTCF orientation in forming the roadblock to cohesin extrusion and discuss that Homie orientation in the transgene may impact Homie function as an effective roadblock. However, the Homie region inserted in the transgene does not contain the CTCF motif. Can the authors elaborate on why they feel the orientation of Homie is important in its ability to function as a roadblock if the CTCF motif is not present? Trans-acting factors responsible for Homie function have not been identified and this point is not discussed in the manuscript.<br /> 5. The imaging results seem to be consistent with both boundary:boundary interaction and loop extrusion stem looping.<br /> 6. The authors suggest that the eveMa TAD could only be formed by extrusion after the breakthrough of Nhomie and several other roadblocks. Additionally, the overall long-range interactions with Nhomie appear to be less than the interactions with endogenous Homie (Figures 7, 8, and supplemental 5). Is it possible that in some cases boundary:boundary pairing is occurring between only the transgenic Homie and endogenous Homie and not including Nhomie?<br /> 7. In Figure 4E, the GFP hebe expression shown in the LhomieG Z5 transgenic embryo does not appear in the same locations as the LlambdaG Z5 control. Is this actually hebe expression or just a background signal?<br /> 8. Figure 6- The LhomieG Z3 late-stage embryo appears to be showing the ventral orientation of the embryo rather than the lateral side of the embryo as was shown in the previous figure. Is this for a reason? Additionally, there are no statistics shown for the Z3 transgenic images. Were these images analyzed in the same way as the Z5 line images?<br /> 9. Do the Micro-C data align with the developmental time points used in the smFISH probe assays?

    1. eLife assessment

      In this valuable study, the authors investigate the role of the UFD-1/NPL-4 complex in the response of C. elegans to infection. While the work is of interest to the field, several pieces of evidence are incomplete, including a lack of validation of the inferences from the RNAi experiments with mutant analyses. There is also the question whether the UFD-1/NPL-4 complex might be better described as regulating "tolerance" to infection instead of inflammation.

    2. Reviewer #1 (Public Review):

      1. I suggest that the author's choose a different term in their title, abstract and manuscript to describe the phenotypes associated with ufd-1 and npl-4 knockdown other than an "inflammation-like response." Inflammation is a pathological term with four cardinal signs: redness (rubor), swelling (tumor), warmth (calor) and pain (dolor). These are not symptoms know to occur in C. elegans. The authors could consider using "tolerance" instead, as this term may better describe their findings.

      2. It would help the reader to better understand the novelty of the findings in this study if the authors include a paragraph in their introduction to put their results in context of the published literature that has examined the relationship between immune activation and nematode health and survival. In particular, I suggest that the authors discuss doi:10.7554/eLife.74206 (2022), a study that charcterized a similar observation to what the authors are reporting. This study found that low cholesterol reduces pathogen tolerance and host survival during pathogen infection. Cholesterol scarcity increases p38 PMK-1 phosphorylation, priming immune effector induction in a manner that reduces pathogen accumulation in the intestine during a subsequent infection. I also suggest that the authors highlight in this introductory paragraph that the toxic effects of inappropriate immune activation in C. elegans has been widely catalogued. For example: doi.org/10.1371/journal.ppat.1011120 (2023); doi:10.1186/s12915-016-0320-z (2016).; doi:10.1126/science.1203411 (2011); doi:10.1534/g3.115.025650 (2016).

      In this context, the authors could consider re-wording their novelty claim in the abstract and introduction to take into account this previous body of work.

      3. The authors rely on the use of RNAi of ufd-1 and npl-4 to study their effect on P. aeruginosa colonization and pathogen resistance throughout the manuscript. To address the possibility of off-target effects of the RNAi, the authors should consider both (i) showing with qRT-PCR that these genes are indeed targeted during RNAi, and (ii) confirming their phenotypes with an orthologous technique, preferably by studying ufd-1 and npl-4 loss-of-function mutants [both in the wild-type and sek-1(km4) backgrounds]. If mutation of these genes is lethal, the authors could use Auxin Inducible Degron (AID) technology to induce the degradation of these proteins in post-developmental animals.

      4. I am confused about the authors explanation regarding their observation that inhibition of the UFD-1/ NPL-4 complex extends the lifespan of sek-1(km25) animals, but not pmk-1(km25) animals, as SEK-1 is the MAPKK that functions immediately upstream of the p38 MAPK PMK-1 to promote pathogen resistance.

      I am also confused why their RNA-seq experiment revealed a signature of intracellular pathogen response genes and not PMK-1 targets, which the authors propose is accounting for toxic immune activation. Activation of which immune response leads to toxicity?

      5. The authors did not test alternative explanations for why UFD-1/ NPL-4 complex inhibition compromises survival during pathogen infection, other than exuberant immune activation. For example, it is possible that inhibition of this proteosome complex shortens lifespan by compromising the general health/ normal physiology of nematodes. Immune responses could be activated as a secondary consequence of this stress, and not be a direct cause of early morality. Does sek-1(km4) mutant suppress the lifespan shortened lifespan of ufd-1 and npl-4 knockdown? This experiment should also be done with loss-of-function mutants, as noted in point 3.

      6. The conclusion of Figure 6 hinges on an experiments that uses double RNAi to knockdown two genes at the same time (Fig. 6D and 6G), an approach that is inherently fraught in C. elegans biology owing the likelihood that the efficiency of RNAi-mediated gene knockdown is compromised and may account for the observed phenotypes. The proper control for double RNAi is not empty vector + ufd-1(RNAi), but rather gfp(RNAi) + ufd-1(RNAi), as the introduction of a second hairpin RNA is what may compromise knockdown efficiency. In this context, it is important to confirm that knockdown of both genes occurs as expected (with qRT-PCR) and to confirm this phenotype using available elt-2 loss-of-function mutants.

      7. A supplementary table with the source data for at least three replications (mean lifespan, n, statistical comparison) for each pathogenesis assay should be included in this manuscript.

    3. Reviewer #2 (Public Review):

      Summary:<br /> The authors aimed to uncover what role, if any, the UFD1/NPL4 complex might play in the innate immune responses of the nematode C. elegans. The authors find that loss of the complex renders animals more sensitive to both pathogenic and non-pathogenic bacteria. However, there appears to be a complex interplay with known innate immune pathways since the loss of UFD1/NPL4 actually results in increased survival of animals lacking the canonical innate immune pathways.

      Strengths:<br /> The authors perform robust genetic analysis to exclude and include possible mechanisms by which the UFD1/NPL4 pathway acts in the innate immune response.

      Weaknesses:<br /> The argument that the loss of the UFD1/NPL4 complex triggers a response that mimics that of an intracellular pathogen has not been thoroughly investigated. Additionally, the finding of a role of the GATA transcription factor, ELT-2, in this response is suggestive, but experiments showing sufficiency in the context of loss of the UFD1/NPL4 complex need to be explored.

    1. eLife assessment

      This manuscript describes an important NMR investigation of allosteric interactions within Abl kinase. The authors identify helix I as a major element that couples the Abl active site with the myristate-binding pocket. The convincing findings have implications for understanding Abl kinase activation and how to target Abl kinase in diseases.

    2. Reviewer #1 (Public Review):

      Summary:<br /> The authors identify a mechanical model of activation of Abelson kinase involving the modification of stability of an alpha helix by mutations and different classes of inhibitors. They use NMR chemical shifts of mutant sequences of the alpha helix in a model of Abelson kinase including the regulatory and kinase domains.

      Strengths:<br /> The mechanism of inhibition of this important drug target is highly complex involving multiple domains' interactions, While crystal structures can establish end states well, the details of more dynamic interactions among the components can be assessed by NMR studies, The authors previously established {Sonti, 2018, PMID29319304} that different inhibitors and assembled states result from changes of stabilisation of the assembly involving the kinase and the SH3 domain. This is extended here to illuminate the role of the kinase C terminal alpha helic I' to the domains' interface, expanding the previous identification of this area of the protein as key to agonist/antagonist action at the allosteric myristlylation binding site.

    3. Reviewer #2 (Public Review):

      In this paper, Paladini and colleagues investigate the concerted motions within the Abl kinase that control its conformational transition between the active (disassembled) and inactive (assembled state). This work follows their previously published findings that binding of the type II inhibitor, imatinib to the active site of Abl, leads to kinase core disassembly via the force imposed by the P-loop and other regions of the N-lobe on the SH3 domain. Interestingly, imatinib-induced disassembly is prevented when an allosteric inhibitor, asciminib, binds to the myristate-binding pocket. Key to asciminib and myristate binding are motions of helix I, located in the C-lobe, and thus, helix I is hypothesized to be the sensor of the imatinib-induced changes. Specifically, bending of helix I upon engagement of myristate or asciminib was postulated to be important for re-assembly of the autoinhibited Abl core, and thus, reducing the "force" with which kinase N-lobe pushes against the SH2 domain upon binding imatinib.

      The authors use NMR to measure conformational transitions in the several 15N-labeled Abl kinase constructs that display different degrees of helix I truncations. This analysis is slightly limited by the instability of the constructs that carry truncations beyond the helix I "bend". Nevertheless, it is sufficient to establish that truncation of helix I that removes its fragment, which is in contact with myristate or asciminib ligands, results in loss of the ability of helix I to impose "force" on the SH2 domain that results in kinase core disassembly, even in the presence of imatinib binding. In the absence of this force, the allosteric coupling between the helix I/SH2 and KD/SH3 interfaces is compromised. Principle component analysis is used to analyze the NMR data, and it is very clear and convincing.

      A compelling evidence in support of the proposed allosteric mechanism comes from the analysis of the E528K disease mutation, identified in the Abl1 malformation syndrome. The authors show that this mutant, poised to break a salt bridge formed between E528 in the C-terminal portion of helix I and R479 on the kinase domain, increases helix I outward motions resulting in core disassembly and higher Abl kinase activity. Together, these results reinforce that helix I motions are central to the mechanism of kinase activation via core disassembly.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This valuable study advances our understanding of the forces that shape the genomic landscape of transposable elements. By exploiting both long-read sequencing of mutation accumulation lines and in vivo transposition assays, the authors offer compelling evidence that structural variation rather than transposition largely shapes transposable element copy number evolution in budding yeast. The work will be of interest to the transposable element and genome evolution communities.

      Public Reviews:

      Reviewer #1 (Public Review):

      Henault et al build on their own previous work investigating the longstanding hypothesis that hybridization between divergent populations can activate transposable element mobilization (transposition). Previously they created crosses of increasing sequence divergence, using both intra- and inter-species hybrids, and passaged them neutrally for hundreds of generations. Their previous work showed that neither hybrids isolated from natural environments nor hybrids from their mutation accumulation lines showed consistent evidence of increased transposable element content. Here, they sequence and assemble long-read genomes of 127 of their mutation-accumulation lines and annotate all existing and de novo transposable elements. They find only a handful of de novo transposition events, and instead demonstrate that structural variation (ploidy, aneuploidy, loss of heterozygosity) plays a much larger role in the transposable element load in a given strain. They then created transposable element reporter constructs using two different Ty1 elements from S. paradoxus lineages and measured the transposition rate in a number of intraspecific crosses. They demonstrate that the transposition rate is dependent on both the Ty1 sequence and the copy number of genomic transposable elements, the latter of which is consistent with what has been observed in the literature on transposable element copy number control in Saccharomyces. To my knowledge, others have not directly tested the effect of Ty1 sequence itself (have not created diverse Ty1 reporter constructs), and so this is an interesting advance. Finally, the authors show that mitotype has a moderate effect on transposition rate, which is an intriguing finding that will be interesting to explore in future work.

      This study represents a large effort to investigate how genetic background can influence transposable element load and transposition rate. The long read sequencing, assembly, and annotation, and the creation of these reporter constructs are non-trivial. Their results are straightforward, well supported, and a nice addition to the literature.

      The authors state that the results from their current work support results taken from their previous study using short-read sequencing data of the same lines. The argument that follows is whether the authors gained anything novel from long-read sequencing. I would like to see the authors make a stronger argument for why this new work was necessary, and a more detailed view of similarities or differences from their previous study (when should others choose to do long read vs. short read of evolved lines?).

      We thank the reviewer for the suggestion. While we initially aimed to justify the relevance and novelty of the current in relation to our previous study, we understand that this justification may not have been strong enough.

      In the second paragraph of the introduction, we explain how the multidimensional nature of TE load makes it more complex to characterize that simply reporting the abundance of a given TE family in a given genome. We added the following concluding sentence to further emphasize the importance of long reads in TE-focused genome inference:

      “As such, ongoing technological and computational advances in genome inference, including long-read sequencing, will certainly be key to getting a detailed understanding of the dynamics of TEs and the underpinning evolutionary forces.”

      In the penultimate introductory paragraph, we summarize our previous work from 2020 and highlight that the evolution of Ty contents in MA lines was inferred from aggregate measures of genomic abundance of TE families using short reads. We then make the point that combinations of multiple SVs could affect the landscape of TEs in ways that are not reflected by crude short-read measures. We added the following sentence to further emphasize this point and contrast it with the necessity of using more powerful methodologies for genome resolution:

      “Under this scenario, measuring Ty family abundance would yield no significant net change, and the dissection of the underlying SVs using short reads could often be challenging.”

      Relatedly, the authors should report the rates of structural variants that they observe. How are these results similar/different from other mutation-accumulation work in S. cerevisiae?

      Since this work does not attempt to provide an exhaustive report of all the SVs in the MA lines, but rather focus on attributing an SV type to individual loci occupied by TEs, we cannot include these estimates, excepted for de novo transposition itself (see below). We added the following sentence to the Results section on the classification of Ty loci by SV types:

      “We note that the current methodology does not aim at providing an exhaustive quantification of all SVs in the MA lines, as previously done for some SV types (Marsit et al., 2021), but focuses solely on loci containing Ty elements.”

      We added estimates of the average retrotransposition rate in the MA experiment based on the number of de novo insertions detected in the MA lines genomes.

      Figure 4:

      “The average retrotransposition rates estimated from the counts of de novo insertions (per line per generation per element) are the following: CC1, 1.0✕10-5; CC2, 4.9✕10-6; CC3, 7.6✕10-6; BB1, 1.5✕10-5; BC2, 1.7✕10-5; BA1, 6.5✕10-6; BA2, 2.2✕10-5; BSc1, 3.6✕10-5.”

      We added the following paragraph in the Discussion section to specifically discuss these estimates in relation to the in vivo measurements.

      “We note that while the CC crosses tend to have the lowest retrotransposition rates as estimated from the de novo insertions (~1✕10-5 per line per generation per element; Figure 4), these values are several orders of magnitude higher than the in vivo measures in SpC backgrounds. The discrepancy between these estimates could be due to uncharacterized biases inherent to each method. They could also be linked to differences between the parental genotypes used to generate the MA crosses and the fluctuation assays. One major difference is the use of ade2 genotypes in the MA parents, a strategy that was initially adopted to provide a marker for the loss of mitochondrial respiration (Joseph and Hall, 2004; Lynch et al., 2008). It has been shown that the induction of adenine starvation through minimal adenine concentration in the medium and deletion of ADE2, which inactivates the adenine de novo biosynthesis pathway, increases Ty1 transcript levels (Todeschini et al., 2005), resulting in higher transposition rates. Rich complex medium like the one that was used for the MA experiment (YPD) can exhibit substantial variation in adenine concentration (VanDusen et al., 1997), and adenine can quickly become the limiting nutrient for ade2 strains (Kokina et al., 2014). Thus, we cannot exclude that the choice of initial ade2 genotypes could have inflated the transposition rates in the MA experiment.”

      Since the authors show a small, but consistent influence of mitotype on transposition rates, adding further evidence for the role of mtDNA in regulating transposition, I'm curious what the transposition rate of a p0 strain is. I think including these results could make this observation more compelling.

      We agree that measuring in vivo transposition rates in ρ0 backgrounds would be an interesting avenue. However, there is a large distinction between having non-functional mitochondrial respiration in ρ0 strains and inheriting diverse functional mtDNA haplotypes. The effects we show are all linked to the reciprocal inheritance of intact mtDNAs, producing ρ+ strains that are all respiration-competent, as shown by our growth confirmations on non-fermentable carbon sources for all the diploid backgrounds generated. While potentially interesting, adding transposition rates measures for the ρ0 backgrounds seems hard to justify in the context of our results.

      Reviewer #2 (Public Review):

      This is an interesting follow-up study that uses long-read sequencing to examine previously constructed mutation accumulation lines between wild populations of S. cerevisiae and S. paradoxus. They also complement this work with reporter assays in hybrid backgrounds. The authors are attempting to test the hypothesis that hybridization leads to genome shock and unrestrained transposition. The paper largely confirms previous results (suggesting hybridization does not increase transposition) that are well cited and discussed in the paper, both from this group and from the Smukowski Heil/Dunham group but extends them to a new set of species/hybrids and with some additional resolution via the long read sequencing. The paper is well written and clear and I have no serious complaints.

      In the abstract, the authors make three primary claims:

      Structural variation plays a strong role in TE load.

      Transposition plays only a minor role in shaping the TE landscape in MA lines.

      Transposition rates are not increased by hybridization but are affected by genotype-specific factors.

      I found all three claims supported, albeit with some minor questions below:

      Structural variation plays a strong role in TE load.

      Convinced of this result. However:

      Line 185-187/Figure 3C: I'm curious given that the changes in Ty count are so often linked to changes in gross DNA sequence whether the count per total DNA sequence is actually changing on average in these genomes. Ie., does hybridization tend to increase TE count via CNV or does hybridization tend to increase DNA content in the MA lines and TEs come along for the ride?

      The Ty content definitely “rides along” with the rest of the genome that is affected by retrotransposition-unrelated SVs. To further highlight this point, we added a panel (E) to Figure 3 in which we correlate the net Ty copy number change (same as panel D, formerly C) to the corresponding genome size, which reflects the amount of DNA lost/gained by all SV types. We added the following to the results section:

      “The distributions of net Ty CN change per MA line showed that most crosses had significant gains (Figure 3D), suggesting that Ty load can often increase as a result of random genetic drift. Some (but not all) of these crosses also exhibited significant increases in genome size after evolution (Supplemental Figure S7A). The net Ty CN changes per MA line subgenome were globally correlated to the corresponding changes in subgenome size (Figure 3E). Even after excluding polyploid lines (which have the largest changes in both Ty CN and genome size), we found a significant relationship between the two variables (mixed linear model with random intercepts and slopes for MA crosses, P-value=3.71✕10-9; Supplemental Figure S7B), indicating that SVs affecting large portions of the genome have a substantial impact on the Ty landscape.”

      One question about ploidy (lines 175-177):

      Both aneuploidy and triploidy seem easy to call from this data. A 3:1 tetraploidy as well. However, in Figure 2B there are tetraploids that are around the 1:1 line. How are the authors calling ploidy for these strains? This was not clear to me from the text.

      This detail was indeed missing from the manuscript. The ploidy level of all MA lines was previously measured by DNA staining and flow cytometry, and the ploidy level of the subgenomes of each polyploid MA line was previously inferred from short-read sequencing. We modified the figure captions and the main text to include this along with the corresponding references:

      Figure 2:

      “The ploidy level of each line was previously determined by DNA staining and flow cytometry (Charron et al., 2019; Marsit et al., 2021).”

      Main text:

      “The ratio of classified bases per subgenome was consistent with the corresponding ploidy levels: triploid BC lines had two copies of the SpC subgenome, while tetraploid lines had both SpC subgenomes duplicated (Charron et al., 2019; Marsit et al., 2021) (Figure 2B).”

      “Finally, we used the ploidy level of each MA line subgenome as previously measured by flow cytometry and short-read sequencing (Charron et al., 2019; Marsit et al., 2021).”

      Reviewer #3 (Public Review):

      Henault et al. address the important open question of whether hybridization could trigger TE mobilization. To do this they analysed MA lines derived from crosses of Saccharomyces paradoxus and Saccharomyces cerevisiae using long-read sequencing. These MA lines were already analysed in a previous publication using Illumina short-read data but the novelty of this work is the long-read sequencing data, which may reveal previously missed information. It is an interesting message of this study that hybridization between the two species did not lead to much TE activity. Due to this low activity, the authors performed an additional TE activity assay in vivo to measure transposition rates in hybrid backgrounds. The study is well written and I cannot spot any major problems. The study provides some important messages (like the influence of the genotype and mitochondrial DNA on transposition rates).

      Major comments

      • What I miss the most in this work is the perspective of the host defence against TEs in Saccharmoces. Based on such a mechanistic perspective, why do the authors think that hybridization could lead to a TE reactivation? For example, in Drosophila small RNAs important for the defence against a TE, are solely maternally transmitted. Hybrid offspring will thus solely have small-RNAs complementary to the TEs of the mother but not to the TEs of the father, therefore a reactivation of the paternal TEs may be expected. I was thus wondering, what is the situation in yeast. Why would we expect an upregulation of TEs? Without such a mechanistic explanation the hypothesis that TEs should be upregulated in hybrids is a bit vague, based on a hunch.

      We agree with the reviewer that in the first version of the manuscript, the justification for the investigation of the reactivation hypothesis in the first place was not self-sufficient and relied too much on our previous work, upon which this article builds. We extensively remodeled the introduction to better justify the investigation of this hypothesis in the context of the current knowledge on the regulation of Ty elements in Saccharomyces.  

      Reviewer #1 (Recommendations For The Authors):

      It's interesting that the net change in transposable element copy number in mutation accumulation lines is either insignificant or gain, and never a significant loss. I think this could make a nice discussion point regarding the roles of drift and selection on TE load.

      We thank the reviewer for the suggestion and agree that this is an interesting perspective that we did not explore in the first version of the manuscript. We thus included a short discussion point in the Results:

      “The distributions of net Ty CN change per MA line showed that most crosses had significant gains (Figure 3D), suggesting that Ty load can often increase as a result of random genetic drift.”

      We also added the following paragraph to the discussion section:

      “Our experiments illustrate how under weakened natural selection efficiency, TE load can increase in hybrid genomes by the action of transposition-unrelated SVs. This offers a nuanced perspective on the classical interpretation of the transposition-selection balance model (Charlesworth et al., 1994; Charlesworth and Langley, 1989), in which increased TE load would be predominantly driven by the relaxation of purifying selection against TE insertions generated by de novo transposition. Our results suggest that SVs arising in the context of hybridization can act as a significant source of TE insertion polymorphisms which natural selection can purge more or less efficiently, depending on the population genetic context. This is closely related to the idea that sexual reproduction could favor the spread of TE families, contributing to their evolutionary success (Hickey, 1982; Zeyl et al., 1996). Since the insertion polymorphisms that contribute to increase TE load mostly originate from standing genetic variation, they could be less deleterious and thus harder for natural selection to purge efficiently.”

      The point about the role of LOH in TE load is cool!

      We thank the reviewer for their enthusiasm, it is one of our favorite results as well.

      Figure 1: Add a figure component of the green box and label it Ty1 or TE.

      We modified Figure 1 accordingly.

      Figure 2C: what is the assembly size ratio?

      We added the following sentence to the figure caption to clarify what we define as assembly size ratio:

      “Assembly size ratio refers to the ratio of subgenome assembly size to the corresponding parental assembly size.”

      Something cut off in the N50 plot axis

      Unfortunately, we can’t seem to understand what the reviewer meant with this comment, nothing seems cut out of the figure panel 2C in any of our versions of the manuscript.

      Reviewer #2 (Recommendations For The Authors):

      These are all minor comments/suggestions that the authors can take or leave.

      Line 42: "fuels" should be "fuel".

      Since the verb refers to “source” and not “variants”, we believe it should be at the third person singular.

      Line 43: unclear what the authors mean by "regroup".

      We understand how this phrasing may sound strange. We modified the sentence accordingly:

      “Structural variation is a term that encompasses a broad variety of large-scale sequence alterations”

      Line 51-52: There are a couple of really nice papers that could be cited here from Anna Selmecki's group (Todd et al. 2020, Todd and Selmecki 2019, both in eLife).

      We thank the reviewer for the suggestions, we included some of these references in the manuscript.

      Figure 1: This is a nice cartoon! I'd suggest spelling out LOH here for a truly naive reader.

      We modified the Figure 1 accordingly.

      Figure 3A: One thing that is slightly lost here in the presentation is the relative frequency of the different events because of the changing scales across 3A. I can see why you want to do it this way, but would consider whether there may be a way to present this that makes it more obvious how much more frequent polyploidy is than excision for example.

      We agree with the reviewer that the focus of this visualization is to compare crosses and individual MA lines within SV types, and fails to display the relative importance of each SV type. We solved this by including an additional panel (new 3A) that shows how the number of Ty loci affected by each SV type scales in comparison to others.

      Figure 5: I'm not a fan of the gray bars highlighting the individual strains. This made the graph less intuitively readable for me.

      We tend to agree with the reviewer and rolled back to a previous version of Figure 5 that was lighter on annotations.

      One thing I would like to see in the future from this data (definitely not in this paper) is genome rearrangements within these hybrid MA lines. How often are there structural changes and how often are those changes mediated by repeats including TEs?

      We completely agree with the reviewer that this would be a very interesting avenue, with a distinct (and likely higher) set of challenges at the analysis level compared to simply focusing on TE sequences like we did here. We hope to be able to tackle this goal in the future of this project.

      Reviewer #3 (Recommendations For The Authors):

      • I'm not from the yeast field. But why this focus on the Ty-load? Are Ty's the only active TEs in yeast? Provide some background on the TE landscape in yeast and a justification for focusing on Ty's.

      We agree with the reviewer that this point was only implicit in the introduction. We modified the introductory segment on Saccharomyces yeasts to mention that Ty retrotransposons are the only TEs found in these genomes, thus explaining the exclusive focus on them. It now reads as follows:

      “In the case of Saccharomyces cerevisiae, the only TEs found are five families of long terminal repeat (LTR) retrotransposons families named Ty1-Ty5 (Kim et al., 1998).”

      • 56 I would argue that Petrov et al 2003 is not the best citation for arguing that TEs can lead to genomic rearrangement through ectopic recombination. Petrov solely showed that some long TE families are at lower population frequency than short TE families ones. This could be due to many reasons (e.g. recent activity of long TEs - mostly LTRs) but Petrov interpreted the data as being due to ectopic recombination. Petrov, therefore, did not demonstrate any direct evidence for the involvement of ectopic recombination.

      We agree with the reviewer that this reference is not the best choice to simply support the role of TEs in generating ectopic recombination events and modified the references accordingly.

      • For the assembly the authors used two steps 1) separate the reads based on similarity to a subgenome 2) and assembly the reads from the resulting two sets separately. This is probably the only viable approach, but I'm wondering if this step can lead to some biases (many reads may not be assigned to one sub-genome or assigned to the wrong sub-genome). An alternative, possibly less biased approach, would be to use one of the emerging assemblers that promise to assemble sub-genomes. Maybe discuss why this approach was not pursued.

      We completely agree that our method has some level of bias. We adopted it because it seemed the most appropriate to answer our question, which required to resolve individual TE insertions at the level of single haplotype sequences. One specific challenge of this dataset is that we have a relatively wide range of nucleotide divergence between parental subgenomes in the different MA crosses, from <1% to ~15%. The efficiency of haplotype separation from tools that are not necessarily designed to be tunable with respect to the level of nucleotide divergence seemed uncertain, which is why we opted for a custom methodology. Although read non-classification remains a problem that is hard to solve (and would remain so using orthogonal strategies), we believe that read misclassification is minimized by our stringent criteria for read classification. The goal of this study was not to develop a tool nor to benchmark our approach against existing diploid assembly tools. It yielded phased genome representations that were of sufficient completeness and contiguity to confidently answer our questions, and we believe that pushing the discussion towards technical considerations would fall outside of our main objective.

      • The authors used a decision tree to classify Ty loci. What were the training data? How were the trees validated? Decision tree is a technical term for a classifier in machine learning. I do not think the authors used machine learning in this work, but rather an "an ad-hoc set of rules". The term decision tree in this study is misleading.

      We believe that the term “decision tree” can simply refer to a hierarchy of conditional rules implemented as a classification algorithm. As the reviewer pointed, it is clear from the manuscript that none of the analyses performed include any form of training or fitting of a machine learning classifier. However, we agree that its specific reference to the machine learning classifier can create unnecessary confusion. We thus agree to remove this term from the manuscript and replaced all its instances by “a hierarchy of binary rules”.

      • 272: as it is the CNC explanation does not make a lot of sense to me; some information is missing, is p22 expression increasing with copy numbers?

      Yes, p22 expression correlates positively with the CN of p22-expressing Ty1 elements.

      Why are the two alternative downstream codons important?

      We thought it would be useful to mention the two start codons at this point because later in the discussion, we bring the conservation of the first start codon as an observation consistent with the putative expression of p22 in S. paradoxus. We also thought that it helped clarify the mechanism by which the N-truncated version of the protein is expressed.

      p22 interferes with assembly viral particles when in high copy numbers, but what happens when at low copy numbers, is it essential for retroviral activity? Is it even necessary for the virus or just some garbage product (they mention N-truncated).

      To our knowledge, these questions regarding the potential molecular functions of p22 outside of a retrotransposition restriction factor are still open. We added details to the background on CNC in the Introduction and Results section to help clarify some the points raised:


      “The best known regulation mechanism in yeast is termed copy number control (CNC) and was characterized in the Ty1 family of S. cerevisiae. This mechanism is a potent copy-number dependent negative feedback loop by which increasing the CN of Ty1 elements strengthens their repression (Czaja et al., 2020; Garfinkel et al., 2003; Saha et al., 2015).”


      “The mechanism of negative copy-number dependent self-regulation of retrotransposition (CNC) was characterized in the Ty1 family of S. cerevisiae (Garfinkel et al., 2016). This mechanism relies on the expression of an N-truncated variant of the Ty1 capsid/nucleocapsid Gag protein (p22) from two downstream alternative start codons (Nishida et al., 2015; Saha et al., 2015). p22 expression scales up with the CN of Ty1 elements that encode it (Tucker et al., 2015), which gradually interferes with the assembly of the viral-like particles essential for Ty1 replication (Cottee et al., 2021; Saha et al., 2015). Thus, CNC yields a steep negative relationship between the retrotransposition rate measured with a tester element and the number of Ty1 copies in the genome (Garfinkel et al., 2003; Tucker et al., 2015).”

      • mtDNA influences transposition, is anything known about the mechanism?

      When presenting this result, we make it clear that this finding is not new and was previously observed in S. cerevisiae x S. uvarum hybrids by Smukowski-Heil et al. (2021). In this reference, the authors discuss multiple mechanisms by which mitochondrial biology and mito-nuclear interplay may affect transposition rate, although their data cannot support one specific hypothesis. Our data does not to allow to further dissect the mechanistic basis of the mtDNA effect, not more than the effect of distinct Ty1 natural variants. Since we simply provide new independent evidence for the mtDNA effect, it seems to us that repeating the discussion on putative mechanisms while bringing no support to any given hypothesis would be of limited relevance.

      • During the first reading, I got quite confused about what CN means (copy number as it turned out). I suggest using abbreviations only if absolutely necessary, and I'm not entirely convinced it is necessary here. But I leave this to the discretion of the authors.

      We agree that the excessive use of abbreviations in manuscripts is annoying. However, in this case, “copy number” is used so extensively that its abbreviation seemed to improve the reading experience. Thus, we would prefer to keep it unchanged.

      • Fig 3D: Wilcoxon Rank sum test. It is not clear to me what was tested here? Which data were used?

      We confirm that the statistical test employed is the Wilcoxon signed-rank test, and not the Wilcoxon rank-sum test (also known as Mann-Whitney U-test). The Wilcoxon signed-rank test is used here as a non-parametric one-sample test against the null hypothesis that the distribution is centered around zero.

      • de novo -> italics

      We choose to follow the recommendation of the general style conventions of the ACS guide for scholarly communications not to italicize common Latin terms like “de novo”, “e.g.” and “i.e.”.

    2. eLife assessment

      This valuable study advances our understanding of the forces that shape the genomic landscape of transposable elements. By exploiting both long-read sequencing of mutation accumulation lines and in vivo transposition assays, the authors offer compelling evidence that structural variation rather than transposition largely shapes transposable element copy number evolution in budding yeast. The work will be of interest to the transposable element and genome evolution communities.

    3. Reviewer #1 (Public Review):

      Henault et al build on their own previous work investigating the longstanding hypothesis that hybridization between divergent populations can activate transposable element mobilization (transposition). Previously they created crosses of increasing sequence divergence, using both intra- and inter-species hybrids and passaged them neutrally for hundreds of generations. Their previous work showed that neither hybrids isolated from natural environments nor hybrids from their mutation accumulation lines showed consistent evidence of increased transposable element content. Here, they sequence and assemble long read genomes of 127 of their mutation-accumulation lines and annotate all existing and de novo transposable elements. They find only a handful of de novo transposition events, and instead demonstrate that structural variation (ploidy, aneuploidy, loss of heterozygosity) plays a much larger role in the transposable element load in a given strain. They then created transposable element reporter constructs using two different Ty1 elements from S. paradoxus lineages and measured transposition rate in a number of intraspecific crosses. They demonstrate that transposition rate is dependent on both the Ty1 sequence and the copy number of genomic transposable elements, the latter of which is consistent with what has been observed in the literature of transposable element copy number control in Saccharomyces. To my knowledge, others have not directly tested the effect of Ty1 sequence itself (have not created diverse Ty1 reporter constructs), and so this is an interesting advance. Finally, the authors show that mitotype has a moderate effect on transposition rate, which is an intriguing finding that will be interesting to explore in future work.

      The authors state that their results from their current work support results taken from their previous study using short read sequencing data of the same lines. The argument that follows is whether the authors gained anything novel from long read sequencing. While major results did not change from their previous work, the addition of long read sequencing did provide novel insight into the comparison of de novo transposition and structural variation that was not possible with short read sequencing. Additionally, this allowed the authors to compare estimates of transposition from two methods (inferred from mutation accumulation lines and from reporter assays).

      Overall, this study represents a large effort to investigate how genetic background can influence transposable element load and transposition rate. The long read sequencing, assembly, and annotation, and the creation of these reporter constructs is non-trivial. Their results are straightforward, well supported, and are a nice addition to the literature.

    4. Reviewer #2 (Public Review):

      This is an interesting followup study that uses long read sequencing to examine previously constructed mutation accumulation lines between wild populations of S. cerevisiae and S. paradoxus. They also complement this work with reporter assays in hybrid backgrounds. The authors are attempting to test the hypothesis that hybridization leads to genome shock and unrestrained transposition. The paper largely confirms previous results (suggesting hybridization does not increase transposition) that are well cited and discussed in the paper, both from this group and from the Smukowski Heil/Dunham group but extends them to a new set of species/hybrids and with some additional resolution via the long read sequencing. The paper is well written and clear and I have no serious complaints.

      In the abstract, the authors make three primary claims:

      Structural variation plays a strong role in TE load.<br /> Transposition plays only a minor role in shaping the TE landscape in MA lines.<br /> Transposition rates are not increased by hybridization but are affected by genotype specific factors.

      Comments on revised submission:

      I found all three claims supported, albeit with some minor questions. Those questions were answered by the authors in revision. I appreciate the authors revisions and feel the paper is now in better shape than upon the original submission.

    1. Author Response

      The following is the authors’ response to the original reviews.

      The reviewers make some suggestions aimed towards increasing the clarity of the manuscript, and I suggest that the authors examine those carefully. In particular, the figure is difficult to read and could contain additional information to help the reader's interpretation. For example, Reviewer 1 suggests including sample age estimates alongside depth, while Reviewer 3 also notes that there is missing information in the figure. Apart from the figure, Reviewer 1 suggests two additional analysis to help explain the amount of mammoth DNA recovered, which they observe is much higher than previous similar investigations. This would seem to be an important issue to address, given the surprising nature of the findings. In addition to this larger issue, the Reviewer makes a few important suggestions for supplementary material that may be needed to support the authors' statements.

      Some additional recommended edits -- in particular to the text and included references to related studies -- are suggested by Reviewers 2 and 3, and both commented on the lack of a publicly-available data repository. The authors may also wish to comment on or revisit their differential treatment of wooly mammoth vs. wooly rhinoceros samples, though I suspect this has more to do with low read numbers for the rhinos.

      Thank you very much for the positive assessment of our manuscript and clear suggestions for revision. We address these points below.

      Reviewer #1 (Recommendations For The Authors):

      I have a few suggestions that might further improve the manuscript:

      It is difficult for the reader to follow which core slices exactly have been sampled and sequenced. The authors mention 23 samples were taken from core LK-001 and 16 samples from core LK-007. From the text it remains unclear to me what the exact age of each of these samples is. Figure 1 shows the depth at which the LK-001 core was sampled, maybe sample age estimates could be included here.

      Thanks for pointing this out. We have added approximate ages to Figure 1, added the depth range to the text (“from 1.5 to 80 cm”; l. 73-74, caption Figure 1), and reworked the table of the sampling depths in the supplement.

      Line 84-87. The authors mention the retrieval of DNA from several expected Arctic taxa, however no further data regarding these findings is given in the manuscript. It would be useful to report the same numbers for these species as the ones given for the Mammuthus and woolly rhinoceros, which would allow for a comparison of the relative abundance of the DNA between these species. Are the expected Arctic species for instance at much higher (DNA) abundance in the samples? It would also be interesting to know if the authors discovered DNA from extant species that are unlikely to have occurred in the geographic region. A (supplementary)table listing the number of mapped reads to each of the respective mitogenomes for each sequence library would be useful for the reader.

      We added a supplementary table (S8) indicating the numbers of reads assigned to mammals.

      Line 90: I am somewhat amazed by the amount of mammoth DNA the authors recovered from these cores. A total depth of over 400X of the mitogenome is quite extraordinary and I am not aware of any ancient sediment study to date that has retrieved a similar amount of data. For instance, the Wang et al. 2021 paper, which the authors cite, sequenced over 400 samples and did not find any mammoth DNA in 70% of those. For the 30% of samples showing signs of mammoth DNA they retrieved on average 530 sequence reads. In this study the authors find on average ~20.000 reads, in 22 out of the 23 sequence libraries. This makes me wonder if the way the mapping was performed has been too lenient, resulting in possible spurious mappings? To really confirm the authenticity of the mammoth (and woolly rhino data) I would suggest two additional analysis:

      1) Mapping all the sequence libraries to a reference consisting of the complete Asian-elephant genome (for instance https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_024166365.1/), the complete human genome (+mitogenome) and the Asian elephant mitogenome. This could possibly reduce spurious mappings as conserved regions between the genomes are filtered out and could also reduce the possible mapping of NUMTS. If the authors could show that after such a mapping approach a significant number of reads are still assigned to the Asian elephant part (including the mitogenome) of the reference, the reported findings would be strengthened.

      2) I also suggest to construct a mitochondrial haplotype network from the obtained DNA, while also including previously published Asian and African elephants as well as previously published mammoth mitogenomes. If the obtained haplotypes indeed show that they cluster within the known haplotype diversity of mammoth, that would be strong support for the authenticity of the data

      The same analysis could be considered for the woolly rhino data, although the lower read numbers might make this analysis challenging.

      We agree that the amount of mammoth DNA is surprising, which is why we opted for further laboratory experiments for confirmation of the hybridization capture results of the first core, i.e., 1) DNA extraction from a second core of a different lake, 2) a quantitative PCR approach (ddPCR), and 3) metabarcoding. Our results of the highly specific ddPCR and metabarcoding assays confirmed considerable amounts of mammoth DNA in two sediment cores of different lakes, thus we have no doubts regarding the authenticity of the data. Considering the large amount of mammoth DNA, the high number of reads, and particularly the high mitogenome coverage, we argue that the effect of some spurious mapping is negligible and does not affect the main outcome and conclusions of our study. Although we agree that a haplotype network would be interesting, such analyses would stretch beyond the focus of this publication.

      Line 91: The authors mention negative controls (extraction and library blanks) did not produce any reads assigned to mammals. This is quite remarkable, as in my experience low levels of (human)contamination are almost always present in the blanks. Could the authors comment on why they think the blanks did not show any signal of mammalian DNA?

      The hybridization capture enrichment and the filtration and mapping procedures likely eliminated human contamination. Also, the data were mapped against Arctic mammal mitogenomes, which did not include human reference sequences. However, six of the sediment samples contained human sequences (now shown in supplementary table S8), albeit at low read counts (mean = 65)

      Line 97: "mapping suggested that the sequences throughout the core originated from multiple individuals" The authors do not provide any supporting data showing this. I think that an analysis (for instance based on allele frequencies) has to be included in manuscript to support this claim.

      We agree that his claim was not sufficiently supported. We performed further analyses including genomic data of previously retrieved mammoth remains and assigned our data to these haplogroups; the results were added to the main text and are shown as a figure (Fig. 2).

      Line 98: "Signatures of post-mortem DNA decay were comparably minor."

      Do the authors know if the used hybridisation enrichment method can distort the measurement of post-mortem damage? Are for instance reads with C-T substitutions less likely to be captured by the baits?

      To our knowledge, there is no study suggesting that damaged sites are less likely to be captured. In general, the hybridization capture procedure is not overly specific, and studies report that DNA is readily and preferentially captured as long as the difference between baits and DNA is not above 10%.

      Line 100: "The proportions of bases did not suggest a substantial deviation from those in the reference genomes or in the closest extant relative of Mammuthus, the Asian elephant (Elephas maximus)."

      It is not clear to me what the authors mean by this. Could the authors explain how this was measured and what their interpretation of this result is?

      We realize that the sentence was unclear. We meant that the nucleotide composition was similar to that of the reference genomes or the closest extant relative. However, as we do not consider this important for the argument, we have removed this sentence from the manuscript.

      Given the high number of recovered mammoth reads in the samples, it would be interesting to know how much mammoth reads are present in the sample before enrichment capture with the baits. Shotgun sequencing the raw extract of one of the samples with the highest number of mammoth reads might allow for a rough estimate of mammoth DNA abundance compared to the other extant species (e.g. reindeer, Arctic lemming and hare) found in the sample(s). This could give further clarification about the extent of stratigraphy disturbance and its overall effect on the DNA based community reconstruction. However, this is just a suggested additional analysis and not something I believe crucial for supporting the overall findings in this manuscript.

      We fully agree that this would be a highly interesting and informative additional analysis to perform. It was, however, not possible to perform this additional analyses in the course of the current experiments.

      Finally, I could not find a public link to the (sequence)data produced in this study. I strongly encourage the authors to make their data publicly available.

      Thank you for pointing this out. We have added a Data Availability paragraph, including the respective reference.

      Reviewer #2 (Recommendations For The Authors):

      In the Discussion it is mentioned that the reasons for Mammoth extinction are not entirely clear but are largely attributed to sudden climate warming (and add some relevant citations). However, there is also abundant literature that suggest humans also played a role in their extinction (for instance, a recent one, Damien et al. (2022) at Ecology Letters 25: 127-137).

      We agree with the reviewer and have added some the recent citation highlighting the possible influence of humans.

      One possibility to add further interest to this paper would be to conduct a phylogenetic tree with the Mammoth mitogenome(s) retrieved and a reference dataset; it could be interesting to know where do they fall in the phylogeny -already abundant with tens of individuals- and maybe it could be even possible to roughly estimate their date. There are some papers that report many Mammoth mitogenomes, including of course some from Siberia; for instance Chang et al. (2017) at Sci Reports and also Fellow Yates et al. (2017) also at Sci Reports (the latter mainly from Central Europe).

      We are well aware of the amount of mt genomes available for mammoth, and such an analyses would be an interesting addition, potentially also offering the possibility to date the DNA. However, the analyses was hampered and would be less secure for this dataset, as our sequences display quite some variation among each other, suggesting that we have a mix of multiple mt genomes, which we cannot readily distinguish. We thus refrain from this, also because we instead provide multiple lines of evidence for the existence of the mammoth DNA in the surface sediment core (metabarcoding, ddPCR).

      Minor points:

      -Correct wooly to woolly


      -In the sampling description it is not totally clear if the samples were taken at 1 cm each (it is mentioned that core LK-001 is sliced in the field at 1-cm steps for radiometric dating and later it is explained that 23 samples were analyzed from this core, but it is unclear if they represent 23 cm of core)

      -Maybe the authors could briefly define some terms such as "talik"


      Reviewer #3 (Recommendations For The Authors):

      Maybe I missed this but I could not find a data availability statement or the location of the repository

      We have added a Data Availability paragraph, including the respective reference.

      It would be good to see some additional analysis on the distribution of the woolly rhinoceros DNA through the sediment core - like the figure for the mammoth i.e read numbers vs depth.

      We have added to the supplements a table showing the numbers of assigned mammal reads over the core depths (Table S8). However, as rhinoceros reads are considerable rarer in our results, we did not produce a figure.

      Would it be possible to be more explicit about the multiple mammoth individuals, could you calculate a minimum number or haplotypes for example.

      We agree that his claim was not sufficiently supported and added results from additional analyses (incl. Fig. 2). Please see our response above.

      Based on the aim stated in the introduction, the analysis of the Arctic biodiversity of this area is missing, it would be nice to see these result added or maybe the focus needs to be changed for clarity.

      We now explicitly state that this objective pertains to a different study, which is currently still in preparation for publication.

      The single main figure needs a bit more consideration. For example in panel A - there was no information on the transformation performed or what the general trend line refers to. Do the results in panel B refer to all 22 libraries? What is the x-axis in Panel C and what do the coloured lines refer to? Additionally, I think the figure needs to be in higher resolution with increased text size on all axes.

      We revised the figure and the caption for clarity and readability.

      Finally this might be an accidental typo - but when referring to the sample aged at around 8,677 years in text it states this the 36.5 cm sample (line 130 and 192), but the supplementary says this is the 51cm sample (Table S6). This would maybe impact potential conclusions. Would you be able to clarify this.

      Thank you for noting this error, we revised it.

    2. eLife assessment

      This work presents convincing evidence for the presence of wooly mammoth/rhinoceros ancient environmental DNA (aeDNA) far from the time likely to host living individuals: what is effectively a genetic version of a geological inclusion. These are important findings that will have ramifications for the interpretation and conclusions extracted from aeDNA more generally.

    3. Reviewer #2 (Public Review):

      Summary:<br /> The authors report the successful retrieval of mitogenomes from extinct Pleistocene megafauna (woolly Mammoth and woolly rhino) from recent sediment cores from two close Siberian lakes. The cores are too recent to represent real time points of these two extinct species (known to have been extinct for several thousands of years) and therefore, the most plausible interpretation is that permafrost thawing and similar physical processes in the lakes have made surface old ancient DNA, maybe from nearby, deep-buried carcasses.

      They have answered the comments and questions I raised in my review. I agree with them on the complexities or separating a potential mixing of different Mammoth mito genomes retrieved.

    4. Reviewer #3 (Public Review):

      Summary:<br /> In this study, the researchers used ancient environmental DNA (aeDNA) retrieved from sediment cores, from two lakes in the Arctic, on the Yamal peninsula, in Siberia. The dating of one of the cores, showed that the sediment layers were very recent (ranging between the years 2019 - 1895). From this core they sequenced 23 libraries which were enriched for mammal mitochondrial genomes. They found a high proportion of two species that have been extinct for thousands of years, the mammoth and the woolly rhinoceros. The highest proportion of mammoth reads were found in very young layer (~81 years old) and as this initial finding does not match the temporal occurrence of the species, they confirmed the identification with several other methods. Additionally, they applied a different dating method on some samples and found that the aging of the samples was not completely congruent. The authors suggest the that the presence of these two Pleistocene megafauna in such recent sediment layers is a consequence of physical processes, specific to the study site, and that the high quality of the aeDNA recovered is a result of permafrost preservation.

      The strengths of the study are in the rigorous confirmation of the identification of the taxa with four different PCR and sequencing techniques being used, the initial enrichment panel, and then subsequent metabarcoding PCRs, and taxa specific PCR for COI and cytB. Along with the ancient DNA protocol applied, this is therefore very convincing that the DNA detected in the samples is indeed from the Pleistocene mammals. Additionally, two methods were used to age the sediment cores, and although the depth of the samples tested do not overlap, they give reasonable ages (apart from the anomalous sample) and all together these are robust results.

      There is now an analysis supporting the idea that there are multiple individual mammoths in the sample as well as a figure to display the locations of the haplotypes. The authors also confirm that the woolly rhinocerous did not recover enough sequences for analysis. The aims have been clarified and no longer states that they are looking at mammal biodiversity through time, so the papers focus is now more specifically on just the mammoth. But a supplementary table of the reads from common mammals has been added.

      Overall the results support that there has been some movement of DNA throughout the sediment core which may impact the dating of the last occurrence of particular extinct taxa. As highlighted, though the geological processes by which this may have arisen are specific to this particular lake and may not be broadly relevant, therefore highlighting that knowledge of each system is important to understanding DNA distribution.

    1. Author Response

      Reviewer #1 (Public Review):


      Alonso-Calleja and colleagues explore the role of TGR5 in adult hematopoiesis at both steady state and post-transplantation. The authors utilize two different mouse models including a TGR5-GFP reporter mouse to analyze the expression of TGR5 in various hematopoietic cell subsets. Using germline Tgr5-/- mice it's reported that loss of Tgr5 has no significant impact on steady-state hematopoiesis, with a small decrease in trabecular bone fraction, associated with a reduction in proximal tibia adipose tissue, and an increase in marrow phenotypic adipocytic precursors. The authors further explored the role of stroma TGR5 expression in the hematopoietic recovery upon bone marrow transplantation of wild-type cells, although the studies supporting this claim are weak. Overall, while most of the hematopoietic phenotypes have negative results or small effects, the role of TGR5 in adipose tissue regulation is interesting to the field.

      We thank Reviewer 1 for having identified some strengths and weaknesses of our study. As summarized below, we will work to consolidate the weaknesses of our study.


      • This is the first time the role of TGR5 has been examined in the bone marrow.

      • This paper supports further exploration of the role of bile acids in bone marrow transplantation and possible therapeutic strategies.


      • The authors fail to describe whether niche stroma cells or adipocyte progenitor cells (APCs) express TGR5.

      We are currently working to address this question using our reporter model and expect to be able to provide the data in the next version of the reviewed preprint.

      • Although the authors note a significant reduction in bone marrow adipose tissue in Tgr5-/- mice, they do not address whether this is white or brown adipose tissue especially since BA-TGR5 signaling has been shown to play a role in beiging.

      The nature of BMAT and how it relates to brown, white or brown/beige adipose tissue has been a persistent question in the field. Our understanding is that BMAT is currently considered a distinct adipose depot that is neither white nor brown/beige. BMAT does not express UCP1 to an appreciable extent, with reports showing its expressing possibly detecting contamination by tissues surrounding bone (Craft et al., 2019). Beyond this consideration, as the regulated BMAT in TGR5-/- mice is almost absent, determination of the brown/beige vs white nature of the regulated BMAT remains technically challenging.

      In Figure 1, the authors explore different progenitor subsets but stop short of describing whether TGR5 is expressed in hematopoietic stem cells (HSCs).

      Figure 1 of the originally submitted manuscript described TGR5 expression in committed myeloid progenitors (CMP, GMP and MEP). Below we provide the requested data (expression in MPPs and HSCs in Author response image 1) and we have further expanded our data with the expression in megakaryocyte progenitors (MkProg - Lin-cKit+Sca1-CD41+CD150+) as shown in Author response image 2.

      Author response image 1.

      Frequencies of GFP+ cells in MPPs and HSCs in the BM of 8-12-week-old male TGR5:GFP mice and their controls (n=9 for Wild-type control mice, n=11 for TGR5:GFP mice). Results represent the mean ± s.e.m., n represents biologically independent replicates. Two-tailed Student’s t-test was used for statistical analysis. p-values (exact value) are indicated.

      Author response image 2.

      A, representative flow cytometry gating strategy used to identify megakaryocyte progenitors (MkProg) and GFP positivity in TGR5:GFP mice and their wild-type controls. B, frequencies of GFP+ cells in MkProg population in the BM of 8-12-week-old male TGR5:GFP mice and their controls (n=3 for Wild-type control mice, n=4 for TGR5:GFP mice). Results represent the mean ± s.e.m., n represents biologically independent replicates. Two-tailed Student’s t-test (B) was used for statistical analysis. p-values (exact value) are indicated.

      • Are there more CD45+ cells in the BM because hematopoietic cells are proliferating more due to a direct effect of the loss of Tgr5 or is it because there is just more space due to less trabecular bone?

      While we do not have direct evidence to address this question, we see approximately an average 20% increase in CD45+ cell counts in the baseline Tgr5-/- mice. The absolute volume of bone and BMAT lost in these animals does not account for 20% of the total volume of the medullary cavity, so we speculate that the increase in CD45+ counts is not due exclusively to an increase in available volume.

      • In Figure 4 no absolute cell counts are provided to support the increase in immunophenotypic APCs (CD45-Ter119-CD31-Sca1+CD24-) in the stroma of Tgr5-/- mice. Accordingly, the absolute number of total stromal cells and other stroma niche cells such as MSCs, ECs are missing.

      We initially chose not to report the total number of cells per leg, as the processing of the bones for stroma isolation is less homogenous than that of the HSPC populations (which we do by crushing whole bones with a mortar and pestle). Regardless of these considerations, the data for absolute counts of APCs (left panel), the stroma-enriched fraction (CD45-Ter119-CD31- - middle panel) and endothelial cells (CD45-Ter119-CD31+ - right panel) is provided in Author response image 3. Note that the number of cells plated for CFU-F and BMSC in vitro differentiation is constant between the genotypes, thus confirming the importance of ther elative abundance data shown in the submitted version of the manuscript. In conclusion, we have prioritized the data showing the relative overrepresentation of APC progenitors in the BM stroma as measured by flow cytometry in a per cell basis, which is in line with the functional in vitro data. Further studies could address the specific question through 3D wholemount studies once APC in situ markers are firmly characterized.

      Author response image 3.

      Left panel: absolute number of adipocyte progenitor cells (APCs) in the CD45-Ter119-CD31- BM stromal gate for bothTgr5+/+ and Tgr5−/− (n=5). Middle panel: absolute number of cells isolated from the stroma-enriched BM fraction (CD45-Ter119-CD31-) in the same mice. Right panel: absolute number of endothelial cells, defined as CD45-Ter119-CD31+, in the same BM isolates.

      • There are issues with the reciprocal transplantation design in Fig 4. Why did the authors choose such a low dose (250 000) of BM cells to transplant? If the effect is true and relevant, the early recovery would be observed independently of the setup and a more robust engraftment dataset would be observed without having lethality post-transplant. On the same note, it's surprising that the authors report ~70% lethality post-transplant from wild-type control mice (Fig 4E), according to the literature 200 000 BM cells should ensure the survival of the recipient post-TBI. Overall, the results even in such a stringent setup still show minimal differences and the study lacks further in-depth analyses to support the main claim.

      We thank the reviewer for this comment. On the one hand, we disagree on the relevance of the effect size, as Tgr5-/- mice recover from low levels of platelets significantly faster than the Tgr5+/+ controls. Underlining the relevance, in a clinical setting, G-CSF is administered to patients routinely even if the acceleration of recovery is of 1-2 days (Trivedi et al., 2009).

      From the point of view of the mortality, we agree that it is higher than expected. We have suffered from cases of swollen muzzles syndrome in our facilities that have greatly hampered our ability to perform myeloablation experiments (Garrett et al., 2019), as even sublethal doses have resulted in the appearance of severe side effects that are reasons for euthanasia under Swiss legislation. For example, a strong reduction in mobility requires immediate euthanasia. All experiments were performed blinded to genotype allocation, so we can reasonably exclude experimenter bias. Finally, it could be argued that mice with more marked symptomatology leading to euthanasia are more likely to have hematopoietic deficits, which in our case was mostly seen for Tgr5+/+animals. We have therefore chosen to report mortality together with the longitudinal assessment of peripheral blood counts.

      • Mechanistically, how does the loss of Tgr5 impact hematopoietic regeneration following sublethal irradiation?

      The question of a non-lethal hematopoietic stress is a very relevant one. Unfortunately, and as delineated in the previous point, we have been seriously conditioned by cases of swollen muzzles syndrome (Garrett et al., 2019) that have stopped us from proceeding with more irradiation studies. We will profit from the change of animal facility that will consolidate during the upcoming year Labora(tory of Regenerative Hematopoiesis) to address this point in follow-up studies.

      • Only male mice were used throughout this study. It would be beneficial to know whether female mice show similar results.

      We agree with this comment, and we expect to include the characterization of BM microenvironment (Figure 3 of the current manuscript) in females in the reviewed version of the manuscript when a suitable cohort becomes available.

      Reviewer #2 (Public Review):

      Summary: In this manuscript, the authors examined the role of the bile acid receptor TGR5 in the bone marrow under steady-state and stress hematopoiesis. They initially showed the expression of TGR5 in hematopoietic compartments and that loss of TGR5 doesn't impair steady-state hematopoiesis. They further demonstrated that TGR5 knockout significantly decreases BMAT, increases the APC population, and accelerates the recovery upon bone marrow transplantation.

      Strengths: The manuscript is well-structured and well-written.

      We thank Reviewer #2 for this comment.

      Weaknesses: The mechanism is not clear, and additional studies need to be performed to support the authors' conclusion.

      We agree with Reviewer #2 that more studies are needed to understand what the role of TGR5 in the hematopoietic system is. We have been hampered in our studies of stress hematopoiesis because of frequent cases of swollen muzzles syndrome (Garrett et al., 2019), which has made difficult to continue with experiments involving myelosuppression (see response to Reviewer #1 as well). Further studies are planned or ongoing, including determining the role of the microbiome on the observed TGR5 bone and hematopoiesis stress phenotypes, but will be the focus of a separate study.


      Craft, C.S., Robles, H., Lorenz, M.R., Hilker, E.D., Magee, K.L., Andersen, T.L., Cawthorn, W.P., MacDougald, O.A., Harris, C.A., Scheller, E.L., 2019. Bone marrow adipose tissue does not express UCP1 during development or adrenergic-induced remodeling. Sci Rep 9, 17427. https://doi.org/10.1038/s41598-019-54036-x

      Garrett, J., Sampson, C.H., Plett, P.A., Crisler, R., Parker, J., Venezia, R., Chua, H.L., Hickman, D.L., Booth, C., MacVittie, T., Orschell, C.M., Dynlacht, J.R., 2019. Characterization and Etiology of Swollen Muzzles in Irradiated Mice. Radiat Res 191, 31–42. https://doi.org/10.1667/RR14724.1

      Trivedi, M., Martinez, S., Corringham, S., Medley, K., Ball, E.D., 2009. Optimal use of G-CSF administration after hematopoietic SCT. Bone Marrow Transplant 43, 895–908. https://doi.org/10.1038/bmt.2009.75

    1. Author Response

      The following is the authors’ response to the original reviews.

      Answers to reviewers’ comments

      Peer Reviewers 2 and 3 criticized the name of the antibody – hvCADab - and the lack of proof that it recognized a classic cadherin. These criticisms were justified and in the intervening months the issue has been resolved. hvCADab does not recognize the cadherin protein, although it was made to an 18 amino acid sequence from the intracellular domain of the H. vulgaris cadherin protein. Newly available genome sequences from two other species, Hydra oligactis and Hydra viridissima, now show that the 18 amino acid antigen sequence is not present in these species.

      Nonetheless, the nerve net in both species is strongly stained by the antibody. Hence we have renamed the antibody PNab (pan-neuronal antibody). The antigen is currently not known. Nevertheless the antibody is an excellent reagent for imaging the nerve net in Hydra.

      We have revised the section on antibody preparation in Materials and Methods to state explicitly that PNab does not recognize classic cadherin. To support this conclusion we have added a sequence comparison (Suppl Fig 3) of the intracellular domains of classic cadherins from H. vulgaris, H. oligactis and H. viridissima, which show that the 18aa antigen sequence is only present in the H. vulgaris classic cadherin and not in the cadherin sequences from H. oligactis and H. viridissima. All three sequences have highly conserved p120/delta-catenin and beta-catenin binding domains. The sequence between these domains is highly variable and the 18aa antigen sequence used for antibody production is clearly not present in the H. oligactis and H. viridissima sequences.

      Both reviewers also criticized our evidence for pan-neuronal staining as inadequate. Hence we have now included additional data. We have stained a transgenic strain expressing NeonGreen under the control of a pan-neuronal alpha-tubulin promoter (Primak et al 2023). 684/684 transgenic nerve cells were stained with PNab. We consider this convincing evidence, in addition to the evidence presented previously, that PNab stains all nerve cells in Hydra. The first paragraph of Results has been revised to include these data.

      Reviewer 2 suggested moving gap junction/innexin data (Suppl Fig 3 and 4) from the Discussion to Results. These are indeed new results and we have followed this suggestion. Fig 12 (new) clearly shows gap junctions between neurites in bundles. It also shows that nerve cells in bundles express cell type specific innexins and hence can form cell type specific gap junctions. We have also added new images (Fig 11) of a transgenic Hym176B strain stained with PNab. These show that neurite bundles in the ectoderm contain neurites from different nerve cell types = neural circuits and hence that neurite links must be specific, e.g. gap junctions.

      As suggested by Reviewer 2 we have now provided a 3D interactive version of the block face SEM reconstruction (Suppl Fig 4). This shows that connections between neurites in bundles consist of thin overlapping fingers rather than “conventional” terminal contacts. It also shows that the purple neurite and extends past the green nerve cell body and does not end on it.

      Reviewer 2 suggested deleting discussion of possible functions for the endodermal nerve net (Discussion). We disagree with this suggestion. Our imaging results showed no connections between ectodermal and endodermal nerve nets. We also presented quantitative data for the absence of contact between the nerve nets in the gastric region. Consistent with our observations, Dupre and Yuste (2017) found no functional connection between the ectodermal and endodermal nerve nets based of neural activity measurements. Nevertheless, Giez et al (2023) in a recent preprint have described contact between specific endodermal and ectodermal nerve cells in the hypostome involved in the mouth opening response to glutathione. Both their observation and ours may be correct. The issue is not resolved. Hence we have included a discussion of possible functions for ectodermal and endodermal nerve nets. Importantly, our conclusions incorporate the difference in connectivity between muscle processes and nerve cells in the two nerve nets.

      Specific comments / Recommendations

      Reviewer 2

      Novelty: two preprints (Giez et al 2023) became available after the submission of our preprint. These include the results cited by the reviewer. These were not available to us at the time of submission.

      hvCADab has been re-named (see above). The differentiating nerve cell in Fig 11B is indeed stained by PNab. We have adjusted the intensities of red and green channels to show this more clearly.

      We consider the very clear black space between ectoderm and endoderm e.g. Fig 2B or Fig 4A to be an adequate marker for mesoglea. Use of an anti-mesoglea antibody would reduce the clarity of the image.

      It is always possible to look at more parts of Hydra tissue for possible nerve connections between ectoderm/endoderm. Nevertheless we provide the first quantitative data on the lack of contacts between 133 nerve cells (57 ectodermal and 76 endodermal) in the body column. Such data has not been previously available. And the EM result (Westfall 1973) cited by the reviewer is anecdotal at best. In later serial sectioning results on the hypostome/tentacle region from the Westfall lab no mention is made of nerve connections between the ectoderm and the endoderm. However, based on the results in the cited preprints (Giez et al) a closer examination of the hypostome/tentacle region in particular is warranted.

      To strengthen our conclusion that there are no contacts between the ectodermal and endodermal nerve nets, we now explicitly cite results from Dupre and Yuste (2017) on a calcium reporter strain demonstrating the absence of any crosscorrelation between the firing patterns of ectodermal RP1 network and the endodermal RP2 network. There was also no correlation between the activity of the second ectodermal nerve net CB and the endodermal RP2 network. These results demonstrate the absence of functional contacts between ectodermal and endodermal nerve nets.

      The reviewer criticizes the absence of trans-mesoglea links between ectodermal and endodermal epithelial cells in our EM images, e.g. Fig 9A. We can assure the reviewer that such links are frequently observed, although not in the image we chose for Fig 9A. This image, however, clearly documents two neurite bundles next to ectodermal muscle fibers.

      We agree with the reviewer that neurite bundles are an important discovery. And they raise the question of synaptic connections between neurites in bundles. Unfortunately, it is not possible to scan along the block face reconstruction (Fig 10) and count synapses. The resolution is not sufficient. Although scattered dense core vesicles (DCV) are observed in neurites, clustered DCV described by Westfall et al (1971) as synapses were not observed. We did, however, observe gap junctions between neurites in bundles (noted in Suppl Fig 3). These data have now been moved to the main body of the paper as Fig 12 together with the scRNAseq results on innexin gene expression in nerve cells. These results make it clear that neurites in bundles are connected via gap junctions and that these gap junctions are specific for neural circuits.

      The reviewer suggests that neurite bundles are an artifact of their interaction with muscle processes at the base of epithelial cells. We disagree with this statement. Muscle processes are temporary structures. They are withdrawn and reformed during every epithelial cell division, which occur approximately every three days. Bundles are almost certainly more stable structures. Furthermore, neurite bundles in the endoderm are distant from endodermal muscle fibers (Fig 4B and Fig 9D) and their polygonal pattern (Fig 2D) is completely different from the circumferential bands of endodermal muscle fibers.

      Reviewer 3

      Specific comments and suggestions have been answered above. Importantly, we show that the PNab antibody does not recognize cadherin and that it clearly stains all nerve cells in Hydra.

    2. eLife assessment

      This work presents important findings on the cellular and ultrastructural organization of the nervous system in the freshwater polyp Hydra. The authors present outstanding imaging data with convincing evidence to support their claims. The manuscript provides a starting point for further functional in vivo studies. The work will be of interest to developmental biologists and neurobiologists.

    3. Reviewer #2 (Public Review):

      In their manuscript, Keramidioti and co-authors investigate the cellular architecture of the nervous system in the freshwater polyp Hydra. Specifically, the authors attempt to improve the resolution, which is lacking in the previous studies, yet to generate a comprehensive overview of the entire nervous system's spatial organization and to infer communication between cells. To this end, Keramidioti et al. use state-of-the-art imaging approaches, such as confocal microscopy combined with the use of transgenic animals, transmission electron microscopy, and block face scanning electron microscopy. The authors present three major observations: i) A novel PNab antibody may be used to detect the entire nervous system of Hydra; ii) Nerve cells in the ectoderm and in the endoderm are organized in two separate nerve nets, which do not interact; iii) Both nerve nets are composed of bundles of overlapping nerve processes.

      The manuscript addresses a long-standing and currently intensively studied question in developmental neurobiology biology - it attempts to reveal structural properties and principles that govern the function of the nervous systems in non-bilaterian animals. Hence, this study contributes to understanding the nervous system evolution trajectories. Therefore, the manuscript may represent interest to researchers interested in evolutionary and developmental neurobiology.

      The manuscript reports a remarkably meticulous study and presents stunning imaging results.

    4. Reviewer #3 (Public Review):

      In this paper by Keramidioti et al, the authors have characterized a polyclonal antibody from rabbit, which was raised against a peptide of the intracellular domain of the Hydra Cadherin. This antibody unexpectedly recognizes presumably all neurons in the Hydra polyp but the specificity of the antibody was not investigated. Regardless, the antibody can be used to visualize and study the nerve net under a variety of conditions. The authors find that the endodermal and ectodermal nerve net do not make any contacts through the mesoglea, in contrast to earlier assumptions and data. They show that ectodermal neurons make close contacts to the myoepithelial muscles, in contrast to the endodermal muscles. Furthermore, they show that tentacle endoderm surprisingly does not have any neurons. Finally, a very nice tool to visualize the connections between the neurons is the staining of mosaic nGreen transgenic lines. This showed that the neurites align in parallel forming bundles of neurites over longer stretches, in particular in the ectoderm, which offers a mechanism how new neurons are added laterally to the existing nerve net. This has important implications about the way the neurons might communicate with each other.

      Taken together, this paper adds to our knowledge of the Hydra nerve net and provides a new experimental tool. Although most of the study is rather descriptive the pictures are of spectacular quality, providing fascinating new insights into the arrangement and topology of the nerve net.

    1. Author Response

      Reviewer #1 (Public Review):


      The manuscript by Dubicka and co-workers on calcification in miliolid foraminifera presents an interesting piece of work. The study uses confocal and electron microscopy to show that the traditional picture of calcification in porcelaneous foraminifera is incorrect.


      The authors present high-quality images and an original approach to a relatively solid (so I thought) model of calcification.


      There are several major shortcomings. Despite the interesting subject and the wonderful images, the conclusions of this manuscript are simply not supported at all by the results. The fluorescent images may not have any relation to the process of calcification and should therefore not be part of this manuscript. The SEM images, however, do point to an outdated idea of miliolid calcification. I think the manuscript would be much stronger with the focus on the SEM images and with the speculation of the physiological processes greatly reduced.

      Reply: We would like to give thanks for all of the highly valuable comments. Prior to our study, we were also convinced that the calcification model of Miliolid (porcelaneous) foraminifera was relatively solid. Nevertheless, our SEM imaging results surprisingly contradicted the old model. The main difference is the in situ biomineralization of calcitic needles that precipitate within the chamber wall after deposition of ACC-bearing vesicles. We agree that our fluorescence studies presented in the paper are not conclusive evidence for the calcification model used by the studied Miliolid species. However, our fluorescent results show that “the old model” (sensu Hemleben et al., 1986) is not completely outdated. Most of the fluorescent imaging data show a vesicular transport of substrates necessary for calcification. This transport is presented by Calcein labelling experiments (Movie 1 that show a high number of dynamic endocytic vesicles of sea water circulation within the cytoplasm. These very fine Calcein-labelled vesicles are most likely responsible for transport and deposition of Ca2+ ions. This is partly consistent with the model presented by Hemleben et al. (1986). We may speculate that calcite nucleation is already occurring within the transported vesicles, but at this stage of research we have no evidence for this phenomenon.

      Further live imaging fluorescence data show autofluorescence of vesicles upon excitation at 405 nm (emission 420–480 nm) associated with acidic vesicles marked by pH-sensitive LysoGlow84, may be a hint indicating association of ACC-bearing vesicles with acidic vesicles. Such spatial association of these vesicles may indicate a mechanism of pH elevation in the vesicles transporting Ca2+-rich gel to the calcifying wall of the new chamber.

      We will do our best to limit the physiological interpretation presented based on fluorescence studies in the revised version of the manuscript. We are convinced that our fluorescent live imaging experiments provide important observations in biomineralizing Miliolid foraminifera, which are still missing in the existing literature. It should be stressed that all the fluorescent experiments and SEM observations were based on specimens constructing and biomineralizing new chambers. All of them belong to the same species and come from the same culture. Due to the aforementioned reasons, it is worthwhile presenting these complimentary results of our study. In the future they may be helpful in further exploration and understanding of all aspects of calcification in foraminifera.

      Reviewer #2 (Public Review):


      Dubicka et al. in their paper entitled " Biocalcification in porcelaneous foraminifera" suggest that in contrast to the traditionally claimed two different modes of test calcification by rotallid and porcelaneous miliolid formaminifera, both groups produce calcareous tests via the intravesicular mineral precursors (Mg-rich amorphous calcium carbonate). These precursors are proposed to be supplied by endocytosed seawater and deposited in situ as mesocrystals formed at the site of new wall formation within the organic matrix. The authors did not observe the calcification of the needles within the transported vesicles, which challenges the previous model of miliolid mineralization. Although the authors argue that these two groups of foraminifera utilize the same calcification mechanism, they also suggest that these calcification pathways evolved independently in the Paleozoic.

      Reply: We would like to acknowledge the review and all valuable comments. We do not argue that Miliolida and Rotallida utilise an identical calcification mechanism, but both groups utilize less divergent crystallization pathways, where mesocrystalline chamber walls are created by accumulating and assembling particles of pre-formed liquid amorphous mineral phase.


      The authors document various unknown aspects of calcification of Pseudolachlanella eburnea and elucidate some poorly explained phenomena (e.g., translucent properties of the freshly formed test) however there are several problematic observations/interpretations which in my opinion should be carefully addressed.


      1) The authors (line 122) suggest that "characteristic autofluorescence indicates the carbonate content of the vesicles (Fig. S2), which are considered to be Mg-ACCs (amorphous MgCaCO3) (Fig. 2, Movies S4 and S5)". Figure S2 which the authors refer to shows only broken sections of organic sheath at different stages of mineralization. Movie S4 shows that only in a few regions some vesicles exhibit red autofluorescence interpreted as Mg-ACC (S5 is missing but probably the authors were referring to S3). In their previous paper (Dubicka et al 2023: Heliyon), the authors used exactly the same methodology to suggest that these are intracellularly formed Mg-rich amorphous calcium carbonate particles that transform into a stable mineral phase in rotaliid Aphistegina lessonii. However, in Figure 1D (Dubicka et al 2023) the apparently carbonate-loaded vesicles show the same red autofluorescence as the test, whereas in their current paper, no evidence of autofluorescence of Mg-ACC grains accumulated within the "gel-like" organic matrix is given. The S3 and S4 movies show circulation of various fluorescing components, but no initial phase of test formation is observable (numerous mineral grains embedded within the organic matrix - Figures 3A and B - should be clearly observed also as autofluorescence of the whole layer). Thus the crucial argument supporting the calcification model (Figure 5) is missing. There is no support for the following interpretation (lines 199-203) "The existence of intracellular, vesicular intermediate amorphous phase (Mg-ACC pools), which supply successive doses of carbonate material to shell production, was supported by autofluorescence (excitation at 405 nm; Fig. 2; Movies S3 and S4; see Dubicka et al., 2023) and a high content of Ca and Mg quantified from the area of cytoplasm by SEM-EDS analysis (Fig. S6)."

      Reply: We used laser line 405nm and multiphoton excitation to detect ACCs. These wavelengths (partly) permeate the shell to excite ACCs autofluorescence. The autofluorescence of the shells is present as well, but it is not clearly visible in movieS4 as the fluorescence of ACCs is stronger. This may be related to the plane/section of the cell which is shown. The laser permeates the shell above the ACCs (short distance), but to excite the shell CaCO3 around foraminifera in the same three-dimensional section where ACCs are shown, the light must pass a thick CaCO3 area due to the three-dimensional structure of the foraminifera shell. Therefore, the laser light intensity is reduced. In a revised version a movie/image with reduced threshold will be shown.

      2) The authors suggest that "no organic matter was detected between the needles of the porcelain structures (Figures 3E; 3E; S4C, and S5A)". Such a suggestion, which is highly unusual considering that biogenic minerals almost by definition contain various organic components, was made based only on FE-SEM observation. The authors should either provide clearcut evidence of the lack of organic matter (unlikely) or may suggest that intense calcium carbonate precipitation within organic matrix gel ultimately results in a decrease of the amount of the organic phase (but not its complete elimination), alike the pure calcium carbonate crystals are separated from the remaining liquid with impurities ("mother liquor"). On the other hand, if (249-250) "organic matrix involved in the biomineralization of foraminiferal shells may contain collagen-like networks", such "laminar" organization of the organic matrix may partly explain the arrangement of carbonate fibers parallel to the surface as observed in Fig. 3E1.

      Reply: We agree with the reviewer that biogenic minerals should, by definition, contain some organic components. We wrote that "no organic matter was detected between the needles of the porcelain structures” as we did not detect any organic structures based only on our FE-SEM observations. We are convinced that the shell incorporates a limited amount of organic matrix. We will rephrase this part of the text to avoid further confusion.

      3) The author's observations indeed do not show the formation of individual skeletal crystallites within intracellular vesicles, however, do not explain either what is the structure of individual skeletal crystallites and how they are formed. Especially, what are the structures observed in polarized light (and interpreted as calcite crystallites) by De Nooijer et al. 2009? The author's explanation of the process (lines 213-216) is not particularly convincing "we suspect that the OM was removed from the test wall and recycled by the cell itself".

      Reply: Thank you for this comment. We will do our best to supplement our explanations. We are aware of the structures observed in polarized light by De Nooijer et al. (2009). However, Goleń et al. (2022, Protist, https://doi.org/10.1016/j.protis.2022.125886) showed that organic polymers may also exhibit light polarization. Additional experimental studies are needed to distinguish these types of polarization. We will aim to investigate this issue in our future research.

      4) The following passage (lines 296-304) which deals with the concept of mesocrystals is not supported by the authors' methodology or observations. The authors state that miliolid needles "assembled with calcite nanoparticles, are unique examples of biogenic mesocrystals (see Cölfen and Antonietti, 2005), forming distinct geometric shapes limited by planar crystalline faces" (later in the same passage the authors say that "mesocrystals are common biogenic components in the skeletons of marine organisms" (are they thus unique or are they common)? It is my suggestion to completely eliminate this concept here until various crystallographic details of the miliolid test formation are well documented.

      Reply: Our intention was to express that mesocrystals are common biogenic components in the skeletons of marine organisms, however Miliolid needles that form distinct geometric shapes limited by planar crystalline faces are unique type of mesocrystals.

    2. eLife assessment

      This manuscript provides important information on the calcification process, especially the properties and formation of freshly formed tests (the foraminiferan shells), in the miliolid foraminiferan species Pseudolachlanella eburnea. The evidence from the high-quality SEM images is solid, but the evidence is incomplete when it comes to the specificity of the auto-fluorescent signals for calcified structures, or the presence of photosynthetic (living) symbionts, which are not verified experimentally. The conclusions based on fluorescent imagery therefore do not have strong support.

    3. Reviewer #1 (Public Review):

      Summary:<br /> The manuscript by Dubicka and co-workers on calcification in miliolid foraminifera presents an interesting piece of work. The study uses confocal and electron microscopy to show that the traditional picture of calcification in porcelaneous foraminifera is incorrect.

      Strengths:<br /> The authors present high-quality images and an original approach to a relatively solid (so I thought) model of calcification.

      Weaknesses:<br /> There are several major shortcomings. Despite the interesting subject and the wonderful images, the conclusions of this manuscript are simply not supported at all by the results. The fluorescent images may not have any relation to the process of calcification and should therefore not be part of this manuscript. The SEM images, however, do point to an outdated idea of miliolid calcification. I think the manuscript would be much stronger with the focus on the SEM images and with the speculation of the physiological processes greatly reduced.

    4. Reviewer #2 (Public Review):

      Summary:<br /> Dubicka et al. in their paper entitled " Biocalcification in porcelaneous foraminifera" suggest that in contrast to the traditionally claimed two different modes of test calcification by rotallid and porcelaneous miliolid formaminifera, both groups produce calcareous tests via the intravesicular mineral precursors (Mg-rich amorphous calcium carbonate). These precursors are proposed to be supplied by endocytosed seawater and deposited in situ as mesocrystals formed at the site of new wall formation within the organic matrix. The authors did not observe the calcification of the needles within the transported vesicles, which challenges the previous model of miliolid mineralization. Although the authors argue that these two groups of foraminifera utilize the same calcification mechanism, they also suggest that these calcification pathways evolved independently in the Paleozoic.

      Strengths:<br /> The authors document various unknown aspects of calcification of Pseudolachlanella eburnea and elucidate some poorly explained phenomena (e.g., translucent properties of the freshly formed test) however there are several problematic observations/interpretations which in my opinion should be carefully addressed.

      Weaknesses:<br /> 1. The authors (line 122) suggest that "characteristic autofluorescence indicates the carbonate content of the vesicles (Fig. S2), which are considered to be Mg-ACCs (amorphous MgCaCO3) (Fig. 2, Movies S4 and S5)". Figure S2 which the authors refer to shows only broken sections of organic sheath at different stages of mineralization. Movie S4 shows that only in a few regions some vesicles exhibit red autofluorescence interpreted as Mg-ACC (S5 is missing but probably the authors were referring to S3). In their previous paper (Dubicka et al 2023: Heliyon), the authors used exactly the same methodology to suggest that these are intracellularly formed Mg-rich amorphous calcium carbonate particles that transform into a stable mineral phase in rotaliid Aphistegina lessonii. However, in Figure 1D (Dubicka et al 2023) the apparently carbonate-loaded vesicles show the same red autofluorescence as the test, whereas in their current paper, no evidence of autofluorescence of Mg-ACC grains accumulated within the "gel-like" organic matrix is given. The S3 and S4 movies show circulation of various fluorescing components, but no initial phase of test formation is observable (numerous mineral grains embedded within the organic matrix - Figures 3A and B - should be clearly observed also as autofluorescence of the whole layer). Thus the crucial argument supporting the calcification model (Figure 5) is missing. There is no support for the following interpretation (lines 199-203) "The existence of intracellular, vesicular intermediate amorphous phase (Mg-ACC pools), which supply successive doses of carbonate material to shell production, was supported by autofluorescence (excitation at 405 nm; Fig. 2; Movies S3 and S4; see Dubicka et al., 2023) and a high content of Ca and Mg quantified from the area of cytoplasm by SEM-EDS analysis (Fig. S6)."

      2. The authors suggest that "no organic matter was detected between the needles of the porcelain structures (Figures 3E; 3E; S4C, and S5A)". Such a suggestion, which is highly unusual considering that biogenic minerals almost by definition contain various organic components, was made based only on FE-SEM observation. The authors should either provide clearcut evidence of the lack of organic matter (unlikely) or may suggest that intense calcium carbonate precipitation within organic matrix gel ultimately results in a decrease of the amount of the organic phase (but not its complete elimination), alike the pure calcium carbonate crystals are separated from the remaining liquid with impurities ("mother liquor"). On the other hand, if (249-250) "organic matrix involved in the biomineralization of foraminiferal shells may contain collagen-like networks", such "laminar" organization of the organic matrix may partly explain the arrangement of carbonate fibers parallel to the surface as observed in Fig. 3E1.

      3. The author's observations indeed do not show the formation of individual skeletal crystallites within intracellular vesicles, however, do not explain either what is the structure of individual skeletal crystallites and how they are formed. Especially, what are the structures observed in polarized light (and interpreted as calcite crystallites) by De Nooijer et al. 2009? The author's explanation of the process (lines 213-216) is not particularly convincing "we suspect that the OM was removed from the test wall and recycled by the cell itself".

      4. The following passage (lines 296-304) which deals with the concept of mesocrystals is not supported by the authors' methodology or observations. The authors state that miliolid needles "assembled with calcite nanoparticles, are unique examples of biogenic mesocrystals (see Cölfen and Antonietti, 2005), forming distinct geometric shapes limited by planar crystalline faces" (later in the same passage the authors say that "mesocrystals are common biogenic components in the skeletons of marine organisms" (are they thus unique or are they common)? It is my suggestion to completely eliminate this concept here until various crystallographic details of the miliolid test formation are well documented.

    1. Author Response

      The following is the authors’ response to the original reviews.

      We thank the editor and reviewers for their valuable feedback and comments. Below we have addressed all points carefully and have, when needed, revised the manuscript accordingly.

      Note that we have taken the opportunity to correct minor typos and unclear text in the revised manuscript.

      Of importance to the editors and reviewers, we detected a few minor factual errors in the method section, which we have now corrected. The first error was that we wrongfully stated that our final dataset had 6358 unique TCRs, whereas it was in fact 6353 unique TCRs. The second error was that we stated that the maximum length of CDR1ꞵ was 5, where it was in fact 6. The last error was that we stated that we used a Levenshtein distance of at least 3 to discard similar peptides when swapping the TCRs to generate negatives. This should have been a Levenshtein greater than 3, to match the script we used to generate negatives (though no peptides had a Levenshtein distance of exactly 3).

      eLife assessment

      This important study reports on an improved deep-learning-based method for predicting TCR specificity. The evidence supporting the overall method is compelling, although the inclusion of real-world applications and clear comparisons with the previous version would have further strengthened the study. This work will be of broad interest to immunologists and computational biologists.

      It is not fully clear to us what is meant by “clear comparisons with the previous version”. In the manuscript we consistently compare the performance of each novel approach introduced to that of the ancestor NetTCR-2.1. Further, we concluded the manuscript with a performance to a large set of current state-of-the-art methods by training and evaluating the novel modeling framework on the IMMREP22 benchmark data.

      We agree that the manuscript can be improved by including a brief discussion of real-life applications of models for prediction of TCR specificity, and have included a brief text in the introduction.

      Reviewer #1 (Recommendations For The Authors):

      It was a great pleasure to read this article. All the concepts and motivations are clearly defined. I have just a few questions.

      What was the motivation behind employing a 1:5 positive-negative ratio? Could it be the cause of worse performance in the case of outliers?

      The ratio 1:5 is based on results from earlier work [36561755]. In this work, negatives were constructed as a mix of swapped and true (i.e measured) negatives with a ratio 1:5 for each. This work demonstrated a slight gain when including both types of negatives compared to only using swapped. In a subsequent publication [https://doi.org/10.1016/j.immuno.2023.100024], it demonstrated that optimal performance was obtained when only including swapped negatives (again in a ratio 1:5). Given this, we maintained this approach in the current work. It is clear that this choice is somewhat arbitrary, and that further work is needed to fully address this issue and the general issue of how to best generate negatives for ML of TCR specificity. Such work is in our view however beyond the scope of the current manuscript.

      Why is the patience of 200 epochs for peptide-specific models and 100 epochs for pan-specific and pre-trained models used in the context of the early stopping mechanism?

      We observed that the loss curve was overall very stable in the case of pan-specific training, likely due to the large amount of data included in this training. Therefore, these models were less likely to become stuck in a local minimum during training, meaning that a lower patience for early stopping would not prevent the model from learning optimally. In contrast, we found for some peptides that the loss curve was very erratic, and would sometimes become stuck in a local minimum for an extended time. To resolve this, the patience was increased from 100 to 200, which resulted in a better chance to escape these minima, as well as a better overall performance.

      Why is weight 3.8 used in the weighted loss function in the pan-specific model?

      The weighted loss was scaled with a division factor (c) of 3.8, in order to get an overall loss that was comparable to training without sample weights. This was primarily done to better compare the two approaches (scaling and no scaling) in terms of loss, and not so much to improve the training itself, as we already use a relatively conservative sample weight scaling based on log2. We have added a brief sentence to clarify this in the manuscript.

      Reviewer #2 (Recommendations For The Authors):

      This work is the evolution of previous studies that developed the NetTCR platform, and in a previous paper cited in this study, the authors explore the paired dataset approach with "paired α/β TCR sequence data". In this manuscript, the authors should make clear what advances were made when compared to the previous study. This is not clear, although extensive reference is made to NetTCR 2.0 and 2.1. Differences are scattered throughout the manuscript, so I would suggest a section or paragraph clearly delineating the advances in model architecture and training when compared to previous versions recently published.

      It is not clear to us when the reviewer is referring to when stating “the authors should make clear what advances were made when compared to the previous study”. Throughout the manuscript we consistently compare the performance of each novel approach introduced to that of the ancestor NetTCR-2.1. In addition, we briefly discuss all of the changes to the architecture and training at the start of the discussion section. Further, we concluded the manuscript with a performance to a large set of current state-of-the-art methods by training and evaluating the novel modeling framework on the IMMREP22 benchmark data. It is correct that the advances are described progressively by introducing each novel approach one by one, i.e. refining the machine learning model architecture and training setup, data denoising in terms of outlier identification in the training data, new model architectures combining the properties of a pan- and peptide-specific model, and integration of similarity based approach to boost model performance). We believe this helps better justify the relevance of each of the novel approaches introduced.

      In Figure 3, the colors have labels, but they are not explained in the legend or in the text. This makes it very difficult to understand the data in the various columns. Also, since it represents the Mean AUC, the data would be best displayed with a boxplot or a mean and bars for variance.

      We agree, and have changed Figure 3 and its corresponding AUC 0.1 figure (Supplementary Figure 1) into a boxplot. We also further clarified what the different models were in the figure text.

      Given the potential impact of this work on bioengineering and biotechnology, I would suggest adding a paragraph or section to the discussion where potential applications of the current model, or examples of applications of previous (or competing) models have been used to further biological research.

      We agree and have added a brief sentence in the introduction to outline biotechnological applications of models for prediction of TCR specificity.

    2. eLife assessment

      This study presents a useful tool for predicting TCR specificity with compelling evidence for improvements over prior art. This work/tool will be broadly relevant to computational biologists and immunologists.

    3. Reviewer #1 (Public Review):

      In this article, different machine learning models (pan-specific, peptide-specific, pre-trained, and ensemble models) are tested to predict TCR-specificity from a paired-chain peptide-TCR dataset. The data consists of 6,358 positive observations across 26 peptides (as compared to six peptides in NetTCR version 2.1) after several pre-processing steps (filtering and redundancy reduction). For each positive sample, five negative samples were generated by swapping TCRs of a given peptide with TCRs binding to other peptides. The weighted loss function is used to deal with the imbalanced dataset in pan-specific models.

      The results demonstrate that the redundant data introduced during training did not lead to performance gain; rather, a decrease in performance was observed for the pan-specific model. The removal of outliers leads to better performance.

      To further improve the peptide-specific model performance, an architecture is created to combine pan-specific and peptide-specific models, where the pan-specific model is trained on pan-specfic data while keeping the peptide-specfic part of the model frozen, and the peptide-specific model is trained on a peptide-specific dataset while keeping the pan-specific part of the model frozen. This model surpassed the performance of individual pan-specific and peptide-specific models. Finally, sequence similarity-based predictions of TCRbase are integrated into the pre-trained CNN model, which further improved the model performance (mostly due to the better discrimination of binders and non-binders).

      The prediction for unseen peptides is still low in a pan-specific model; however, an improvement in prediction is observed for peptides with high similarity to the ones in the training dataset. Furthermore, it is shown that 15 observations shows satisfactory performance as compared to the ~150 recommended in the literature.

      Models are evaluated on the external dataset (IMMREP benchmark). Peptide-specific models performed competitively with the best models in the benchmark. The pre-trained model performed worst, which the authors suggested could be because of positive and negative sample swapping across training and testing sets. To resolve this issue, they applied the redundancy removal technique to the IMMER dataset. The results agreed with earlier conclusion that the pre-trained models surpassed peptide-specific models and the integration of similarity-based methods leads to performance boost. It highlights the need for the creation of a new benchmark without data redundancy or leakage problems.

      The manuscript is well written, clear and easy to understand. The data is effectively presented. The results validate the drawn conclusions.

    4. Reviewer #2 (Public Review):

      Summary:<br /> The authors describe a novel ML approach to predict binding between MHC-bound peptides and T-Cell receptors. Such approaches are particularly useful for predicting the binding of peptide sequences with low similarity when compared to existing data sets. The authors focus on improving dataset quality and optimizing model architecture to achieve a pan-specific predictive model in hopes of achieving a high precision model for novel peptide sequences.

      Strengths:<br /> Since assuring the quality of training datasets is the first major step in any ML training project, the extensive human curation and computational analysis and enhancements made in this manuscript represent a major contribution to the field. Moreover, the systematic approach to testing redundancy reduction and data augmentation is exemplary, and will significantly help future research in the field.

      The authors also highlight how their model can identify outliers and how that can be used to improve the model around known sequences, which can help the creation and optimization of future datasets for peptide binding.

      The new models presented here are novel and built using paired α/β TCR sequence data to predict peptide-specific TCR binding, and have been extensively and rigorously tested.

      Weaknesses:<br /> Achieving an accurate pan-specific model is an ambitious goal, and the authors have significant difficulties when trying to achieve non-random performance for prediction of TCR binding to novel peptides. This is the most challenging task for this kind of model, but also the most desirable when applying such models to biotechnological and bioengineering projects.

      The manuscript is a highly technical and extremely detailed computational work, which can make the achievements and impact of the work hard to parse for application-oriented researchers, and still hard to translate to real-world use-cases for TCR specificity predictions.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Trenker et al. report cryo-EM structures of HER4/HER2 heterodimers and HER4 homodimers bound to Neuregulin-1b (Nrg1b) and Betacellulin (BTC). As observed for prior cryo-EM structures of full-length or near full-length HER-family receptors only the extracellular regions are visualized, presumably owing to flexibility in the relative orientation of extra- and intra-cellular regions. The authors observe no appreciable differences between Nrg1b and BTC bound heterodimers, both ligands, in this case being high-affinity ligands, and modest "scissor-like" differences in the subunit relationships in HER4 homodimers with Nrg1b and BTC bound.

      The authors also show that, as they showed for HER3, the HER4 dimerization arm is not indispensable for forming heterodimers with HER2 despite the HER4 dimerization arm forming a more canonical interaction with HER2. Perhaps most interestingly, the authors observe glycan interactions that appear to stabilize intra- and inter-subunit interactions in HER4 homodimers but that inter-subunit glycans are not present in HER2/HER4 heterodimers. The authors speculate that these glycan interactions may contribute to the apparent propensity of HER4 to homodimerize vs. heterodimerize with HER2.

      I realize that an important role of reviewers is to provide authors with informed and critical comments, but I found this manuscript a well-written, thoughtful, and important contribution. My only note is that I am not an electron microscopist so have assumed the microscopy has been carried out expertly and rely on other reviewers to vet structure determinations.

      We thank the reviewer for sharing our enthusiasm and the positive assessment of our manuscript. We have carefully reviewed the all microscopy-related concerns while responding to the assessment of reviewer #2.

      Reviewer #2 (Public Review):

      With the data presented in this manuscript, the authors help complete the set of high-resolution HER2-associated complex heterodimer structures as well as HER4 homodimer structures in the presence of NRG1b and BTC. Purification of HER2-HER4 heterodimers appears to be inherently challenging due to the propensity of HER4 to form homodimers. The authors have used an effective scheme to isolate these HER2-HER4 heterodimers and have employed graphene-oxide grid chemistry to presumably overcome the issues of low sample yield for solving cryo-EM structures of these complexes. The authors conclude HER2-HER4 heterodimers with either ligand are conformationally homogeneous relative to the HER4 homodimers. The HER2-HER4 heterodimers also appear to be better stabilized compared to other published HER2 heterodimers. The ability to model glycans in the context of HER4 homodimers is exciting to see and provides a strong rationale for the stability of these structures. Overall, the work is of great interest and the methods described in this work would benefit a wide variety of structural biology projects.

      We thank the reviewer for their positive assessment of our manuscript.

      Major comments:

      1) The HER2-HER4 heterodimer with BTC appears to be the lowest resolution of the reported structures. Although the authors claim the overall structure is similar to the HER2-HER4 heterodimer with NRG1b, it is therefore unclear whether the lower resolution of the BTC is due to challenging data collection conditions, sample preparation, or conformational dynamics not discernible due to the lower resolution. The authors should minimally clarify where they see the possible issues arising for the lower resolution as this is a key aspect of the work.

      The most likely reason for the lower resolution of the HER2/HER4/BTC reconstruction is not the underlying fundamental biology but a certain degree of preferred orientations in the sample, as can be seen from the directional FSC curves in the supplemental materials (Figure S3). We would like to note that while the overall resolution of the HER2/HER4/BTC reconstruction may be comparatively lower than other reconstructions presented in the manuscript, it remains of sufficiently high quality to substantiate our key claims. Specifically, our analysis indicates a close resemblance between the HER2/HER4/BTC reconstruction and the HER2/HER4/NRG reconstruction. For example, individual beta strands can still be well resolved allowing their accurate placement. There may be differences in features at higher resolution than 4.5Å between these two reconstructions which we cannot observe due to the lower resolution of HER2/HER4/BTC map, but these would amount to side chain motions rather than larger secondary structure movement. In the manuscript, we only draw comparisons between domain movements in different heterodimer structures and do not see any conformational variability in the final reconstructions, nor in their 3D classification analyses. Thus, we do not attribute the lower resolution of HER2/HER4/BTC reconstruction to increased dynamics at resolution scales that are discussed in the manuscript. What is more likely, is that variability in data quality, which we commonly observe between different GO grids, contributes to differences in resolution between different samples and potentially to the different orientation distributions. To comment on these possibilities, we added the following text to the manuscript (italic, underlined):

      Page 8 top paragraph:

      “Despite the diverse sequences of the NRG1β and BTC ligands, the larger-scale domain conformation of the HER2/HER4 heterodimers stabilized by each ligand is identical with only small differences in the ligand binding pockets (Figure 1d). Due to the lower resolution of the HER2/HER4/BTC complex, we cannot exclude the possibility of differences in side-chain arrangements between the two structures. However, we attribute the lower resolution to variability in data collection on GO grids, which we frequently observe, rather than differences in conformational heterogeneity of HER2/HER4/BTC.”

      Page 10, second paragraph:

      “Our cryo-EM structures of the full-length HER2/HER4 complexes bound to either NRG1β or BTC, did not reveal discernible differences at the receptor dimerization interface and larger-scale domain arrangements (Figure 1d).”

      2) For all maps, authors should display Euler angle plots from their final refinements to assess the degree of preferred orientation. Judging by the sphericity, it appears all the structures, except HER2-HER4-BTC, have well-sampled projection distributions. However, a formal clarification would be useful to the reader.

      We thank the reviewer for pointing this out. We regarded the 3DFSC curves included in our original submission as sufficient measure for projection distributions. In the revised manuscript, we now also include Euler angle plots from respective CryoSPARC refinements in the supplemental Figures.

      3) The authors should also include map-model FSCs to ascertain the quality of the map with respect to model building, as this is currently missing in the submission.

      We included map-model FSCs from Phenix validation runs in our supplemental material.

      Minor comments:

      1) With respect to complex formation, is there a reason why HER2 expression is dramatically lower than HER4?

      The expression of HER2 and HER4 in Expi293F cells, and consequently the amount of HER2 and HER4 receptors at the beginning of our first purification step, which is the NRG1b-mediated pulldown of HER4, is not noticeably different. After this initial purification step, a significant portion of HER2 is lost due to the fact that HER2/HER4 complexes constitute only a small fraction of the total HER complexes because HER4 homodimers preferentially tend to form. This is the reason why HER4 levels after the first purification step shown on the gel in Figure S1b are significantly higher than those of HER2. In the revised manuscript, in Figure S1d, we now show that both receptors are expressed at a comparable levels at the beginning of purification. In this experiment, levels of HER2-MBP-TS and HER4-TS purified separately from the equivalent volumes of transfected Exp293F cell culture via their shared TS-tags (MBP=Maltose Binding Protein, TS=Twin-Strep) are evaluated on a Coomassie-stained gel. When equal volumes of these elutions are then mixed and either subjected to HER4-directed pulldown using NRG1b-coated Flag-resin (lane 3, Figure S1d of the revised manuscript) or HER2-MBP-directed pulldown using amylose resin in the presence of NRG1b (lane 4, Figure S1d of revised manuscript), none of these pulldowns reveals substantial HER2/HER4 heterodimerization indicating that HER4 homodimerization is favored.

      2) Figures S1e authors should clarify if HER2 substitutions are VR alone or do these include GD substitutions as well. These should be suitably clarified in the main text.

      The HER2 constructs used in all cellular assays do not include the G778D mutation. We clarified this in Figure S1e, in the Materials and Methods section and in the main text on page 6.

      3) The validation reports for all 4 reported structures suggest the user-provided FSC-derived resolutions are different from those calculated by the deposition server. Are the masks deposited significantly different compared to the ones generated within cryoSPARC?

      The user-provided FSC-derived resolutions are different from those calculated by the server because the server only calculates resolution of unmasked curves from half maps while we provide the resolution derived from masked FSCs. These were all calculated using masks generated within the respective refinement job in cryoSPARC. However, we did notice that our author-provided FSC curves were from unmasked maps and we replaced the provided unmasked FSCs with masked FSCs as generated in cryoSPARC. These FSC plots in the validation reports now reflect the author-provided resolution in our validation reports and the plots generated by cryoSPARC shown in Figures S2, S3, S9 and S10.

      4) For interpretation regarding activation through phosphorylation in Figure 2e, have the authors considered HER4 could homodimerize as well? It appears from the data presented in Figure 4 and S12 that the propensity to form homodimers is greater for HER4 than to heterodimerize with HER2, despite the VR/IQ substitutions. This also appears to be supported by the reasonable amount of signal for pERK in lanes with HER4-IQ alone in the presence of NRG1b. It is recommended that the authors comment on this possibility.

      The IQ mutation, originally engineered to disrupt the receiver interface in EGFR, has been shown to have residual activity, which is greater than the mutation on the opposite site of the asymmetric dimer interface (VR) (PMID:16777603). This might be because this mutation partially destabilizes an inactive state of HER kinases by disrupting the hydrophobic interactions, which are both important for kinase inhibition and for stabilization of the active dimer. While IQ mutation is significantly inhibitory, as evidenced by the fact that we do not detect NRG1b-dependent HER4 phosphorylation in cells expressing HER4-IQ alone, it is possible that undetectable levels of phosphorylated HER4 cause the small increase in pERK signal. To acknowledge this possibility, we added the following sentence to the appropriate paragraph on page 10 in the main text:

      “Small increases in pERK levels in cells expressing the HER4-IQ construct are consistent with previous observations that the IQ mutation in HER kinase domains has small residual activity through homodimerization (PMID:16777603).”

      5) In the following line, "NRG1b-induced phosphorylation of HER2, HER4, ERK and AKT was not notably affected by substitution of the HER4 dimerization arm to a GS-arm relative to wild type receptors", it is unclear what the authors mean by wild-type receptors? There is presently no wildtype HER2 and/or HER4 tested in this blot.

      We thank the reviewer for pointing this out. Wild type receptors here refer to WT dimerization arm sequences in contrast to GS-arm mutants. We corrected the language in the appropriate place in the main text:

      “NRG1b-induced phosphorylation of HER2, HER4, ERK and AKT was not notably affected by substitution of the HER4 dimerization arm to a GS-arm relative to receptors featuring wild type dimerization arm sequences, indicating that the HER4 dimerization arm is not required for assembly and activation of HER2/HER4 heterodimers (Figure 2e).”

      6) Considering the asparagine residues can potentially mediate stabilization of HER2-HER4 dimers through glycosylation, the authors should include western blot data for receptor-activation for mutants where glycosylation can be disrupted. This could minimally instruct the reader on how functionally relevant the identified interactions like N576-N358 are.

      We agree with the Reviewer that this is a very interesting and important point, and it is subject of our future investigations. The different spectra of glycosylation that we observe between HER4 homodimers and HER2/HER4 heterodimers suggest that glycans will modulate these interactions differently. We speculate that glycans will likely be more important for HER4 homodimerization where glycosylation is more pronounced in our reconstructions. To investigate how these interactions change in the absence of single glycan modifications or their combinations, will also require taking into consideration how glycan mutations will alter an equilibrium between HER4 homodimers and HER2/HER4 heterodimerization. Such studies will require months of mutagenesis and optimization of controlled expression of such mutants, ideally generation of stable cell lines, and likely and ideally structural follow up studies. We respectfully argue that this undertaking is beyond the main scope of the current manuscript, and conceptually constitutes a separate, very important question that we are working on.

      Reviewer #1 (Recommendations For The Authors):

      The structural coordinates should be deposited in the RCSB.

      The coordinates will be released upon publication of the revised manuscript.

      Reviewer #2 (Recommendations For The Authors):

      1) Figure S1b authors should ideally include a silver stain gel to assess the purity of the heterodimer-ligand complex. Although HER subunits are discernible, there is no clear band for NRG1b.

      Given its small size (9.7 kDa) our NRG1b construct is typically difficult to detect in our samples, but we would like to respectfully argue that the fact that we can resolve it at high resolution in our cryo-EM reconstructions provides sufficient evidence that it is present. Likewise, we argue that the Coomassie-stained gel we present in the manuscript is sufficient. It demonstrates that our purifications yield a stoichiometric complex of enough purity to obtain a high resolution cryo-EM reconstruction. Since we are not making any other claims about these preparations, we respectfully argue that providing a silver stain gel is not necessary to support conclusions of our study.

      We thank the reviewer for point this out. To best reflect what we wanted to convey, we change it to: “and is the same as observed in structures of an isolated HER2 ectodomain.”

      3) Page 8 first paragraph line 3, although one can deduce where the ligand binding pocket is, it would be clearer if this is marked in Figure 1d.

      We added arrows in the figure to indicate the ligand-binding pocket.

      4) Figure 2b inset A needs to be labeled 'A'.

      The inset was already labelled but in a different corner. We rearranged the label to make it clearer.

      5) Figure S5c will benefit from inset images zooming into the dimerization arm. It is hard to visualize the subtleties of the structural changes in the current format.

      Figure 5c predominantly shows side-views of various heterodimer overlays to highlight subtle differences in larger-scale assembly that correlate with differences in dimerization arm engagement. This side-orientation is not suitable for zooming into the dimerization arm regions, which can only be effectively visualized in front views (the view of the heart-shaped dimer illustrated in Figure 1a). We show a zoomed-in view of this representation in main Figure 2c, which is what we understand the Reviewer is requesting.

      6) Fig 3e is it A102 or A202 in the bottom-most panel.

      This is now corrected, thank you.

      7) Fig S9 revisit the color code for NRG1b, it appears there is no blue subunit of NRG1b. Also revisit the RMSD in the figure legend, since the text appears to suggest a different set of RMSDs for the 3 overlays.

      We fixed the color code in the Figure, thank you.

      In reference to Figure S9 (Figure S11 in the revised manuscript) we discuss two types of RMSDs:

      1) RMSDs between our cryo-EM homodimers and the crystal structure homodimers. The structure overlays are shown in Figure S9a and RMSD values were mentioned in the Figure legends. However, in the original manuscript we did not explicitly mention these values in the main text but have now added them to the main text of the revised version of the manuscript.

      2) RMSDs between monomers within our cryo-EM structures and within monomers of the crystal structure. Figure S11b and Figure S11c of the revised manuscript show these overlays for the cryo-EM structures only and the values are present in the Figure legend. We do not show the respective overlay for the crystal structures, which is why the values are not mentioned in the Figure legends, but we discuss the values in the main text.

      We recognize that this is confusing and added RMSD values for 1. to the main text and discuss this more carefully:

      “Our cryo-EM structures of the HER4/NRG1b homodimer differs slightly from the three HER4/NRG1b homodimers per asymmetric unit in the 3U7U crystal structure in which each monomer adopts a different orientation of the domain IV relative to the rest of the ectodomain (Figure S9a, RMSD: 5.438 Å, 5.435 Å and 3.662 Å). Notably, our two cryo-EM HER4 homodimer structures are more symmetric than the crystal structures of the HER4/NRG1β ectodomain homodimer. RMSDs for monomers within our cryo-EM structures are 1.42 Å in the cryo-EM HER4/NRG1b homodimer and 1.58 Å in the HER4/BTC homodimer (Figure S9b+c) compared to the monomers in the crystal structures which align with RMSDs of 1.67 Å, 5.76 Å and 2.38 Å”

      8) Page 12 paragraph 2 last line, expand on the abbreviation NAG.

      It is now expanded.

      9) What is the slit width used for the energy filter during data collection?

      The slit width was 20 eV. We added this information to the Methods section.

      10) The crosslinking conditions of 0.2% glutaraldehyde for 40 min on ice, with no quenching seems rather harsh. Have the authors attempted other crosslinking conditions? Do milder conditions or GraFix not help with complex stabilization?

      We thank the Reviewer for pointing this out. The reaction was quenched after 40 min by addition of 40 µl of 1M Tris pH 7.4 buffer. This information is now included in the Methods section. We have screened ideal crosslinking conditions for HER4 homodimers, and previously for HER2/HER3 heterodimers, and found that these crosslinking conditions were the mildest conditions that achieved complete crosslinking as assessed by SDS-PAGE.

      11) Have the authors used default parameters for all their data processing steps? Were additional steps like local per-particle CTF refinement and global defocus refinement employed during refinement?

      We did not perform any per particle CTF refinements as we previously have not observed any improvement from running such refinement on our size particles on top of per patch CTF estimation that already takes into account local CTF differences per micrograph. To make the manuscript clearer in this regard we added the following statement to the Methods section: “Unless specifically mentioned here or in the processing workflow, default parameters in CryoSPARC were used for each processing step.”

    2. eLife assessment

      This manuscript describes structures of HER4 homo- and HER4/HER2 hetero-dimer complexes using single particle cryo-EM. This important work describes convincingly new structural details of these complexes that expand our understanding of their function. This work will be of interest to researchers working on cell surface signalling and kinase activity.

    3. Reviewer #1 (Public Review):

      Trenker et al. report cryo-EM structures of HER4/HER2 heterodimers and HER4 homodimers bound to Neuregulin-1β (Nrg1β) and Betacellulin (BTC). As observed for prior cryo-EM structures of full-length or near full-length HER-family receptors only the extracellular regions are visualized, presumably owing to flexibility in the relative orientation of extra- and intra-cellular regions. The authors observe no appreciable differences between Nrg1β and BTC bound heterodimers, both ligands in this case being high-affinity ligands, and modest "scissor-like" differences in the subunit relationships in HER4 homodimers with Nrg1β and BTC bound.

      The authors also show that, as they showed for HER3, the HER4 dimerization arm is not indispensable for forming heterodimers with HER2 despite the HER4 dimerization arm forming a more canonical interaction with HER2. Perhaps most interestingly, the authors observe glycan interactions that appear to stabilize intra- and inter-subunit interactions in HER4 homodimers but that inter-subunit glycans are not present in HER2/HER4 heterodimers. The authors speculate that these glycan interactions may contribute to the apparent propensity of HER4 to homodimerize vs. heterodimerize with HER2.

    4. Reviewer 2 (Public Review):

      With the data presented in this manuscript, the authors help complete the set of high resolution HER2- associated complex heterodimer structures as well as HER4 homodimer structures in the presence of NRG1b and BTC. Purification of HER2-HER4 heterodimers appears to be inherently challenging due to the propensity of HER4 to form homodimers. The authors have used an effective scheme to isolate these HER2-HER4 heterodimers and have employed graphene-oxide grid chemistry to presumably overcome the issues of low sample yield for solving cryo-EM structures of these complexes. The authors conclude HER2-HER4 heterodimers with either ligand is conformationally homogeneous relative to the HER4 homodimers. The HER2-HER4 heterodimers also appear to be better stabilized compared to other published HER2 heterodimers. The ability to model glycans in the context of HER4 homodimers is exciting to see and provides a strong rationale for the stability of these structures. Overall, the work is of great interest and the methods described in this work would benefit a wide variety of structural biology projects.

    Author Response

      Yamanaka et al.'s research investigates into the impact of volatile organic compounds (VOCs), particularly diacetyl, on gene expression changes. By inhibiting histone acetylase (HDACs) enzymes, the authors were able to observe changes in the transcriptome of various models, including cell lines, flies, and mice. The study reveals that HDAC inhibitors not only reduce cancer cell proliferation but also provide relief from neurodegeneration in fly Huntington's disease models. Although the findings are intriguing, the research falls short in providing a thorough analysis of the underlying mechanisms.

      HDAC inhibitors have been previously shown to induce gene expression changes as well as control cell division and demonstrated to work on disease models. The authors demonstrate diacetyl as a prominent HDAC inhibitor. Though the demonstration of diacetyl is novel, several similar molecules have been used before.

      In this manuscript we are not trying to understand the mechanisms by which HDAC inhibitors affect Huntington’s disease or cancer, since these have either been studied in detail before and are outside the scope of this manuscript. Our focus is to demonstrate that volatile odorants commonly found in the environment can inhibit HDACs, alter gene expression, and have downstream physiological effects. To the best of our knowledge this unusual effect of odorants has not been systematically described before.

      Reviewer #2 (Public Review):

      Sachiko et al. study presents strong evidence that implicates environmental volatile odorants, particularly diacetyl, in an alternate role as an inhibitors HDAC proteins and gene expression. HDACs are histone deacetylases that generally have repressive role in gene expression. In this paper the authors test the hypothesis that diacetyl, which is a compound emitted by rotting food sources, can diffuse through blood-brain-barrier and cell membranes to directly modulate HDAC activity to alter gene expression in a neural activity independent manner. This work is significant because the authors also link modulation of HDAC activity by diacetyl exposure to transcriptional and cellular responses to present it as a potential therapeutic agent for neurological diseases, such as inhibition of neuroblastoma and neurodegeneration.

      The authors first demonstrate that exposure to diacetyl, and some other odorants, inhibits deacetylation activity of specific HDAC proteins using in vitro assays, and increases acetylation of specific histones in cultured cells. Consistent with a role for diacetyl in HDAC inhibition, the authors find dose dependent alterations in gene expression in different fly and mice tissues in response to diacetyl exposure. In flies they first identify a decrease in the expression of chemosensory receptors in olfactory neurons after exposure to diacetyl. Subsequently, they also observe large gene expression changes in the lungs, brain, and airways in mice. In flies, some of the gene expression changes in response to diacetyl are partially reversable and show an overlap with genes that alter expression in response to treatment with other HDAC inhibitors. Given the use of HDAC inhibitors as chemotherapy agents and treatment methods for cancers and neurodegenerative diseases, the authors hypothesize that diacetyl as an HDAC inhibitor can also serve similar functions. Indeed, they find that exposure of mice to diacetyl leads to a decrease in the brain expression of many genes normally upregulated in neuroblastomas, and selectively inhibited proliferation of cell lines which are driven from neuroblastomas. To test the potential for diacetyl in treatment of neurodegenerative diseases, the authors use the fly Huntington's disease model, utilizing the overexpression of Huntingtin protein with expanded poly-Q repeats in the photoreceptor rhabdomeres which leads to their degeneration. Exposing these flies to diacetyl significantly decreases the loss of rhabdomeres, suggesting a potential for diacetyl as a therapeutic agent for neurodegeneration.

      The findings are very intriguing and highlight environmental chemicals as potent agents which can alter gene expression independent of their action through chemosensory receptors.

      We thank the reviewer for the encouraging comments.

      Reviewer #1 (Recommendations For The Authors):

      1) The results section for figure 1 seems poorly written with errors in figure citations. Please rewrite this section.

      We thank the reviewer for pointing it out and have now rewritten the results section as well as made concomitant changes in the introduction to address this comment.

      2) Discussion could be more focused and could speculate mechanistic details of HDAC inhibitors in rescue of neurodegeneration.

      We have added in information about the mechanistic role of the HDAC inhibition in rescue of neurodegeneration. “Exposure to diacetyl volatiles in the fly model of Huntington’s disease reduces cell degeneration, as has been previously observed with orally administered HDAC inhibitors like sodium butyrate and SAHA in this genetic model (27). Previous studies indicate that the inhibition of HDACs counter the acetyltransferase inhibitory activity of the polyglutaminedomain of the human Htt protein which binds to p300, P/CAF and CBP (27).”

      A few minor comments are:

      1) Figure 1 is not properly cited in the test (Eg: line 137- Its not relevant to Fig 1B and its to IC)

      We thank the referee for pointing out our error and have now corrected it.

      2) Some Abbreviations were not expanded at the first sight, which made difficult in understanding the statement (Eg: Line 51- VOC, 111- Or

      We have now defined abbreviations the first time they appear in the manuscript.

      3) Line 98- What was the unit when you mention 0.01%?

      We have added (v/v) in the text to represent the standard volume / total volume. We have also described it in the method section.

      4) Line 138- there is no comparative study done with b-HB, but the authors have claimed its was comparable. If it’s from previous study, a relative comparative statement could be given.

      We apologize for the confusion. We have added the IC50 values previously reported for b-hydroxy butyrate “IC50 for HDAC1: 5.3 mM and HDAC3 2.4 mM” which was shown in the reference #21.

      5) In lines 146-150, more details of what are the compounds and how similar they are to diacetyl could be added

      We have added representative structures and names for the chemicals tested in Figure 1C.

      6) In line 160, Why specifically they increase H3K14 acetylation?

      This observed increased H3K9 (not H3K14) acetylation levels is identical to what has previously reported for b-hydroxybutyrate. We have added a sentence pointing out this similarity “preferable acetylation of H3K9 was also observed in HEK193 cells with b-hydroxybutyrate (reference #21)”.

      7) In line 317, How HDAC inhibitors reverse the PolyQ disorder? What is its mechanism? Can at least discuss in the discussion section.

      Our assay is based on a previous publication using the Drosophila model (Ref #27) and evaluated the mechanisms in detail. We have now added a section in the Discussion describing the past findings. “Exposure to diacetyl volatiles in the fly model of Huntington’s disease reduces cell degeneration, as has been previously observed with orally administered HDAC inhibitors like sodium butyrate and SAHA in this genetic model (27). Previous studies indicate that the inhibition of HDACs counter the acetyltransferase inhibitory activity of the polyglutamine-domain of the Htt protein which binds to p300, P/CAF and CBP (27).”

      8) In figures, 1C and 1D, proper labeling of drug molecules is missing. Check 1D- Could have included Diacetyl for comparison, Where is the uninhibited control (negative)?

      We have added the name of the chemical compounds to Figure 1C and 1D. Each compound tested has a separate blank control, which forms the basis for calculation of the percentage inhibition. The negative control is therefore part of each column.

      Reviewer #2 (Recommendations For The Authors):

      As specific feedback for the authors, I have a few questions/recommendations about the main point of the paper:

      a. Throughout the manuscript, the authors demonstrate gene expression differences in different tissues in flies and mice in response to exposure to diacetyl using both transgenic reporter expression and RNAseq. The authors mention they were able to show that these gene expression changes are independent of neural activity, yet I am not sure which experiment specifically demonstrates this. How do the authors know that these changes in gene expression are due to diacetyl reaching the brain after passing blood brain barrier but not due to changes in gene expression with olfactory circuit activity? I acknowledge that disproving that the gene expression differences are independent of neural activity, but one question is whether inhibiting neural activity result in changes in the expression of overlapping genes in the same direction. Or for example, if one inhibits neural activity in Gr21a neurons, do they reversibly shut down expression of the receptor after a few days? Is this true for other ORs or specific to Gr21a and Gr63a?

      While it is difficult to completely rule out contributions of the olfactory effects in the brain, we also report differential gene expression in the lungs of mice where we do not expect olfactory circuit activity (Fig 3D-G). The overlap in DEGs is highly statistically significant between the organs suggesting at least some commonality in mechanism (Fig 5D). We recently evaluated a Drosophila tissue that does not express odorant receptors or connections, the ovaries, and also found substantial evidence of diacetyl-exposed modulation of genes. While the data are intended for a different publication, we found up to 123 up and 61 downregulated DEGs (FDR cutoff <0.05 and log2 fold change cutoff of 1 and -1). These data should also be viewed together with the in vitro HDAC inhibition data and the increased histone acetylation seen in cell lines.

      b. Is diacetyl detected by any chemosensory receptors in flies or mice? RNA profiles from these receptor mutants can be used to distinguish whether the gene expression changes are occurring due to neural activity or direct ability of diacetyl to alter HDAC activity. One relatively simple experiment would be to test whether differentially expressed genes in the orco mutant antennae overlap at all with antennal RNA profiles from diacetyl exposed flies.

      Diacetyl can be detected by multiple chemosensory receptors in flies and mice. In flies the Gr21a+Gr63a complex expressing neurons are inhibited by diacetyl as indicated, and Or9a, Or43b, Or59b, Or67a, and Or85b are activated receptors (Hallem, Cell, 2006). It would be extremely resource and time-consuming process to create and evaluate single mutants or combinations of mutants as suggested. In response to the previous point, we noted examples of tissues without olfactory receptors or olfactory circuits showing DEGs upon diacetyl exposure.

      As suggested by the referee, we compared DEGs from RNASeq data of Orco mutant antenna (N=2 replicates) generated for another project. There is very little overlap between antennal DEGs from Orco and the diacetyl (labelled chart as d4on_up and d4on_down) exposed flies. These data suggest that large-scale silencing of antennal neurons in Orco mutants do not alter expression of the same genes as altered by exposure to diacetyl.

      Author response image 1.

      c. The comparison of DEGs from individuals exposed to diacetyl versus the other two HDAC inhibitors shows some overlap. The overlap is greater for DEGs shared between the two HDAC inhibitors. Yet, there is still a substantial number of genes that are unique to diacetyl exposure. For example, if you compare SB to VA exposure, each condition has about 150-200 genes uniquely misexpressed for each condition with about 55 genes shared. However, the number of uniquely misexpressed genes is over 600 for diacetyl exposed individuals, with only 30 and 100 genes shared with either SB and VA respectively. I would have expected a higher overlap in DEGs if these compounds all inhibit similar HDACs. Do they inhibit different HDACs? Can this explain the significant number of uniquely misexpressed genes in each condition?

      It is difficult to judge significance of overlap in DEG sets the genome has around 13,000 genes from evaluating numbers without statistical analysis which we noted in the text. “A pairwise analysis using the Fisher’s exact test of each gene set revealed a statistically significant overlap of diacetyl-induced genes with SB-induced genes (p=6x10-11) and with VA-induced genes (p=2x10-65) (Figure 4F).”

      We have also further clarified in the text “This highly significant overlap among upregulated genes lends further support to our model that diacetyl vapors act as an HDAC inhibitor in vivo. As expected, each of the 3 treatments also modulated a substantial number of unique genes (Figure 4G,H), suggesting that differences in delivery format (oral vs vapor delivery), molecular structure and inhibition profile across the repertoire of HDACs may contribute to differences in gene regulation.”

      d. The authors show changes in RNA profiles in response to diacetyl exposure in different tissues and suggest that these are due to changes in histone acetylation without direct comparison of genes that show up or down regulation with acetylation patterns. They do show in the beginning that diacetyl inhibits HDAC function in vitro and in cell culture. Yet it is critical that they also show a general increase in acetylation levels within tissues profiled for RNA. Additional experiments profiling chromatin and histone acetylation patterns in the tissues where RNA is profiled from would strengthen the argument of the paper.

      We agree with the referee’s suggestion and appreciate it. However, given the heterogeneity of the cell types and therefore histone marks in chromatin within the tissues that we analyzed, we estimate that it will require substantial effort to purify or enrich specific cell populations before performing Chip-Seq. Such studies will examine correlations between up- and down-regulated genes and histone acetylation pattens in cells in the future studies. This effort will require significant resources and time which we feel are outside the scope of this manuscript.

      e. The rhabdomere experiments might benefit from a negative control. Can the authors expose the flies to another volatile and show neurodegeneration is not affected?

      We exposed the negative control group to headspace odorants of paraffin oil which is a mixture of hydrocarbons.

      f. The same is true for the initial HDAC activity profiles from Figure 1. Can the authors show an HDAC activity that is not affected by diacetyl exposure?

      We exposed the negative control group to headspace odorants of paraffin oil which is a mixture of hydrocarbons. Diacetyl shows very little inhibition (Average inhibition = 7.69%; N=2) in purified human HDAC4 when tested at the 15mM concentration.

      g. One point that might require some explanation in the discussion is why diacetyl exposure only increases acetylation of certain histones but not others in Figure 2, especially given that many HDACs are inhibited by diacetyl in Figure 1.

      Please see response to comment #6, Reviewer 1.

      h. Figure S1C is missing descriptions of what different histogram colors signify.

      We apologize for the oversight and have now indicated it in the Figure legend.

    2. eLife assessment

      This interesting and important work shows that diacety, a volatile organic compound released by yeast in fermenting fruit, can act as a histone deacetylase (HDAC) inhibitor and trigger wide changes in gene expression, together with suppression neurotoxicity in a Drosophila model of Huntington's disease. While the effects on gene expression changes and degenerative phenotypes are convincingly shown, further studies are required to determine whether and how olfactory sensory neurons and odorant receptors mediate the effects of diacetyl described by the authors.

      To the Senior Editor and the Reviewing Editor:

      We sincerely appreciate the valuable comments provided by the reviewers, the reviewing editor, and the senior editor. After carefully reviewing and considering the comments, we have addressed the key concerns raised by the reviewers and made appropriate modifications to the article in the revised manuscript.

      The main revisions made to the manuscript are as follows:

      1) We have added comparison experiments with TNDM (see Fig. 2 and Fig. S2).

      2) We conducted new synthetic experiments to demonstrate that our conclusions are not a by-product of d-VAE (see Fig. S2 and Fig. S11).

      3) We have provided a detailed explanation of how our proposed criteria, especially the second criterion, can effectively exclude the selection of unsuitable signals.

      4) We have included a semantic overview figure of d-VAE (Fig. S1) and a visualization plot of latent variables (Fig. S13).

      5) We have elaborated on the model details of d-VAE, as well as the hyperparameter selection and experimental settings of other comparison models.

      We believe these revisions have significantly improved the clarity and comprehensibility of the manuscript. Thank you for the opportunity to address these important points.

      Q1: “First, the model in the paper is almost identical to an existing VAE model (TNDM) that makes use of weak supervision with behaviour in the same way [1]. This paper should at least be referenced. If the authors wish they could compare their model to TNDM, which combines a state space model with smoothing similar to LFADS. Given that TNDM achieves very good behaviour reconstructions, it may be on par with this model without the need for a Kalman filter (and hence may achieve better separation of behaviour-related and unrelated dynamics).”

      Our model significantly differs from TNDM in several aspects. While TNDM also constrains latent variables to decode behavioral information, it does not impose constraints to maximize behavioral information in the generated relevant signals. The trade-off between the decoding and reconstruction capabilities of generated relevant signals is the most significant contribution of our approach, which is not reflected in TNDM. In addition, the backbone network of signal extraction and the prior distribution of the two models are also different.

      It's worth noting that our method does not require a Kalman filter. Kalman filter is used for post hoc assessment of the linear decoding ability of the generated signals. Please note that extracting and evaluating relevant signals are two distinct stages.

      Heeding your suggestion, we have incorporated comparison experiments involving TNDM into the revised manuscript. Detailed information on model hyperparameters and training settings can be found in the Methods section in the revised manuscripts.

      Thank you for your valuable feedback.

      Q2: “Second, in my opinion, the claims regarding identifiability are overstated - this matters as the results depend on this to some extent. Recent work shows that VAEs generally suffer from identifiability problems due to the Gaussian latent space [2]. This paper also hints that weak supervision may help to resolve such issues, so this model as well as TNDM and CEBRA may indeed benefit from this. In addition however, it appears that the relative weight of the KL Divergence in the VAE objective is chosen very small compared to the likelihood (0.1%), so the influence of the prior is weak and the model may essentially learn the average neural trajectories while underestimating the noise in the latent variables. This, in turn, could mean that the model will not autoencode neural activity as well as it should, note that an average R2 in this case will still be high (I could not see how this is actually computed). At the same time, the behaviour R2 will be large simply because the different movement trajectories are very distinct. Since the paper makes claims about the roles of different neurons, it would be important to understand how well their single trial activities are reconstructed, which can perhaps best be investigated by comparing the Poisson likelihood (LFADS is a good baseline model). Taken together, while it certainly makes sense that well-tuned neurons contribute more to behaviour decoding, I worry that the very interesting claim that neurons with weak tuning contain behavioural signals is not well supported.”

      We don’t think our distilled signals are average neural trajectories without variability. The quality of reconstructing single trial activities can be observed in Figure 3i and Figure S4. Neural trajectories in Fig. 3i and Fig. S4 show that our distilled signals are not average neural trajectories. Furthermore, if each trial activity closely matched the average neural trajectory, the Fano Factor (FF) should theoretically approach 0. However, our distilled signals exhibit a notable departure from this expectation, as evident in Figure 3c, d, g, and f. Regarding the diminished influence of the KL Divergence: Given that the ground truth of latent variable distribution is unknown, even a learned prior distribution might not accurately reflect the true distribution. We found the pronounced impact of the KL divergence would prove detrimental to the decoding and reconstruction performance. As a result, we opt to reduce the weight of the KL divergence term. Even so, KL divergence can still effectively align the distribution of latent variables with the distribution of prior latent variables, as illustrated in Fig. S13. Notably, our goal is extracting behaviorally-relevant signals from given raw signals rather than generating diverse samples from the prior distribution. When aim to separating relevant signals, we recommend reducing the influence of KL divergence. Regarding comparing the Poisson likelihood: We compared Poisson log-likelihood among different methods (except PSID since their obtained signals have negative values), and the results show that d-VAE outperforms other methods.

      Author response image 1.

      Regarding how R2 is computed: , where and denote ith sample of raw signals, ith sample of distilled relevant signals, and the mean of raw signals. If the distilled signals exactly match the raw signals, the sum of squared error is zero, thus R2=1. If the distilled signals always are equal to R2=0. If the distilled signals are worse than the mean estimation, R2 is negative, negative R2 is set to zero.

      Thank you for your valuable feedback.

      Q3: “Third, and relating to this issue, I could not entirely follow the reasoning in the section arguing that behavioural information can be inferred from neurons with weak selectivity, but that it is not linearly decodable. It is right to test if weak supervision signals bleed into the irrelevant subspace, but I could not follow the explanations. Why, for instance, is the ANN decoder on raw data (I assume this is a decoder trained fully supervised) not equal in performance to the revenant distilled signals? Should a well-trained non-linear decoder not simply yield a performance ceiling? Next, if I understand correctly, distilled signals were obtained from the full model. How does a model perform trained only on the weakly tuned neurons? Is it possible that the subspaces obtained with the model are just not optimally aligned for decoding? This could be a result of limited identifiability or model specifics that bias reconstruction to averages (a well-known problem of VAEs). I, therefore, think this analysis should be complemented with tests that do not depend on the model.”

      Regarding “Why, for instance, is the ANN decoder on raw data (I assume this is a decoder trained fully supervised) not equal in performance to the relevant distilled signals? Should a well-trained non-linear decoder not simply yield a performance ceiling?”: In fact, the decoding performance of raw signals with ANN is quite close to the ceiling. However, due to the presence of significant irrelevant signals in raw signals, decoding models like deep neural networks are more prone to overfitting when trained on noisy raw signals compared to behaviorally-relevant signals. Consequently, we anticipate that the distilled signals will demonstrate superior decoding generalization. This phenomenon is evident in Fig. 2 and Fig. S1, where the decoding performance of the distilled signals surpasses that of the raw signals, albeit not by a substantial margin.

      Regarding “Next, if I understand correctly, distilled signals were obtained from the full model. How does a model perform trained only on the weakly tuned neurons? Is it possible that the subspaces obtained with the model are just not optimally aligned for decoding?”:Distilled signals (involving all neurons) are obtained by d-VAE. Subsequently, we use ANN to evaluate the performance of smaller and larger R2 neurons. Please note that separating and evaluating relevant signals are two distinct stages.

      Regarding the reasoning in the section arguing that smaller R2 neurons encode rich information, we would like to provide a detailed explanation:

      1) After extracting relevant signals through d-VAE, we specifically selected neurons characterized by smaller R2 values (Here, R2 signifies the proportion of neuronal activity variance explained by the linear encoding model, calculated using raw signals). Subsequently, we employed both KF and ANN to assess the decoding performance of these neurons. Remarkably, our findings revealed that smaller R2 neurons, previously believed to carry limited behavioral information, indeed encode rich information.

      2) In a subsequent step, we employed d-VAE to exclusively distill the raw signals of these smaller R2 neurons (distinct from the earlier experiment where d-VAE processed signals from all neurons). We then employed KF and ANN to evaluate the distilled smaller R2 neurons. Interestingly, we observed that we could not attain the same richness of information solely through the use of these smaller R2 neurons.

      3) Consequently, we put forth and tested two hypotheses: First, that larger R2 neurons introduce additional signals into the smaller R2 neurons that do not exist in the real smaller R2 neurons. Second, that larger R2 neurons aid in restoring the original appearance of impaired smaller R2 neurons. Our proposed criteria and synthetic experiments substantiate the latter scenario.

      Thank you for your valuable feedback.

      Q4: “Finally, a more technical issue to note is related to the choice to learn a non-parametric prior instead of using a conventional Gaussian prior. How is this implemented? Is just a single sample taken during a forward pass? I worry this may be insufficient as this would not sample the prior well, and some other strategy such as importance sampling may be required (unless the prior is not relevant as it weakly contributed to the ELBO, in which case this choice seems not very relevant). Generally, it would be useful to see visualisations of the latent variables to see how information about behaviour is represented by the model.”

      Regarding "how to implement the prior?": Please refer to Equation 7 in the revised manuscript; we have added detailed descriptions in the revised manuscript.

      Regarding "Generally, it would be useful to see visualizations of the latent variables to see how information about behavior is represented by the model.": Note that our focus is not on latent variables but on distilled relevant signals. Nonetheless, at your request, we have added the visualization of latent variables in the revised manuscript. Please see Fig. S13 for details.

      Thank you for your valuable feedback.

      Recommendations: “A minor point: the word 'distill' in the name of the model may be a little misleading - in machine learning the term refers to the construction of smaller models with the same capabilities.

      It should be useful to add a schematic picture of the model to ease comparison with related approaches.”

      In the context of our model's functions, it operates as a distillation process, eliminating irrelevant signals and retaining the relevant ones. Although the name of our model may be a little misleading, it faithfully reflects what our model does.

      I have added a schematic picture of d-VAE in the revised manuscript. Please see Fig. S1 for details.

      Thank you for your valuable feedback.

      Reviewer #2

      Q1: “Is the apparently increased complexity of encoding vs decoding so unexpected given the entropy, sparseness, and high dimensionality of neural signals (the "encoding") compared to the smoothness and low dimensionality of typical behavioural signals (the "decoding") recorded in neuroscience experiments? This is the title of the paper so it seems to be the main result on which the authors expect readers to focus. ”

      We use the term "unexpected" due to the disparity between our findings and the prior understanding concerning neural encoding and decoding. For neural encoding, as we said in the Introduction, in previous studies, weakly-tuned neurons are considered useless, and smaller variance PCs are considered noise, but we found they encode rich behavioral information. For neural decoding, the nonlinear decoding performance of raw signals is significantly superior to linear decoding. However, after eliminating the interference of irrelevant signals, we found the linear decoding performance is comparable to nonlinear decoding. Rooted in these findings, which counter previous thought, we employ the term "unexpected" to characterize our observations.

      Thank you for your valuable feedback.

      Q2: “I take issue with the premise that signals in the brain are "irrelevant" simply because they do not correlate with a fixed temporal lag with a particular behavioural feature hand-chosen by the experimenter. As an example, the presence of a reward signal in motor cortex [1] after the movement is likely to be of little use from the perspective of predicting kinematics from time-bin to time-bin using a fixed model across trials (the apparent definition of "relevant" for behaviour here), but an entire sub-field of neuroscience is dedicated to understanding the impact of these reward-related signals on future behaviour. Is there method sophisticated enough to see the behavioural "relevance" of this brief, transient, post-movement signal? This may just be an issue of semantics, and perhaps I read too much into the choice of words here. Perhaps the authors truly treat "irrelevant" and "without a fixed temporal correlation" as synonymous phrases and the issue is easily resolved with a clarifying parenthetical the first time the word "irrelevant" is used. But I remain troubled by some claims in the paper which lead me to believe that they read more deeply into the "irrelevancy" of these components.”

      In this paper, we employ terms like ‘behaviorally-relevant’ and ‘behaviorally-irrelevant’ only regarding behavioral variables of interest measured within a given task, such as arm kinematics during a motor control task. A similar definition can be found in the PSID[1].

      Thank you for your valuable feedback.

      [1] Sani, Omid G., et al. "Modeling behaviorally relevant neural dynamics enabled by preferential subspace identification." Nature Neuroscience 24.1 (2021): 140-149.

      Q3: “The authors claim the "irrelevant" responses underpin an unprecedented neuronal redundancy and reveal that movement behaviors are distributed in a higher-dimensional neural space than previously thought." Perhaps I just missed the logic, but I fail to see the evidence for this. The neural space is a fixed dimensionality based on the number of neurons. A more sparse and nonlinear distribution across this set of neurons may mean that linear methods such as PCA are not effective ways to approximate the dimensionality. But ultimately the behaviourally relevant signals seem quite low-dimensional in this paper even if they show some nonlinearity may help.”

      The evidence for the “useless” responses underpin an unprecedented neuronal redundancy is shown in Fig. 5a, d and Fig. S9a. Specifically, the sum of the decoding performance of smaller R2 neurons and larger R2 neurons is significantly greater than that of all neurons for relevant signals (red bar), demonstrating that movement parameters are encoded very redundantly in neuronal population. In contrast, we can not find this degree of neural redundancy in raw signals (purple bar).

      The evidence for the “useless” responses reveal that movement behaviors are distributed in a higher-dimensional neural space than previously thought is shown in the left plot (involving KF decoding) of Fig. 6c, f and Fig. S9f. Specifically, the improvement of KF using secondary signals is significantly higher than using raw signals composed of the same number of dimensions as the secondary signals. These results demonstrate that these dimensions, spanning roughly from ten to thirty, encode much information, suggesting that behavioral information exists in a higher-dimensional subspace than anticipated from raw signals.

      Thank you for your valuable feedback.

      Q5: “there is an apparent logical fallacy that begins in the abstract and persists in the paper: "Surprisingly, when incorporating often-ignored neural dimensions, behavioral information can be decoded linearly as accurately as nonlinear decoding, suggesting linear readout is performed in motor cortex." Don't get me wrong: the equivalency of linear and nonlinear decoding approaches on this dataset is interesting, and useful for neuroscientists in a practical sense. However, the paper expends much effort trying to make fundamental scientific claims that do not feel very strongly supported. This reviewer fails to see what we can learn about a set of neurons in the brain which are presumed to "read out" from motor cortex. These neurons will not have access to the data analyzed here. That a linear model can be conceived by an experimenter does not imply that the brain must use a linear model. The claim may be true, and it may well be that a linear readout is implemented in the brain. Other work [2,3] has shown that linear readouts of nonlinear neural activity patterns can explain some behavioural features. The claim in this paper, however, is not given enough”

      Due to the limitations of current observational methods and our incomplete understanding of brain mechanisms, it is indeed challenging to ascertain the specific data the brain acquires to generate behavior and whether it employs a linear readout. Conventionally, the neural data recorded in the motor cortex do encode movement behaviors and can be used to analyze neural encoding and decoding. Based on these data, we found that the linear decoder KF achieves comparable performance to that of the nonlinear decoder ANN on distilled relevant signals. This finding has undergone validation across three widely used datasets, providing substantial evidence. Furthermore, we conducted experiments on synthetic data to show that this conclusion is not a by-product of our model. In the revised manuscript, we added a more detailed description of this conclusion.

      Thank you for your valuable feedback.

      Q6: “Relatedly, I would like to note that the exercise of arbitrarily dividing a continuous distribution of a statistic (the "R2") based on an arbitrary threshold is a conceptually flawed exercise. The authors read too much into the fact that neurons which have a low R2 w.r.t. PDs have behavioural information w.r.t. other methods. To this reviewer, it speaks more about the irrelevance, so to speak, of the preferred direction metric than anything fundamental about the brain.”

      We chose the R2 threshold in accordance with the guidelines provided in reference [1]. It's worth mentioning that this threshold does not exert any significant influence on the overall conclusions.

      Thank you for your valuable feedback.

      [1] Inoue, Y., Mao, H., Suway, S.B., Orellana, J. and Schwartz, A.B., 2018. Decoding arm speed during reaching. Nature communications, 9(1), p.5243.

      Q7: “I am afraid I may be missing something, as I did not understand the fano factor analysis of Figure 3. In a sense the behaviourally relevant signals must have lower FF given they are in effect tied to the temporally smooth (and consistent on average across trials) behavioural covariates. The point of the original Churchland paper was to show that producing a behaviour squelches the variance; naturally these must appear in the behaviourally relevant components. A control distribution or reference of some type would possibly help here.”

      We agree that including reference signals could provide more context. The Churchland paper said stimulus onset can lead to a reduction in neural variability. However, our experiment focuses specifically on the reaching process, and thus, we don't have comparative experiments involving different types of signals.

      Thank you for your valuable feedback.

      Q8: “The authors compare the method to LFADS. While this is a reasonable benchmark as a prominent method in the field, LFADS does not attempt to solve the same problem as d-VAE. A better and much more fair comparison would be TNDM [4], an extension of LFADS which is designed to identify behaviourally relevant dimensions.”

      We have added the comparison experiments with TNDM in the revised manuscript (see Fig. 2 and Fig. S2). The details of model hyperparameters and training settings can be found in the Methods section in the revised manuscripts.

      Thank you for your valuable feedback.

      Reviewer #3

      Q1.1: “TNDM: LFADS is not the best baseline for comparison. The authors should have compared with TNDM (Hurwitz et al. 2021), which is an extension of LFADS that (unlike LFADS) actually attempts to extract behaviorally relevant factors by adding a behavior term to the loss. The code for TNDM is also available on Github. LFADS is not even supervised by behavior and does not aim to address the problem that d-VAE aims to address, so it is not the most appropriate comparison. ”

      We have added the comparison experiments with TNDM in the revised manuscript (see Fig. 2 and Fig. S2). The details of model hyperparameters and training settings can be found in the Methods section in the revised manuscripts.

      Thank you for your valuable feedback.

      Q1.2: “LFADS: LFADS is a sequential autoencoder that processes sections of data (e.g. trials). No explanation is given in Methods for how the data was passed to LFADS. Was the moving averaged smoothed data passed to LFADS or the raw spiking data (at what bin size)? Was a gaussian loss used or a poisson loss? What are the trial lengths used in each dataset, from which part of trials? For dataset C that has back-to-back reaches, was data chopped into segments? How long were these segments? Were the edges of segments overlapped and averaged as in (Keshtkaran et al. 2022) to avoid noisy segment edges or not? These are all critical details that are not explained. The same details would also be needed for a TNDM comparison (comment 1.1) since it has largely the same architecture as LFADS.

      It is also critical to briefly discuss these fundamental differences between the inputs of methods in the main text. LFADS uses a segment of data whereas VAE methods just use one sample at a time. What does this imply in the results? I guess as long as VAEs outperform LFADS it is ok, but if LFADS outperforms VAEs in a given metric, could it be because it received more data as input (a whole segment)? Why was the factor dimension set to 50? I presume it was to match the latent dimension of the VAE methods, but is the LFADS factor dimension the correct match for that to make things comparable?

      I am also surprised by the results. How do the authors justify LFADS having lower neural similarity (fig 2d) than VAE methods that operate on single time steps? LFADS is not supervised by behavior, so of course I don't expect it to necessarily outperform methods on behavior decoding. But all LFADS aims to do is to reconstruct the neural data so at least in this metric it should be able to outperform VAEs that just operate on single time steps? Is it because LFADS smooths the data too much? This is important to discuss and show examples of. These are all critical nuances that need to be discussed to validate the results and interpret them.”

      Regarding “Was the moving averaged smoothed data passed to LFADS or the raw spiking data (at what bin size)? Was a gaussian loss used or a poisson loss?”: The data used by all models was applied to the same preprocessing procedure. That is, using moving averaged smoothed data with three bins, where the bin size is 100ms. For all models except PSID, we used a Poisson loss.

      Regrading “What are the trial lengths used in each dataset, from which part of trials? For dataset C that has back-to-back reaches, was data chopped into segments? How long were these segments? Were the edges of segments overlapped and averaged as in (Keshtkaran et al. 2022) to avoid noisy segment edges or not?”:

      For datasets A and B, a trial length of eighteen is set. Trials with lengths below the threshold are zero-padded, while trials exceeding the threshold are truncated to the threshold length from their starting point. In dataset A, there are several trials with lengths considerably longer than that of most trials. We found that padding all trials with zeros to reach the maximum length (32) led to poor performance. Consequently, we chose a trial length of eighteen, effectively encompassing the durations of most trials and leading to the removal of approximately 9% of samples. For dataset B (center-out), the trial lengths are relatively consistent with small variation, and the maximum length across all trials is eighteen. For dataset C, we set the trial length as ten because we observed the video of this paradigm and found that the time for completing a single trial was approximately one second. The segments are not overlapped.

      Regarding “Why was the factor dimension set to 50? I presume it was to match the latent dimension of the VAE methods, but is the LFADS factor dimension the correct match for that to make things comparable?”: We performed a grid search for latent dimensions in {10,20,50} and found 50 is the best.

      Regarding “I am also surprised by the results. How do the authors justify LFADS having lower neural similarity (fig 2d) than VAE methods that operate on single time steps? LFADS is not supervised by behavior, so of course I don't expect it to necessarily outperform methods on behavior decoding. But all LFADS aims to do is to reconstruct the neural data so at least in this metric it should be able to outperform VAEs that just operate on single time steps? Is it because LFADS smooths the data too much?”: As you pointed out, we found that LFADS tends to produce excessively smooth and consistent data, which can lead to a reduction in neural similarity.

      Thank you for your valuable feedback.

      Q1.3: “PSID: PSID is linear and uses past input samples to predict the next sample in the output. Again, some setup choices are not well justified, and some details are left out in the 1-line explanation given in Methods.

      Why was a latent dimension of 6 chosen? Is this the behaviorally relevant latent dimension or the total latent dimension (for the use case here it would make sense to set all latent states to be behaviorally relevant)? Why was a horizon hyperparameter of 3 chosen? First, it is important to mention fundamental parameters such as latent dimension for each method in the main text (not just in methods) to make the results interpretable. Second, these hyperparameters should be chosen with a grid search in each dataset (within the training data, based on performance on the validation part of the training data), just as the authors do for their method (line 779). Given that PSID isn't a deep learning method, doing a thorough grid search in each fold should be quite feasible. It is important that high values for latent dimension and a wider range of other hyperparmeters are included in the search, because based on how well the residuals (x_i) for this method are shown predict behavior in Fig 2, the method seems to not have been used appropriately. I would expect ANN to improve decoding for PSID versus its KF decoding since PSID is fully linear, but I don't expect KF to be able to decode so well using the residuals of PSID if the method is used correctly to extract all behaviorally relevant information from neural data. The low neural reconstruction in Fid 2d could also partly be due to using too small of a latent dimension.

      Again, another import nuance is the input to this method and how differs with the input to VAE methods. The learned PSID model is a filter that operates on all past samples of input to predict the output in the "next" time step. To enable a fair comparison with VAE methods, the authors should make sure that the last sample "seen" by PSID is the same as then input sample seen by VAE methods. This is absolutely critical given how large the time steps are, otherwise PSID might underperform simply because it stopped receiving input 300ms earlier than the input received by VAE methods. To fix this, I think the authors can just shift the training and testing neural time series of PSID by 1 sample into the past (relative to the behavior), so that PSID's input would include the input of VAE methods. Otherwise, VAEs outperforming PSID is confounded by PSID's input not including the time step that was provided to VAE.”

      Thanks for your suggestions for letting PSID see the current neural observations. We did it per your suggestions and then performed a grid search for the hyperparameters for PSID. Specifically, we performed a grid search for the horizon hyperparameter in {2,3,4,5,6,7}. Since the relevant latent dimension should be lower than the horizon times the dimension of behavior variables (two-dimensional velocity in this paper) and increasing the dimension will reach performance saturation, we directly set the relevant latent dimensions as the maximum. The horizon number of datasets A, B, C, and synthetic datasets is 7, 6, 6 and 5, respectively.

      And thus the latent dimension of datasets A, B, and C and the synthetic dataset is 14, 12, 12 and 10, respectively.

      Our experiments show that KF can decode information from irrelevant signals obtained by PSID. Although PSID extracts the linear part of raw signals, KF can still use the linear part of the residuals for decoding. The low reconstruction performance of PSID may be because the relationship between latent variables and neural signals is linear, and the relationship between latent variables and behaviors is also linear; this is equivalent to the linear relationship between behaviors and neural signals, and linear models can only explain a small fraction of neural signals.

      Thank you for your valuable feedback.

      Q1.4: “CEBRA: results for CEBRA are incomplete. Similarity to raw signals is not shown. Decoding of behaviorally irrelevant residuals for CEBRA is not shown. Per Fig. S2, CEBRA does better or similar ANN decoding in datasets A and C, is only slightly worse in Dataset B, so it is important to show the other key metrics otherwise it is unclear whether d-VAE has some tangible advantage over CEBRA in those 2 datasets or if they are similar in every metric. Finally, it would be better if the authors show the results for CEBRA on Fig. 2, just as is done for other methods because otherwise it is hard to compare all methods.”

      CEBRA is a non-generative model, this model cannot generate behaviorally-relevant signals. Therefore, we only compared the decoding performance of latent embeddings of CEBRA and signals of d-VAE.

      Thank you for your valuable feedback.

      Q2: “Given the fact that d-VAE infers the latent (z) based on the population activity (x), claims about properties of the inferred behaviorally relevant signals (x_r) that attribute properties to individual neurons are confounded.

      The authors contrast their approach to population level approaches in that it infers behaviorally relevant signals for individual neurons. However, d-VAE is also a population method as it aggregates population information to infer the latent (z), from which behaviorally relevant part of the activity of each neuron (x_r) is inferred. The authors note this population level aggregation of information as a benefit of d-VAE, but only acknowledge it as a confound briefly in the context of one of their analyses (line 340): "The first is that the larger R2 neurons leak their information to the smaller R2 neurons, causing them contain too much behavioral information". They go on to dismiss this confounding possibility by showing that the inferred behaviorally relevant signal of each neuron is often most similar to its own raw signals (line 348-352) compared with all other neurons. They also provide another argument specific to that result section (i.e., residuals are not very behavior predictive), which is not general so I won't discuss it in depth here. These arguments however do not change the basic fact that d-VAE aggregates information from other neurons when extracting the behaviorally relevant activity of any given neuron, something that the authors note as a benefit of d-VAE in many instances. The fact that d-VAE aggregates population level info to give the inferred behaviorally relevant signal for each neuron confounds several key conclusions. For example, because information is aggregated across neurons, when trial to trial variability looks smoother after applying d-VAE (Fig 3i), or reveals better cosine tuning (Fig 3b), or when neurons that were not very predictive of behavior become more predictive of behavior (Fig 5), one cannot really attribute the new smoother single trial activity or the improved decoding to the same single neurons; rather these new signals/performances include information from other neurons. Unless the connections of the encoder network (z=f(x)) is zero for all other neurons, one cannot claim that the inferred rates for the neuron are truly solely associated with that neuron. I believe this a fundamental property of a population level VAE, and simply makes the architecture unsuitable for claims regarding inherent properties of single neurons. This confound is partly why the first claim in the abstract are not supported by data: observing that neurons that don't predict behavior very well would predict it much better after applying d-VAE does not prove that these neurons themselves "encode rich[er] behavioral information in complex nonlinear ways" (i.e., the first conclusion highlighted in the abstract) because information was also aggregated from other neurons. The other reason why this claim is not supported by data is the characterization of the encoding for smaller R2 neurons as "complex nonlinear", which the method is not well equipped to tease apart from linear mappings as I explain in my comment 3.”

      We acknowledge that we cannot obtain the exact single neuronal activity that does not contain any information from other neurons. However, we believe our model can extract accurate approximation signals of the ground truth relevant signals. These signals preserve the inherent properties of single neuronal activity to some extent and can be used for analysis at the single-neuron level.

      We believe d-VAE is a reasonable approach to extract effective relevant signals that preserve inherent properties of single neuronal activity for four key reasons:

      1) d-VAE is a latent variable model that adheres to the neural population doctrine. The neural population doctrine posits that information is encoded within interconnected groups of neurons, with the existence of latent variables (neural modes) responsible for generating observable neuronal activity [1, 2]. If we can perfectly obtain the true generative model from latent variables to neuronal activity, then we can generate the activity of each neuron from hidden variables without containing any information from other neurons. However, without a complete understanding of the brain’s encoding strategies (or generative model), we can only get the approximation signals of the ground truth signals.

      2) After the generative model is established, we need to infer the parameters of the generative model and the distribution of latent variables. During the inference process, inference algorithms such as variational inference or EM algorithms will be used. Generally, the obtained latent variables are also approximations of the real latent variables. When inferring the latent variables, it is inevitable to aggregation the information of the neural population, and latent variables are derived through weighted combinations of neuronal populations [3].

      This inference process is consistent with that of d-VAE (or VAE-based models).

      3) Latent variables are derived from raw neural signals and used to explain raw neural signals. Considering the unknown ground truth of latent variables and behaviorally-relevant signals, it becomes evident that the only reliable reference at the signal level is the raw signals. A crucial criterion for evaluating the reliability of latent variable models (including latent variables and generated relevant signals) is their capability to effectively explain the raw signals [3]. Consequently, we firmly maintain the belief that if the generated signals closely resemble the raw signals to the greatest extent possible, in accordance with an equivalence principle, we can claim that these obtained signals faithfully retain the inherent properties of single neurons. d-VAE explicitly constrains the generated signal to closely resemble the raw signals. These results demonstrate that d-VAE can extract effective relevant signals that preserve inherent properties of single neuronal activity.

      Based on the above reasons, we hold that generating single neuronal activities with the VAE framework is a reasonable approach. The remaining question is whether our model can obtain accurate relevant signals in the absence of ground truth. To our knowledge, in cases where the ground truth of relevant signals is unknown, there are typically two approaches to verifying the reliability of extracted signals:

      1) Conducting synthetic experiments where the ground truth is known.

      2) Validation based on expert knowledge (Three criteria were proposed in this paper). Both our extracted signals and key conclusions have been validated using these two approaches.

      Next, we will provide a detailed response to the concerns regarding our first key conclusion that smaller R2 neurons encode rich information.

      We acknowledge that larger R2 neurons play a role in aiding the reconstruction of signals in smaller R2 neurons through their neural activity. However, considering that neurons are correlated rather than independent entities, we maintain the belief that larger R2 neurons assist damaged smaller R2 neurons in restoring their original appearance. Taking image denoising as an example, when restoring noisy pixels to their original appearance, relying solely on the noisy pixels themselves is often impractical. Assistance from their correlated, clean neighboring pixels becomes necessary.

      The case we need to be cautious of is that the larger R2 neurons introduce additional signals (m) that contain substantial information to smaller R2 neurons, which they do not inherently possess. We believe this case does not hold for two reasons. Firstly, logically, adding extra signals decreases the reconstruction performance, and the information carried by these additional signals is redundant for larger R2 neurons, thus they do not introduce new information that can enhance the decoding performance of the neural population. Therefore, it seems unlikely and unnecessary for neural networks to engage in such counterproductive actions. Secondly, even if this occurs, our second criterion can effectively exclude the selection of these signals. To clarify, if we assume that x, y, and z denote the raw, relevant, and irrelevant signals of smaller R2 neurons, with x=y+z, and the extracted relevant signals become y+m, the irrelevant signals become z-m in this case. Consequently, the irrelevant signals contain a significant amount of information. It's essential to emphasize that this criterion holds significant importance in excluding undesirable signals.

      Furthermore, we conducted a synthetic experiment to show that d-VAE can indeed restore the damaged information of smaller R2 neurons with the help of larger R2 neurons, and the restored neuronal activities are more similar to ground truth compared to damaged raw signals. Please see Fig. S11a,b for details.

      Thank you for your valuable feedback.

      [1] Saxena, S. and Cunningham, J.P., 2019. Towards the neural population doctrine. Current opinion in neurobiology, 55, pp.103-111.

      [2] Gallego, J.A., Perich, M.G., Miller, L.E. and Solla, S.A., 2017. Neural manifolds for the control of movement. Neuron, 94(5), pp.978-984.

      [3] Cunningham, J.P. and Yu, B.M., 2014. Dimensionality reduction for large-scale neural recordings. Nature neuroscience, 17(11), pp.1500-1509.

      Q3: “Given the nonlinear architecture of the VAE, claims about the linearity or nonlinearity of cortical readout are confounded and not supported by the results.

      The inference of behaviorally relevant signals from raw signals is a nonlinear operation, that is x_r=g(f(x)) is nonlinear function of x. So even when a linear KF is used to decode behavior from the inferred behaviorally relevant signals, the overall decoding from raw signals to predicted behavior (i.e., KF applied to g(f(x))) is nonlinear. Thus, the result that decoding of behavior from inferred behaviorally relevant signals (x_r) using a linear KF and a nonlinear ANN reaches similar accuracy (Fig 2), does not suggest that a "linear readout is performed in the motor cortex", as the authors claim (line 471). The authors acknowledge this confound (line 472) but fail to address it adequately. They perform a simulation analysis where the decoding gap between KF and ANN remains unchanged even when d-VAE is used to infer behaviorally relevant signals in the simulation. However, this analysis is not enough for "eliminating the doubt" regarding the confound. I'm sure the authors can also design simulations where the opposite happens and just like in the data, d-VAE can improve linear decoding to match ANN decoding. An adequate way to address this concern would be to use a fully linear version of the autoencoder where the f(.) and g(.) mappings are fully linear. They can simply replace these two networks in their model with affine mappings, redo the modeling and see if the model still helps the KF decoding accuracy reach that of the ANN decoding. In such a scenario, because the overall KF decoding from original raw signals to predicted behavior (linear d-VAE + KF) is linear, then they could move toward the claim that the readout is linear. Even though such a conclusion would still be impaired by the nonlinear reference (d-VAE + ANN decoding) because the achieved nonlinear decoding performance could always be limited by network design and fitting issues. Overall, the third conclusion highlighted in the abstract is a very difficult claim to prove and is unfortunately not supported by the results.”

      We aim to explore the readout mechanism of behaviorally-relevant signals, rather than raw signals. Theoretically, the process of removing irrelevant signals should not be considered part of the inherent decoding mechanisms of the relevant signals. Assuming that the relevant signals we extracted are accurate, the conclusion of linear readout is established. On the synthetic data where the ground truth is known, our distilled signals show a significant improvement in neural similarity to the ground truth when compared to raw signals (refer to Fig. S2l). This observation demonstrates that our distilled signals are accurate approximations of the ground truth. Furthermore, on the three widely-used real datasets, our distilled signals meet the stringent criteria we have proposed (see Fig. 2), also providing strong evidence for their accuracy.

      Regarding the assertion that we could create simulations in which d-VAE can make signals that are inherently nonlinearly decodable into linearly decodable ones: In reality, we cannot achieve this, as the second criterion can rule out the selection of such signals. Specifically,z=x+y=n^2+y, where z, x, y, and n denote raw signals, relevant signals, irrelevant signals and latent variables. If the relevant signals obtained by d-VAE are n, then these signals can be linear decoded accurately. However, the corresponding irrelevant signals are n^2-n+z; thus, irrelevant signals will have much information, and these extracted relevant signals will not be selected. Furthermore, our synthetic experiments offer additional evidence supporting the conclusion that d-VAE does not make inherently nonlinearly decodable signals become linearly decodable ones. As depicted in Fig. S11c, there exists a significant performance gap between KF and ANN when decoding the ground truth signals of smaller R2 neurons. KF exhibits notably low performance, leaving substantial room for compensation by d-VAE. However, following processing by d-VAE, KF's performance of distilled signals fails to surpass its already low ground truth performance and remains significantly inferior to ANN's performance. These results collectively confirm that our approach does not convert signals that are inherently nonlinearly decodable into linearly decodable ones, and the conclusion of linear readout is not a by-product by d-VAE.

      Regarding the suggestion of using linear d-VAE + KF, as discussed in the Discussion section, removing the irrelevant signals requires a nonlinear operation, and linear d-VAE can not effectively separate relevant and irrelevant signals.

      Thank you for your valuable feedback.

      Q4: “The authors interpret several results as indications that "behavioral information is distributed in a higher-dimensional subspace than expected from raw signals", which is the second main conclusion highlighted in the abstract. However, several of these arguments do not convincingly support that conclusion.

      4.1) The authors observe that behaviorally relevant signals for neurons with small principal components (referred to as secondary) have worse decoding with KF but better decoding with ANN (Fig. 6b,e), which also outperforms ANN decoding from raw signals. This observation is taken to suggest that these secondary behaviorally relevant signals encode behavior information in highly nonlinear ways and in a higher dimensions neural space than expected (lines 424 and 428). These conclusions however are confounded by the fact that A) d-VAE uses nonlinear encoding, so one cannot conclude from ANN outperforming KF that behavior is encoded nonlinearly in the motor cortex (see comment 3 above), and B) d-VAE aggregates information across the population so one cannot conclude that these secondary neurons themselves had as much behavior information (see comment 2 above).

      4.2) The authors observe that the addition of the inferred behaviorally relevant signals for neurons with small principal components (referred to as secondary) improves the decoding of KF more than it improves the decoding of ANN (red curves in Fig 6c,f). This again is interpreted similarly as in 4.1, and is confounded for similar reasons (line 439): "These results demonstrate that irrelevant signals conceal the smaller variance PC signals, making their encoded information difficult to be linearly decoded, suggesting that behavioral information exists in a higher-dimensional subspace than anticipated from raw signals". This is confounded by because of the two reasons explained in 4.1. To conclude nonlinear encoding based on the difference in KF and ANN decoding, the authors would need to make the encoding/decoding in their VAE linear to have a fully linear decoder on one hand (with linear d-VAE + KF) and a nonlinear decoder on the other hand (with linear d-VAE + ANN), as explained in comment 3.

      4.3) From S Fig 8, where the authors compare cumulative variance of PCs for raw and inferred behaviorally relevant signals, the authors conclude that (line 554): "behaviorally-irrelevant signals can cause an overestimation of the neural dimensionality of behaviorally-relevant responses (Supplementary Fig. S8)." However, this analysis does not really say anything about overestimation of "behaviorally relevant" neural dimensionality since the comparison is done with the dimensionality of "raw" signals. The next sentence is ok though: "These findings highlight the need to filter out relevant signals when estimating the neural dimensionality.", because they use the phrase "neural dimensionality" not "neural dimensionality of behaviorally-relevant responses".”

      Questions 4.1 and 4.2 are a combination of Q2 and Q3. Please refer to our responses to Q2 and Q3.

      Regarding question 4.3 about “behaviorally-irrelevant signals can cause an overestimation of the neural dimensionality of behaviorally-relevant responses”: Previous studies usually used raw signals to estimate the neural dimensionality of specific behaviors. We mean that using raw signals, which include many irrelevant signals, will cause an overestimation of the neural dimensionality. We have modified this sentence in the revised manuscripts.

      Thank you for your valuable feedback.

      Q5: “Imprecise use of language in many places leads to inaccurate statements. I will list some of these statements”

      5.1) In the abstract: "One solution is to accurately separate behaviorally-relevant and irrelevant signals, but this approach remains elusive due to the unknown ground truth of behaviorally-relevant signals". This statement is not accurate because it implies no prior work does this. The authors should make their statement more specific and also refer to some goal that existing linear (e.g., PSID) and nonlinear (e.g., TNDM) methods for extracting behaviorally relevant signals fail to achieve.

      5.2) In the abstract: "we found neural responses previously considered useless encode rich behavioral information" => what does "useless" mean operationally? Low behavior tuning? More precise use of language would be better.

      5.3) "... recent studies (Glaser 58 et al., 2020; Willsey et al., 2022) demonstrate nonlinear readout outperforms linear readout." => do these studies show that nonlinear "readout" outperforms linear "readout", or just that nonlinear models outperform linear models?

      5.4) Line 144: "The first criterion is that the decoding performance of the behaviorally-relevant signals (red bar, Fig.1) should surpass that of raw signals (the red dotted line, Fig.1).". Do the authors mean linear decoding here or decoding in general? If the latter, how can something extracted from neural surpass decoding of neural data, when the extraction itself can be thought of as part of decoding? The operational definition for this "decoding performance" should be clarified.

      5.5) Line 311: "we found that the dimensionality of primary subspace of raw signals (26, 64, and 45 for datasets A, B, and C) is significantly higher than that of behaviorally-relevant signals (7, 13, and 9), indicating that behaviorally-irrelevant signals lead to an overestimation of the neural dimensionality of behaviorally-relevant signals." => here the dimensionality of the total PC space (i.e., primary subspace of raw signals) is being compared with that of inferred behaviorally-relevant signals, so the former being higher does not indicate that neural dimensionality of behaviorally-relevant signals was overestimated. The former is simply not behavioral so this conclusion is not accurate.

      5.6) Section "Distilled behaviorally-relevant signals uncover that smaller R2 neurons encode rich behavioral information in complex nonlinear ways". Based on what kind of R2 are the neurons grouped? Behavior decoding R2 from raw signals? Using what mapping? Using KF? If KF is used, the result that small R2 neurons benefit a lot from d-VAE could be somewhat expected, given the nonlinearity of d-VAE: because only ANN would have the capacity to unwrap the nonlinear encoding of d-VAE as needed. If decoding performance that is used to group neurons is based on data, regression to the mean could also partially explain the result: the neurons with worst raw decoding are most likely to benefit from a change in decoder, than neurons that already had good decoding. In any case, the R2 used to partition and sort neurons should be more clearly stated and reminded throughout the text and I Fig 3.

      5.7) Line 346 "...it is impossible for our model to add the activity of larger R2 neurons to that of smaller R2 neurons" => Is it really impossible? The optimization can definitely add small-scale copies of behaviorally relevant information to all neurons with minimal increase in the overall optimization loss, so this statement seems inaccurate.

      5.8) Line 490: "we found that linear decoders can achieve comparable performance to that of nonlinear decoders, providing compelling evidence for the presence of linear readout in the motor cortex." => inaccurate because no d-VAE decoding is really linear, as explained in comment 3 above.

      5.9) Line 578: ". However, our results challenge this idea by showing that signals composed of smaller variance PCs nonlinearly encode a significant amount of behavioral information." => inaccurate as results are confounded by nonlinearity of d-VAE as explained in comment 3 above.

      5.10) Line 592: "By filtering out behaviorally-irrelevant signals, our study found that accurate decoding performance can be achieved through linear readout, suggesting that the motor cortex may perform linear readout to generate movement behaviors." => inaccurate because it us confounded by the nonlinearity of d-VAE as explained in comment 3 above.”

      Regarding “5.1) In the abstract: "One solution is to accurately separate behaviorally-relevant and irrelevant signals, but this approach remains elusive due to the unknown ground truth of behaviorally-relevant signals". This statement is not accurate because it implies no prior work does this. The authors should make their statement more specific and also refer to some goal that existing linear (e.g., PSID) and nonlinear (e.g., TNDM) methods for extracting behaviorally relevant signals fail to achieve”:

      We believe our statement is accurate. Our primary objective is to extract accurate behaviorally-relevant signals that closely approximate the ground truth relevant signals. To achieve this, we strike a balance between the reconstruction and decoding performance of the generated signals, aiming to effectively capture the relevant signals. This crucial aspect of our approach sets it apart from other methods. In contrast, other methods tend to emphasize the extraction of valuable latent neural dynamics. We have provided elaboration on the distinctions between d-VAE and other approaches in the Introduction and Discussion sections.

      Thank you for your valuable feedback.

      Regarding “5.2) In the abstract: "we found neural responses previously considered useless encode rich behavioral information" => what does "useless" mean operationally? Low behavior tuning? More precise use of language would be better.”:

      In the analysis of neural signals, smaller variance PC signals are typically seen as noise and are often discarded. Similarly, smaller R2 neurons are commonly thought to be dominated by noise and are not further analyzed. Given these considerations, we believe that the term "considered useless" is appropriate in this context. Thank you for your valuable feedback.

      Regarding “5.3) "... recent studies (Glaser 58 et al., 2020; Willsey et al., 2022) demonstrate nonlinear readout outperforms linear readout." => do these studies show that nonlinear "readout" outperforms linear "readout", or just that nonlinear models outperform linear models?”:

      In this paper, we consider the two statements to be equivalent. Thank you for your valuable feedback.

      Regarding “5.4) Line 144: "The first criterion is that the decoding performance of the behaviorally-relevant signals (red bar, Fig.1) should surpass that of raw signals (the red dotted line, Fig.1).". Do the authors mean linear decoding here or decoding in general? If the latter, how can something extracted from neural surpass decoding of neural data, when the extraction itself can be thought of as part of decoding? The operational definition for this "decoding performance" should be clarified.”:

      We mean the latter, as we said in the section “Framework for defining, extracting, and separating behaviorally-relevant signals”, since raw signals contain too many behaviorally-irrelevant signals, deep neural networks are more prone to overfit raw signals than relevant signals. Therefore the decoding performance of relevant signals should surpass that of raw signals. Thank you for your valuable feedback.

      Regarding “5.5) Line 311: "we found that the dimensionality of primary subspace of raw signals (26, 64, and 45 for datasets A, B, and C) is significantly higher than that of behaviorally-relevant signals (7, 13, and 9), indicating that behaviorally-irrelevant signals lead to an overestimation of the neural dimensionality of behaviorally-relevant signals." => here the dimensionality of the total PC space (i.e., primary subspace of raw signals) is being compared with that of inferred behaviorally-relevant signals, so the former being higher does not indicate that neural dimensionality of behaviorally-relevant signals was overestimated. The former is simply not behavioral so this conclusion is not accurate.”: In practice, researchers usually used raw signals to estimate the neural dimensionality. We mean that using raw signals to do this would overestimate the neural dimensionality. Thank you for your valuable feedback.

      Regarding “5.6) Section "Distilled behaviorally-relevant signals uncover that smaller R2 neurons encode rich behavioral information in complex nonlinear ways". Based on what kind of R2 are the neurons grouped? Behavior decoding R2 from raw signals? Using what mapping? Using KF? If KF is used, the result that small R2 neurons benefit a lot from d-VAE could be somewhat expected, given the nonlinearity of d-VAE: because only ANN would have the capacity to unwrap the nonlinear encoding of d-VAE as needed. If decoding performance that is used to group neurons is based on data, regression to the mean could also partially explain the result: the neurons with worst raw decoding are most likely to benefit from a change in decoder, than neurons that already had good decoding. In any case, the R2 used to partition and sort neurons should be more clearly stated and reminded throughout the text and I Fig 3.”:

      When employing R2 to characterize neurons, it indicates the extent to which neuronal activity is explained by the linear encoding model [1-3]. Smaller R2 neurons have a lower capacity for linearly tuning (encoding) behaviors, while larger R2 neurons have a higher capacity for linearly tuning (encoding) behaviors. Specifically, the approach involves first establishing an encoding relationship from velocity to neural signal using a linear model, i.e., y=f(x), where f represents a linear regression model, x denotes velocity, and y denotes the neural signal. Subsequently, R2 is utilized to quantify the effectiveness of the linear encoding model in explaining neural activity. We have provided a comprehensive explanation in the revised manuscript. Thank you for your valuable feedback.

      [1] Collinger, J.L., Wodlinger, B., Downey, J.E., Wang, W., Tyler-Kabara, E.C., Weber, D.J., McMorland, A.J., Velliste, M., Boninger, M.L. and Schwartz, A.B., 2013. High-performance neuroprosthetic control by an individual with tetraplegia. The Lancet, 381(9866), pp.557-564.

      [2] Wodlinger, B., et al. "Ten-dimensional anthropomorphic arm control in a human brain− machine interface: difficulties, solutions, and limitations." Journal of neural engineering 12.1 (2014): 016011.

      [3] Inoue, Y., Mao, H., Suway, S.B., Orellana, J. and Schwartz, A.B., 2018. Decoding arm speed during reaching. Nature communications, 9(1), p.5243.

      Regarding Questions 5.7, 5.8, 5.9, and 5.10:

      We believe our conclusions are solid. The reasons can be found in our replies in Q2 and Q3. Thank you for your valuable feedback.

      Q6: “Imprecise use of language also sometimes is not inaccurate but just makes the text hard to follow.

      6.1) Line 41: "about neural encoding and decoding mechanisms" => what is the definition of encoding/decoding and how do these differ? The definitions given much later in line 77-79 is also not clear.

      6.2) Line 323: remind the reader about what R2 is being discussed, e.g., R2 of decoding behavior using KF. It is critical to know if linear or nonlinear decoding is being discussed.

      6.3) Line 488: "we found that neural responses previously considered trivial encode rich behavioral information in complex nonlinear ways" => "trivial" in what sense? These phrases would benefit from more precision, for example: "neurons that may seem to have little or no behavior information encoded". The same imprecise word ("trivial") is also used in many other places, for example in the caption of Fig S9.

      6.4) Line 611: "The same should be true for the brain." => Too strong of a statement for an unsupported claim suggesting the brain does something along the lines of nonlin VAE + linear readout.

      6.5) In Fig 1, legend: what is the operational definition of "generating performance"? Generating what? Neural reconstruction?”

      Regarding “6.1) Line 41: "about neural encoding and decoding mechanisms" => what is the definition of encoding/decoding and how do these differ? The definitions given much later in line 77-79 is also not clear.”:

      We would like to provide a detailed explanation of neural encoding and decoding. Neural encoding means how neuronal activity encodes the behaviors, that is, y=f(x), where y denotes neural activity and, x denotes behaviors, f is the encoding model. Neural decoding means how the brain decodes behaviors from neural activity, that is, x=g(y), where g is the decoding model. For further elaboration, please refer to [1]. We have included references that discuss the concepts of encoding and decoding in the revised manuscript. Thank you for your valuable feedback.

      [1] Kriegeskorte, Nikolaus, and Pamela K. Douglas. "Interpreting encoding and decoding models." Current opinion in neurobiology 55 (2019): 167-179.

      Regarding “6.2) Line 323: remind the reader about what R2 is being discussed, e.g., R2 of decoding behavior using KF. It is critical to know if linear or nonlinear decoding is being discussed.”:

      This question is the same as Q5.6. Please refer to the response to Q5.6. Thank you for your valuable feedback.

      Regarding “6.3) Line 488: "we found that neural responses previously considered trivial encode rich behavioral information in complex nonlinear ways" => "trivial" in what sense? These phrases would benefit from more precision, for example: "neurons that may seem to have little or no behavior information encoded". The same imprecise word ("trivial") is also used in many other places, for example in the caption of Fig S9.”:

      We have revised this statement in the revised manuscript. Thanks for your recommendation.

      Regarding “6.4) Line 611: "The same should be true for the brain." => Too strong of a statement for an unsupported claim suggesting the brain does something along the lines of nonlin VAE + linear readout.”

      We mean that removing the interference of irrelevant signals and decoding the relevant signals should logically be two stages. We have revised this statement in the revised manuscript. Thank you for your valuable feedback.

      Regarding “6.5) In Fig 1, legend: what is the operational definition of "generating performance"? Generating what? Neural reconstruction?””:

      We have replaced “generating performance” with “reconstruction performance” in the revised manuscript. Thanks for your recommendation.

      Q7: “In the analysis presented starting in line 449, the authors compare improvement gained for decoding various speed ranges by adding secondary (small PC) neurons to the KF decoder (Fig S11). Why is this done using the KF decoder, when earlier results suggest an ANN decoder is needed for accurate decoding from these small PC neurons? It makes sense to use the more accurate nonlinear ANN decoder to support the fundamental claim made here, that smaller variance PCs are involved in regulating precise control”

      Because when the secondary signal is superimposed on the primary signal, the enhancement in KF performance is substantial. We wanted to explore in which aspect of the behavior the KF performance improvement is mainly reflected. In comparison, the improvement of ANN by the secondary signal is very small, rendering the exploration of the aforementioned questions inconsequential. Thank you for your valuable feedback.

      Q8: “A key limitation of the VAE architecture is that it doesn't aggregate information over multiple time samples. This may be why the authors decided to use a very large bin size of 100ms and beyond that smooth the data with a moving average. This limitation should be clearly stated somewhere in contrast with methods that can aggregate information over time (e.g., TNDM, LFADS, PSID) ”

      We have added this limitation in the Discussion in the revised manuscript. Thanks for your recommendation.

      Q9: “Fig 5c and parts of the text explore the decoding when some neurons are dropped. These results should come with a reminder that dropping neurons from behaviorally relevant signals is not technically possible since the extraction of behaviorally relevant signals with d-VAE is a population level aggregation that requires the raw signal from all neurons as an input. This is also important to remind in some places in the text for example:

      • Line 498: "...when one of the neurons is destroyed."

      • Line 572: "In contrast, our results show that decoders maintain high performance on distilled signals even when many neurons drop out."”

      We want to explore the robustness of real relevant signals in the face of neuron drop-out. The signals our model extracted are an approximation of the ground truth relevant signals and thus serve as a substitute for ground truth to study this problem. Thank you for your valuable feedback.

      Q10: “Besides the confounded conclusions regarding the readout being linear (see comment 3 and items related to it in comment 5), the authors also don't adequately discuss prior works that suggest nonlinearity helps decoding of behavior from the motor cortex. Around line 594, a few works are discussed as support for the idea of a linear readout. This should be accompanied by a discussion of works that support a nonlinear encoding of behavior in the motor cortex, for example (Naufel et al. 2019; Glaser et al. 2020), some of which the authors cite elsewhere but don't discuss here.”

      We have added this discussion in the revised manuscript. Thanks for your recommendation.

      Q11: “Selection of hyperparameters is not clearly explained. Starting line 791, the authors give some explanation for one hyperparameter, but not others. How are the other hyperparameters determined? What is the search space for the grid search of each hyperparameter? Importantly, if hyperparameters are determined only based on the training data of each fold, why is only one value given for the hyperparameter selected in each dataset (line 814)? Did all 5 folds for each dataset happen to select exactly the same hyperparameter based on their 5 different training/validation data splits? That seems unlikely.”

      We perform a grid search in {0.001, 0.01,0.1,1} for hyperparameter beta. And we found that 0.001 is the best for all datasets. As for the model parameters, such as hidden neuron numbers, this model capacity has reached saturation decoding performance and does not influence the results.

      Regarding “Importantly, if hyperparameters are determined only based on the training data of each fold, why is only one value given for the hyperparameter selected in each dataset (line 814)? Did all 5 folds for each dataset happen to select exactly the same hyperparameter based on their 5 different training/validation data splits”: We selected the hyperparameter based on the average performance of 5 folds data on validation sets. The selected value denotes the one that yields the highest average performance across the 5 folds data.

      Thank you for your valuable feedback.

      Q12: “d-VAE itself should also be explained more clearly in the main text. Currently, only the high-level idea of the objective is explained. The explanation should be more precise and include the idea of encoding to latent state, explain the relation to pip-VAE, explain inputs and outputs, linearity/nonlinearity of various mappings, etc. Also see comment 1 above, where I suggest adding more details about other methods in the main text.”

      Our primary objective is to delve into the encoding and decoding mechanisms using the separated relevant signals. Therefore, providing an excessive amount of model details could potentially distract from the main focus of the paper. In response to your suggestion, we have included a visual representation of d-VAE's structure, input, and output (see Fig. S1) in the revised manuscript, which offers a comprehensive and intuitive overview. Additionally, we have expanded on the details of d-VAE and other methods in the Methods section.

      Thank you for your valuable feedback.

      Q13: “In Fig 1f and g, shouldn't the performance plots be swapped? The current plots seem counterintuitive. If there is bias toward decoding (panel g), why is the irrelevant residual so good at decoding?”

      The placement of the performance plots in Fig. 1f and 1g is accurate. When the model exhibits a bias toward decoding, it prioritizes extracting the most relevant features (latent variables) for decoding purposes. As a consequence, the model predominantly generates signals that are closely associated with these extracted features. This selective signal extraction and generation process may result in the exclusion of other potentially useful information, which will be left in the residuals. To illustrate this concept, consider the example of face recognition: if a model can accurately identify an individual using only the person's eyes (assuming these are the most useful features), other valuable information, such as details of the nose or mouth, will be left in the residuals, which could also be used to identify the individual.

      Thank you for your valuable feedback.

      In this interesting work, the authors investigated an important topical question: when we see travelling waves in cortical activity, is this due to true wave-like spread, or due to sequentially activated sources? In simulations, it is shown that sequential brain module activation can show up as a travelling wave - even in improved methods such as phase delay maps - and a variety of parameters is investigated. Then, in ex-vivo turtle eye-brain preparations, the authors show that visual cortex waves observable in local field potentials are in fact often better explained as areas D1 and D2 being sequentially activated. This has implications for how we think about travelling wave methodology and relevant analytical tools.


      I enjoyed reading the discussion. The authors are careful in their claims, and point out that some phenomena may still indeed be genuine travelling waves, but we should have a higher evidence bar to claim this for a particular process in light of this paper and Zhigalov & Jensen (2023) (ref 44). Given this careful discussion, the claims made are well-supported by the experimental results. The discussion also gives a nice overview of potential options in light of this and future directions.

      The illustration of different gaussian covariances leading to very different latency maps was interesting to see.

      Furthermore, the methods are detailed and clearly structured and the Supplementary Figures, particularly single trial results, are useful and convincing.

      We are glad the reviewer found our manuscript “interesting”, the questions we raise “important”, our claims “well-supported by the experimental results”, and our methods “detailed and clearly structured”.

      The details of the sequentially activated Gaussian simulations give some useful results, but the fundamental idea still appears to be "sequential activation is often indistinguishable from a travelling wave", an idea advanced e.g. by Zhigalov & Jensen (2023). It takes a while until the (in my opinion) more intriguing experimental results.

      To emphasize the experimental results, we switched between the analytical results and the experimental results. Correspondingly, figure 2 now illustrates the more intriguing experimental results and figure 3 the analytical results. In addition, we added subtitles to the different sections of the results to ease the navigation through the paper and to enable the readers to access the different sections more easily.

      One of the key claims is that the spikes are more consistent with two sequentially activated modules rather than a continuous wave (with Fig 3k and 3l key to support this). Whilst this is more consistent, it is worth mentioning that there seems to be stochasticity to this and between-trial variability, especially for spikes.

      In the revised manuscript we added the reviewer’s comment about stochasticity, and we discuss its possible origins:

      "The transition was also not clear when examining spiking responses in some of the trials (as indicated by high DIP scores, Figure 2K). However, the observation that temporal grouping became more pronounced when using ALSA (a more robust estimate of local excitability) (Figure 2L,N), suggests that high DIP values may result from variability in the spike times of single neurons, and not necessarily from the lack of modular activation. Such issues can be resolved by denser sampling of spiking activity in the tissue."

      Recommendations For The Authors:

      The eye-cortex turtle preparation is not the most common. I would add more context about how specific the results are to this preparation vs how comparable it is to human data.

      We added a sentence explaining the relevance of our preparation: “Finally, while the layered organization of turtle cortex is different than that of mammalian cortex, the basic excitability features of both tissues are similar (Connors and Kriegstein, 1986; Hemberger et al., 2019; Kriegstein and Connors, 1986; Larkum et al., 2008; Shein-Idelson et al., 2017b), and substantial differences in the manner by which field potentials and spikes spread through the tissue are not to be expected.”

      Philosophical question: when does a 'module' become small enough for it to count as a travelling wave? More on this could be added to the discussion. I think we are in the very early days for a true understanding of travelling waves, and I wonder if these sequentially activated modules will functionally correspond to the known cortical segregation, or if it varies by area/task.

      We agree with the reviewer that macroscopic waves could be composed of smaller modules (or single neurons at the smallest scale). Our results suggest that modular patterns can be classified as wave patterns both at large scales (of brain areas) and smaller scales of local neural circuits. Therefore, we believe it is necessary to make this distinction across different scales. We sharpened this point in the first paragraph of the discussion:

      "…We showed that LFP measurements indicative of waves propagating across turtle cortex are underlined by discrete and consecutively activated neuronal populations, and not by a continuously propagating wavefront of spikes (Figure 2). Similarly, activation profiles that resemble continuous travelling waves in EEG simulations can be underlined by consecutive activation of two discrete cortical regions (Figure 1). We replicated these results using an analytical model and demonstrated that a simple scenario of sequentially activated Gaussians can exhibit WLPs with a rich diversity of spatiotemporal profiles (Figure 3). Our results offer insight into the scenarios and conditions for WLP detection by identifying failure points that should be considered when identifying travelling waves and therefore suggest caution when interpreting continuous phase latency maps as microscopically propagating wave patterns. Such failure points may exist both when examining activity at the scale of brain regions (Figure 1) and smaller neural circuits (Figure 2). Therefore, our results suggest that the discrepancy between modular and wave activation should be examined across spatial scales. Specifically, it is not necessarily the case that at the fine grained (single neuron) scale activation patterns are modular, but, following coarse graining, smooth wave patterns emerge. Rather, modular activation may hierarchically exist across scales (Kaiser and Hilgetag, 2010; Meunier et al., 2010) and may be masked by smeared spatial supra-threshold excitability boundaries. Below we discuss these limitations across techniques and their implications.”

      I would advise the authors to focus on the experimental data, perhaps by putting the simulations second, and by putting some of the equation details that are in Methods into the Supplementary Information. Whilst the simulation parameter space is well-explored, the fundamental idea of spreading Gaussians is relatively simple, and the current manuscript organization detracted from the main message for me a little bit.”

      Following the referee’s suggestion, we switched between the section with experimental data and the one with the analytic model (see response to comment 1). In addition, to ease the reading of the methods, we moved the mathematical derivation and related equations to appendix 1.

      Things I thought about that you may also enjoy thinking about: Could we tell something about sequential sources vs travelling waves by the nature of the wave - e.g. shape or dispersion? If some wave properties are conserved whilst travelling, this could be evidence for travelling vs two sources.

      This is a wonderful suggestion. We are currently working on a follow up publication with a new approach to do exactly that! We think that this new body of work is outside the scope of this paper.

      Could synaptic potentials spread like waves, but spikes more in modular bursts? This would also explain the LFP vs spikes difference - maybe travelling waves of EPSPs are there priming the network, 'looking' for suitable modules to activate, which then activate sequentially. The current discussion is quite spike-focused - could some information be in synaptic potentials after all?

      This is an interesting idea with intriguing functional implications. We added this idea to our discussion (see paragraph below). In addition, to emphasize our discussion on synaptic potentials, we reorganized the paragraphs in the discussion to separate between our discussion on sub-threshold excitability (which is mostly synaptic) and supra-threshold excitability which is the focus of the second part of the discussion.

      “Variability in responses may also be explained by differences in propagation mechanisms (Ermentrout and Kleinfeld, 2001; Muller et al., 2018; Wu et al., 2008). Several reports suggest that waves are underlined by propagation along axonal collaterals (Muller et al., 2018, 2014). Both the transmembrane voltage-gated currents excited during action potentials as well as the post-synaptic currents along axonal boutons can potentially contribute to measured signals. However, such waves travel at high propagation speeds and are not compatible with the wide diversity of wave velocities and mechanisms of local neuronal interactions (Ermentrout and Kleinfeld, 2001; Feller et al., 1996). An intriguing possibility is that such axonal waves prime neuronal excitability by sub-threshold inputs that later result in modular supra-threshold activation. The ability to experimentally discriminate between axonal inputs and local spiking excitability (e.g. by reporters with different wavelengths) can potentially resolve such discrepancies.

      Our turtle cortex results (Figure 2) exemplify how contrasting sub-threshold LFP measurements with supra-threshold spiking measurements can yield different conclusions about the nature of activity spread….”

    1. eLife assessment

      This paper provides a simple example of a neural-like system that displays criticality, but not for any deep reason; it's just because a population of neurons are driven (independently!) by a slowly varying latent variable, something that is common in the brain. Moreover, criticality does not imply optimal information transmission (one of its proposed functions). The work is likely to have an important impact on the study of criticality in neural systems and is convincingly supported by the experiments presented.

    2. Joint Public Review

      This paper shows that networks of binary neurons can exhibit power law behavior (including "crackling", which refers to a particular relationship among the power law exponents) without fine tuning. If, as is standard, we equate power law behavior to criticality, then criticality can arise in networks of neurons without fine tuning. The network model used to show this was extremely simple: a population of completely uncoupled neurons was driven by a small number of slowly varying "hidden" variables (either 1 or 5). This caused the firing rate of every neuron to change slowly over time, in a correlated fashion. Criticality was observed over a large range of couplings, time constants, and average firing rates.

      This paper is extremely important in light of the hypothesis that criticality in the brain is both special, in the sense that it requires fine tuning, and that it leads to optimal information processing. As mentioned above, this paper shows that fine tuning is not required. It also shows that criticality does not imply optimal information transmission. This does not, of course, rule out the above "critical brain" hypothesis. But it does show that simply observing power law behavior is not enough to draw conclusions about either fine tuning or function.

      These authors are not the first to show that slowly varying firing rates can give rise to power law behavior (see, for example, Touboul and Destexhe, 2017; Priesemann and Shriki, 2018). However, to our knowledge they are the first to show crackling, and to compute information transmission in, and out of, the critical state.


      Touboul and Destexhe, 2017: Touboul J, Destexhe A. Power-law statistics and universal scaling in the absence of criticality. Phys Rev E. 2017 95:012413, 2017.

      Priesemann and Shriki, 2018: Priesemann V, Shriki O. PLOS Comp. Bio. 14:1-29, 2018.

    3. Author Response:

      […] While this does not rule out criticality in the brain, it decidedly weakens the evidence for it, which was based on the following logic: critical systems give rise to power law behavior; power law behavior is observed in cortical networks; therefore, cortical networks operate near a critical point. Given, as shown in this paper, that power laws can arise from noncritical processes, the logic breaks. Moreover, the authors show that criticality does not imply optimal information transmission (one of its proposed functions). This highlights the necessity for more rigorous analyses to affirm criticality in the brain. In particular, it suggests that attention should be focused on the question "does the brain implement a dynamical latent variable model?".

      These authors are not the first to show that slowly varying firing rates can give rise to power law behavior (see, for example, Touboul and Destexhe, 2017; Priesemann and Shriki, 2018). However, to our knowledge they are the first to show crackling, and to compute information transmission in the critical state.

      We thank the reviewers for their thoughtful assessment of our paper.

      We would push back on the assessment that our model ‘has nothing to do with criticality,’ and that we observed ‘signatures of criticality [that] emerge through fundamentally non-critical mechanisms.’ This assessment partially stems from the definition of criticality provided in the Public Comment, that ‘criticality is a very specific set of phenomena in physics in which fundamentally local interactions produce unexpected long-range behavior.’

      Our disagreement is largely focused on this definition, which we do not think is a standard definition. Taking the favorite textbook example, the Ising model, criticality is characterized by a set of power-law divergences in thermodynamic quantities (e.g., susceptibility, specific heat, magnetization) at the critical temperature, with exponents of these power laws governed by scaling laws. It is not defined by local interactions. All-to-all Ising model is generally viewed as showing a critical behavior at a certain temperature, even though interactions there are manifestly non-local. It is possible that, by “local” in the definition, the Public Comment meant that interactions are “collective” and among microscopic degrees of freedom. However, that same all-to-all Ising model is mathematically equivalent to the mean-field model, where criticality is achieved through large fluctuations of the mean field, but not through microscopic interactions.

      More commonly, criticality is defined by power laws and scaling relationships that emerge at a critical value of a parameter(s) of the system. That is, criticality is defined by its signatures. What is crucial in all such definitions is that this atypical, critical state requires fine tuning. For example, in the textbook example of the Ising model, a parameter (the temperature) must be tuned to a critical value for critical behavior to appear. In the branching process model that generates avalanche criticality, criticality requires tuning m=1. The key result of our paper is that all signatures expected for avalanche criticality (power laws, crackling, and, as shown below, estimates of the branching rate m), and hence the criticality itself, appear without fine-tuning.

      As we discussed in our introduction, there are a few other instances of signatures of criticality (and hence of criticality itself) emerging without fine-tuning. The first we are aware of was the demonstration of Zipf’s Law (by Schwab, et al. 2014, and Aitchison et al. 2016), a power-law relationship between rank and frequency of states, which was shown to emerge generically in systems driven by a broadly distributed latent variable. A second example, arising from applications of coarse-graining analysis to neural data (cf., Meshulam et al. 2019; also, Morales et al., 2023), was demonstrated in our earlier paper (Morrell et al. 2021). Thus, here we have a third example: the model in this paper generates signatures of criticality in the statistics of avalanches of activity, and it does so without fine-tuning (cf., Fig. 2-3).

      The rate at which these ‘criticality without fine-tuning' examples are piling up may inspire revisiting the requirement of fine-tuning in the definition of criticality, and our ongoing work (Ngampruetikorn et al. 2023) suggests that criticality may be more accurately defined through large fluctuations (variance > 1/N) rather than through fine-tuning or scaling relations.


      • Schwab DJ, Nemenman I, Mehta P. “Zipf’s Law and Criticality in Multivariate Data without FineTuning.” Phys Rev Lett. 2014 Aug; doi::101103/PhysRevLett.113.068102,

      • Aitchison L, Corradi N, Latham PE. “Zipf’s Law Arising Naturally When There Are Underlying, Unobserved Variables.” PLOS Computational biology. 2016 12; 12(12):1-32. doi:10.1371/journal.pcbi.1005110

      • Meshulam L, Gauthier JL, Brody CD, Tank DW, Bialek W. “Coarse Graining, Fixed Points, and Scaling in a Large Population of Neurons.” Phys Rev Lett. 2019 Oct; doi: 10.1103/PhysRevLett.123.178103.

      • Morales GB, di Santo S, Muñoz MA. “Quasiuniversal scaling in mouse-brain neuronal activity stems from edge-of-instability critical dynamics.” Proceedings of the National Academy of Sciences. 2023; 120(9):e2208998120.

      • Morrell MC, Sederberg AJ, Nemenman I. “Latent Dynamical Variables Produce Signatures of Spatiotemporal Criticality in Large Biological Systems.” Phys Rev Lett. 2021 Mar; doi: 10.1103/PhysRevLett.126.118302.

      • Ngampruetikorn, V., Nemenman, I., Schwab, D., “Extrinsic vs Intrinsic Criticality in Systems with Many Components.” arXiv: arXiv:2309.13898 [physics.bio-ph]

      Major comments:

      1) For many readers, the essential messages of the paper may not be immediately clear. For example, is the paper criticizing the criticality hypothesis of cortical networks, or does the criticism extend deeper, to the theoretical predictions of "crackling" relationships in physical systems as they can emerge without criticality? Statements like "We show that a system coupled to one or many dynamical latent variables can generate avalanche criticality ..." could be misinterpreted as affirming criticality. A more accurate language is needed; for instance, the paper could state that the model generates relationships observed in critical systems. The paper should provide a clearer conclusion and interpretation of the findings in the context of the criticality hypothesis of cortical dynamics.

      Please see the response to the Public Review, above. To clarify the essential message that the dynamical latent variable model produces avalanche criticality without fine-tuning, we have made revisions to the abstract and introduction. This point was already made in the discussion (first sentence).

      Key sentences changed in the abstract:

      "… We find that populations coupled to multiple latent variables produce critical behavior across a broader parameter range than those coupled to a single, quasi-static latent variable, but in both cases, avalanche criticality is observed without fine-tuning of model parameters. … Our results suggest that avalanche criticality arises in neural systems in which activity is effectively modeled as a population driven by a few dynamical variables and these variables can be inferred from the population activity."

      In the introduction, we changed the final sentence to read:

      "These results demonstrate how criticality in neural recordings can arise from latent dynamics in neural activity, without need for fine-tuning of network parameters."

      2) On lines 97-99, the authors state that "We are agnostic as to the origin of these inputs: they may be externally driven from other brain areas, or they may arise from recurrent dynamics locally". This idea is also repeated at the beginning of the Summary section. Perhaps being agnostic isn't such a good idea: it's possible that the recurrent dynamics is in a critical regime, which would just push the problem upstream. Presumably you're thinking of recurrent dynamics with slow timescales that's not critical? Or are you happy if it's in the critical regime? This should be clarified.

      We have amended this sentence to clarify that any latent dynamics with large fluctuations would suffice:

      ”We are agnostic as to the origin of these inputs: they may be externally driven from other brain areas, or they may arise from large fluctuations in local recurrent dynamics.”

      3) Even though the model in Equation 2 has been described in a previous publication and the Methods section, more details regarding the origin and justification of this model in the context of cortical networks would be helpful in the Results section. Was it chosen just for simplicity, or was there a deeper reason?

      This model was chosen for its simplicity: there are no direct interactions between neurons, coupling between neurons and latent variables is random, and simulation is straightforward. More complex latent dynamics or non-random structure in the coupling matrices could have been used, but our aim was to explore this model in the simplest setting possible.

      We have revised the Results (“Avalanche scaling in a dynamical latent variable model,” first paragraph) to justify the choice of the model:

      "We study a model of a population of neurons that are not coupled to each other directly but are driven by a small number of dynamical latent variables -- that is, slowly changing inputs that are not themselves measured (Fig.~\ref{fig:fig1}A). We are agnostic as to the origin of these inputs: they may be externally driven from other brain areas, or they may arise from large fluctuations in local recurrent dynamics. The model was chosen for its simplicity, and because we have previously shown that this model with at least about five latent variables can produce power laws under the coarse-graining analysis \citep{Morrell2021}."

      We have added the following to the beginning of the Methods section expanding on the reasons for this choice:

      "We study a model from Morrell 2021, originally constructed as a model of large populations of neurons in mouse hippocampus. Neurons are non-interacting, receiving inputs reflective of place-field selectivity as well as input current arising from a random projection from a small number of dynamical latent variables, representing inputs shared across the population of neurons that are not directly measured or controlled. In the current paper, we incorporate only the latent variables (no place variables), and we assume that every cell is coupled to every latent variable with some randomly drawn coupling strength."

      4) The Methods section (paragraph starting on line 340) connects the time scale to actual time scales in neuronal systems, stating that "The timescales of latent variables examined range from about 3 seconds to 3000 seconds, assuming 3-ms bins". While bins of 3 ms are relevant for electrophysiological data from LFPs or high-density EEG/MEG, time scales above 10 seconds are difficult to generate through biophysically clear processes like ionic channels and synaptic transmission. The paper suggests that slow time scales of the latent variables are crucial for obtaining power law behavior resembling criticality. Yet, one way to generate such slow time scales is via critical slowing down, implying that some brain areas providing input to the network under study may operate near criticality. This pushes the problem toward explaining the criticality of those external networks. Hence, discussing potential sources for slow time scales in latent variables is crucial. One possibility you might want to consider is sources external to the organism, which could easily have time scales in the 1-24 hour range.

      As the reviewers note, it is a possibility that slow timescales arise from some other brain area in which dynamics are slow due to critical dynamics, but many other plausible sources exist. These include slowly varying sensory stimuli or external sources, as suggested by the reviewers. It is also possible to generate “effective” slow dynamics from non-critical internal sources. One example, from recordings in awake mice, is the slow change in the level of arousal that occurs on the scale of many seconds to minutes. These changes arise from release of neuromodulators that have broad effects on neural populations and correlations in activity (for a focused review, see Poulet and Crochet, 2019).

      We have added the following sentence to the Methods section where timescales of latent variables was discussed:

      "The timescales of latent variables examined range from about $3$ seconds to $3000$ seconds, assuming $3$-ms bins. Inputs with such timescales may arise from external sources, such as sensory stimuli, or from internal sources, such as changes in physiological state."

      5) It is common in neuronal avalanche analysis to calculate the branching parameter using the ratio of events in consecutive bins. Near-critical systems should display values close to 1, especially in simulations without subsampling. Including the estimated values of the branching parameter for the different cases investigated in this study could provide more comprehensive data. While the paper acknowledges that the obtained exponents in the model differ from those in a critical branching process, it would still be beneficial to offer the branching parameter of the observed avalanches for comparison.

      The reviewers requested that the branching parameter be computed in our model. We point out that, for the quasi-stationary latent variables (as in Fig. 3), a branching parameter of 1 is expected because the summed activity at time t+k is, on average, equal to the summed activity at time t, regardless of k. Numerics are consistent with this expectation. Following the methodology for an unbiased estimate of the branching parameter from Wilting and Priesemann (2018), we checked an example set of parameters (epsilon = 8, eta = 3) for quasi-stationary latent fields. We found that the naïve (biased) estimate of the branching parameter was 0.94, and that the unbiased estimator was exp(−1.4⋅10−8) ≈ 0.999999986.

      For faster time scales, it is no longer true that summed activity is constant over time, as the temporal correlations in activity decay exponentially. Using the five-field simulation from Figure 2, we calculated the branching parameter for several values of tau. The biased estimates of m are 0.76 (𝜏=50), 0.79 (𝜏=500), and 0.79 (𝜏=5000). The corrected estimates are 0.98 (𝜏=50), 0.998 (𝜏=500), and 0.9998 (𝜏=5000).

      6) In the Discussion (l 269), the paper suggests potential differences between networks cultured in vitro and in vivo. While significant differences indeed exist, it's worth noting that exponents consistent with a critical branching process have also been observed in vivo (Petermann et al 2009; Hahn et al. 2010), as well as in large-scale human data.

      We thank the reviewers for pointing out these studies, and we have added the missing one (Hahn et al. 2010) to our reference list. The following was added to the discussion, in the section “Explaining Experimental Exponents:”

      "A subset of the in vivo recordings analyzed from anesthetized cat (Hahn et al. 2010) and macaque monkeys (Petermann et al. 2009) exhibited a size distribution exponent close to 1.5."

      Along these lines, we noted two additional studies of high relevance that have been published since our initial submission (Capek et al. 2023, Lombardi et al. 2023), and we have added these references to the discussion of experimental exponents.

      Minor comments:

      1) The term 'latent variable' should be rigorously explained, as it is likely to be unfamiliar to some readers.

      Sentences and clauses have been added to the Introduction, Results and the Methods to clarify the term:

      Intro: “Numerous studies have reported relatively low-dimensional structure in the activity of large populations of neurons [refs], which can be modeled by a population of neurons that are broadly and heterogeneously coupled to multiple dynamical latent (i.e., unobserved) variables.”

      Results: “We studied a population of neurons that are not coupled to each other directly but are driven by a small number of dynamical latent variables -- that is, slowly changing inputs that are not themselves measured.”

      Methods: “Neurons are non-interacting, receiving inputs reflective of place-field selectivity as well as input current reflecting a random projection from a small number of dynamical latent variables, representing inputs shared across the population of neurons that are not directly measured.”

      2) There's a relatively important typo in the equations: Eq. 2 and Eq. 6 differ by a minus sign in the exponent. Eqs. 3 and 4 use the plus sign, but epsilon_0 on line 198 uses the minus sign. All very confusing until we figured out what was going on. But easy to fix.

      Thank you for catching this. We have made the following corrections:

      1) Figures adopted the sign convention that epsilon > 0, with larger values of epsilon decreasing the activity level. Signs in Eqs. 3 and 4 have been corrected to match.

      2) Equation 5 was missing a minus sign in front of the Hamiltonian. Restoring this minus sign fixed the discrepancy between 2 and 6.

      3) In Eq. 7, the left hand side is zeta'/zeta', which is equal to 1. Maybe it should be zeta'/zeta? Fixed, thank you.

      Additional comments:

      The authors are free to ignore these; they are meant to improve the paper.

      We are extremely grateful for the close reading of our paper and note the actions taken below.

      1) We personally would not use the abbreviation DLV; we find abbreviations extremely hard to remember. And DLV is not used that often.

      Done, thank you for the suggestion.

      2) l 198: epsilon_0 = -log(2^{1/N}-1) was kind of hard to picture -- we had to do a little algebra to make sense of it. Why not write e^{-epsilon_0} = 2^{1/N}-1 \approx log(2)/N, which in turn implies that epsilon_0 ~ log(N)?

      Thank you, good point. We have added a sentence now to better explain:

      "...which is maximized at $\epsilon_0 = - \log (2^{1/N} - 1)$, independent of $J_i$ and $\eta$. After some algebra, we find that $\epsilon_0 \sim \log N$ for large $N$."

      3) Typo on l 202: "We plot P_ava as a function of epsilon in Fig. 4B". 4B --> 4D.


      4) It would be easier on the reader if the tables were all in one place. It would be even nicer to put the parameters in the figure captions. Or at least N; that one is kind of important.

      Table placement was a Latex issue, which we have now fixed. We also have included links between tables and relevant figures and indicated network size.

      5) What's x_i in Eqs. 7 and 8?

      We added a sentence of explanation. These are the individual observations of avalanche sizes or durations, depending on what is being fit.

      6) The latent variables evolve according to an Ornstein-Uhlenbeck process. But we might equally expect oscillations or non-normal behavior coupling dynamical modes, and these are likely to give different behavior with respect to avalanches. It might be worth commenting on this.

      7) The model assumes a normal distribution of the coupling strengths between the latent variables and the binary units. Discussing the potential effects of different types of random coupling could provide interesting insights.

      Both 6 and 7 are interesting questions. At this point, we could speculate that the main results would be qualitatively unchanged, provided dynamics are sufficiently slow and that the distribution of coupling strengths is sufficiently broad (that is, there is variance in the coupling matrix across individual neurons). Further studies would be needed to make these statements more precise.

      8) In Fig 1, tau_f = 1E4 whereas in Fig 2 tau_f = 5E3. Why the difference?

      For Figure 1, we chose a set of parameters that gave clear scaling. In Figure 2, we saw some value in showing more than one example of scaling, hence different parameters for the examples in Fig 2 than Fig 1. Note that the Fig 1 simulations are represented in Fig. 2 G-J, as the 5-field simulation with tau_F = 1e4.

    4. eLife assessment

      This paper provides a simple example of a neural-like system that displays signatures of criticality, but not for any deep reason; it's just because a population of neurons are driven (independently!) by a slowly varying latent variable, something that could easily arise in neural systems. In this model, criticality does not imply optimal information transmission (one of its proposed functions). This is an important finding backed up by compelling evidence, and it should be influential in how people think about criticality in the brain.

    5. Joint Public Review:

      This paper shows that signatures of criticality -- in particular, power law behavior and "crackling" (the latter referring to a particular relationship among critical exponents) -- emerge from a biologically reasonable model that has nothing to do with criticality. Instead, the firing rate of a population of "neurons" (taken to be binary units) varies slowly in time. Importantly, conditioned on firing rate, the activity of each neuron (whether or not it emits a "spike") is independent of the activity of all the other neurons.

      To put this result in broader context, we need to be clear what critically is and is not. Critically is a very specific set of phenomena in physics in which fundamentally local interactions produce unexpected long-range behavior. The model in this paper has no such local interactions. Instead, each neuron is coupled to a small number of latent dynamical modes (which in turn produce slowly varying firing rates). Thus, signatures of criticality emerge through fundamentally non-critical mechanisms. Consequently, such signatures of criticality observed in the brain can be misleading: they might not be evidence that the brain is critical at all; instead, they might just be evidence that neural activity is mirroring a small number of dynamical latent variables.

      While this does not rule out criticality in the brain, it decidedly weakens the evidence for it, which was based on the following logic: critical systems give rise to power law behavior; power law behavior is observed in cortical networks; therefore, cortical networks operate near a critical point. Given, as shown in this paper, that power laws can arise from non-critical processes, the logic breaks. Moreover, the authors show that criticality does not imply optimal information transmission (one of its proposed functions). This highlights the necessity for more rigorous analyses to affirm criticality in the brain. In particular, it suggests that attention should be focused on the question "does the brain implement a dynamical latent variable model?".

      These authors are not the first to show that slowly varying firing rates can give rise to power law behavior (see, for example, Touboul and Destexhe, 2017; Priesemann and Shriki, 2018). However, to our knowledge they are the first to show crackling, and to compute information transmission in the critical state.

      Major comments:

      1) For many readers, the essential messages of the paper may not be immediately clear. For example, is the paper criticizing the criticality hypothesis of cortical networks, or does the criticism extend deeper, to the theoretical predictions of "crackling" relationships in physical systems as they can emerge without criticality? Statements like "We show that a system coupled to one or many dynamical latent variables can generate avalanche criticality ..." could be misinterpreted as affirming criticality. A more accurate language is needed; for instance, the paper could state that the model generates relationships observed in critical systems. The paper should provide a clearer conclusion and interpretation of the findings in the context of the criticality hypothesis of cortical dynamics.

      2) On lines 97-99, the authors state that "We are agnostic as to the origin of these inputs: they may be externally driven from other brain areas, or they may arise from recurrent dynamics locally". This idea is also repeated at the beginning of the Summary section. Perhaps being agnostic isn't such a good idea: it's possible that the recurrent dynamics is in a critical regime, which would just push the problem upstream. Presumably you're thinking of recurrent dynamics with slow timescales that's not critical? Or are you happy if it's in the critical regime? This should be clarified.

      3) Even though the model in Equation 2 has been described in a previous publication and the Methods section, more details regarding the origin and justification of this model in the context of cortical networks would be helpful in the Results section. Was it chosen just for simplicity, or was there a deeper reason?

      4) The Methods section (paragraph starting on lie 340) connects the time scale to actual time scales in neuronal systems, stating that "The timescales of latent variables examined range from about 3 seconds to 3000 seconds, assuming 3-ms bins". While bins of 3 ms are relevant for electrophysiological data from LFPs or high-density EEG/MEG, time scales above 10 seconds are difficult to generate through biophysically clear processes like ionic channels and synaptic transmission. The paper suggests that slow time scales of the latent variables are crucial for obtaining power law behavior resembling criticality. Yet, one way to generate such slow time scales is via critical slowing down, implying that some brain areas providing input to the network under study may operate near criticality. This pushes the problem toward explaining the criticality of those external networks. Hence, discussing potential sources for slow time scales in latent variables is crucial. One possibility you might want to consider is sources external to the organism, which could easily have time scales in the 1-24 hour range.

      5) It is common in neuronal avalanche analysis to calculate the branching parameter using the ratio of events in consecutive bins. Near-critical systems should display values close to 1, especially in simulations without subsampling. Including the estimated values of the branching parameter for the different cases investigated in this study could provide more comprehensive data. While the paper acknowledges that the obtained exponents in the model differ from those in a critical branching process, it would still be beneficial to offer the branching parameter of the observed avalanches for comparison.

      6) In the Discussion (l 269), the paper suggests potential differences between networks cultured in vitro and in vivo. While significant differences indeed exist, it's worth noting that exponents consistent with a critical branching process have also been observed in vivo (Petermann et al 2009; Hahn et al. 2010), as well as in large-scale human data.


      Touboul and Destexhe, 2017: Touboul J, Destexhe A. Power-law statistics and universal scaling in the absence of criticality. Phys Rev E. 2017 95:012413, 2017.

      Priesemann and Shriki, 2018: Priesemann V, Shriki O. PLOS Comp. Bio. 14:1-29, 2018.

      Petermann et al 2009: Oetermann, T., Thiagarajan, T. C., Lebedev, M. A., Nicolelis, M. A., Chialvo, D. R., and Plenz, D. PNAS 106:15921-15926, 2009.

      Hahn et al. 2010: Hahn, G., Petermann, T., Havenith, M. N., Yu, S., Singer, W., Plenz, D., and Nikolic, D. J. Neurophys. 104:3312-3322, 2010.

      Minor comments:

      1) The term 'latent variable' should be rigorously explained, as it is likely to be unfamiliar to some readers.

      2) There's a relatively important typo in the equations: Eq. 2 and Eq. 6 differ by a minus sign in the exponent. Eqs. 3 and 4 use the plus sign, but epsilon_0 on line 198 uses the minus sign. All very confusing until we figured out what was going on. But easy to fix.

      3) In Eq. 7, the left hand side is zeta'/zeta', which is equal to 1. Maybe it should be zeta'/zeta?

    1. eLife assessment

      This study showing how the Circle of Willis acquires smooth muscle coverage during development in the zebrafish model is important, and the evidence provided is solid, with only minor weaknesses. The work is of interest to researchers working on cerebral circulation, angiogenesis, and developmental vascular stabilization.

    2. Reviewer #1 (Public Review):

      Cheng et al investigated how vascular cells of the zebrafish Circle of Willis arteries differentiate using live imaging of transgenic zebrafish embryos. They find an anterior-to-posterior gradient in the differentiation of pdgfrb+ progenitors into acta2+ smooth muscle cells (SMCs). Computational modeling suggests that blood flow velocity and wall shear stress are higher in the anterior Circle of Willis arteries. Using pharmacological manipulations, they show that blood flow is required for the differentiation of SMCs but not for the short-term maintenance of the SMC differentiation state. They provide evidence that the increased expression of the flow-responsive Klf2 transcription factor in endothelial cells predates SMC differentiation, with the same anterior-to-posterior gradient, and that Klf2 expression is required for SMC differentiation.

      Overall, the study is very well-conducted and the paper is well-written. These important data point to hemodynamics as an important driver of artery muscularization in the Circle of Willis.

    3. Reviewer #2 (Public Review):

      Summary:<br /> Cheng et al. explore the development of the arteries that form the Circle of Willis and investigate how blood flow pulsatility influences vascular smooth muscle cell (VSMC) differentiation. Using live confocal imaging of the developing zebrafish, the authors show that endothelial cells in the Circle of Willis arteries transition from venous to arterial identity between 54 hours post-fertilization (hpf) and 3 days post-fertilization (dpf), and that this coincides with pdgfrb+ mural cell progenitor differentiation into acta2+ arterial VSMCs. They find that the anterior portions of the Circle of Willis, including the internal carotid arteries (CaDI), establish acta2 expression earlier than posterior aspects, likely due to faster flow rate and increased pulsatility through the CaDI. Then, using computational fluid dynamics, an in vitro co-culture assay, and genetic and drug manipulations of blood flow, the authors provide evidence that pdgfrb+ differentiation is dependent upon pulsatile blood flow and klf2a activation. The results add to our understanding of vascular development and suggest that deficits in pulsatile flow could be potential drivers of arteriopathies.

      Strengths:<br /> 1) Longitudinal confocal imaging of live developing zebrafish makes the timeline of arterial development in the Circle of Willis easy to understand. This is a strong approach to studying how vascular networks are altered with genetic and pharmacological manipulations.

      2) Rigorous use of multiple techniques to test the hypothesis that pulsatile blood flow is required for smooth muscle cell differentiation. The microangiography experiment, in vitro co-culture assay, and genetic and drug manipulations of heart rate at various developmental time points yield outcomes that are consistent with the hypothesis.

    4. Reviewer #3 (Public Review):

      Cheng et al. studied if and how blood flow regulates the differentiation of vascular smooth muscle cells (VSMC) in the Circle of Willis (CW) in zebrafish embryos. They show that CW vessels gradually acquire an arterial identity. VSMCs also undergo gradual differentiation, which correlates with blood flow velocity. Using cell culture they show that pulsatile blood flow promotes pericyte differentiation into smooth muscle cells. They further identify transcription factor klf2a as differentially regulated by blood flow, and show that klf2a inhibition results in VSMC differentiation. The authors conclude that pulsatile flow promotes VSMC differentiation through klf2a activation.

      Overall this is an important study, because VSMC differentiation in CW has not been previously studied, although analogous observations regarding the role of blood flow and klf2 involvement have been previously made in other systems and other vascular beds, for example, mouse klf2 mutants, which have deficient VSMC coverage of the dorsal aorta (Wu et al., 2008, JBC 283: 3942-50). The results convincingly show that VSMC differentiation in CW depends on the blood flow and that klf2a flow-dependent function regulates VSMC differentiation.

    1. eLife assessment

      This paper marks a fundamental advance in our understanding of prokaryotic Type IV restriction systems. The authors provide an encyclopedic overview of a hitherto uncharacterized branch of these systems, which they name CoCoNuTs, for coiled-coil nuclease tandems. They provide compelling evidence that these nucleases target RNA and are part of an echeloned defense response following viral infection. This article will be of great interest to scientists studying prokaryotic immunity mechanisms, as well as broadly to protein scientists engaged in the analysis, classification, and functional annotation of the proteome of life.

    2. Reviewer #1 (Public Review):

      Summary:<br /> In this manuscript, Bell et al. provide an exhaustive and clear description of the diversity of a new class of predicted type IV restriction systems that the authors denote as CoCoNuTs, for their characteristic presence of coiled-coil segments and nuclease tandems. Along with a comprehensive analysis that includes phylogenetics, protein structure prediction, extensive protein domain annotations, and an in-depth investigation of encoding genomic contexts, they also provide detailed hypotheses about the biological activity and molecular functions of the members of this class of predicted systems. This work is highly relevant, it underscores the wide diversity of defence systems that are used by prokaryotes and demonstrates that there are still many systems to be discovered. The work is sound and backed-up by a clear and reasonable bioinformatics approach. I do not have any major issues with the manuscript, but only some minor comments.

      Strengths:<br /> The analysis provided by the authors is extensive and covers the three most important aspects that can be covered computationally when analysing a new family/superfamily: phylogenetics, genomic context analysis, and protein-structure-based domain content annotation. With this, one can directly have an idea about the superfamily of the predicted system and infer their biological role. The bioinformatics approach is sound and makes use of the most current advances in the fields of protein evolution and structural bioinformatics.

      Weaknesses:<br /> It is not clear how coiled-coil segments were assigned if only based on AF2-predicted models or also backed by sequence analysis, as no description is provided in the methods. The structure prediction quality assessment is based solely on the average pLDDT of the obtained models (with a threshold of 80 or better). However, this is not enough, particularly when multimeric models are used. The PAE matrix should be used to evaluate relative orientations, particularly in the case where there is a prediction that parts from 2 proteins are interacting. In the case of multimers, interface quality scores, such as the ipTM or pDockQ, should also be considered and, at minimum, reported.

    3. Reviewer #2 (Public Review):

      Summary:<br /> In this work, using in-depth computational analysis, Bell et al. explore the diverse repertoire of type IV McrBC modification-dependent restriction systems. The prototypical two-component McrBC system has been structurally and functionally characterised and is known to act as a defence by restricting phage and foreign DNA containing methylated cytosines. Here, the authors find previously unanticipated complexity and versatility of these systems and focus on detailed analysis and classification of a distinct branch, the so-called CoCoNut, named after its composition of coiled-coil structures and tandem nucleases. These CoCoNut systems are predicted to target RNA as well as DNA and to utilise defence mechanisms with some similarity to type III CRISPR-Cas systems.

      Strengths:<br /> This work is enriched with a plethora of ideas and a myriad of compelling hypotheses that now await experimental verification. The study comes from the group that was amongst the first to describe, characterize, and classify CRISPR-Cas systems. By analogy, the findings described here can similarly promote ingenious experimental and conceptual research that could further drive technological advances. It could also instigate vigorous scientific debates that will ultimately benefit the community.

      Weaknesses:<br /> The multi-component systems described here function in the context of large oligomeric complexes. Some of the single chain AF2 predictions shown in this work are not compatible, for example, with homohexameric complex formation due to incompatible orientation of domains. The recent advances in protein structure prediction, in particular AlphaFold2 (AF2) multimer, now allow us to confidently probe potential protein-protein interactions and protein complex formation. This predictive power could be exploited here to produce a better glimpse of these multimeric protein systems. It can also provide a more sound explanation for some of the observed differences amongst different McrBC types.

    1. eLife assessment

      This important study conducts genetic analyses utilizing zebrafish, mouse, and mouse embryonic stem cell models to elucidate the role of Rtf1, a component of the PAF1 complex, in early cardiac development. By combining marker gene expression analysis, single-cell transcriptomics, ChIP-seq, and chemical inhibition, the study provides convincing evidence that Rtf1-mediated RNAPII (Pol2) transcriptional pausing is required for early cardiac development and that attenuation of pause release by pharmacological inhibition of Cdk9, a component of the PTEF-b complex that regulates the transition between the pausing and elongation phases of transcription, can partially restore transcriptional pausing and cardiogenesis in zebrafish rtf1 mutants. The work will be of broad interest to developmental biologists.

    2. Reviewer #1 (Public Review):

      Summary:<br /> The manuscript submitted by Langenbacher et al., entitled " Rtf1-dependent transcriptional pausing regulates cardiogenesis", describes very interesting and highly impactful observations about the function of Rtf-1 in cardiac development. Over the last few years, the Chen lab has published novel insights into the genes involved in cardiac morphogenesis. Here, they used the mouse model, the zebrafish model, cellular assays, single cell transcription, chemical inhibition, and pathway analysis to provide a comprehensive view of Rtf1 in RNAPII (Pol2) transcription pausing during cardiac development. They also conducted knockdown-rescue experiments to dissect the functions of Rtf1 domains.

      Strengths:<br /> The most interesting discovery is the connection between Rtf1 and CDK9 in regulating Pol2 pausing as an essential step in normal heart development. The design and execution of these experiments also demonstrate a thorough approach to revealing a previously underappreciated role of Pol2 transcription pausing in cardiac development. This study also highlights the potential amelioration of related cardiac deficiencies using small molecule inhibitors against cyclin dependent kinases, many of which are already clinically approved, while many other specific inhibitors are at various preclinical stages of development for the treatment of other human diseases. Thus, this work is impactful and highly significant.

    3. Reviewer #2 (Public Review):


      Langenbacher at el. examine the requirement of Rtf1, a component of the PAF1C, which regulates transcriptional pausing in cardiac development. The authors first confirm their previous morphant study with newly generated rtf1 mutant alleles, which recapitulate the defects in cardiac progenitor and differentiation gene expression observed previously in morphants. They then examine the conservation of Rtf1 in mouse embryos and embryonic stem cell-derived cardiomyocytes. Conditional loss of Rtf1 in mesodermal lineages and depletion in murine ESCs demonstrates a failure to turn on cardiac progenitor and differentiation marker genes, supporting conservation of Rtf1 in promoting cardiac development. The authors subsequently employ bulk RNA-seq on flow-sorted hand2:GFP+ cells and multiomic single-cell RNA-seq on whole Rtf1-depleted embryos at the 10-12 stage. These experiments corroborate that genes associated with cardiac and muscle development are lost. Furthermore, the differentiation trajectories suggest that the expression of genes associated with cardiac maturation is not initiated. Structure-function analysis supports that the Plus3 domain is necessary for its function in promoting cardiac progenitor formation. ChIP-seq for RNA Pol II on 10-12 somite stage embryos suggests that Rtf1 is required for proper promoter pausing. This defect can partially be rescued through use of a pharmacological inhibitor for Cdk9, which inhibits elongation, can partially restore elongation in rtf1 mutants.


      Many aspects of the data are strong, which support the basic conclusions of the authors that Rtf1 is required for transcriptional pausing and has a conserved requirement in vertebrate cardiac development. Areas of strength include the genetic data supporting the conserved requirement for Rtf1 in promoting cardiac development, the complementary bulk and single-cell RNA-sequencing approaches providing some insight into the gene expression changes of the cardiac progenitors, the structure-function analysis supporting the requirement of the Plus3 domain, and the pharmacological epistasis combined with the RNA Pol II ChIP-seq, supporting the mechanism implicating Cdk9 in the Rtf1 dependent mechanism of RNA Pol II pausing.


      While most of the basic conclusions are supported by the data, there are a number of analyses that are confusing as to why they chose to perform the experiments the way they did and some places where the interpretations presently do not support the interpretations. One of the conclusions is that the phenotype affects the maturation of the cardiomyocytes and they are arresting in an immature state. However, this seems to be mostly derived from picking a few candidates from the single cell data in Fig. 6. If that were the case, wouldn't the expectation be to observe relatively normal expression of earlier marker genes required for specification, such as Nkx2.5 and Gata5/6? The in situ expression analysis from fish and mice (Fig. 2 and Fig. 3) and bulk RNA-seq (Fig. 5) seems to suggest that there are pretty early specification and differentiation defects. While some genes associated with cardiac development are not changed, many of these are not specific to cardiomyocyte progenitors and expressed broadly throughout the ALPM. Similarly, it is not clear why a consistent set of cardiac progenitor genes (for instance mef2ca, nkx2.5, and tbx20) was analyzed for all the experiments, in particular with the single cell analysis.

      The point of the multiomic analysis is confusing. RNA- and ATAC-seq were apparently done at the same time. Yet, the focus of the analysis that is presented is on a small part of the RNA-seq data. This data set could have been more thoroughly analyzed, particularly in light of how chromatin changes may be associated with the transcriptional pausing. This seems to be a lost opportunity. Additionally, how the single cell data is covered in Supplemental Fig. 2 and 3 is confusing. There is no indication of what the different clusters are in the Figure or the legend.

      While the effect of Rtf1 loss on cardiomyocyte markers is certainly dramatic, it is not clear how well the mutant fish have been analyzed and how specific the effect is to this population. It is interpreted that the effects on cardiomyocytes are not due to "transfating" of other cell fates, yet supplemental Fig. 4 shows numerous effects on potentially adjacent cell populations. Minimally, additional data needs to be provided showing the live fish at these stages and marker analysis to support these statements. In some images, it is not clear the embryos are the same stage (one can see pigmentation in the eyes of controls that is not in the mutants/morphants), causing some concern about developmental delay in the mutants.

      With respect to the transcriptional pausing defects in the Rtf1 deficient embryos, it is not clear from the data how this effect relates to the expression of the cardiac markers. This could have been directly analyzed with some additional sequencing, such as PRO-seq, which would provide a direct analysis of transcriptional elongation.

      Some additional minor issues include the rationale that sequence conservation suggests an important requirement of a gene (line 137), which there are many examples this isn't the case, referencing figures panels out of order in Figs. 4, 7, and 8) as described in the text, and using the morphants for some experiments, such as the rescue, that could have been done in a blinded manner with the mutants.

    1. eLife assessment

      The current manuscript re-examines an established claim in the literature that human PANX-1 is regulated by Src kinase phosphorylation at two tyrosine residues, Y199 and Y309. This issue is important for our understanding of Pannexin channel regulation. The authors present an extensive series of experiments that fail to detect PANX-1 phosphorylation at these sites. Although the authors' approach is more rigorous than the previous studies, this work relies primarily on negative results that are not unambiguously definitive; the work nonetheless provides a compelling reason for the field to reexamine conclusions drawn in earlier studies.

    2. Reviewer #1 (Public Review):

      The current manuscript revisits previous reports in the literature. The human Pannexin 1 channel is regulated by phosphorylation at two residues by Src kinase. From this series of experiments, the authors conclude that PANX-1 is not phosphorylated at these residues.

      Strengths of the manuscript:<br /> The biggest strength of the manuscript is the comprehensiveness of the approach. The authors recapitulate prior experiments in the literature, and also add a series of new, orthogonal experiments that all examine the claim of PANX-1 phosphorylation. The breadth of the reported experiments extends over multiple cell lines and protein constructs, in vitro purified proteins, mass spec, different phosphorylation detection reagents and antibodies, and functional electrophysiology assays that show that the addition of Src does not impact gating. The combined weight of all these data strongly suggests that the field should re-examine the claim that PANX-1 is regulated by phosphorylation at Y199 and Y309.

      Another strength is that the authors go beyond simply showing that the antibodies do not recognize phosphorylated PANX-1. They also provide potential mechanisms for how the antibodies may be misleading. Both antibodies recognize phosphorylated Src-1. In the case of anti-PANX1-pY308, the authors provide solid mutagenesis evidence that the antibody also weakly recognizes a non-phosphorylated epitope of PANX1 in the same region as the tyrosine. This helps make a convincing case.

      Such experiments, while not glamorous, have great practical importance for developing an accurate understanding of how Pannexin channels are regulated.

    3. Reviewer #2 (Public Review):

      Summary:<br /> The widely distributed pannexin 1 (PANX1) is an ATP-permeable channel that plays an important role in intercellular communication and has been implicated in various pathophysiological processes and diseases. Previous studies have demonstrated that PANX1 can be phosphorylated at two molecular sites via the non-receptor kinase Src, thereby leading to channel opening and ATP release. In this paper, the authors used a variety of methods to detect tyrosine phosphorylation modification of PANX1 channel protein, however, their results showed that commercially available antibodies against the two phosphorylation sites used in previous studies did not work well, in other words, phosphorylation changes in PANX1 could not be detected by those antibodies. Therefore, the authors call for the re-examination and evaluation of previous research results.

      Strengths:<br /> In general, this is a meticulous study, using different detection methods and different expression systems.

    4. Reviewer #3 (Public Review):

      Summary:<br /> It has been proposed in the literature, that the ATP release channel Panx1 can be activated in various ways, including by tyrosine phosphorylation of the Panx1 protein. The present study reexamines the commercial antibodies used previously in support of the phosphorylation hypothesis and the presented data indicate that the antibodies may recognize proteins unrelated to Panx1. Consequently, the authors caution about the use and interpretation of results obtained with these antibodies.

      Strengths:<br /> The manuscript by Ruan et al. addresses an important issue in Panx1 research, i.e. the activation of the channel formed by Panx1 via protein phosphorylation. If the authors' conclusions are correct, the previous claims for Panx1 phosphorylation on the basis of the commercial anti-phospho-Panx1 antibodies would be in question.

      This is a very detailed and comprehensive analysis making use of state-of-the-art techniques, including mass spectrometry and phos-tag gel electrophoresis.

      In general, the study is well-controlled as relating to negative controls.

      The value of this manuscript is, that it could spawn new, more function-oriented studies on the activation of Panx1 channels.

      Weaknesses:<br /> Although the manuscript addresses an important issue, the activation of the ATP-release channel Panx1 by protein phosphorylation, the data provided do not support the firm conclusion that such activation does not exist. The failure to reproduce published data obtained with commercial anti-phospho Panx1 antibodies can only be of limited interest for a subfield.

      1. The title claiming that "Panx1 is NOT phosphorylated..." is not justified by the failure to reproduce previously published data obtained with these antibodies. If, as claimed, the antibodies do not recognize Panx1, their failure cannot be used to exclude tyrosine phosphorylation of the Panx1 protein. There is no positive control for the antibodies.

      2. The authors claim that exogenous SRC expression does not phosphorylate Y198. DeLalio et al. 2019 show that Panx1 is constitutively phosphorylated at Y198, so an effect of exogenous SRC expression is not necessarily expected.

      3. The authors argue that the GFP tag of Panx1at the COOH terminus does not interfere with folding since the COOH modified (thrombin cleavage site) Panx1 folds properly, forming an amorphous glob in the cryo-EM structure. However, they do not show that the COOH-modified Panx1 folds properly. It may not, because functional data strongly suggest that the terminal cysteine dives deep into the pore. For example, the terminal cysteine, C426, can form a disulfide bond with an engineered cysteine at position F54 (Sandilos et al. 2012).

      4. The authors dismiss the additional arguments for tyrosine phosphorylation of Panx1 given by the various previous studies on Panx1 phosphorylation. These studies did not, as implied, solely rely on the commercial anti-phospho-Panx1 antibodies, but also presented a wealth of independent supporting data. Contrary to the authors' assertion, in the previous papers the pY198 and pY308 antibodies recognized two protein bands in the size range of glycosylated and partial glycosylated Panx1.

      5. A phosphorylation step triggering channel activity of Panx1 would be expected to occur exclusively on proteins embedded in the plasma membrane. The membrane-bound fraction is small in relation to the total protein, which is particularly true for exogenously expressed proteins. Thus, any phosphorylated protein may escape detection when total protein is analyzed. Furthermore, to be of functional consequence, only a small fraction of the channels present in the plasma membrane need to be in the open state. Consequently, only a fraction of the Panx1 protein in the plasma membrane may need to be phosphorylated. Even the high resolution of mass spectroscopy may not be sufficient to detect phosphorylated Panx1 in the absence of enrichment processes.

      6. In the electrophysiology experiments described in Figure 7, there is no evidence that the GFP-tagged Panx1 is in the plasma membrane. Instead, the image in Figure 7a shows prominent fluorescence in the cytoplasm. In addition, there is no evidence that the CBX-sensitive currents in 7b are mediated by Panx1-GFP and are not endogenous Panx1. Previous literature suggests that the hPanx1 protein needs to be cleaved (Chiu et al. 2014) or mutated at the amino terminus (Michalski et al 2018) to see voltage-activated currents, so it is not clear that the currents represent hPANX1 voltage-activated currents.

    1. eLife assessment

      The manuscript constitutes an important contribution to antimalarial drug discovery, employing diverse systems biology methodologies; with a focus on an improved M1 metalloprotease inhibitor, the study provides convincing evidence of the utility of chemoproteomics in elucidating the preferential targeting of PfA-M1. Additionally, metabolomic analysis effectively documents specific alterations in the final steps of hemoglobin breakdown. These findings underscore the potential of the developed methodology, not only in understanding PfA-M1 targeting but also in its broader applicability to diverse malarial proteins or pathways. Revisions are needed to further enhance overall clarity and detail the scope of these implications.

    2. Reviewer #1 (Public Review):

      Summary:<br /> The article "Chemoproteomics validates selective targeting of Plasmodium M1 alanyl aminopeptidase as a cross-species strategy to treat malaria" presents a series of biochemical methods based on proteomics and metabolomics, as a means to:<br /> (1) validate the specific targeting of biologically active molecules (MIPS2673) towards a defined (unique) protein target within a parasite<br /> and<br /> (2) to explore whether by quantifying the perturbations generated at the level of the parasite metabolome, it is possible to extrapolate which metabolic pathway has been disrupted by using this biologically active molecule and whether this may further confirm selective targeting in parasites of the expected (or in-vitro targeted) enzyme (here PfA-1).

      The inhibitor used in this work by the authors (MIPS2673) is to my knowledge a novel one, although belonging to a chemical series previously explored by the authors, which recently enabled them to discover a specific PfA-M17 inhibitor, MIPS2571 (Edgard et al., 2022, ref 11 of this current work). Indeed, inhibitors specifically targeting either PfA-M1 or PfA-M17 (and not both, as currently done in the past) are scarce today, and highly needed to functionally characterize these two zinc-aminopeptidases. MIPS2673, blocks the development of erythrocytic stages of Plasmodium falciparum with an EC50 of 324 nM, blocks the parasite development at the young trophozoite stage at 5x EC50 (but at ring stages at 10xEC50, figure 1E), and inhibits the enzymatic activity of PfA-M1 (and its ortholog Pv-M1) but not of the related malarial metallo-aminopeptidases (M17 and M18 families) nor the human metalloenzymes from closely related enzymatic families, supporting its selective targeting of PfA-M1 (and Pv-M1).

      All experiments are carried out in vitro (e.g. biochemical studies such as enzymology, proteomics, metabolomics) and on cultured parasites (erythrocyte stages of Plasmodium falciparum and several gametocytes stages obtained in vitro); there are no in vivo manipulations. The work related to Plasmodium vivax, which justifies the "cross-species" indication in the title of the article, is restricted to using a recombinant form of the M1-family aminopeptidase in enzymatic assays. The rest of the work concerns only Plasmodium falciparum. While I found globally that this work is original and brings new data and above all proposes chemical validation approaches that could be used for other target validations under similar limiting conditions (impossibility of KO of the gene), I have some specific questions to address to the authors.

      Strengths and weaknesses:<br /> -The chemoproteomic approach, that explores the ability of MIPS2673 to more significantly "protect" the putative target (PfA-M1) against thermal degradation or enzymatic attack (by proteinase K), to document its selective targeting towards PfA-M1 (the inhibitor, once associated with its target, is expected to stabilize its structure or prevent the action of end proteases), uses several concentrations of MIPS2673 and provides convincing results. My main criticism is that these tests are carried out with parasite extracts enriched in 30-38 hours old forms, and restricted to the fraction of soluble proteins isolated from these parasitic forms, which still limits the scope of the analysis. It is clear that this methodological approach is a choice that can be argued both biologically (PfA-M1 is well expressed in these stages of the parasite development) and biochemically (it is difficult to do proteomic analyses on insoluble proteins) but I regret that the authors do not discuss these limitations further, notably, I would have expected (from Figure 1E) some targets to be also present at ring stages.

      -The metabolomic approach, by documenting the ability of MIPS2673 to selectively increase the number of non-hydrolyzed dipeptides in treated versus untreated parasites is another argument in favor of the selective targeting of PfA-M1 by MIPS2673, in particular by its broad-spectrum aminopeptidase action preferentially targeting peptides resulting from the degradation of hemoglobin by the parasite. The relative contribution of peptides derived from host hemoglobin versus other parasite proteins is, however, little discussed.

      The work as a whole remains highly interesting, both for the specific topic of PfA-M1's role in parasite biology and for the method, applicable to other malarial drug contexts.

    3. Reviewer #2 (Public Review):

      In this manuscript, the authors first developed a new small molecular inhibitor that could target specifically the M1 metalloproteases of both important malaria parasite species Plasmodium falciparum and P. vivax. This was done by a chemical modification of a previously developed molecule that targets PfM1 as well as PfM17 and possibly other Plasmodial metalloproteases. After the successful chemical synthesis, the authors showed that the derived inhibitor, named MIPS2673, has a strong antiparasitic activity with IC50 342 nM and it is highly specific for M1. With this in mind, the authors first carried out two large-scale proteomics to confirm the MIPS2673 interaction with PfM1 in the context of the total P. falciparum protein lysate. This was done first by using thermal shift profiling and subsequently limited proteolysis. While the first demonstrated overall interaction, the latter (limited proteolysis) could map more specifically the site of MIPS2673-PfM1 interaction, presumably the active site. Subsequent metabolomics analysis showed that MIPS2673 cytotoxic inhibitory effect leads to the accumulation of short peptides many of which originate from hemoglobin. Based on that the authors argue that the MIPS2673 mode of action (MOA) involves inhibition of hemoglobin digestion that in turn inhibits the parasite growth and development.

    4. Reviewer #3 (Public Review):

      This is a manuscript that attempts to validate Plasmodium M1 alanyl aminopeptidase as a target for antimalarial drug development. The authors provide evidence that MIPS2673 inhibits recombinant enzymes from both Pf and Pv and is selective over other proteases. There is in vitro antimalarial activity. Chemoproteomic experiments demonstrate selective targeting of the PfA-M1 protease.

      This is a continuation of previous work focused on designing inhibitors for aminopeptidases by a subset of these authors. Medicinal chemistry explorations resulted in the synthesis of MIPS2673 which has improved properties including potent inhibition of PfA-M1 and PvA-M1 with selectivity over a closed related peptidase. The compound also demonstrated selectivity over several human aminopeptidases and was not toxic to HEK293 cells at 40 uM. The activity against P. falciparum blood-stage parasites was about 300 nM.

      Thermal stability studies confirmed that PfA-M1 was a binding target, however, there were other proteins consistently identified in the thermal stability studies. This raises the question as to their potential role as additional targets of this inhibitor. The authors dismiss these because they are not metalloproteases, but further analysis is warranted. This is particularly important as the authors were not able to generate mutants using in vitro evolution of resistance strategies. This often indicates that the inhibitor has more than one target.

      The next set of experiments focused on a limited proteolysis approach. Again several proteins were identified as interacting with MIPS2673 including metalloproteases. The authors go on to analyze the LiP-MS data to identify the peptide from PfA-M1 which putatively interacts with MIPS2673. The authors are clearly focused on PfA-M1 as the target, but a further analysis of the other proteins identified by this method would be warranted and would provide evidence to either support or refute the authors' conclusions.

      The final set of experiments was an untargeted metabolomics analysis. They identified 97 peptides as significantly dysregulated after MIPS2673 treatment of infected cells and most of these peptides were derived from one of the hemoglobin chains. The accumulation of peptides was consistent with a block in hemoglobin digestion. This experiment does reveal a potential functional confirmation, but questions remain as to specificity.

      Overall, this is an interesting series of experiments that have identified a putative inhibitor of PfA-M1 and PvA-M1. The work would be significantly strengthened by structure-aided analysis. It is unclear why putative binding sites cannot be analyzed via specific mutagenesis of the recombinant enzyme. In the thermal stability and LiP -MS analysis, other proteins were consistently identified in addition to PfA-M1 and yet no additional analysis was undertaken to explore these as potential targets. The metabolomics experiments were potentially interesting, but without significant additional work including different lengths of treatment and different stages of the parasite, the conclusions drawn are overstated. Many treatments disrupt hemoglobin digestion - either directly or indirectly and from the data presented here it is premature to conclude that treatment with MIPS2673 directly inhibits hemoglobin digestion. Finally, the potency of this compound on parasites grown in vitro is 300 nM - this would need improvements in potency and demonstration of in vivo efficacy in the SCID mouse model to consider this a candidate for a drug.

      Summary:<br /> Overall, this is an interesting series of experiments that have identified a putative inhibitor of the Plasmodium M1 alanyl aminopeptidases, PfA-M1 and PvA-M1.

      Strengths:<br /> The main strengths include the synthesis of MIPS2673 which is selectively active against the enzymes and in whole-cell assay.

      Weaknesses:<br /> The weaknesses include the lack of additional analysis of additional targets identified in the chemoproteomic approaches.