10,000 Matching Annotations
  1. Last 7 days
    1. eLife Assessment

      This valuable study uses fiber photometry, implantable lenses, and optogenetics, to show that a subset of subthalamic nucleus neurons are active during movement, and that active but not passive avoidance depends in part on STN projections to substantia nigra. The strength of the evidence for these claims is solid, whereas evidence supporting the claims that STN is involved in cautious responding is unclear as presented. This paper may be of interest to basic and applied behavioural neuroscientists working on movement or avoidance.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript presents a robust set of experiments that provide new insights into the role of STN neurons during active and passive avoidance tasks. These forms of avoidance have received comparatively less attention in the literature than the more extensively studied escape or freezing responses, despite being extremely relevant to human behaviour and more strongly influenced by cognitive control.

      Strengths:

      Understanding the neural infrastructure supporting avoidance behaviour would be a fundamental milestone in neuroscience. The authors employ sophisticated methods to delineate the role of STN neurons during avoidance behaviours. The work is thorough and the evidence presented is compelling. Experiments are carefully constructed, well-controlled, and the statistical analyses are appropriate.

      Weaknesses:

      One possible remaining conceptual concern that might require future work is determining whether STN primarily mediates higher-level cognitive avoidance or if its activation primarily modulates motor tone.

    3. Reviewer #2 (Public review):

      Summary:

      Zhou, Sajid et al. present a study investigating the STN involvement in signaled movement. They use fiber photometry, implantable lenses, and optogenetics during active avoidance experiments to evaluate this. The data are useful for the scientific community and the overall evidence for their claims is solid, but many aspects of the findings are confusing. The authors present a huge collection of data, it is somewhat difficult to extract the key information and the meaningful implications resulting from these data.

      Strengths:

      The study is comprehensive in using many techniques and many stimulation powers and frequencies and configurations.

      Weaknesses - re-review:

      All previous weaknesses have been addressed. The authors should explain how inhibition of the STN impairing active avoidance is consistent with the STN encoding cautious action. If 'caution' is related to avoid latency, why does STN lesion or inhibition increase avoid latency, and therefore increase caution? Wouldn't the opposite be more consistent with the statement that the STN 'encodes cautious action'?

    4. Reviewer #3 (Public review):

      Summary:

      The authors use calcium recordings from STN to measure STN activity during spontaneous movement and in a multi-stage avoidance paradigm. They also use optogenetic inhibition and lesion approaches to test the role of STN during the avoidance paradigm. The paper reports a large amount of data and makes many claims, some seem well supported to this Reviewer, others not so much.

      Strengths:

      Well-supported claims include data showing that during spontaneous movements, especially contraversive ones, STN calcium activity is increased using bulk photometry measurements. Single-cell measures back this claim but also show that it is only a minority of STN cells that respond strongly, with most showing no response during movement, and a similar number showing smaller inhibitions during movement.

      Photometry data during cued active avoidance procedures show that STN calcium activity sharply increases in response to auditory cues, and during cued movements to avoid a footshock. Optogenetic and lesion experiments are consistent with an important role for STN in generating cue-evoked avoidance. And a strength of these results is that multiple approaches were used.

      Original Weaknesses:

      I found the experimental design and presentation convoluted and some of the results over-interpreted.

      As presented, I don't understand this idea that delayed movement is necessarily indicative of cautious movements. Is the distribution of responses multi-modal in a way that might support this idea; or do the authors simply take a normal distribution and assert that the slower responses represent 'caution'? Even if responses are multi-modal and clearly distinguished by 'type', why should readers think this that delayed responses imply cautious responding instead of say: habituation or sensitization to cue/shock, variability in attention, motivation, or stress; or merely uncertainty which seems plausible given what I understand of the task design where the same mice are repeatedly tested in changing conditions. This relates to a major claim (i.e., in the title).

      Related to the last, I'm struggling to understand the rationale for dividing cells into 'types' based the their physiological responses in some experiments.

      In several figures the number of subjects used was not described. This is necessary. Also necessary is some assessment of the variability across subjects. The only measure of error shown in many figures relates trial-to-trial or event variability, which is minimal because in many cases it appears that hundreds of trials may have been averaged per animal, but this doesn't provide a strong view of biological variability (i.e., are results consistent across animals?).

      It is not clear if or how spread of expression outside of target STN was evaluated, and if or how or how many mice were excluded due to spread or fiber placements. Inadequate histological validation is presented and neighboring regions that would be difficult to completely avoid, such as paraSTN may be contributing to some of the effects.

      Raw example traces are not provided.

      The timeline of the spontaneous movement and avoidance sessions were not clear, nor the number of events or sessions per animal and how this was set. It is not clear if there was pre-training or habituation, if many or variable sessions were combined per animal, or what the time gaps between sessions was, or if or how any of these parameters might influence interpretation of the results.

      Comments on revised version:

      The authors removed the optogenetic stimulation experiments, but then also added a lot of new analyses. Overall the scope of their conclusions are essentially unchanged.

      Part of the eLife model is to leave it to the authors discretion how they choose to present their work. But my overall view of it is unchanged. There are elements that I found clear, well executed, and compelling. But other elements that I found difficult to understand and where I could not follow or concur with their conclusions.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #2 (Public review):

      (1) Vglut2 isn't a very selective promoter for the STN. Did the authors verify every injection across brain slices to ensure the para-subthalamic nucleus, thalamus, lateral hypothalamus, and other Vglut2-positive structures were never infected?

      The STN is anatomically well-confined, with its borders and the overlying zona incerta (composed of GABAergic neurons) providing protection against off-target expression in most neighboring forebrain regions. All viral injections were histologically verified and did not into extend into thalamic or hypothalamic areas. As described in the Methods, we employed an app we developed (Brain Atlas Analyzer, available on OriginLab) that aligns serial histological sections with the Allen Brain Atlas to precisely assess viral spread and confirm targeting accuracy. The experiments included in the revised manuscript now focus on optogenetic inhibition and irreversible lesion approaches—three complementary methods that consistently targeted the STN and yielded similar behavioral effects.

      (2) The authors say in the methods that the high vs low power laser activation for optogenetic experiments was defined by the behavioral output. This is misleading, and the high vs low power should be objectively stated and the behavioral results divided according to the power used, not according to the behavioral outcome.

      Optogenetic excitation is no longer part of the study.

      (3) In the fiber photometry experiments exposing mice to the range of tones, it is impossible to separate the STN response to the tone from the STN response to the movement evoked by the tone. The authors should expose the mouse to the tones in a condition that prevents movement, such as anesthetized or restrained, to separate out the two components.

      The new mixed-effects modeling approach clearly differentiates sensory (auditory) from motor contributions during tone-evoked STN activation. In prior work (see Hormigo et al, 2023, eLife), we explored experimental methods such as head restraint or anesthesia to reduce movement, but we concluded that these approaches are unsuitable for addressing this question. Mice exhibit substantial residual movement even when head-fixed, and anesthesia profoundly alters neural excitability and behavioral state, introducing major confounds. To fully eliminate movement would require paralysis and artificial ventilation, which would again disrupt physiological network dynamics and raise ethical concerns. Therefore, the current modeling approach—incorporating window-specific covariates for movement—is the most appropriate and rigorous way to dissociate tone-evoked sensory activity from motor activity in behaving animals.

      (4) The claim 'STN activation is ideally suited to drive active avoids' needs more explanation. This claim comes after the fiber photometry experiments during active avoidance tasks, so there has been no causality established yet.

      Text adjusted. 

      (5) The statistical comparisons in Figure 7E need some justification and/or clarification. The 9 neuron types are originally categorized based on their response during avoids, then statistics are run showing that they respond differently during avoids. It is no surprise that they would have significantly different responses, since that is how they were classified in the first place. The authors must explain this further and show that this is not a case of circular reasoning.

      Statistically verifying the clustering is useful to ensure that the selected number of clusters reflects distinct classes. It is also necessary when different measurements are used to classify (movement time series classified the avoids) and to compare neuronal types within each avoid mode/class (know called “mode”). Moreover, the new modeling approach goes beyond the prior statistical limitations related to considering movement and neuronal variables separately. 

      (6) The authors show that neurons that have strong responses to orientation show reduced activity during avoidance. What are the implications of this? The author should explain why this is interesting and important.

      The new modeling approach goes beyond the prior analysis limitations. For instance, it shows that most of the prior orienting related activations closely reflect the orienting movement, and only in a few cases (noted and discussed in the results) orienting activations are related to the behavioral contingencies or behavioral outcomes in the task. 

      (8) The experiments in Figure 10 are used to say that STN stimulation is not aversive, but they only show that STN stimulation cannot be used as punishment in place of a shock. This doesn't mean that it is not aversive; it just means it is not as aversive as a shock. The authors should do a simpler aversion test, such as conditioned or real-time place preference, to claim that STN stimulation is not aversive. This is particularly surprising as previous work (Serra et al., 2023) does show that STN stimulation is aversive.

      Optogenetic excitation is no longer part of the study. 

      (7) It is not clear which conditions each mouse experienced in which order. This is critical to the interpretation of Figure 9 and the reduction of passive avoids during STN stimulation. Did these mice have the CS1+STN stimulation pairing or the STN+US pairing prior to this experiment? If they did, the stimulation of the STN could be strongly associated with either punishment or with the CS1 that predicts punishment. If that is the case, stimulating the STN during CS2 could be like presenting CS1+CS2 at the same time and could be confusing.

      Optogenetic excitation is no longer part of the study. 

      (8) The experiments in Figure 10 are used to say that STN stimulation is not aversive, but they only show that STN stimulation cannot be used as punishment in place of a shock. This doesn't mean that it is not aversive; it just means it is not as aversive as a shock. The authors should do a simpler aversion test, such as conditioned or real-time place preference, to claim that STN stimulation is not aversive. This is particularly surprising as previous work (Serra et al., 2023) does show that STN stimulation is aversive.

      Optogenetic excitation is no longer part of the study.

      (9) In the discussion, the idea that the STN encodes 'moving away' from contralateral space is pretty vague and unsupported. It is puzzling that the STN activates more strongly to contraversive turns, but when stimulated, it evokes ipsiversive turns; however, it seems a stretch to speculate that this is related to avoidance. In the last experiments of the paper, the axons from the STN to the GPe and to the midbrain are selectively stimulated. Do these evoke ipsiversive turns similarly?

      Optogenetic excitation is no longer part of the study. 

      (10) In the discussion, the authors claim that the STN is essential for modulating action timing in response to demands, but their data really only show this in one direction. The STN stimulation reliably increases the speed of response in all conditions (except maximum speed conditions such as escapes). It seems to be over-interpreting the data to say this is an inability to modulate the speed of the task, especially as clear learning and speed modulation do occur under STN lesion conditions, as shown in Figure 12B. The mice learn to avoid and increase their latency in AA2 vs AA1, though the overall avoids and latency are different from controls. The more parsimonious conclusion would be that STN stimulation biases movement speed (increasing it) and that this is true in many different conditions.

      Optogenetic excitation is no longer part of the study.

      (11)  In the discussion, the authors claim that the STN projections to the midbrain tegmentum directly affect the active avoidance behavior, while the STN projections to the SNr do not affect it. This seems counter to their results, which show STN projections to either area can alter active avoidance behavior. What is the laser power used in these terminal experiments? If it is high (3mW), the authors may be causing antidromic action potentials in the STN somas, resulting in glutamate release in many brain areas, even when terminals are only stimulated in one area. The authors could use low (0.25mW) laser power in the terminals to reduce the chance of antidromic activation and spatially restrict the optical stimulation.

      Optogenetic excitation is no longer part of the study. 

      (12) Was normality tested for data prior to statistical testing?

      Yes, although now we use mixed models

      (13) Why are there no error bars on Figure 5B, black circles and orange triangles?

      When error bars are not visible, they are smaller than the trace thickness or bar line—for example, in Figure 5B, the black circles and orange triangles include error bars, but they are smaller than the symbol size.

      Reviewer #3 (Public review):

      (1) I really don't understand or accept this idea that delayed movement is necessarily indicative of cautious movements. Is the distribution of responses multi-modal in a way that might support this idea, or do the authors simply take a normal distribution and assert that the slower responses represent 'caution'? Even if responses are multi-modal and clearly distinguished by 'type', why should readers think this that delayed responses imply cautious responding instead of say: habituation or sensitization to cue/shock, variability in attention, motivation, or stress; or merely uncertainty which seems plausible given what I understand of the task design where the same mice are repeatedly tested in changing conditions. This relates to a major claim (i.e., in the work's title).

      In our study, “caution” is defined operationally as the tendency to delay initiation of an avoidance response in demanding situations (e.g., taking more time or care before crossing a busy street). The increase in avoidance latency with task difficulty is highly robust, as we have shown previously through detailed analyses of timing distributions and direct comparisons with appetitive behaviors (e.g., Zhou et al., 2022 JNeurosci). Moreover, we used the tracked movement time series to statistically classify responses into cautious modes, which is likely novel. This definition can dissociate cautious responding from broader constructs listed by a reviewer, such as attention, motivation, or stress, which must be explicitly defined to be rigorously considered in this context, including the likelihood that they covary with caution without being equivalent to it. 

      Cue-evoked orienting responses at CS onset are directly measured, and their habituation and sensitization have been characterized in our prior work (e.g., Zhou et al., 2023 JNeurosci). US-evoked escapes are also measured in the present study and directly compared with avoidance responses. Together, these analyses provide a rigorous and consistent framework for defining and quantifying caution within our behavioral procedures.

      Importantly, mice exhibit cautious responding as defined here across different tasks, making it more informative to classify avoidance responses by behavioral mode rather than by task alone. Accordingly, in the miniscope, single-neuron, and mixed-effects model analyses, we classified active avoids into distinct modes reflecting varying levels of caution. Although these modes covary with task contingencies, their explicit classification improves model predictability and interpretability with respect to cautious responding.

      (2) Related to the last, I'm struggling to understand the rationale for dividing cells into 'types' based the their physiological responses in some experiments (e.g., Figure 7).

      This section has now been expanded into 3 figures (Fig. 7-9) with new modeling approaches that should make the rationale more straight forward.

      By emphasizing the mixed-effects modeling results and integrating these analyses directly into the figures, the revised manuscript now more clearly delineates what is encoded at the population and single-neuron levels. Including movement and baseline covariates allowed us to dissociate motor-related modulation from other neural signals, substantially clarifying the distinction between movement encoding and other task-related variables, which we focus on in the paper. These analyses confirm the strong role of the STN in representing movement while revealing additional signals related to aversive stimulation and cautious responding that persist after accounting for motor effects. These signals arise from distinct neuronal populations that can be differentiated by their movement sensitivity and activation patterns across avoidance modes, reflecting varying levels of caution. At the same time, several effects that initially reflected orienting-related activity at CS-onset (note that our movement tracking captures both head position and orientation as a directional vector) dissipated once movement and baseline covariates were included in the models, emphasizing the utility of the analytical improvements in the revision.

      (3)The description and discussion of orienting head movements were not well supported, but were much discussed in the avoidance datasets. The initial speed peaks to cue seem to be the supporting data upon which these claims rest, but nothing here suggests head movement or orientation responses.

      As described in the methods (and noted above), we track the head and decompose the movement into rotational and translational components. With the new approach, several effects that initially reflected orienting-related activity at CS-onset (note that our movement tracking captures both head position and orientation as a directional vector) dissipated once movement and baseline covariates were included in the models, emphasizing the utility of the analytical improvements in the revision.

      (4) Similar to the last, the authors note in several places, including abstract, the importance of STN in response timing, i.e., particularly when there must be careful or precise timing, but I don't think their data or task design provides a strong basis for this claim.

      The avoidance modes and the measured latencies directly support the relation to action timing, but now the portion of the previous paper about optogenetic excitation and apparently the main source of criticism is no longer in the present study. 

      (5) I think that other reports show that STN calcium activity is recruited by inescapable foot shock as well. What do these authors see? Is shock, independent of movement, contributing to sharp signals during escapes?

      The question, “Is shock, independent of movement, contributing to sharp signals during escapes?” is now directly addressed in the revised analyses. By incorporating movement and baseline covariates into the mixed-effects models, we dissociate STN activity related to aversive stimulation from that associated with motor output. The results show that shock-evoked STN activation persists even after controlling for movement within defined neuronal populations, supporting a specific nociceptive contribution independent of motor dynamics—a dissociation that appears to be new in this field.

      (6) In particular, and related to the last point, the following work is very relevant and should be cited:  Note that the focus of this other paper is on a subset of VGLUT2+ Tac1 neurons in paraSTN, but using VGLUT2-Cre to target STN will target both STN and paraSTN.

      We appreciate the reviewer’s reference to the recent preprint highlighting the role of the para-subthalamic nucleus in avoidance learning. However, our study focused specifically on performance in well-trained mice rather than on learning processes. Behavioral learning is inherently more variable and can be disrupted by less specific manipulations, whereas our experiments targeted the stable execution of learned avoidance behaviors. Future work will extend these findings to the learning phase and examine potential contributions of subthalamic subdivisions, which our current Vglut2-based manipulations do not dissociate. We will consider this and related work more closely in those studies.

      (7) In multiple other instances, claims that were more tangential to the main claims were made without clearly supporting data or statistics. E.g., claim that STN activation is related to translational more than rotational movement; claim that GCaMP and movement responses to auditory cues were small; claims that 'some animals' responded differently without showing individual data.

      We have adjusted the text accordingly.

      (8) In several figures, the number of subjects used was not described. This is necessary. Also necessary is some assessment of the variability across subjects. The only measure of error shown in many figures relates to trial-to-trial or event variability, which is minimal because, in many cases, it appears that hundreds of trials may have been averaged per animal, but this doesn't provide a strong view of biological variability. When bar/line plots are used to display data, I recommend showing individual animals where feasible.

      All experiments report number of mice and sessions. Wherever feasible, we display individual data points (e.g., Figures 1 and 2) to convey variability directly. However, in cases where figures depict hundreds of paired (repeated-measures) data points, showing all points without connecting them would not be appropriate, while linking them would make the figures visually cluttered and uninterpretable. All plots and traces include measures of variability (SEM), and the raw data will be shared on Dryad. When error bars are not visible, they are smaller than the trace thickness or bar line—for example, in Figure 5B, the black circles and orange triangles include error bars, but they are smaller than the symbol size.

      Also, to minimize visual clutter, only a subset of relevant comparisons is highlighted with asterisks, whereas all relevant statistical results, comparisons, and mouse/session numbers are fully reported in the Results section, with statistical analyses accounting for the clustering of data within subjects and sessions.

      (9) Can the authors consider the extent to which calcium imaging may be better suited to identify increases compared to decreases and how this may affect the results, particularly related to the GRIN data when similar numbers of cells show responses in both directions (e.g., Figure 3)?

      This is an interesting issue related to a widely used technique beyond the scope of our study.

      (10) Raw example traces are not provided.

      We do not think raw traces are useful here. All figures contain average traces to reflect the activity of the estimated population.

      (11) The timeline of the spontaneous movement and avoidance sessions was not clear, nor was the number of events or sessions per animal nor how this was set. It is not clear if there was pre-training or habituation, if many or variable sessions were combined per animal, or what the time gaps between sessions were, or if or how any of these parameters might influence interpretation of the results.

      We have enhanced the description of the sessions, including the number of animals and sessions, which are daily and always equal per animals in each group of experiments. As noted, the sessions are part of the random effects in the model.

      (12) It is not clear if or how the spread of expression outside of the target STN was evaluated, and if or how many mice were excluded due to spread or fiber placements.

      The STN is anatomically well-confined, with its borders and the overlying zona incerta (composed of GABAergic neurons) providing protection against off-target expression in most neighboring forebrain regions. All viral injections were histologically verified and did not into extend into thalamic or hypothalamic areas. As described in the Methods, we employed an app we developed (Brain Atlas Analyzer, available on OriginLab) that aligns serial histological sections with the Allen Brain Atlas to precisely assess viral spread and confirm targeting accuracy. The experiments included in the revised manuscript now focus on optogenetic inhibition and irreversible lesion approaches—three complementary methods that consistently targeted the STN and yielded similar behavioral effects.

      Recommendations for the authors:

      Reviewing Editor Comments:

      The primary feedback agreed upon by all the reviewers was that the manuscript requires significant streamlining as it is currently overly long and convoluted.

      We thank the reviewers and editors for their thoughtful and constructive feedback. In response to the primary comment that “the manuscript requires significant streamlining as it is currently overly long and convoluted,” we have substantially revised and refocused the paper. Specifically, we streamlined the included data and enhanced the analyses to emphasize the central findings: the encoding of movement, cautious responding, and punishment in the STN during avoidance behavior. We also focused the causal component of the study by including only the loss-of-function experiments—both optogenetic inhibition and irreversible viral/electrolytic lesions—that establish the critical role of STN circuits in generating active avoidance. Together, these revisions enhance clarity, tighten the narrative focus, and align the manuscript more closely with the reviewers’ recommendations.

      Major revisions include the addition of mixed-effects modeling to dissociate the contributions of movement from other STN-encoded signals related to caution and punishment. This modeling approach allowed us to reveal that these components are statistically separable, demonstrating that movement, cautious responding, and aversive input are encoded by neuronal subsets. To streamline the manuscript and address reviewer concerns, we removed the optogenetic excitation experiments. As revised, the paper presents a more concise and cohesive narrative showing that STN neurons differentially encode movement, caution, and aversive stimuli, and that this circuitry is essential for generating active avoidance behavior.

      Many of the specific points raised by reviewers now fall outside the scope of the revised manuscript. This is primarily because the revised version omits data and analyses related to optogenetic excitation and associated control experiments. By removing these components, the paper now presents a streamlined and internally consistent dataset focused on how the STN encodes movement, cautious responding, and aversive outcomes during avoidance behavior, as well as on loss-of-function experiments demonstrating its necessity for generating active avoidance. Below, we address the points that remain relevant across reviews.

      Following extensive revisions, the current manuscript differs in several important ways from what the assessment describes:

      The description that the study “uses fiber photometry, implantable lenses, and optogenetics” is more accurately represented as using both fiber photometry and singleneuron calcium imaging with miniscopes, combined with optogenetic and irreversible lesion approaches.

      The phrase stating that “active but not passive avoidance depends in part on STN projections to substantia nigra” is better characterized as “STN projections to the midbrain,” since our data show that optogenetic inhibition of STN terminals in both the mesencephalic reticular tegmentum (MRT) and substantia nigra pars reticulata (SNr) produce equivalent effects, and thus these sites are combined in the study. 

      Finally, the original concern that evidence for STN involvement in cautious responding or avoidance speed was incomplete no longer applies. The revised focus on encoding, through the inclusion of mixed-effects modeling, now dissociates movement-related, cautious, and aversive components of STN activity. By removing the optogenetic excitation data, we no longer claim that the STN controls caution but rather that it encodes cautious responding, alongside movement and punishment signals. Furthermore, loss-of-function experiments demonstrate that silencing STN output abolishes active avoidance entirely, supporting an essential role for the STN in generating goal-directed avoidance behavior—a behavioral domain that, unlike appetitive responding, is fundamentally defined by caution and the need to balance action timing under threat.

      Reviewer #2 (Recommendations for the authors):

      (1) Show individual data points on bar plots.

      Wherever feasible, we display individual data points (e.g., Figures 1 and 2) to convey variability directly. However, in cases where figures depict hundreds of paired (repeatedmeasures) data points, showing all points without connecting them would not be appropriate, while linking them would make the figures visually cluttered and uninterpretable. All plots and traces include measures of variability (SEM), and the raw data will be shared on Dryad. When error bars are not visible, they are smaller than the trace thickness or bar line—for example, in Figure 5B, the black circles and orange triangles include error bars, but they are smaller than the symbol size.

      Also, to minimize visual clutter, only a subset of relevant comparisons is highlighted with asterisks, whereas all relevant statistical results, comparisons, and mouse/session numbers are fully reported in the Results section, with statistical analyses accounting for the clustering of data within subjects and sessions.

      (2) The active avoidance experiments are confusing when they are introduced in the results section. More explanation of what paradigms were used and what each CS means at the time these are introduced would add clarity. For example, AA1, AA2, etc, are explained only with references to other papers, but a brief description of each protocol and a schematic figure would really help.

      The avoidance protocols (AA1–4) are now described briefly but clearly in the Results section (second paragraph of “STN neurons activate during goal-directed avoidance contingencies”) and in greater detail in the Methods section. As stated, these tasks were conducted sequentially, and mice underwent the same number of sessions per procedure, which are indicated. All relevant procedural information has been included in these sections. Mice underwent daily sessions and learnt these tasks within 1-2 sessions, progressing sequentially across tasks with an equal number of sessions per task (7 per task), and the resulting data were combined and clustered by mouse/session in the statistical models.

      (3) How do the Class 1, 2, 3 avoids relate to Class 1, 2, 3 neural types established in Figure 3? It seems like they are not related, and if that is the case, they should be named something different from each other to avoid confusion. (4) Similarly, having 3 different cell types (a,b,c) in the active avoidance seems unrelated to the original classification of cell types (1,2,3), and these are different for each class of avoid. This is very confusing, and it is unclear how any of these types relate to each other. Presumably, the same mouse has all three classes of avoids, so there are recordings from each cell during each type of avoid.

      The terms class, mode, and type are now clearly distinguished throughout the manuscript. Modes refer to distinct patterns of avoidance behavior that differ in the level of cautious responding (Mode 3 is most cautious). Within each mode, types denote subgroups of neurons identified based on their ΔF/F activity profiles. In contrast, classes categorize neurons according to their relationship to movement, determined by cross-correlation analyses between ΔF/F and head speed (Class1-4; Fig. 7 is a new analysis) or head turns (ClassA-C, renamed from 1-3). This updated terminology clarifies the analytic structure, highlighting distinct neuronal populations within each analysis. For example, during avoidance behaviors, these classifications distinguish neurons encoding movement-, caution-, and outcome-related signals. Comparisons are conducted within each analytical set, within classes (A-C or 1-4 separately), within avoidance modes, or within modespecific neuronal types.

      …So the authors could compare one cell during each avoid and determine whether it relates to movement or sound, or something else. It is interesting that types a,b, and c have the exact same proportions in each class of avoid, and makes it important to investigate if these are the exact same cells or not.

      That previous table with the a,b,c % in the three figure panels was a placeholder, which was not updated in the included figure. It has now been correctly updated. They do not have the same proportions as shown in Fig. 9, although they are similar.

      Also, these mice could be recorded during the open field, so the original neural classification (class 1, 2,3) could be applied to these same cells, and then the authors can see whether each cell type defined in the open field has a different response to the different avoid types. As it stands, the paper simply finds that during movement and during avoidance behaviors, different cells in the STN do different things.

      We included a new analysis in Fig. 7 that classifies neurons based on the cross-correlation with movement. The inclusion of the models now clearly assigns variance to movement versus the other factors, and this analysis leads to the classification based on avoid modes. 

      (5) The use of the same colors to mean two different things in Figure 9 is confusing. AA1 vs AA2 shouldn't be the same colors as light-naïve vs light signaling CS.

      Optogenetic excitation is no longer part of the study.

      (6) The exact timeline of the optogenetics experiments should be presented as a schematic for understanding. It is not clear which conditions each mouse experienced in which order. This is critical to the interpretation of Figure 9 and the reduction of passive avoids during STN stimulation. Did these mice have the CS1+STN stimulation pairing or the STN+US pairing prior to this experiment? If they did, the stimulation of the STN could be strongly associated with either punishment or with the CS1that predicts punishment. If that is the case, stimulating the STN during CS2 could be like presentingCS1+CS2 at the same time and could be confusing. The authors should make it clear whether the mice were naïve during this passive avoid experiment or whether they had experienced STN stimulation paired with anything prior to this experiment.

      Optogenetic excitation is no longer part of the study.

      (20) Similarly, the duration of the STN stimulation should be made clear on the plots that show behavior over time (e.g., Figure 9E).

      Optogenetic excitation is no longer part of the study.

      (21) There is just so much data and so many conditions for each experiment here. The paper is dense and difficult to read. It would really benefit readability if the authors put only the key experiments and key figure panels in the main text and moved much of the repetitive figure panels to supplemental figures. The addition of schematic drawings for behavioral experiment timing and for the different AA1, AA2, and AA3 conditions would also really improve clarity.

      By focusing the study, we believe it has substantially improved clarity and readability. 

      Reviewer #3 (Recommendations for the authors):

      (1) Minor error in results 'Cre-AAV in the STN of Vglut2-Cre' Fixed.

      (2) In some Figure 2 panels, the peaks appear to be cut off, and blue traces are obscured by red.

      In Fig. 2, the peaks of movement (speed) traces are intentionally truncated to emphasize the rising phase of the turn, which would otherwise be obscured if the full y-axis range were displayed (peaks and other measures are statistically compared). This adjustment enhances clarity without omitting essential detail and is now noted in the legend.

    1. eLife Assessment

      This valuable study provides a 3D standardised anatomical atlas of the brain of an orb-weaving spider. The authors describe the brain's shape and its inner compartments-the neuropils-and add information on the distribution of a number of neuroactive substances such as neurotransmitters and neuropeptides. Through the use of histological and microscopy methods the authors provide a more complete view of an arachnid brain than previous studies and also presents convincing evidence about the organisation and homology of brain regions. The work will serve as a reference for future studies on spider brains and will enables comparisons of brain regions with insects so that the evolution of these structures can be inferred across arthropods.

    2. Reviewer #1 (Public review):

      Summary:

      Artiushin et al. establish a comprehensive 3D atlas of the brain of the orb-web building spider Uloborus diversus. First, they use immunohistochemistry detection of synapsin to mark and reconstruct the neuropils of the brain of six specimen and they generate a standard brain by averaging these brains. Onto this standard 3D brain, they plot immunohistochemical stainings of major transmitters to detect cholinergic, serotonergic, octopaminergic/taryminergic and GABAergic neurons, respectively. Further, they add information on the expression of a number of neuropeptides (Proctolin, AllatostatinA, CCAP and FMRFamide). Based on this data and 3D reconstructions, they extensively describe the morphology of the entire synganglion, the discernable neuropils and their neurotransmitter/neuromodulator content.

      Strengths:

      While 3D reconstruction of spider brains and the detection of some neuroactive substances have been published before, this seems to be the most comprehensive analysis so far both in terms of number of substances tested and the ambition to analyzing the entire synganglion. Interestingly, besides the previously described neuropils, they detect a novel brain structure, which they call the tonsillar neuropil.

      Immunohistochemistry, imaging and 3D reconstruction are convincingly done and the data is extensively visualized in figures, schemes and very useful films, which allow the reader to work with the data. Due to its comprehensiveness, this dataset will be a valuable reference for researchers working on spider brains or on the evolution of arthropod brains.

      Weaknesses:

      As expected for such a descriptive groundwork, new insights or hypotheses are limited while the first description of the tonsillar neuropil is interesting. The reconstruction of the main tracts of the brain would be a very valuable complementary piece of data.

    3. Reviewer #2 (Public review):

      Summary

      Artiushin et al. created the first three-dimensional atlas of a synganglion in the hackled orb-weaver spider, which is becoming a popular model for web-building behavior. Immunohistochemical analysis with an impressive array of antisera reveal subcompartments of neuroanatomical structures described in other spider species as well as two previously undescribed arachnid structures, the protocerebral bridge, hagstone, and paired tonsillar neuropils. The authors describe the spider's neuroanatomy in detail and discuss similarities and differences from other spider species. The final section of the discussion examines the homology between onychophoran and chelicerate arcuate bodies and mandibulate central bodies.

      Strengths

      The authors set out to create a detailed 3D atlas and accomplished this goal.

      Exceptional tissue clearing and imaging of the nervous system reveals the three-dimensional relationships between neuropils and some connectivity that would not be apparent in sectioned brains.

      Detailed anatomical description makes it easy to reference structures described between the text and figures.

      The authors used a large palette of antisera which may each be investigated in future studies for function in the spider nervous system and may be compared across species.

      Weaknesses addressed in the revision

      Additional added information about spider-specific neuropils helps orient a non-expert reader. While the function and connectivity of many of these structures is currently unknown, this study will be foundational in future investigations of function.

    4. Reviewer #3 (Public review):

      Summary:

      This is an impressive paper that offers a much-needed 3D standardized brain atlas for the hackled-orb weaving spider Uloborus diversus, an emerging organism of study in neuroethology. The authors used a detailed immunohistological wholemount staining method that allowed them to localize a wide range of common neurotransmitters and neuropeptides and map them on a common brain atlas. Through this approach, they discovered groups of cells that may form parts of neuropils that had not previously been described, such as the 'tonsillar neuropil', which might be part of a larger insect-like central complex. Further, this work provides unique insights into previously underappreciated complexity of higher-order neuropils in spiders, particularly the arcuate body, and hints at a potentially important role for the mushroom bodies in vibratory processing for web-building spiders.

      Strengths:

      To understand brain function, data from many experiments on brain structure must be compiled to serve as a reference and foundation for future work. As demonstrated by the overwhelming success in genetically tractable laboratory animals, 3D standardized brain atlases are invaluable tools-especially as increasing amounts of data are obtained at the gross morphological, synaptic, and genetic levels, and as functional data from electrophysiology and imaging are integrated. Among 'non-model' organisms, such approaches have included global silver staining and confocal microscopy, MRI, and more recently, micro-computed tomography (X-ray) scans used to image multiple brains and average them into a composite reference. In this study, the authors used synapsin immunoreactivity to generate an averaged spider brain as a scaffold for mapping immunoreactivity to other neuromodulators. Using this framework, they describe many previously known spider brain structures and also identify some previously undescribed regions. They argue that the arcuate body-a midline neuropil thought to have diverged evolutionarily from the insect central complex-shows structural similarities that may support its role in path integration and navigation.

      Having diverged from insects such as the fruit fly Drosophila melanogaster over 400 million years ago, spiders are an important group for study-particularly due to their elegant web-building behavior, which is thought to have contributed to their remarkable evolutionary success. How such exquisitely complex behavior is supported by a relatively small brain remains unclear. A rich tradition of spider neuroanatomy emerged in the previous century through the work of comparative zoologists, who used reduced silver and Golgi stains to reveal remarkable detail about gross neuroanatomy. Yet, these techniques cannot uncover the brain's neurochemical landscape, highlighting the need for more modern approaches-such as those employed in the present study.

      A key insight from this study involves two prominent higher-order neuropils of the protocerebrum: the arcuate body and the mushroom bodies. The authors show that the arcuate body has a more complex structure and lamination than previously recognized, suggesting it is insect central complex-like and may support functions such as path integration and navigation, which are critical during web building. They also report strong synapsin immunoreactivity in the mushroom bodies and speculate that these structures contribute to vibratory processing during sensory feedback, particularly in the context of web building and prey localization. These findings align with prior work that noted the complex architecture of both neuropils in spiders and their resemblance (and in some cases greater complexity) compared to their insect counterparts. Additionally, the authors describe previously unrecognized neuropils, such as the 'tonsillar neuropil,' whose function remains unknown but may belong to a larger central complex. The diverse patterns of neuromodulator immunoreactivity further suggest that plasticity plays a substantial role in central circuits.

      Weaknesses:

      My major concern, however, is some of the authors' neuroanatomical descriptions rely too heavily on inference rather than what is currently resolvable from their immunohistochemistry stains alone.

      Comments on revisions:

      I thought that the authors did an excellent job responding to the reviews, and I have no further comments.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Artiushin et al. establish a comprehensive 3D atlas of the brain of the orb-web building spider Uloborus diversus. First, they use immunohistochemistry detection of synapsin to mark and reconstruct the neuropils of the brain of six specimens and they generate a standard brain by averaging these brains. Onto this standard 3D brain, they plot immunohistochemical stainings of major transmitters to detect cholinergic, serotonergic, octopaminergic/taryminergic and GABAergic neurons, respectively. Further, they add information on the expression of a number of neuropeptides (Proctolin, AllatostatinA, CCAP, and FMRFamide). Based on this data and 3D reconstructions, they extensively describe the morphology of the entire synganglion, the discernible neuropils, and their neurotransmitter/neuromodulator content.

      Strengths:

      While 3D reconstruction of spider brains and the detection of some neuroactive substances have been published before, this seems to be the most comprehensive analysis so far, both in terms of the number of substances tested and the ambition to analyze the entire synganglion. Interestingly, besides the previously described neuropils, they detect a novel brain structure, which they call the tonsillar neuropil.<br /> Immunohistochemistry, imaging, and 3D reconstruction are convincingly done, and the data are extensively visualized in figures, schemes, and very useful films, which allow the reader to work with the data. Due to its comprehensiveness, this dataset will be a valuable reference for researchers working on spider brains or on the evolution of arthropod brains.

      Weaknesses:

      As expected for such a descriptive groundwork, new insights or hypotheses are limited, apart from the first description of the tonsillar neuropil. A more comprehensive labeling in the panels of the mentioned structures would help to follow the descriptions. The reconstruction of the main tracts of the brain would be a very valuable complementary piece of data.

      Reviewer #2 (Public review):

      Summary

      Artiushin et al. created the first three-dimensional atlas of a synganglion in the hackled orb-weaver spider, which is becoming a popular model for web-building behavior. Immunohistochemical analysis with an impressive array of antisera reveals subcompartments of neuroanatomical structures described in other spider species as well as two previously undescribed arachnid structures, the protocerebral bridge, hagstone, and paired tonsillar neuropils. The authors describe the spider's neuroanatomy in detail and discuss similarities and differences from other spider species. The final section of the discussion examines the homology between onychophoran and chelicerate arcuate bodies and mandibulate central bodies.

      Strengths

      The authors set out to create a detailed 3D atlas and accomplished this goal.

      Exceptional tissue clearing and imaging of the nervous system reveal the three-dimensional relationships between neuropils and some connectivity that would not be apparent in sectioned brains.

      A detailed anatomical description makes it easy to reference structures described between the text and figures.

      The authors used a large palette of antisera which may be investigated in future studies for function in the spider nervous system and may be compared across species.

      Weaknesses

      It would be useful for non-specialists if the authors would introduce each neuropil with some orientation about its function or what kind of input/output it receives, if this is known for other species. Especially those structures that are not described in other arthropods, like the opisthosomal neuropil. Are there implications for neuroanatomical findings in this paper on the understanding of how web-building behaviors are mediated by the brain?

      Likewise, where possible, it would be helpful to have some discussion of the implications of certain neurotransmitters/neuropeptides being enriched in different areas. For example, GABA would signal areas of inhibitory connections, such as inhibitory input to mushroom bodies, as described in other arthropods. In the discussion section on relationships between spider and insect midline neuropils, are there similarities in expression patterns between those described here and in insects?

      Reviewer #3 (Public review):

      Summary:

      This is an impressive paper that offers a much-needed 3D standardized brain atlas for the hackled-orb weaving spider Uloborus diversus, an emerging organism of study in neuroethology. The authors used a detailed immunohistological whole-mount staining method that allowed them to localize a wide range of common neurotransmitters and neuropeptides and map them on a common brain atlas. Through this approach, they discovered groups of cells that may form parts of neuropils that had not previously been described, such as the 'tonsillar neuropil', which might be part of a larger insect-like central complex. Further, this work provides unique insights into the previously underappreciated complexity of higher-order neuropils in spiders, particularly the arcuate body, and hints at a potentially important role for the mushroom bodies in vibratory processing for web-building spiders.

      Strengths:

      To understand brain function, data from many experiments on brain structure must be compiled to serve as a reference and foundation for future work. As demonstrated by the overwhelming success in genetically tractable laboratory animals, 3D standardized brain atlases are invaluable tools - especially as increasing amounts of data are obtained at the gross morphological, synaptic, and genetic levels, and as functional data from electrophysiology and imaging are integrated. Among 'non-model' organisms, such approaches have included global silver staining and confocal microscopy, MRI, and, more recently, micro-computed tomography (X-ray) scans used to image multiple brains and average them into a composite reference. In this study, the authors used synapsin immunoreactivity to generate an averaged spider brain as a scaffold for mapping immunoreactivity to other neuromodulators. Using this framework, they describe many previously known spider brain structures and also identify some previously undescribed regions. They argue that the arcuate body - a midline neuropil thought to have diverged evolutionarily from the insect central complex - shows structural similarities that may support its role in path integration and navigation.

      Having diverged from insects such as the fruit fly Drosophila melanogaster over 400 million years ago, spiders are an important group for study - particularly due to their elegant web-building behavior, which is thought to have contributed to their remarkable evolutionary success. How such exquisitely complex behavior is supported by a relatively small brain remains unclear. A rich tradition of spider neuroanatomy emerged in the previous century through the work of comparative zoologists, who used reduced silver and Golgi stains to reveal remarkable detail about gross neuroanatomy. Yet, these techniques cannot uncover the brain's neurochemical landscape, highlighting the need for more modern approaches-such as those employed in the present study.

      A key insight from this study involves two prominent higher-order neuropils of the protocerebrum: the arcuate body and the mushroom bodies. The authors show that the arcuate body has a more complex structure and lamination than previously recognized, suggesting it is insect central complex-like and may support functions such as path integration and navigation, which are critical during web building. They also report strong synapsin immunoreactivity in the mushroom bodies and speculate that these structures contribute to vibratory processing during sensory feedback, particularly in the context of web building and prey localization. These findings align with prior work that noted the complex architecture of both neuropils in spiders and their resemblance (and in some cases greater complexity) compared to their insect counterparts. Additionally, the authors describe previously unrecognized neuropils, such as the 'tonsillar neuropil,' whose function remains unknown but may belong to a larger central complex. The diverse patterns of neuromodulator immunoreactivity further suggest that plasticity plays a substantial role in central circuits.

      Weaknesses:

      My major concern, however, is that some of the authors' neuroanatomical descriptions rely too heavily on inference rather than what is currently resolvable from their immunohistochemistry stains alone.

      We would like to thank the reviewers for their time and effort in carefully reading our manuscript and providing helpful feedback, and particularly for their appreciation and realistic understanding of the scope of this study and its context within the existing spider neuroanatomical literature.

      Regarding the limitations and potential additions to this study, we believe these to be well-reasoned and are in agreement. We plan to address some of these shortcomings in future publications.

      As multiple reviewers remarked, a mapping of the major tracts of the brain would be a welcome addition to understanding the neuroanatomy of U. diversus. This is something which we are actively working on and hope to provide in a forthcoming publication. Given the length of this paper as is, we considered that a treatment of the tracts would be better served as an additional paper. Likewise, mapping of the immunoreactive somata of the currently investigated targets is a component which we would like to describe as part of a separate paper, keeping the focus of the current one on neuropils, in order to leverage our aligned volumes to describe co-expression patterns, which is not as useful for the more widely dispersed somata. Furthermore, while we often see somata through immunostaining, the presence and intensity of the signal is variable among immunoreactive populations. We are finding that these populations are more consistently and comprehensively revealed thru fluorescent in situ hybridization.

      We appreciate the desire of the reviewers for further information regarding the connectivity and function of the described neuropils, and where possible we have added additional statements and references. That being said, where this context remains sparse is largely a reflection of the lack of information in the literature. This is particularly the case for functional roles for spider neuropils, especially higher order ones of the protocerebrum, which are essentially unexamined. As summarized in the quite recent update to Foelix’s Spider Neuroanatomy, a functional understanding for protocerebral neuropil is really only available for the visual pathway. Consequently, it is therefore also difficult to speak of the implications for presence or absence of particular signaling elements in these neuropils, if no further information about the circuitry or behavioral correlates are available. Finally, multiple reviewers suggested that it might be worthwhile to explore a comparison of the arcuate body layer innervation to that of the central bodies of insects, of which there is a richer literature. This is an idea which we were also initially attracted to, and have now added some lines to the discussion section. Our position on this is a cautious one, as a series of more recent comparative studies spanning many insect species using the same antibody, reveals a considerable amount of variation in central body layering even within this clade, which has given us pause in interpreting how substantive similarities and differences to the far more distant spiders would be. Still, this is an interesting avenue which merits an eventual comprehensive analysis, one which would certainly benefit from having additional examples from more spider species, in order to not overstate conclusions based on the currently limited neuroanatomical representation.

      Given our framing for the impetus to advance neuroanatomical knowledge in orb-web builders, the question of whether the present findings inform the circuitry controlling web-building is one that naturally follows. While we are unable with this dataset alone to define which brain areas mediate web-building - something which would likely be beyond any anatomical dataset lacking complementary functional data – the process of assembling the atlas has revealed structures and defined innervation patterns in previously ambiguous sectors of the spider brain, particularly in the protocerebrum. A simplistic proposal is that such regions, which are more conspicuous by our techniques and in this model species, would be good candidates for further inquiries into web-building circuitry, as their absence or oversight in past work could be attributable to the different behavioral styles of those model species. Regardless, granted that such a hypothesis cannot be readily refuted by the existing neuroanatomical literature, underscores the need to have more finely refined models of the spider brain, to which we hope that we have positively contributed to and are gratified by the reviewer’s enthusiasm for the strengths of this study.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) Brenneis 2022 has done a very nice and comprehensive study focused on the visual system - this might be worth including.

      Thank you, we have included this reference on Line 34.

      (2) L 29: When talking about "connectivity maps", the emerging connectomes based on EM data could be mentioned.

      Additional references have been added, thank you. Line 35.

      (3) L 99: Please mention that you are going to describe the brain from ventral to dorsal.

      Thank you, we have added a comment to Line 99.

      (4) L 13: is found at the posterior.

      Thank you, revised.

      (5) L 168: How did you pick those two proctolin+ somata, given that there is a lot of additional punctate signal?

      Although not visible in this image, if you scroll through the stack there is a neurite which extends from these neurons directly to this area of pronounced immunoreactivity.

      (6) Figure 1: Please add the names of the neuropils you go through afterwards.

      We have added labels for neuropils which are recognizable externally.

      (7) Figure 1 and Figure 5: Please mark the esophagus.

      Label has now been added to Figure 1. In Figure 5, the esophagus should not really be visible because these planes are just ventral to its closure.

      (8) Figure 5A: I did not see any CCAP signal where the arrow points to; same for 5B (ChAT).

      In hindsight, the CCAP point is probably too minor to be worth mentioning, so we have removed it.

      The ChAT signal pattern in 5B has been reinforced by adding a dashed circle to show its location as well.

      (9) L 249: Could the circular spot also be a tract (many tracts lack synapsin - at least in insects)?

      Yes, thank you for pointing this out – the sentence is revised (L274). We are currently further analyzing anti-tubulin volumes and it seem that indeed there are tracts which occupy these synapsin-negative spaces, although interestingly they do not tend to account for the entire space.

      (10) L 302: Help me see the "conspicuous" thing.

      Brace added to Fig. 8B, note in caption.

      (11) L 315: Please first introduce the number of the eyes and how these relate to 1{degree sign} and 2{degree sign} pathway. Are these separate pathways from separate eyes or two relay stations of one visual pathway?

      We have expanded the introduction to this section (L336). Yes, these are considered as two separate visual pathways, with a typical segregation of which eyes contribute to which pathway – although there is evidence for species-specific differences in these contributions. In the context of this atlas, we are not currently able to follow which eyes are innervating which pathway.

      (12) L 343: It seems that the tonsillar neuropil could be midline spanning (at least this is how I interpret the signal across the midline). Would it make sense to re-formulate from a paired structure to midline-spanning? Would that make it another option for being a central complex homolog?

      In the spectrum from totally midline spanning and unpaired (e.g., arcuate body (at least in adults)) to almost fully distinct and paired (e.g., mushroom bodies (although even here there is a midline spanning ‘bridge’)), we view the tonsillar to be more paired due to the oval components, although it does have a midline spanning section, particularly unambiguous just posterior to the oval sections.

      Regarding central complex homology, if the suggestion is that the tonsillar with its midline spanning component could represent the entire central complex, then this is a possibility, but it would neglect the highly innervated and layered arcuate body, which we think represent a stronger contender – at least as a component of the central complex. For this reason, we would still be partial to the possibility that the tonsillar is a part of the central complex, but not the entire complex.

      (13) L 407: ...and dorsal (..) lobe...

      Added the word ‘lobe’ to this sentence (L429).

      (14) L 620ff: Maybe mention the role of MBs in learning and memory.

      A reference has been added at L661.

      (15) L 644: In the context of arcuate body homology with the central body, I was missing a discussion of the neurotransmitters expressed in the respective parts in insects. Would that provide additional arguments?

      This is an interesting comparison to explore, and is one that we initially considered making as well. There are certainly commonalities that one could point to, particularly in trying to build the case of whether particular lobes of the arcuate body are similar to the fan-shaped or ellipsoid bodies in insects. Nevertheless, something which has given us pause is studying the more recent comparative works between insect species (Timm et al., 2021, J Comp Neuro, Homberg et al., 2023, J Comp Neuro), which also reveal a fair degree of heterogeneity in expression patterns between species – and this is despite the fact that the neuropils are unambiguously homologous. When comparing to a much more evolutionarily distant organism such as the spider, it becomes less clear which extant species should serve as the best point of comparison, and therefore we fear making specious arguments by focusing on similarities when there are also many differences. We have added some of these comments to the discussion (L699-725).

      Throughout the text, I frequently had difficulties in finding the panels right away in the structures mentioned in the text. It would help to number the panels (e.g., 6Ai, Aii, Aii,i etc) and refer to those in the text. Further, all structures mentioned in the text should be labelled with arrows/arrowheads unless they are unequivocally identified in the panel

      Thank you for the suggestion. We have adopted the additional numbering scheme for panels, and added additional markers where suggested.

      Reviewer #2 (Recommendations for the authors):

      (1) L 18: "neurotransmitter" should be pluralized.

      Thank you, revised (L18).

      (2) L 55: Missing the word "the" before "U. diversus".

      Thank you, revised (L57).

      (3) L 179: Change synaptic dense to "synapse-dense".

      Thank you, revised (L189).

      (4) L 570: "present in" would be clearer than "presented on in".

      Our intention here was to say that Loesel et al did not show slices from the subesophageal mass for CCAP, so it was ambiguous as to whether it had immunoreactivity there but they simply did not present it, or if it indeed doesn’t show signal in the subesophageal. But agreed, this is awkward phrasing which has been revised (L606-608), thank you.

      (5) L 641: It would be worth noting that the upper and lower central bodies are referred to as the fan-shaped and ellipsoid bodies in many insects.

      Thank you, this has been added in L694.

      (6) L 642: Although cited here regarding insect central body layers, Strausfeld et al. 2006 mainly describe the onychophoran brain and the evolutionary relationship between the onychophoran and chelicerate arcuate bodies. The phylogenetic relationships described here would strengthen the discussion in the section titled "A spider central complex?"

      The phylogenetic relationship of onychophorans and chelicerates remains controversial and therefore we find it tricky to use this point to advance the argument in that discussion section, as one could make opposing arguments. The homology of the arcuate body (between chelicerates, onychophorans, and mandibulates) has likewise been argued over, with this Strausfeld et al paper offering one perspective, while others are more permissive (good summary at end of Doeffinger et al., 2010). Our thought was simply to draw attention to grossly similar protocerebral neuropils in examples from distantly related arthropods, without taking a stance, as our data doesn’t really deeply advance one view over the other.

      (7) L 701- Noduli have been described in stomatopods (Thoen et al., Front. Behav. Neurosci., 2017).

      This is an important addition, thank you – it has been incorporated and cited (L766).

      (8) Antisera against DC0 (PKA-C alpha) may distinguish globuli cells from other soma surrounding the mushroom bodies, but this may be accomplished in future studies.

      Agreed, this is something we have been interested in, but have not yet acquired the antibody.

      Reviewer #3 (Recommendations for the authors):

      Overall, this paper is both timely and important. However, it may face some resistance from classically trained arthropod neuroanatomists due to the authors' reliance on immunohistochemistry alone. A method to visualize fiber tracts and neuropil morphology would have been a valuable and grounding complement to the dataset and can be added in future publications. Tract-tracing methods (e.g., dextran injections) would strengthen certain claims about connectivity - particularly those concerning the mushroom bodies. For delineating putative cell populations across regions, fluorescence in situ hybridization for key transcripts would offer convincing evidence, especially in the context of the arcuate body, the tonsillar neuropil, and proposed homologies to the insect central complex.

      That said, the dataset remains rich and valuable. Outlined below are a number of issues the authors may wish to address. Most are relatively minor, but a few require further clarification.

      (1) Abstract

      (a) L 12-14: The authors should frame their work as a novel contribution to our understanding of the spider brain, rather than solely as a tool or stepping stone for future studies. The opening sentences currently undersell the significance of the study.

      Thank you for your encourament! We have revised the abstract.

      (b) Rather than touting "first of its kind" in the abstract, state what was learned from this.

      Thank you, we have revised the abstract.

      (c) The abstract does not mention the major results of the study. It should state which brain regions were found. It should list all of the peptides and transmitters that were tested so that they can be discoverable in searches.

      Thank you, revised.

      (2) Introduction

      (a) L 38: There's a more updated reference for Long (2016): Long, S. M. (2021). Variations on a theme: Morphological variation in the secondary eye visual pathway across the order of Araneae. Journal of Comparative Neurology, 529(2), 259-280.

      Thank you, this has been updated (L41 and elsewhere).

      (b) L 47: While whole-mount imaging offers some benefits, a downside is the need for complete brain dissection from the cuticle, which in spiders likely damages superficial structures (such as the secondary eye pathways).

      True – we have added this caveat to the section (L48-51).

      (c) L 49-52: If making this claim, more explicit comparisons with non-web building C. saeli in terms of neuropil presence, volume, or density later in the paper would be useful.

      We do not have the data on hand to make measured comparisons of C. salei structures, and the neuropils identified in this study are not clearly identifiable in the slices provided in the literature, so would likely require new sample preparations. We’ve removed the reference to proportionality and softened this sentence slightly – we are not trying to make a strong claim, but simply state that this is a possibility.

      (3) Results

      (a) The authors should state how they accounted for autofluorescence.

      While we did not explicitly test for autofluorescence, the long process of establishing a working whole-mount immuno protocol and testing antibodies produced many examples of treated brains which did not show any substantial signal.  We have added a note to the methods section (L866).

      (b) L 69: There is some controversy in delineating the subesophageal and supraesophageal mass as the two major divisions despite its ubiquity in the literature. It might be safer to delineate the protocerebrum, deutocerebrum, and fused postoral ganglia (including the pedipalp ganglion) instead.

      Thank you for this insight, we have modified the section, section headings and Figure 1 to account for this delineation as well. We have chosen to include both ways of describing the synganglion, in order to maintain a parallel with the past literature, and to be further accessible to non-specialist readers. L73-77

      (c) L 90: It might be useful to include a justification for the use of these particular neuropeptides.

      Thank you, revised. L97-99.

      (d) L 106 - 108: It is stated that the innervation pattern of the leg neuropils is generally consistent, but from Figure 2, it seems that there are differences. The density of 5HT, Proctolin, ChAT, and FMRFamide seems to be higher in the posterior legs. AstA seems to have a broader distribution in L1 and is absent in L4.

      We would still stand by the generalization that the innervation pattern is fairly similar for each leg. The L1 neuropils tend to be bigger than the posterior legs, which might explain the difference in density. Another important aspect to keep in mind is that not all of the leg neuropils appear at the exact same imaging plane as we move from ventral to dorsal. If you scroll through the synapsin stack (ventral to dorsal), you will see that L2 and L3 appear first, followed shortly by L1, and then L4, and at the dorsal end of the subesophageal they disappear in the opposite order. The observations listed here are true for the single z-plane in Figure 2, but the fact that they don’t appear at the same time seems to mainly account for these differences. For example, if you scroll further ventrally in the AstA volume, you will see a very similar innervation appear in L4 as well, even though it is absent in the Fig. 2 plane. We plan to have these individual volumes available from a repository so that they can be individually examined to better see the signal at all levels. At the moment, the entire repository can be accessed here: https://doi.org/10.35077/ace-moo-far.

      (e) Figure 1 and elsewhere: The axes for the posterior and lateral views show Lateral and Medial. It would be more accurate to label them Left and Right. because it does not define the medial-to-lateral axis. The medial direction is correct for only one hemiganglion, and it's the opposite for the contralateral side.

      Thank you, revised.

      (f) In Figures that show particular sections, it might be helpful to include a plane in the standard brain to illustrate where that section is.

      Yes, we agree and it was our original intention. It is something we can attempt to do, but there is not much room in the corners of many of the synapsin panels, making it harder to make the 3D representation big enough to be clear.

      (g) Figure 2, 3: Presenting the z-section stack separately in B and C is awkward because it makes it seem that they are unrelated. I think it would be better to display the z160-190 directly above its corresponding z230-260 for each of the exemplars in B and C. Since there's no left-right asymmetry, a hemibrain could be shown for all examples as was done for TH in D. It's not clear why TH was presented differently.

      Thank you for this suggestion. We rearranged the figure as described, but ultimately still found the original layout to be preferrable, in part because the labelling becomes too cramped. We hope that the potential confusion of the continuity of the B and C sections will be mitigated by focusing on the z plane labels and overall shape – which should suggest that the planes are not far from each other. We trust that the form of the leg neuropils is recognizable in both B and C synapsin images, and so readers will make the connection.

      Regarding TH, this panel is apart from the rest because we were unable to register the TH volume to the standard brain because the variant of the protocol which produced good anti-TH staining conflicted with synapsin, and we could not simultaneously have adequate penetration of the synapsin signal. We did not want to align the TH panel with the others to avoid potential confusion that this was a view from the same z-plane of a registered volume, as the others are. We have added a note to the figure caption.

      (h) The locations of the labels should be consistent. The antisera are below the images in Figure 2, above in Figure 3, and to the bottom left in Figure 5. The slices are shown above in Figure 2 and below in Figure 3.

      Thank you, this has been revised for better consistency.

      (i) It is surprising to me that there is no mention of the neuronal somata visible in Figure 2 and Figure 3. A typical mapping of the brain would map the locations of the neurons, not just the neuropils.

      Our first arrangement of this paper described each immunostain individually from ventral to dorsal, including locations of the immunoreactive somata which could be observed. To aid the flow of the paper and leverage the aligned volumes to emphasize co-expression in the function divisions of the brain, we re-formulated to this current layout which is organized around neuropils. Somata locations are tricky to incorporate in this format of the paper which focuses on key z-planes or tight max projections, because the relevant immunoreactive somata are more dispersed throughout the synganglion, not always overlapping in neighboring z-planes. Further, since only a minority of the antisera we used can reveal traceable projections from the supplying somata in the whole-mount preparation, we would be quite limited in the degree to which we could integrate the specific somata mapping with expression patterns in the neuropil.  Finally, compared to immuno, which can be variable in staining intensity between somata for the same target, we find that FISH reveals these locations more clearly and comprehensively – so while we agree that this mapping would also be useful for the atlas, we would like to better provide this information in a future publication using whole-mount FISH.

      (j) L 139: There is a reference to a "brace" in Figure 3B, which does not seem to exist. There's one in Figure 3C.

      There is a smaller brace near the bottom of the TDC2 panel in Fig. 3B.

      (k) L 151 should be "3D".

      Thank you, revised (L160).

      (l) Figure 4C: It is not mentioned in the legend that the bottom inset is Proctolin without synapsin.

      Thank you, revised (L1213).

      (m) L 199: Are the authors sure this subdivision is solely on the anterior-posterior axis? Could it also be dorsal ventral? (i.e., could this be an artifact of the protocerebrum and deutocerebrum?)

      Yes, this division can be appreciated to extend somewhat in the dorsal-ventral axis and it is possible that this is the protocerebrum emerging after the deutocerebrum, although this area is largely dorsal to the obvious part of the deutocerebrum. In the horizontal planes there appears to be a boundary line which we use for this subdivision in order to assist in better describing features within this generally ventral part of the protocerebrum – referred to as “stalk” because it is thinner before the protocerebrum expands in size, dorsally. Our intention was more organizational, and as stated in the text, this area is likely heterogenous and we are not suggesting that it has a unified function, so being a visual artifact would not be excluded.

      (n) L 249: Could it also indicate large tracts projecting elsewhere?

      Yes, definitely, we have evidence that part of the space is occupied by tracts. Revised, thank you (L262).

      (o) L 281: Several investigators, including Long (2021,) noted very large and robust mushroom bodies of Nephila.

      Thank you – the point is well taken that there are examples of orb-web builders that do have appreciable mushroom bodies. We have added a note in this section (L295), giving the examples of Deinopis spinosa and Argiope trifasciata (Figure 4.20 and 4.22 in Long, 2016).

      It looks like these species make the point better than Nephila, as Long lists the mushroom body percentage of total protocerebral volume for D. spinosa as 4.18%, for A. trifasciata as 2.38%, but doesn’t give a percentage for Nephila clavipes (Figure 4.24) and only labels the mushroom bodies structures as “possible” in the figure.

      In Long (2021), Nephilidae is described as follows: “In Nephilidae, I found what could be greatly reduced medullae at the caudal end of the laminae, as well as a structure that has many physical hallmarks of reduced mushroom bodies”

      (p) L 324: If the authors were able to stain for histamine or supplement this work with a different dissection technique for the dorsal structures, the visual pathways might have been apparent, which seems like a very important set of neuropils to include in a complete brain atlas.

      Yes, for this reason histamine has been an interesting target which we have attempted to visualize, but unfortunately have not yet been able to successfully stain for in U. diversus. An additional complication is that the antibodies we have seen call for glutaraldehyde fixation, which may make them incompatible with our approach to producing robust synapsin staining throughout the brain. 

      We agree that the lack of the complete visual pathway is a substantial weakness of our preparation, and should be amended in future work, but this will likely require developing a modified approach in order to preserve these delicate structures in U. diversus.

      (q) L 331: Is this bulbous shape neuropil, or just the remains of neuropil that were not fully torn away during dissection?

      This certainly is a severed part of the primary pathway, although it seems more likely that the bulbous shape is indicative of a neuropil form, rather than just being a happenstance shape that occurred during the breakage. We have examples where the same bulbous shape appears on both sides, and in different brains. It is possible that this may be the principal eye lamina – although we did not see co-staining with expected markers in examples where it did appear, so cannot be sure.

      (r) L 354: Is tyraminergic co-staining with the protocerebral bridge enough evidence to speculate that inputs are being supplied?

      We agree that this is not compelling, and have removed the statement.

      (s) L 372: This whole structure appears to be a previously described structure in spiders, the 'protocerebral commissure'.

      We are reasonably sure that what we are calling the PCB is a distinct structure from the protocerebral bridge (PCC). In Babu and Barth’s (1984) horizontal slice (Fig. 11b), you can see the protocerebral commissure immediately adjacent to the mushroom body bridge. It is found similarly located in other species, as can be seen in the supplementary 3D files provided by Steinhoff et al., (2024).

      While not visible with synapsin in U. diversus, we likewise can make out a commissure in this area in close proximity to the mushroom body bridge using tubulin staining. What we are calling the protocerebral bridge is a structure which is much more dorsal to the protocerebral commissure, not appearing in the same planes as the MB bridge.

      (t) L 377: Do you have an intuition why the tonsillar neuropil and the protocerebral bridge would show limited immunoreactivity, while the arcuate body's is quite extensive?

      This is an interesting question. Given the degree of interconnection and the fact that multiple classes of neurons in insects will innervate both central body as well as PCB or noduli, perhaps it would be expected that expression in tonsillar and protocerebral bridge should be commensurate to the innervation by that particular neurotransmitter expressing population in the arcuate body. Apart from the fact that the arcuate body is just bigger, perhaps this points to a great role of the arcuate body for integration, whereas the tonsillar and PCB may engage in more particular processing, or be limited to certain sensory modalities.

      Interestingly, it seems that this pattern of more limited immunoreactivity in the PCB and noduli compared with the central bodies (fan-shaped/ellipsoid) also appears in insects (Kahsai et al., 2010, J Comp Neuro, Timm et al., 2021, J Comp Neuro, Homberg et al., 2023, J Comp Neuro) – particularly, with almost every target having at least some layering in the fan-shaped body (Kahsai et al., 2010, J Comp Neuro).  For example, serotoninergic innervation is fairly consistently seen in the upper and lower central bodies across insects, but its presence in the PCB or noduli is more variable – appearing in one or the other in a species-dependent manner (Homberg et al., 2023, J Comp Neuro).

      (4) Discussion

      (a) L 556: But if confocal images from slices are aligned, is the 3D shape not preserved?

      Yes, fair enough – the point we wanted to make was that there is still a limitation in z resolution depending on the thickness of the slices used, which could obscure structures, but perhaps this is too minor of a comment.

      (b) L 597: This is a very interesting result. I agree it's likely to do with the processing of mechanosensory information relevant to web activities, and the mushroom body seems like the perfect candidate for this.

      (c) L 638: Worth noting that neuropil volume vs density of synapses might play a role in this, as the literature is currently a bit ambiguous with regards to the former.

      Thank you, noted (L689).

      (d) L 651: The latter seems far more plausible.

      Agreed, though the presence of mushroom bodies appears to be variable in spiders, so we didn’t want to take a strong stance, here.

    1. For this purpose, self-curation can be a better alternativeto memorialization. In other words, the user can be responsible forthe curation of their own legacy prior to death.

      Aww, respect the wishes of those who aim to become forgotten!

    2. As Haverinen explains, RPGstransform an avatar into a character which represents “both the storyof the role-play and the personal interests of the player” (2014a, p. 157).She also states that “the communal spirit is usually strong among play-ers who have played together for hundreds of hours and often evenyears. They have shared their personal lives with each other, and have‘lived’ together in the story they have created for the game”

      I remember when my Disgaea save file got deleted... I cried!

    3. Sibilla and Mancini (2018) also high-light that identification is increased when players are given the abilityto customize their avatars.Sibilla and Mancini (2018) list two types of user-avatar identification:actualization and idealization.

      Would say representation and projection, I find actualization and idealization a bit blurry. I represent how I currently feel like, and create projections of how I would envision myself under other circumstances.

    4. devices like phonesand laptops (2014, p. 129). The data on these devices is frequently in-accessible, for instance due to password protection or because theinformation is scattered across multiple platforms. Nevertheless, thisdata holds material of strong emotional significance for the bereavedor the promise of uncovering new information, causing unnecessaryfrustration or hurt when they cannot be accessed.For this reason, there exists a need for designing hardware and soft-ware that account for a user’s death

      Services like Gmail have the option.

    5. Klass et al. (1996) propose the ‘continuing bonds’ model inwhich grief is not perceived in stages but instead it is seen as a rene-gotiation of the relationship the bereaved has to the deceased. Thistheory becomes relevant in the digital age due to an increase in digitaldeath practices that facilitate remembrance and allow the bereavedto maintain their connection

      Much like you don't let go of a friend who you've lost touch with, you may not let go of a deceased one, and instead adopt part of them on your way of living, an item, a common friend, a habit, etc. a means to honour this person, to keep their legacy out of respect.

    Annotators

    1. After the death of Solomon, the Hebrew kingdom is split into Israel and Judah. Israel, with its capital at Samaria, was criticized for falling into the worship of calves and Baal and was conquered by the Assyrians in 722 BCE.

      Solomon’s death, there were two kingdoms: Israel and Judah." Israel had its capital at Samaria, where they worshipped idols, and they were conquered by the Assyrians in 722 BCE.

    1. eLife Assessment

      This valuable study addresses T cell receptor activation during autoreactive T cell development and how the strength of T cell receptor engagement in naïve cells can predispose T cells to develop into effector/memory T cells. The authors lead with solid results that are largely consistent with data in the field suggesting that, in comparison to their counterparts with relatively lower basal self-reactivity, naive CD5hi CD8 T cells in non-obese diabetic (NOD) mice are poised for activation. They propose that diabetogenic T cells are preferentially found among the naive CD5hi CD8 T cell population. While the evidence does not fully support all the authors' conclusions, the data provide a foundation that sets up future studies.

    2. Reviewer #1 (Public review):

      Summary

      In their manuscript, Ho and colleagues investigate the importance of thymically-imprinted self-reactivity in determining CD8 T cell pathogenicity in non-obese diabetic (NOD) mice. The authors describe pre-existing functional biases associated with naive CD8 T cell self-reactivity based on CD5 levels, a well characterized proxy for T cell affinity to self-peptide. They find that naive CD5hi CD8 T cells are poised to respond to antigen challenge; these findings are largely consistent with previously published data on the C57Bl/6 background. The authors go on to suggest that naive CD5hi CD8 T cells are more diabetogenic as 1) the CD5hi naive CD8 T cell receptor repertoire has features associated with autoreactivity and contains a larger population of islet-specific T cells, and 2) the autoreactivity of "CD5hi" monoclonal islet-specific TCR transgenic T cells cannot be controlled by phosphatase over-expression. Thus, they implicate CD8 T cells with relatively higher levels of basal self-reactivity in autoimmunity. The data presented offers valuable insights and sets the foundation for future studies, but some conclusions are not yet fully supported.

      Specific comments

      There is value in presenting phenotypic differences between naive CD5lo and CD5hi CD8 T cells in the NOD background as most previous studies have used T cells harvested from C57Bl/6 mice or peripheral blood from healthy human donors.

      The comparison of a marker of self-reactivity, CD5 in this case, on broad thymocyte populations (DN/DP/CD8SP) is cautioned. CD5 is upregulated with signals associated with b-selection and positive selection; CD5 levels will thus vary even among subsets within these broad developmental intermediates. This is a particularly important consideration when comparing CD5 across thymic intermediates in polyclonal versus TCR transgenic thymocytes due to the striking differences in thymic selection efficiency, resulting in different developmental population profiles. The higher levels of CD5 noted in the DN population of NOD8.3 mice, for example, is likely due to the shift towards more mature DN4 post-b-selection cells. Similarly, in the DP population, the larger population of post-positive selection cells in the NOD8.3 transgenic thymus may also skew CD5 levels significantly. Overall, the reported differences between NOD and NOD8.3 thymocyte subsets could be due largely to differences in differentiation/maturation stage rather than affinity for self-antigen during T cell development. The authors have added some additional text to the revised manuscript that acknowledges some of these limitations.

      The lack of differences in CD5 levels of post-positive selection DP thymocytes, CD8 SP thymocytes, and CD8 T cells in the pancreas draining lymph nodes from NOD vs NOD8.3 mice also raises questions about the relevance of this model to address the question of basal self-reactivity and diabetogenicity and the authors' conclusion that "that intrinsic high CD5-associated self-reactivity in NOD8.3 T cells overrides the transgenic Pep-mediated protection observed in dLPC/NOD mice"; the phenotype of the polyclonal and NOD8.3 TCR transgenic CD8 T cells that were analyzed in the (spleen and) pancreas draining lymph nodes is not clear (i.e., are these gated on naive T cells?). Furthermore, the rationale for the comparison with NOD-BDC2.5 mice that carry an MHC II-restricted TCR is unclear.

      In reference to the conclusion that transgenic Pep phosphatase does not inhibit the diabetogenic potential of "CD5hi" CD8 T cells, there is some concern that comparing diabetes development in mice receiving polyclonal versus TCR transgenic T cells specific for an islet antigen is not appropriate. The increased frequency and number of antigen specific T cells in the NOD8.3 mice may be responsible for some of the observed differences. Further justification for the comparison is suggested.

      The manuscript presents an interesting observation that TCR sequences from CD5hi CD8 T cells may share certain characteristics with diabetogenic T cells found in patients (e.g., CDR3 length), and that autoantigen-specific T cells may be enriched within the CD5hi naive CD8 T cell population. However, the percentage of tetramer-positive cells among naive CD8 T cells appears unusually high in the data presented, and caution is warranted when comparing additional T cell receptor features of self-reactivity/auto-reactivity between CD4 and CD8 T cells.

      The counts for the KEGG enrichment pathways presented are relatively low, and the robustness of the analysis should be carefully considered, particularly given that several significance values appear borderline. That said, the differentially expressed genes among CD5lo and CD5hi CD8 T cells are generally consistent with previously published datasets.

      The manuscript includes some imprecise wording that may be misleading. For example (not exhaustive): The strength of TCR reactivity to foreign antigen is not "contributed by basal TCR signal" per se but rather correlates with sub-threshold TCR signals necessary for T cell development and survival, CD5 is not broadly expressed on all B cells as the text might suggest but is restricted to a specific subset of B cells, some of the proximal signaling molecules downstream of the preTCR are different than for the mature TCR, upregulation of CD127 at early timepoints post T cell activation is not directly suggestive of their "heightened capabilities in memory T cell homeostasis", etc. The statement "Our study exclusively examined female mice because the disease modeled is relevant in females" should be reconsidered. While the use of female NOD mice can be justified by their higher incidence of diabetes than their male counterparts, the current wording could be misleading.

      For clarity and transparency, please consider while additional information is provided in the revised manuscript, gating strategies are not always clear (i.e., naive versus total CD8 T cells), and the age/status of the mice from which cells are harvested (i.e., prediabetic?) is not consistently provided as far as this reviewer noted.

    3. Reviewer #2 (Public review):

      Summary:

      In this study Chia-Lo Ho et al. study the impact of CD5high CD8 T cells in the pathophysiology of type 1 diabetes (T1D) in NOD mice. The authors used high expression of CD5 as a surrogate of high TCR signaling and self-reactivity and compared the phenotype, transcriptome, TCR usage, function and pathogenic properties of CD5high vs. CD5low CD8 T cells extracted from the so-called naive T cell pool. The study shows that CD5high CD8 T cells resemble memory T cells poised for stronger response to TCR stimulation and that they exacerbate disease upon transfer in RAG-deficient NOD mice. The authors attempt to link these features to the thymic selection events of these CD5high CD8 T cells. Importantly, forced overexpression of the phosphatase PTPN22 in T cells attenuated TCR signaling and reduced pathogenicity of polyclonal CD8 T cells but not highly autoreactive 8.3-TCR CD8 T cells.

      Strengths:

      The study is nicely performed and the manuscript is clearly and well written. Interpretation of the data is careful and fair. The data are novel and likely important. However, some issues would need to be clarified through either text changes or addition of new data.

      Weaknesses:

      The definition of naïve T cells based solely on CD44low and CD62Lhigh staining may be oversimplistic. Indeed, even within this definition naïve CD5high CD8 T cells express much higher levels of CD44 than CD5low CD8 T cells.

      Comments on revisions:

      The authors addressed my previous comments thoughtfully and extensively.

    4. Reviewer #3 (Public review):

      Summary:

      In this study, Ho et al. hypothesised that autoreactive T cells receiving enhanced TCR signals during positive selection in the thymus are primed for generating effector and memory T cells. They used CD5 as a marker for TCR signal strength during their selection at the double positive stage. Supporting their hypothesis, naïve T cells with high CD5 proliferated better and expressed markers of T cell activation compared to naïve T cells with lower levels of CD5. Furthermore, results showed that autoimmune diabetes can be efficiently induced after the transfer of naïve CD5 hi T cells compared to CD5 lo T cells. This provided solid evidence in support of their hypothesis that T cells receiving higher basal TCR signaling are primmed to develop into effector T cells. However, all functional characterisation was done on the cells in the periphery and CD5 hi cells in the peripheral lymphoid compartment can receive tonic TCR signaling. Hence, the function of CD5 hi T cells might not be related to development and programming in the thymus. This is a major hurdle in the interpretation of the results and justifying the title of the study. The evidence that transgenic PTPN22 expression could not regulate T cell activation in CD5 hi TCR transgenic autoreactive T cells was weak. Studying T cell development in TCR transgenic mice and looking at TCR downstream signaling could be misleading due to transgenic expression of TCR at all developmental stages.

      Strengths:

      (1) Demonstrating that CD5 hi cells in naïve CD8 T cell compartment express markers of T cell activation, proliferation and cytotoxicity at a higher level

      (2) Using gene expression analysis, study showed CD5 hi cells among naïve CD8 T cells are transcriptionally poised to develop into effector or memory T cells.

      (3) Study showed that CD5 hi cells have higher basal TCR signaling compared to CD5 lo T cells.

      (4) Key evidence of pathogenicity of autoreactive CD5 hi T cells was provided by doing the adoptive transfer of CD5 hi and CD5 lo CD8 T cells into NOD Rag1-/- mice and comparing them.

      Weaknesses:

      (1) Although CD5 can be used as a marker for self-reactivity and T cell signal strength during thymic development, it can also be regulated in the periphery by tonic TCR signaling or when T cells are activated by its cognate antigen. Hence, TCR signals in the periphery could also prime the T cells towards effector/memory differentiation. That's why from the evidence presented here it cannot be concluded that this predisposition of T cells towards effector/memory differentiation is programmed due to higher reactivity towards self-MHC molecules in the thymus, as stated in the title.

      (2) Flow cytometry data needs to be revisited for the gating strategy, biological controls and interpretation.

      (3) Evidence linking CD5 hi cells to more effector phenotype using gene enrichment scores is very weak.

      (4) Experiments done in this study did not address why CD5 hi T cells could be negatively regulated in NOD mice when PTPN22 is overexpressed resulting in protection from diabetes but the same cannot be achieved in NOD8.3 mice.

      (5) Experimental evidence provided to show that PTPN22 overexpression does not regulate TCR signaling in NOD8.3 T cells is weak.

      (6) TCR sequencing analysis does not conclusively show that CD5 hi population is linked with autoreactive T cells. Doing single-cell RNAseq and TCR seq analysis would have helped address this question.

      (7) When analysing data from CD5 hi T cells from the pancreatic lymph node, it is difficult to discriminate if the phenotype is just because of T cells that would have just encountered the cognate antigen in the draining lymph node or if it is truly due to basal TCR signaling.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Review #1 (Public review):

      Figures 1 through 4 contain data that largely recapitulate published findings (Fulton et al., 2015; Lee et al., 2024; Swee et al., 2016; Dong et al., 2021); it is noted that there is value in confirming phenotypic differences between naive CD5lo and CD5hi CD8 T cells in the NOD background. It is important to contextualize the data while being wary of making parallels with results obtained from CD5lo and CD5hi CD4 T cells. There should also be additional attention paid to the wording in the text describing the data (e.g., the authors assert that, in Figure 4C, the “CD5hi group exhibited higher percentages of CD8+ T cells producing TNF-α, IFN-γ and IL-2” though there is no difference in IL-2 nor consistent differences in TNF-α between the CD5lo and CD5hi population<sup>hi</sup> CD8<sup>+</sup> and CD5<sup>lo</sup>CD8<sup>+</sup> T cells have been previously characterized in other genetic backgrounds. In our study, we aimed to confirm and extend these observations specifically in the autoimmune-prone NOD background, which had not been systematically addressed. Additionally, we carefully reviewed the text describing Figure 4C and revised the wording to accurately reflect the observed data (line 263-264). Specifically, we now state that the CD5<sup>hi</sup> group exhibited higher levels of IFN-γ and a trend toward increased TNF-α, while IL-2 production did not show a significant difference.

      The comparison of CD5 across thymocyte populations is cautioned due to variation in developmental stages, particularly in transgenic models. The reported differences may reflect maturation stages rather than self-reactivity.

      We appreciate the reviewer’s important point regarding the interpretation of CD5 levels across thymocyte subsets. In our revised manuscript (lines 455–471), we have added clarification that CD5 expression in DN and DP subsets reflects pre-TCR and TCR signaling events during thymic development. We also acknowledge that differences in maturation stages, especially in the NOD8.3 transgenic model, may influence CD5 expression. We now discuss this caveat and interpret our results with caution, particularly emphasizing that our data support but do not sufficiently define their differential self-reactivity.

      The conclusion that PTPN22 overexpression does not inhibit the diabetogenic potential of CD5<sup>hi</sup>CD8<sup>+</sup> T cells is potentially confounded by differences between polyclonal and TCR transgenic systems.

      We thank the reviewer for raising this concern. We acknowledge that this system introduces confounders due to differences in precursor frequencies and clonal expansion compared to polyclonal repertoires. These differences may affect the responsiveness to phosphatase-mediated attenuation of signaling. Therefore, while our results support that high-affinity autoreactive CD8<sup>+</sup> T cells may be less sensitive to PTPN22 overexpression, we do not claim that this finding generalizes to all autoreactive CD8<sup>+</sup> T cells. Rather, it highlights a potential inability of peripheral tolerance in T cells with strong intrinsic self-reactivity.

      TCR sequencing data shows variability; is this representative of the overall repertoire?

      We appreciate the reviewer’s comment. We acknowledge that data from bulk TCR sequencing has potential limitations, including variability across experiments and limited resolution at the clonotype level. To improve representativeness and reduce sampling bias, we performed TCR repertoire analysis in two independent experiments. In each experiment, naïve CD5<sup>hi</sup> CD8<sup>+</sup> and CD5<sup>lo</sup>CD8<sup>+</sup> T cells were sorted from pooled peripheral lymph nodes of at least 20 individual NOD mice per group. This approach allowed us to capture a broader range of clonotypes and ensured that the resulting repertoire profiles reflect the characteristics of the overall CD5<sup>hi</sup> and CD5<sup>lo</sup> populations, rather than isolated outliers. Despite some variability, we observed consistent trends in key features, such as shorter CDR3β length, altered TRAV/TRBV usage and reduced diversity in the CD5<sup>hi</sup> subset across both experiments. To enhance resolution and directly assess clonotype-specific reactivity, we plan to perform single-cell RNA and TCR sequencing in future studies, as noted in the revised Discussion (lines 466–471).

      Clarifications are requested regarding naive gating, controls, gMFI reporting, and missing methods.

      We thank the reviewer for these specific suggestions. We have revised figure legends to better describe gating strategies and included appropriate controls in Figures or Supplementary Figures. Regarding gMFI reporting, we have now shown in the figure legends whether values are reported as gMFI. Additionally, we have added the missing methods for cytokine staining, EdU incorporation, overlapped count matrix construction and TCR repertoire diversity metrics.

      Review #2 (Public review):

      Summary Comment:

      The study is nicely performed, but the definition of naive T cells using only CD44 and CD62L may be oversimplified. CD5hi naive T cells express higher CD44 than CD5lo cells.

      We thank the reviewer for the critical evaluation and thoughtful comment. As noted, we defined naïve CD8<sup>+</sup> T cells using a well-established gating strategy based on CD44<sup>lo</sup> and CD62L<sup>hi</sup> expression, consistent with previous studies (Immunity. 2010; 32(2):214–26; Nat Immunol. 2015; 16(1):107–17). We acknowledge that CD44 is expressed along a continuum, and indeed, within the naïve gate, CD5<sup>hi</sup> CD8<sup>+</sup> T cells exhibited slightly higher CD44 levels compared to their CD5<sup>lo</sup> counterparts. However, both subsets remained well below the CD44 expression observed in conventional effector/memory CD8<sup>+</sup> T cells, supporting their classification as naïve. To further validate this, we assessed additional markers associated with activation and memory differentiation, including CD69, PD-1, KLRG1 and CD25. These analyses confirmed that the sorted CD5<sup>hi</sup> and CD5<sup>lo</sup> populations retained a phenotypically naïve profile while exhibiting meaningful differences in baseline activation readiness (Figure 1F).

      Review #3 (Public review):

      CD5 can be regulated by peripheral signals. Therefore, it cannot be concluded that predisposition to effector/memory differentiation is solely programmed in the thymus.

      We thank the reviewer for this important point. We agree that CD5 expression can be dynamically regulated in the periphery by tonic TCR signals and antigen encounter, as also reflected in our own data that cells with high CD5 level display elevated activation potential upon encountering antigen (e.g., Figure 3L). To minimize the confounding effects of pre-existing peripheral activation, we performed an adoptive T cell transfer experiment (Figure 4). In this experiment, naïve CD5<sup>hi</sup>CD<sup>+</sup>and CD5<sup>lo</sup>CD8<sup>+</sup>T cells were sorted from the peripheral lymph nodes of young (6–8-week-old) prediabetic NOD mice and transferred into NOD Rag1<sup>–/–</sup> recipients. After 4 weeks, we compared the disease phenotypes and functional profiles of CD8<sup>+</sup> T cells from these two groups. This approach allowed us to evaluate the stability and differentiation capacity of CD5<sup>hi</sup> versus CD5<sup>lo</sup> cells in a lymphopenic environment, while excluding the possibility that the observed differences were due to already activated CD8<sup>+</sup>T cells at the time of isolation. We have revised the Discussion (lines 440–450) to acknowledge these experimental limitations and clarify that, while our findings demonstrate functional differences between CD5<sup>hi</sup>CD8<sup>+</sup> and CD5<sup>lo</sup>CD8<sup>+</sup>T cells, we cannot fully exclude contributions from peripheral influences.

      Experiments do not explain why PTPN22 overexpression protects in polyclonal T cells but not in NOD8.3 mice.

      We appreciate this critical comment. Our findings support that autoreactive T cells with high-affinity TCRs as in NOD8.3 mice receive strong signaling that even PTPN22 overexpression is insufficient to attenuate their activation and effector function. We acknowledge that further mechanistic studies are needed to fully elucidate the differential effects of PTPN22 in polyclonal versus TCR-transgenic settings.

      Evidence that PTPN22 does not regulate TCR signaling in NOD8.3 T cells is weak.

      We thank the reviewer for this critical comment. Our data show that NOD8.3 T cells with an intrinsic high CD5-associated self-reactivity are more resistant to transgenic Pep-mediated change in the phosphorylation status of TCR signaling molecules CD3ζ and Erk and CD5 expression (Figure 6, B-D). However, we agree that additional functional assays would strengthen this conclusion.

      TCR sequencing does not conclusively link CD5hi cells with autoreactivity; single-cell analysis is needed.

      We agree with this critical comment. Bulk TCR sequencing revealed repertoire features associated with autoreactivity, but cannot definitively link specific TCRs to function. We have acknowledged this in the discussion (lines 466–471) and highlighted plans to perform single-cell analysis.

      CD5hi cells in the PLNs may reflect antigen exposure rather than basal signaling.

      We thank the reviewer for this insightful comment. As also noted in Figure 3L, CD5 expression can be influenced by peripheral tonic TCR signals and recent antigen exposure. To minimize the contribution of peripheral activation, we particularly characterized naïve CD8<sup>+</sup>T cells isolated from the peripheral lymph nodes of young (6–8-week-old) prediabetic NOD mice before the onset of overt autoimmunity. Furthermore, we performed an adoptive transfer experiment (Figure 4) using sorted naïve CD5<sup>hi</sup>CD8<sup>+</sup> and CD5<sup>lo</sup>CD8<sup>+</sup>T cells from these mice and characterized their disease phenotype after 4 weeks in lymphopenic NOD Rag1<sup>–/–</sup> recipients and evaluated the effector function of CD8<sup>+</sup>T cells. This approach allowed us to compare the differentiation potential of these subsets in a controlled setting, independent of their activation status at the time of isolation. We have revised the Discussion (lines 440–450) to emphasize that, while our data support functional differences between CD5<sup>hi</sup>CD8<sup>+</sup> and CD5<sup>lo</sup>CD8<sup>+</sup>T cells, we cannot fully exclude the role of peripheral cues in shaping CD5 expression.

      Provide proper gating controls and representative flow plots.

      We thank the reviewer for this comment. We have revised figure legends to better describe gating strategies and included representative flow cytometry plots and appropriate gating controls in Figures or Supplementary Figures.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The authors):

      (1) The figure presentation is inconsistent and the labels/font are often too small to read easily.

      As Reviewer suggested, the figure presentation has been revised for consistency. Labels and fonts have been adjusted for improved readability. Specific figures that were difficult to read have been reformatted with larger fonts and clearer legends.

      (2) A careful review of the text to ensure clarity of the content is suggested (e.g., “gratitude” at line 91, “were generally lied” at line 123).

      Thanks for Reviewer’s comments. The text has been carefully reviewed for clarity and grammatical accuracy. Corrections have been made, including changing “gratitude” to “magnitude” (line 47) and “were generally lied” to “fell between” (line 79).

      Reviewer #2 (Recommendations For The Authors):

      (1) The definition of naïve T cells based solely on CD44low and CD62Lhigh staining may be oversimplistic. Indeed, even within this definition, naïve CD5high CD8 T cells express much higher levels of CD44 than CD5low CD8 T cells.

      Thanks for Reviewer’s comments. We used a literature-supported gating strategy (Immunity. 2010; 32(2):214–26; Nat Immunol. 2015; 16(1):107–17) to define naïve T cells based on CD44<sup>low</sup> and CD62L<sup>high</sup> expression. It is important to note that CD44 expression exists along a continuum. While we were initially surprised to observe that CD5<sup>lo</sup>CD8<sup>+</sup>T cells expressed relatively higher levels of CD44 than CD5<sup>lo</sup>CD8<sup>+</sup>T cells within the naïve gate, both populations still exhibited significantly lower CD44 expression compared to conventional effector/memory CD8<sup>+</sup>T cells. To further validate the distinction between CD5<sup>hi</sup> and CD5 subsets, we also examined additional markers such as CD69, PD1, KLRG1 and CD25, which supported their phenotypic differences within the naïve compartment (Figure 1F).

      (2) Figure 1G should show the proportion of IGRP-tetramer+ in the three groups of CD8 T cells. Additionally, it would be useful to assess reactivity against a pool of other islet autoantigens using a similar strategy.

      As suggested by the reviewer, the revised manuscript now includes additional data showing the proportion of IGRP-tetramer+ cells (Supplementary Figure 1D), as well as reactivity against another islet autoantigen, insulin-1/insulin-2 (Insulin B15–23) (Supplementary Figure 1E). The description of these results, including the proportions of IGRP-tetramer<sup>+</sup> and Insulin B15–23<sup>+</sup> CD8<sup>+</sup>Tcells, has been added to lines 126–129 of the revised manuscript.

      (3) The resolution of Figure 2 is suboptimal and at places poorly visible. Figure 2D is stated to show “two significant pathways stand out.” In fact, the data are barely significant, and the authors may want to correct their statement.

      The resolution of Figure 2 has been improved. As Reviewer suggested, the text has been revised to state “two potential pathways stand out” (line 187) instead of “two significant pathways stand out”.

      (4) Figure 3C-F and 3H, showing fold change over baseline values would be much easier for the reader to grasp the data.

      As Reviewer suggested, data in Figures 3C-F and 3H now are shown in fold change over baseline values for clarity. Baseline gMFI is the mean of each group (total CD<sup>+</sup> , CD5<sup>hi</sup>CD8<sup>+</sup> and CD5<sup>lo</sup>CD8<sup>+</sup>) at 0 μg/ml anti-CD3, with fold changes calculated for stimulation conditions (0.625-10 μg/ml anti-CD3). The figure legend has been updated accordingly.

      (5) Figure 4A, it would be much more valuable to show the diabetes frequency upon transfer of CD25- CD4 T cells alone and upon transfer of CD5high CD8 T cells alone. The word “spontaneous” in the Figure 4A legend seems inappropriate.

      Thanks for the Reviewer’s comment. We apologize for not including the data for the CD25 CD4<sup>+</sup> T cell transfer group in the original manuscript. While this group was part of our initial experimental design, we had considered it a control group and unintentionally omitted it from the figure. The revised manuscript now includes this group in Figure 4A. In addition, the term “spontaneous” has been replaced with “diabetes incidence” in the Figure 4A legend and manuscript (line 248). Regarding the suggestion to assess CD5<sup>hi</sup>CD8<sup>+</sup>T cells transfer alone, we appreciate the Reviewer’s point. However, previous studies have shown that CD8<sup>+</sup> T cells alone are not effective and sufficient to induce diabetes in adoptive transfer models, and that effective β-cell destruction typically requires both CD4<sup>+</sup> and CD8<sup>+</sup> T cell subsets. For instance, Christianson et al. (1993) demonstrated that enriched CD8<sup>+</sup> T cells from NOD mice fail to transfer diabetes on their own, while CD4<sup>+</sup> T cells—particularly from diabetic donors—can induce disease only under specific conditions and are significantly potentiated by co-transfer of CD8<sup>+</sup>cells. These findings have contributed to the widely available standard of co-transferring both subsets when studying diabetogenic potential in NOD models (Diabetes. 1993;42(1):44–55).

      (6) Line 257-258, please remove “indicating superior in vivo proliferation by the CD5hi subset.” Indeed, several other possibilities may explain the phenotype, including survival, migration, etc.

      As Reviewer suggested, the phrase “indicating superior in vivo proliferation by the CD5<sup>hi</sup> subset” has been replaced with “implying increased expansion and activation/effector potential” (line 261).

      (7) Figure 5A, it is unclear to this referee what is the significance of CD5 and pCD3zeta expression on DN thymocytes. Do these cells express rearranged alpha/beta TCR? Is it signaling through pre-TCRalpha/TCRbeta pairs?

      Thanks a lot for this important question. In the revised manuscript, we have expanded the discussion (line 455–471) to address the developmental significance of CD5 and pCD3ζ expression on DN thymocytes. CD5 expression at this stage reflects pre-TCR signaling strength during early selection, which occurs following successful TCRβ rearrangement. The associated phosphorylation of CD3ζ indicates activation of downstream signaling through the pre-TCRα/TCRβ complex. As discussed in the revised text, these early signals play a critical role in determining lineage progression and self-reactivity tuning. We now acknowledge that signaling at the DN stage occurs through the pre-TCRα/TCRβ heterodimer, not a fully rearranged αβ TCR, and that CD5 expression serves as a marker of the strength of these initial pre-selection signals (Sci Signal. 2022;15(736):eabj9842.). These developmental checkpoints are essential for calibrating TCR sensitivity and ensuring proper thymocyte maturation. This has been clarified in the revised discussion (line 455–471).

      (8) Figure 5F, could the DP TCRbeta- CD69- thymocytes from 8.3-TCR NOD mice already express low levels of the self-reactive TCR at this stage to explain their high expression of CD5? Addressing the question experimentally would be useful.

      Thanks a lot for this useful comment. According to a review by Huseby et al. (2022), expression of a functional TCRβ chain begins at the DN3 stage, initiating progression through the β-selection checkpoint. This is followed by TRAV locus recombination, resulting in the generation of αβ TCR-expressing double-positive 1 (DP-1) thymocytes. At the DP-1 stage, the quality of TCR signaling driven by self-pMHC interactions governs both positive and negative selection, as well as the development of nonconventional T cell lineages. We hypothesize that in transgenic NOD8.3 mice, which express pre-rearranged Tcra and Tcrb transgenes derived from the islet-reactive CD8<sup>+</sup>T cell clone NY8.3, thymocytes undergo allelic exclusion and lack the clonal diversity seen in non-transgenic mice. As a result, NOD8.3 thymocytes may receive strong TCR signals from early developmental stages (DN3 and DP-1) even without undergoing normal selection checkpoints. While the elevated TCR signal observed in NOD8.3 is indeed artificial, this model provides a unique system to test our hypothesis—namely, whether a strongly self-reactive TCR can generate high basal signaling during thymic development that overrides the negative regulatory effects of phosphatases like Pep. This possibility has been acknowledged in the revised Discussion section, along with a plan to validate the hypothesis experimentally (line 455–471).

      (9) Figure 7, single-cell TCR-seq would be much more appropriate to tackle the question of self-reactivity of CD5hi vs. CD5low CD8 T cells.

      Thanks a lot for this useful comment. The limitations of bulk TCR-seq are acknowledged, and single-cell TCR-seq is proposed as a future direction (line 455–471).

      Note, for Reviewer #2 (Recommendations For The Authors) (7) (8) (9), the discussion paragraphs are included to address the reviewers’ questions (line 455–471).

      Reviewer #3 (Recommendations For The Authors):

      (1) Positive controls (activated T cells from PLN or spleen), gating controls (whole naïve T cells), and representative flow-cytometry plots are needed for T-bet, EOMES, GzmB, and cytokine staining in Figure 1.

      As Reviewer suggested, we added representative gating controls for T-bet, EOMES, GzmB and cytokine staining in Supplementary Figure 1 of revised manuscript.

      (2) For Figure 1F, MFI for activation markers for the CD44hiCD62Llo cells should be provided for the comparison of PLN data.

      As Reviewer suggested, MFI data for these markers have been included in Figure 1F of revised manuscript.

      (3) In many places and figure legends, it is not mentioned from which organ cells were collected, i.e., spleen or PLN.

      As Reviewer suggested, the origin of cells for each experiment has been explicitly indicated in the figure legends or figure content to ensure clarity.

      (4) In the pancreatic lymph node, autoreactive T cells might be upregulating CD5 because they are encountering antigens. This should be addressed in the discussion.

      As Reviewer suggested, this issue has been included in the discussion of revised manuscript (line 440-450).

      (5) It is not clear if T cells from the spleen and PLN were stimulated to detect the production of pro-inflammatory cytokines.

      Thanks for the critical comment. The stimulation protocol and cytokine staining method have been added to the Supplementary material’s Supplementary methods section Cytokine staining in revised manuscript.

      (6) Figure 4C-D: It is not clear if analysis was done on naïve T cells or if they were stimulated.

      Thanks for the comment. Additionally, the stimulation and cytokine staining methods used in Figure 4C-D have been described in detail in the Supplementary Materials section Cytokine staining of revised manuscript.

      (7) IGRP gating in Figure 4F should be revisited with negative controls.

      Thanks for the critical comment. Negative controls have been added and used to adjust IGRP gating, and this is now mentioned in the figure legend of revised manuscript.

      (8) Interpretation that only CD5hi cells form a central memory T cell population (Figure 4F) could be misleading.

      Thanks for this valuable comment. We agree with that in conventional CD8<sup>+</sup> T cell immune responses, both CD5<sup>hi</sup> and CD5<sup>lo</sup> subsets have the potential to differentiate into central memory T cells. In our experimental approach, we adoptively transferred sorted CD5<sup>hi</sup>CD8<sup>+</sup> or CD5<sup>lo</sup>CD8<sup>+</sup>cells into Rag1<sup>-/-</sup> recipients and specifically analyzed PLNs four weeks after transfer. Using CD44 and CD62L expression as conventional markers for central memory T cells, we barely observed a CD44<sup>hi</sup>CD62L<sup>hi</sup> population in CD5<sup>lo</sup>CD8<sup>+</sup>transferred group. Based on these results, we stated: “This analysis underscores that the central memory T cell population and the frequency of islet autoantigen-specific CD8<sup>+</sup>T cells are higher in the CD5<sup>hi</sup> transferred subset within the PLNs, implying more robust immune responses initiated by the CD5<sup>hi</sup>cells” (line 272–274). Importantly, we did not intend to imply that only CD5<sup>hi</sup> cells can form central memory T cells, but rather that they were more enriched for this phenotype under the specific conditions and time point analyzed. 

      (9) IL-2 gating representative plot should be provided for Figure 5A.

      As Reviewer suggested, a representative IL-2 gating plot has been included in the revised Supplementary Figure 3B.

    1. In a sense, this chapter is a sort-of transition from moving very quickly and taking a very "50,000 foot view", to zooming in on the details. This made the chapter extra challenging to write. And longer.

      Indicates the change in the focus of the chapter from the big picture approach to examining events, which took longer to write.

    2. After the First Punic War’s losses, Carthage rebuilt in its power in Spain. While Rome was a rising power, Carthage still controlled most of the western Mediterranean.

      Carthage, having lost the First Punic War, began to strengthen its position in Spain. Rome, although expanding, still had Carthage reigning over most of the West Mediterranean.

    3. In 264 BCE, Rome was a regional power. The republican city controlled most of the Italian peninsula and a population of about 300,000. Carthage was a trade-based empire that spanned the Mediterranean. When a group of mercenaries seized the city of Messina in Sicily and asked both Rome and Carthage for help, Rome decided the empire had expanded into Italy quite enough, and intervened.

      Carthage at this point was a strong trading empire that extended to the Mediterranean. When mercenaries occupied the city of Messina in Sicily and asked for their assistance, Rome decided to involve itself in the conflict despite this action causing them to enter areas outside Italy.

    4. Zeno of Citium founded Stoicism around 300 BCE, in Athens. Born about 334 BCE in Cyprus, Zeno was a wealthy Phoenician merchant who lost his fortune in a shipwreck.

      Zeno of Citium was born in 334 BCE in Cyprus. In about 300 BCE, he started the Stoic school in Athens. Zeno was a wealthy businessman whose money was lost during a shipwreck.

    5. This kingdom would last 335 years, until it was conquered by Rome after its last monarch, Cleopatra VII, got involved in the Roman Civil War

      the kingdom lasted 335 years and came to an end because of the Roman conquest that took place because Queen Cleopatra VII joined a Roman civil war.

    6. Jesus was convicted of sedition against Rome for allowing himself to be called "King of the Jews" and executed by crucifixion, a method typically reserved for slaves, rebels, and bandits; but not uncommon. Historians have estimated that until the practice was abolished by Constantine in the third century, Rome executed tens of thousands and possibly up to 100,000 victims in this way. For example, after the slave revolt led by Spartacus, about 6,000 men were crucified along the Appian Way, the major road from Rome to southern Italy.

      Jesus was crucified for being known as the ‘King of the Jews, which was a punishment for slaves or rebels by the Romans. It was a common act in which thousands to as many as 100,000 people were subjected to this punishment; for instance, the followers of Spartacus.

    7. Caesar's grand-nephew, Gaius Octavius, returned to Rome from Illyria (the Balkans) and took the name Gaius Julius Caesar Octavianus. Although he was only eighteen, Octavian quickly raised seven legions of veterans of Caesar's wars. This forced a power-sharing arrangement between the allies of Caesar, which became the Second Triumvirate.

      His grand-nephew, Octavian, came back to Rome at the age of 18, adopted the name Caesar, and soon amassed an army. This resulted in a power-sharing arrangement with some of Caesar’s followers, who were named the Second Triumvirate.

    8. by the Xiongnu and held for a decade, but he escaped and completed his mission, traveling over 12,000 miles through modern Xinjiang and Uzbekistan. He returned to China in 126 BCE with maps of thirty-six kingdoms, exotic goods, and credible stories of an interconnected world beyond the Pamir Mountains.

      He had been captured by the Xiongnu for ten years, but he managed to escape and accomplish the journey, covering more than 12,000 miles. He returned to China in 126BCE with atlases, foreign products, and information about the territories beyond the Pamir Mountains.

    9. Chinese cultural traditions based on Confucuianism, Daoism, and Chinese writing, continued in both south and north. The Eastern Jin continued Han administrative techniques including the Nine-Rank System of official appointments.

      China culture, such as Confucianism, Daoism, and writing, existed in both the south and the north. The Eastern Jin dynasty continued Han administrative practices, such as the Nine-Rank System on how to select officials.

    10. The Sasanian Empire had begun in 224 CE, when a conqueror named Ardashir I who claimed descent from Achaemenid emperors, overthrew the Parthians. Internal strife and perennial conflict with the Roman Empire had weakened the Parthian, and Ardashir organized a more centralized state, with a capital at Ctesiphon on the Tigris River in present-day Iraq, about twenty miles from Baghdad.

      The Sasanian Empire began in the year 224 CE when Ardashir I overthrew the Parthian Empire under the claim that he was a successor to the previous Achaemenid rulers. He founded a strong centralized state, which was headquartered at Ctesiphon near present-day Baghdad

    11. The new religion focused on building communities that would support their members, and on embracing all believers regardless of their status as free or enslaved

      The above quote implies that the new religion was intended to form communities and welcomed all people, whether they were free or slaves.

    12. The prophet's continuing revelations established the Islamic requirements called the Five Pillars: a profession of faith, prayer five times a day, fasting during Ramadan, charity, and for those who can afford it

      This quote reveals that the five pillars of Islam came as a result of following the teachings of the prophet, belief in God, performing five prayers a day, fasting in Ramadan, giving charity, and if possible, going for a pilgrimage.

    13. The Sasanian Empire in Persia fell under Muslim control in 651, after nearly two decades of conflict. Muslim forces had begun raids into Mesopotamia under Abu Bakr, beginning in 633.

      After fighting for about 20 years, the Sasanian Empire in Persia was overrun by Muslim armies in the year 651. The attack on Persia was earlier, in the year 633, under Abu Bakr."

    14. In other cities such as Rome and Ravenna, plague killed up to 40%; while as many as 20% of country people died. These types of losses, added to the impact of the Gothic War, lend some credence to the claim that up to half the Italian population was wiped out in the 540s.

      In other cities like Rome and Ravenna, the plague resulted in the death of 40% of the population, and rural areas saw the death of up to 20% of the population. Deaths of this nature, together with the effect of the war with the Goths, give some truth to the fact that the Italian population could have lost half its numbers in the 540s.

    15. The 10% tithe paid to the Church funded schools, libraries, and scriptoria where texts were copied. In day to day life, Carolingian customs and written law was taken up throughout western Europe.

      This quote indicates that the 10% tax to the Church was used to finance the construction of schools, libraries, and scriptoria, where the copies were made. The Carolingian tradition and legal system spread to Western Europe.

    16. In 788, Charlemagne annexed Bavaria, and in the early 790s he had extended his rule into territory along the Danube River near Vienna (Austria) that had been held by the Avar Khaganate, an northeastern Asian empire that extended from north of the Black Sea to the Danube between the middle of the sixth century and the early 800s.

      In 788, Charlemagne overran Bavaria, and by the early 790s, he had enlarged his empire to include territories around the Danube River, which were under the rulership of the Avar Khaganate, a great empire from northeastern Asia.

    17. In India, fragmentation in the north produced a variety of regional kingdoms frequently at war with each other. In contrast, the Chola Empire expanded from its origin in Tamil Nadu in southeastern India to Sri Lanka, the Malabar Coast, and the Maldive Islands.

      The regions of North India were split up into small kingdoms which frequently warred amongst each other, while the Chola Empire in the South expanded, dominating regions such as Sri Lanka and the surrounding islands.

    18. The arts thrived with masterpieces in painting such as landscape scrolls, poetry, ceramics, and literature. Pre-Qin bronzes and jades were collected and imitated, reflecting an ongoing reverence for the past.

      The phrase means that there was a great deal of art and culture, excellent paintings, poems, ceramics, and books, and people were appreciating the best works of the bygone era.

    19. The Magyars (Hungarians) had originated east of the Ural Mountains but had been pushing westward since the beginning of the Common Era.

      This quote indicates that Magyars, or Hungarians, originally inhabited areas to the east of the Ural Mountains, which they migrated over time to the west.

    20. After John's death, the regency government of his young son, Henry III, reissued the document in 1216, stripped of some of its more "radical" content. It is still remembered as one of the earliest documents in the English traditions leading to developments such as the US Bill of Rights.

      when King John died, the new regime under his young son Henry III re issued the Magna Carta in 1216, except for its most extreme clauses. This is still remembered as one of the earliest movements toward things such as the US Bill of Rights.

    21. Perhaps the most famous Muslim traveler was the Moroccan Ibn Battuta (1304-1369) who over a period of about thirty years is said to have traveled 73,000 miles across North Africa, the Middle East, India, China, Southeast Asia, the edge of Europe (Constantinople) and sub-Saharan Africa (Mali).

      That Ibn Battuta, a Moroccan traveler, is very renowned for traversing 73,000 miles over a 30-year period. His destinations included Africa, Middle East, India, China, Europe, and Mali, which lies in sub Saharan Africa.

    22. One of the reasons written charters like Magna Carta and the [Charter of the Forest](Charter of the Forest) of 1217 were becoming more influential in England is that the level of education was beginning to rise. Before the eleventh century, education had been almost exclusively monastic.

      Its implies that the significance of documents such as the Magna Carta and the Charter of the Forest rose in England because more people were being educated. Before the 11th century, education was restricted to monks.

    23. In the late 1370s, Florentine Ciompi (wool-carders) briefly seized power and created a guild to protect their interests; but a counter-coup defeated them, dissolved their guild, and executed their leaders.

      By the late 1370s, the poor wool guild workers in Florence took control of power briefly, establishing an association that helped them secure their rights. Shortly after, the ruling class regained control of power and eliminated the guild organized by the workers by killing the guild leaders.

    24. After 1350, tenant farmers also began refusing to do the two to five days per week of work on the lord's behalf that had been traditional, and they began breaking the laws and customs that tied them to particular lands. Peasants fled either to towns or to manors that offered lower rents and more freedoms.

      In 1350, most farmers ceased doing the unremunerated services they owed to their lords and also disregarded some terms that tied them to their estates. This action led them to move to other locations, which included cities, in search of cheaper rentals.

    25. Zhu joined a rebel group called the Red Turbans. Zhu quickly became a successful rebel leader, showing tactical talent and building a strong personal network of refugees and orphans like himself.

      Zhu allied with a rebel group known as the Red Turbans, and later on, he demonstrated leadership qualities. He was very good at strategizing and could rally people with similar backgrounds, which were poorer refugees and orphans.

    26. Yongle was also very interested in resuming China's contact with the outside world, which had been deemphasized during his father's reign

      Yongle realized that it was necessary for China to interact with the rest of the world. His father cared only for what was going on in China, but Yongle wanted the situation to change. He promoted travel, trade, and interactions with the outside world.

    27. Men hunted big game, defended the band from predatory animals, and fought; women gathered, fished, trapped small animals, and grew the "three sisters" of corn, beans, and squash in garden plots they shifted when soil fertility began to wane.

      Men are more dominant than women physically, since they have to hunt, defend, and fought. The women are strong too, but they have less physical things to do, like grow and gather stuff.

    28. the new availability of inexpensive paper spurred an explosion of notebooks called zibaldoni, in which regular people wrote down excerpts of books they had read, things they had heard, or discoveries they had made themselves.

      I like how regular people could participate in learning, not just elites.

    29. Hongwu designated his chosen heir's first son, who was sixteen at the time. Four years later when the twenty-year old grandson inherited the throne, one of his uncles took it away from him.

      Putting a teenager on the throne seems risky. Young emperors were easy targets for ambitious family members and officials.

    30. The failure of these social systems in China is often interpreted as a loss by rulers of the Mandate of Heaven.

      The Mandate of Heaven was how people explained why dynasties fell. When famine, plague, and chaos hit, it made sense to people that the Yuan rulers had lost divine approval. I think its cool to see their thought process when their was no probable explanation for them at the time.

    31. A Jacquerie (peasant's revolt) in France was put down brutally in 1358. Heavily armored cavalry rode down crowds of farmers described by the authorities as a rebel army

      shows how desperate peasants were. Even though they fought back, the revolt was crushed brutally. Authorities often used extreme violence to maintain control.

    1. remember events based on the most intense moment (peak) and the end (end), rather than the whole

      kinda like only remembering the fact that we won a competition and not that sth challenging happened that day

    1. Brutus faced Octavian while Antony's legions fought Cassius. Both Cassius and Brutus committed suicide. Antony is said to have covered Brutus' body with a purple cloth as a sign of respect. They had not been friends, but Brutus had insisted, as a condition of going along with the plot to kill Caesar, that Antony be spared.

      In the battle that ensued after the death of Caesar, Brutus fought Octavian, and Cassius battled Antony. Brutus and Cassius took their own lives. Antony showed respect to Brutus when he covered him with a purple cloth, considering that Brutus was one of those who ensured that Antony was not harmed when they conspired to kill Caesar.

    1. eLife Assessment

      This important study demonstrates that in Drosophila melanogaster, tachykinin (Tk) expression is regulated by the microbiota. The authors present convincing evidence that axenic flies raised with no microbiota are longer-lived than conventionally reared animals, and that Tk expression and Tk receptors in the nervous system are required for this effect. They further test individual bacterial strains for their role in these effects and connect the effect to loss of lipid stores and suggest that FOXO may be involved in the phenotype, results that are of interest to the fields of environmental perception, host microbiome interactions, and geroscience.

      [Editors' note: this paper was reviewed by Review Commons.]

    2. Reviewer #1 (Public review):

      Summary:

      In this study the authors use a Drosophila model to demonstrate that Tachykinin (Tk) expression is regulated by the microbiota. In Drosophila conventionally reared (CR) flies are typically shorter lived than those raised without a microbiota (axenic). Here, knockdown of Tk expression is found to prevent lifespan shortening by the microbiota and the reduction of lipid stores typically seen in CR flies when compared to axenic counterparts. It does so without reducing food intake or fecundity which are often seen as necessary trade-offs for lifespan extension. Further, the strength of the interaction between Tk and the microbiota is found to be bacteria specific and is stronger in Acetobacter pomorum (Ap) mono-associated flies compared to Levilactobacillus brevis (Lb) mono-association. The impact on lipid storage was also only apparent in Ap-flies.

      Building on these findings the authors show that gut specific knockdown is largely sufficient to explain these phenotypes. Knockdown of the Tk receptor, TkR99D, in neurons recapitulates the lifespan phenotype of intestinal Tk knockdown supporting a model whereby Tk from the gut signals to TkR99D expressing neurons to regulate lifespan. In addition, the authors show that FOXO may have a role in lifespan regulation by the Tk-microbiota interaction. However, they rule out a role for insulin producing cells or Akh-producing cells suggesting the microbiota-Tk interaction regulates lifespan through other, yet unidentified, mechanisms.

      Major comments:

      Overall, I find the key conclusions of the paper convincing. The authors present an extensive amount of experimental work, and their conclusions are well founded in the data. In particular, the impact of TkRNAi on lifespan and lipid levels, the central finding in this study, has been demonstrated multiple times in different experiments and using different genetic tools. As a result, I don't feel that additional experimental work is necessary to support the current conclusions.

      However, I find it hard to assess the robustness of the lifespan data from the other manipulations used (TkR99DRNAi, TkRNAi in dFoxo mutants etc.) because information on the population size and whether these experiments have been replicated is lacking. Can the authors state in the figure legends the numbers of flies used for each lifespan and whether replicates have been done? For all other data it is clear how many replicates have been done, and the methods give enough detail for all experiments to be reproduced.

      Significance:

      Overall, I find the key conclusions of the paper convincing. The authors present an extensive amount of experimental work, and their conclusions are well founded in the data. We have known that the microbiota influence lifespan for some time but the mechanisms by which they do so have remained elusive. This study identifies one such mechanism and as a result opens several avenues for further research. The Tk-microbiota interaction is shown to be important for both lifespan and lipid homeostasis, although it's clear these are independent phenotypes. The fact that the outcome of the Tk-microbiota interaction depends on the bacterial species is of particular interest because it supports the idea that manipulation of the microbiota, or specific aspects of the host-microbiota interaction, may have therapeutic potential.

      These findings will be of interest to a broad readership spanning host-microbiota interactions and their influence on host health. They move forward the study of microbial regulation of host longevity and have relevance to our understanding of microbial regulation of host lipid homeostasis. They will also be of significant interest to those studying the mechanisms of action and physiological roles of Tachykinins.

      Field of expertise: Drosophila, gut, ageing, microbiota, innate immunity

    3. Reviewer #2 (Public review):

      Summary:

      The main finding of this work is that microbiota impacts lifespan though regulating the expression of a gut hormone (Tk) which in turn acts on its receptor expressed on neurons. This conclusion is robust and based on a number of experimental observations, carefully using techniques in fly genetics and physiology: 1) microbiota regulates Tk expression, 2) lifespan reduction by microbiota is absent when Tk is knocked down in gut (specifically in the EEs), 3) Tk knockdown extends lifespan and this is recapitulated by knockdown of a Tk receptor in neurons. These key conclusions are very convincing. Additional data are presented detailing the relationship between Tk and insulin/IGF signalling and Akh in this context. These are two other important endocrine signalling pathways in flies. The presentation and analysis of the data are excellent.

      There are only a few experiments or edits that I would suggest as important to confirm or refine the conclusions of this manuscript. These are:

      (1) When comparing the effects of microbiota (or single bacterial species) in different genetic backgrounds or experimental conditions, I think it would be good to show that the bacterial levels are not impacted by the other intervention(s). For example, the lifespan results observed in Figure 2A are consistent with Tk acting downstream of the microbes but also with Tk RNAi having an impact on the microbiota itself. I think this simple, additional control could be done for a few key experiments. Similarly, the authors could compare the two bacterial species to see if the differences in their effects come from different ability to colonise the flies.

      (2) The effect of Tk RNAi on TAG is opposite in CR and Ax or CR and Ap flies, and the knockdown shows an effect in either case (Figure 2E, Figure 3D). Why is this? Better clarification is required.

      (3) With respect to insulin signalling, all the experiments bar one indicate that insulin is mediating the effects of Tk. The one experiment that does not is using dilpGS to knock down TkR99D. Is it possible that this driver is simply not resulting in an efficient KD of the receptor? I would be inclined to check this, but as a minimum I would be a bit more cautious with the interpretation of these data.

      (4) Is it possible to perform at least one lifespan repeat with the other Tk RNAi line mentioned? This would further clarify that there are no off-target effects that can account for the phenotypes.

      There are a few other experiments that I could suggest as I think they could enrich the current manuscript, but I do not believe they are essential for publication:

      (5) The manuscript could be extended with a little more biochemical/cell biology analysis. For example, is it possible to look at Tk protein levels, Tk levels in circulation, or even TkR receptor activation or activation of its downstream signalling pathways? Comparing Ax and CR or Ap and CR one would expect to find differences consistent with the model proposed. This would add depth to the genetic analysis already conducted. Similarly, for insulin signalling - would it be possible to use some readout of the pathway activity and compare between Ax and CR or Ap and CR?

      (6) The authors use a pan-acetyl-K antibody but are specifically interested in acetylated histones. Would it be possible to use antibodies for acetylated histones? This would have the added benefit that one can confirm the changes are not in the levels of histones themselves.

      (7) I think the presentation of the results could be tightened a bit, with fewer sections and one figure per section.

      Significance:

      The main contribution of this manuscript is the identification of a mechanism that links the microbiota to lifespan. This is very exciting and topical for several reasons:

      (1) The microbiota is very important for overall health but it is still unclear how. Studying the interaction between microbiota and health is an emerging, growing field, and one that has attracted a lot of interest, but one that is often lacking in mechanistic insight. Identifying mechanisms provides opportunities for therapies. The main impact of this study comes from using the fruit fly to identify a mechanism.

      (2) It is very interesting that the authors focus on an endocrine mechanism, especially with the clear clinical relevance of gut hormones to human health recently demonstrated with new, effective therapies (e.g. Wegovy).

      (3) Tk is emerging as an important fly hormone and this study adds a new and interesting dimension by placing TK between microbiota and lifespan.

      I think the manuscript will be of great interest to researchers in ageing, human and animal physiology and in gut endocrinology and gut function.

    4. Reviewer #3 (Public review):

      Summary:

      Marcu et al. demonstrate a gut-neuron axis that is required for the lifespan-shortening effects mediated by gut bacteria. They show that the presence of commensal bacteria-particularly Acetobacter pomorum-promotes Tk expression in the gut, which then binds to neuronal tachykinin receptors to shorten lifespan. Tk has also recently been reported to extend lifespan via adipokinetic hormone (Akh) signaling (Ahrentløv et al., Nat Metab 7, 2025), but the mechanism here appears distinct. The lifespan shortening by Ap via Tk seems to be partially dependent on foxo and independent of both insulin signaling and Akh-mediated lipid mobilization.

      Although the detailed mechanistic link to lifespan is not fully resolved, the experiment and its results clearly show the involvement of the molecules tested. This work adds a valuable dimension to our growing understanding of how gut bacteria influence host longevity. However, there are some points that should be addressed.

      (1) Tk+ EEC activity should be assessed directly, rather than relying solely on transcript levels. Approaches such as CaLexA or GCaMP could be used.

      (2) In Line243, the manuscript states that the reporter activity was not increased in the posterior midgut. However, based on the presented results in Fig4E, there is seemingly not apparent regional specificity. A more detailed explanation is necessary.

      (3) If feasible, assessing foxo activation would add mechanistic depth. This could be done by monitoring foxo nuclear localization or measuring the expression levels of downstream target genes.

      (4) Fig1C uses Adh for normalization. Given the high variability of the result, the authors should (1) check whether Adh expression levels changed via bacterial association and/or (2) compare the results using different genes as internal standard.

      (5) While the difficulty of maintaining lifelong axenic conditions is understandable, it may still be feasible to assess the induction of Tk (i.e.. Tk transcription or EE activity upregulation) by the microbiome on males.

      (6) We also had some concerns regarding the wording of the title.<br /> Fig6B and C suggests that TkR86C, in addition to TkR99D, may be involved in the A. pomorum-lifespan interaction. Consider revising the title to refer more generally to the "tachykinin receptor" rather than only TkR99D.<br /> The difference between "aging" and "lifespan" should also be addressed. While the study shows a role for Tk in lifespan, assessment of aging phenotypes (e.g. Climbing assay, ISC proliferation) beyond the smurf assay is required to make conclusions about aging.

      (7) The statement in Line 82 that EEs express 14 peptide hormones should be supported with an appropriate reference, if available.

      Significance:

      General assessment: The main strength of this study is the careful and extensive lifespan analyses, which convincingly demonstrate the role of gut microbiota in regulating longevity. The authors clarify an important aspect of how microbial factors contribute to lifespan control. The main limitation is that the study primarily confirms the involvement of previously reported signaling pathways, without identifying novel molecular players or previously unrecognized mechanisms of lifespan regulation.

      Advance: The lifespan-shortening effect of Acetobacter pomorum (Ap) has been reported previously, as has the lifespan-shortening effect of Tachykinin (Tk). However, this study is the first to link these two factors mechanistically, which represents a significant and original contribution to the field. The advance is primarily mechanistic, providing new insight into how microbial cues converge on host signaling pathways to influence ageing.

      Audience: This work will be of particular interest to a specialized audience of basic researchers in ageing biology. It will also attract interest from microbiome researchers who are investigating host-microbe interactions and their physiological consequences. The findings will be useful as a conceptual framework for future mechanistic studies in this area.

      Field of expertise: Drosophila ageing, lifespan, microbiome, metabolism

    5. Author response:

      (1) General Statements

      The goal of our study was to mechanistically connect microbiota to host longevity. We have done so using a combination of genetic and physiological experiments, which outline a role for a neuroendocrine relay mediated by the intestinal neuropeptide Tachykinin, and its receptor TkR99D in neurons. We also show a requirement for these genes in metabolic and healthspan effects of microbiota.

      The referees' comments suggest they find the data novel and technically sound. We have added data in response to numerous points, which we feel enhance the manuscript further, and we have clarified text as requested. Reviewer #3 identified an error in Figure 4, which we have rectified. We felt that some specific experiments suggested in review would not add significant further depth, as we articulate below.

      Altogether our reviewers appear to agree that our manuscript makes a significant contribution to both the microbiome and ageing fields, using a large number of experiments to mechanistically outline the role(s) of various pathways and tissues. We thank the reviewers for their positive contributions to the publication process.

      (2) Description of the planned revisions

      Reviewer #2:

      Not…essential for publication…is it possible to look at Tk protein levels?

      We have acquired a small amount of anti-TK antibody and we will attempt to immunostain guts associated with A. pomorum and L. brevis. We are also attempting the equivalent experiment in mouse colon reared with/without a defined microbiota. These experiments are ongoing, but we note that the referee feels that the manuscript is a publishable unit whether these stainings succeed or not.

      (3) Description of the revisions that have already been incorporated in the transferred manuscript

      Reviewer #1:

      Can the authors state in the figure legends the numbers of flies used for each lifespan and whether replicates have been done?

      We have incorporated the requested information into legends for lifespan experiments.

      Do the interventions shorten lifespan relative to the axenic cohort? Or do they prevent lifespan extension by axenic conditions? Both statements are valid, and the authors need to be consistent in which one they use to avoid confusing the reader.

      We read these statements differently. The only experiment in which a genetic intervention prevented lifespan extension by axenic conditions is neuronal TkR86C knockdown (Figure 6B-C). Otherwise, microbiota shortened lifespan relative to axenic conditions, and genetic knockdowns extend blocked this effect (e.g. see lines 131-133). We have ensured that the framing is consistent throughout, with text edited at lines 198-199, 298-299, 311-312, 345-347, 407-408, 424-425, 450, 497-503.

      TkRNAi consistently reduces lipid levels in axenic flies (Figs 2E, 3D), essentially phenocopying the loss of lipid stores seen in control conventionally reared (CR) flies relative to control axenic. This suggests that the previously reported role of Tk in lipid storage - demonstrated through increased lipid levels in TkRNAi flies (Song et al (2014) Cell Rep 9(1): 40) - is dependent on the microbiota. In the absence of the microbiota TkRNAi reduces lipid levels. The lack of acknowledgement of this in the text is confusing

      We have added text at lines 219-222 to address this point. We agree that this effect is hard to interpret biologically, since expressing RNAi in axenics has no additional effect on Tk expression (Figure S7). Consequently we can only interpret this unexpected effect as a possible off-target effect of RU feeding on TAG, specific to axenic flies. However, this possibility does not void our conclusion, because an off-target dimunition of TAG cannot explain why CR flies accumulate TAG following Tk<sup>RNAi</sup> induction. We hope that our added text clarifies.

      I have struggled to follow the authors logic in ablating the IPCs and feel a clear statement on what they expected the outcome to be would help the reader.

      We have added the requested statement at lines 423-424, explaining that we expected the IPC ablation to render flies constitutively long-lived and non-responsive to A pomorum.

      Can the authors clarify their logic in concluding a role for insulin signalling, and qualify this conclusion with appropriate consideration of alternative hypotheses?

      We have added our logic at lines 449-454. In brief, we conclude involvement for insulin signalling because FoxO mutant lifespan does not respond to Tk<sup>RNAi</sup>, and diminishes the lifespan-shortening effect of A. pomorum. However, we cannot state that the effects are direct because we do not have data that mechanistically connects Tk/TkR99D signalling directly in insulin-producing cells. The current evidence is most consistent with insulin signalling priming responses to microbiota/Tk/TkR99D, as per the newly-added text.

      Typographical errors

      We have remedied the highlighted errors, at lines 128-140.

      Reviewer #2:

      it would be good to show that the bacterial levels are not impacted [by TkRNAi]

      We have quantified CFUs in CR flies upon ubiquitous TkRNAi (Figure S5), finding that the RNAi does not affect bacterial load. New text at lines 138-139 articulates this point.

      The effect of Tk RNAi on TAG is opposite in CR and Ax or CR and Ap flies, and the knockdown shows an effect in either case (Figure 2E, Figure 3D). Why is this?

      As per response to Reviewer #1, we have added text at lines 219-222 to address this point.

      Is it possible to perform at least one lifespan repeat with the other Tk RNAi line mentioned?

      We have added another experiment showing longevity upon knockdown in conventional flies, using an independent TkRNAi line (Figure S3).

      Reviewer #3:

      In Line243, the manuscript states that the reporter activity was not increased in the posterior midgut. However, based on the presented results in Fig4E, there is seemingly not apparent regional specificity. A more detailed explanation is necessary.

      We thank the reviewer sincerely for their keen eye, which has highlighted an error in the previous version of the figure. In revisiting this figure we have noticed, to our dismay, that the figures for GFP quantification were actually re-plots of the figures for (ac)K quantification. This error led to the discrepancy between statistics and graphics, which thankfully the reviewer noticed. We have revised the figure to remedy our error, and the statistics now match the boxplots and results text.

      Fig1C uses Adh for normalization. Given the high variability of the result, the authors should (1) check whether Adh expression levels changed via bacterial association

      We selected Adh on the basis of our RNAseq analysis, which showed it was not different between AX and CV guts, whereas many commonly-used “housekeeping” genes were. We have now added a plot to demonstrate (Figure S2).

      The statement in Line 82 that EEs express 14 peptide hormones should be supported with an appropriate reference

      We have added the requested reference (Hung et al, 2020) at line 86.

      (4) Description of analyses that authors prefer not to carry out

      Reviewer #1:

      I'd encourage the authors to provide lifespan plots that enable comparison between all conditions

      We have avoided this approach because the number of survival curves that would need to be presented on the same axis (e.g. 16 for Figure 5) is not legible. However we have ensured that axes on faceted plots are equivalent and with grid lines for comparison. Moreover, our approach using statistical coefficients (EMMs) enables direct quantitative comparison of the differences among conditions.

      Reviewer #2:

      Is it possible that this driver is simply not resulting in an efficient KD of the receptor? I would be inclined to check this

      This comment relates to Figure 7G. We do see an effect of the knockdown in this experiment, so we believe that the knockdown is effective. However the direction of response is not consistent with our hypothesis so the experiment is not informative about the role of these cells. We therefore feel there is little to be gained by testing efficacy of knockdown, which would also be technically challenging because the cells are a small population in a larger tissue which expresses the same transcripts elsewhere (i.e. necessitating FISH).

      Would it be possible to use antibodies for acetylated histones?

      The comment relates to Figure 4C-E. The proposed studies would be a significant amount of work because, to our knowledge, the specific histone marks which drive activation in TK+ cells remain unknown. On the other hand, we do not see how this information would enrich the present story, rather such experiments would appear to be the beginning of something new. We therefore agree with Reviewer #1 (in cross-commenting) that this additional work is not justified.

      Reviewer #3:

      Tk+ EEC activity should be assessed directly, rather than relying solely on transcript levels. Approaches such as CaLexA or GCaMP could be used.

      We agree with reviewers 1-2 (in cross-commenting) that this proposal is non-trivial and not justified by the additional insight that would be gained. As described above, we are attempting to immunostain Tk, which if successful will provide a third line of evidence for regulation of Tk+ cells. However we note that we already have the strongest possible evidence for a role of these cells via genetic analysis (Figure 5).

      While the difficulty of maintaining lifelong axenic conditions is understandable, it may still be feasible to assess the induction of Tk (ie. Tk transcription or EE activity upregulation) by the microbiome on males.

      As the reviewer recognises, maintaining axenic experiments for months on end is not trivial. Given the tendency for males either to simply mirror female responses to lifespan-extending interventions, or to not respond at all, we made the decision in our work to only study females. We have instead emphasised in the manuscript that results are from female flies.

      TkR86C, in addition to TkR99D, may be involved in the A. pomorum-lifespan interaction. Consider revising the title to refer more generally to the "tachykinin receptor" rather than only TkR99D.

      We disagree with this interpretation: the results do not show that TkR86C-RNAi recapitulates the effect of enteric Tk-RNAi. A potentially interesting interaction is apparent, but the data do not support a causal role for TkR86C. A causal role is supported only for TkR99D, knockdown of which recapitulates the longevity of axenic flies and Tk<sup>RNAi</sup> flies_._ Therefore we feel that our current title is therefore justified by the data, and a more generic version would misrepresent our findings.

      The difference between "aging" and "lifespan" should also be addressed.

      The smurf phenotype is a well-established metric of healthspan. Moreover, lifespan is the leading aggregate measure of ageing. We therefore feel that the use of “ageing” in the title is appropriate.

      If feasible, assessing foxo activation would add mechanistic depth. This could be done by monitoring foxo nuclear localization or measuring the expression levels of downstream target genes.

      Foxo nuclear localisation has already been shown in axenic flies (Shin et al, 2011). We have added text and citation at lines 401-402.

    1. Flavius Constantius, a Roman army officer, became one of four joint emperors in the Tetrarchy system, and his position passed to his son, Constantine, when he died in 306

      That Flavius Constantius, a Roman military leader, was one of the four rulers that the Tetrarchy governed. Constantine took over when Flavius Constantius died in the year 306.

    1. eLife Assessment

      In this important manuscript, the authors establish a vertebrate model for studying the development of circuits that control heart rate. This contribution uses a combination of experimental techniques to provide compelling information for scientists looking to understand how heart rate regulation emerges during development.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript by Hernandez-Nunez et al. provides a comprehensive characterization of how heart-brain circuits develop in a vertebrate brain, namely the zebrafish. The characterization is performed using a combination of modern and sophisticated imaging and neural manipulation techniques and achieves unprecedented clarity and detail in how the heart-brain communication develops early in life. The paper describes a three-stage program, where first an efferent-circuit from the motor vagus to the heart develops, followed by sympathetic innervation, and lastly sensory neurons innervate the heart.

      Strengths:

      The paper is very clearly and nicely written. The findings are novel and of high quality and relevance. The presentations are very clear and nicely interpreted. The analyses are well presented and applied.

      Weaknesses:

      From the heart rate traces, heart rate variability seems to be prominent and changes across days post-fertilization (dpf). That would be a useful dependent variable, considering that the variation captured by the models does not fully explain heart rate, both for sympathetic and parasympathetic efferents. Given the strong autorhythmicity of nodal tissue in neurogenic hearts, modulatory inputs could potentially predict heart rate variability with higher precision.

    3. Reviewer #2 (Public review):

      Hernandez-Nunez et al. investigate the development and function of neural circuits involved in the regulation of heart rate in larval zebrafish. Using conserved genetic markers, they identify neural pathways involved in the bidirectional control of heart rate and in providing sensory feedback, potentially enabling more precise tuning. The main observation is that the different elements of this circuit are laid down in a developmentally staggered manner.

      At 4 days old, the heart rate is invariant to a range of sensory stimuli, and the vagal motor or sympathetic pathways could not be seen to innervate the heart. Progressively through development, the heart is first innervated by the vagal motor pathway, whose axons are cholinergic, before the formation of phox2bb+ intracardiac neurons (ICNs). At this stage, before the first ICNs are observed, activation of the vagal motor pathway by optogenetic activation of a localized population of cholinergic hindbrain neurons leads to bradycardia. After the vagal motor innervation begins, the sympathetic pathway innervates the heart, which could be visualized in the form of TH+ fibers from the anterior paravertebral ganglia (APG). The activity of the TH+ APG neurons was diverse and showed proportional, integral, and derivative-like relationships to the heart rate, suggesting a role in more precise tuning of the rate than what could be achieved through the vagal pathway alone. The sensory vagus innervation of the heart was identified to be the last stage to develop; however, neurons in the nodose ganglion exhibited diverse responses tuned to the heart rate well before the innervation reached the heart. The authors attribute this to the fact that other indirect sensory cues from the gills or vasculature could be used to sense heart rate prior to innervation.

      This study identifies key components of the control loop required for the regulation of heart rate in zebrafish. The control mechanism appears to be independent of the cues that trigger heart rate changes, indicating that the circuit is indeed part of an interoceptive pathway for heart rate control. Evidence for the staggered development of the vagal-motor, sympathetic, and sensory pathways is conclusive, and as the authors discuss, this phenomenon progressively allows for finer-grained control of the heart rate. This could be achieved through proportional-integral-derivative-like control properties emerging in a diverse set of neurons in the APG and sensory feedback of the state of the heart. In line with these findings, the baseline variability of heart rate prior to innervation at 4 days old appears to be comparatively lower than the later stages (Figure 1C, D, Supplementary Figure 1C-F) and increases over development.

      Based on this observation and the time courses of the kernels identified by the GLMs, I would expect heart rate fluctuations of a finer time scale, ultimately limited by the time course of GCaMP6s, to be captured by the models in Figures 3, 5, and 7, in addition to the stimulus-locked changes that are highlighted. While the models yield valuable insight in the form of the activation kernels and their potential roles, in one instance, this captures the potential contribution of either the motor vagus or the APG to the change in heart rate. This makes it challenging to identify where it falls short and the potential functions of pathways that are yet to be discovered.

      Lastly, the proposed anatomical connectivity of the heart-brain circuit is based on tracts observed in this study as well as those inferred from function and from previous studies.

      (1) It is not clear from the images presented here whether the VSNs send feedback projections to the brainstem VPN.

      (2) Do the brainstem neurons identified by their functional roles send efferent projections via the motor vagus nerve? This is unclear from the results presented and needs to be clarified in the text.

      (3) Add appropriate clarifying annotations to Figure 9 and a section of text discussing the potential unknowns in the proposed circuit diagram.

    4. Author response:

      We thank the reviewers for their thoughtful, constructive, and generous evaluations of our manuscript. We are encouraged by their overall assessment of the clarity, novelty, and significance of the work, and we appreciate the opportunity to further strengthen the manuscript.

      Both reviewers highlight the central contribution of this study: a developmental, circuitlevel dissection of how heart–brain signaling emerges in a vertebrate. We are pleased that the evidence supporting the staggered assembly of vagal motor, sympathetic, and sensory pathways was found to be compelling, and that the computational and experimental framework was viewed as appropriate and informative.

      Below, we briefly outline how we plan to address the main points raised in the reviews.

      Heart rate variability and temporal structure

      Both reviewers note that heart rate variability (HRV) changes across development and suggest that HRV may provide additional insight into the function of autonomic circuits. We agree that HRV is an important physiological readout and that its developmental changes are consistent with the progressive emergence of autonomic control.

      In the revised manuscript, we plan to (i) discuss heart rate variability more explicitly in the context of circuit maturation and (ii) clarify the temporal scales captured by our experiments and modeling framework. In particular, we will emphasize that our analyses focus on relationships between neural activity and heart-rate trajectories at timescales accessible given imaging rate and indicator kinetics, rather than beat-to-beat variability. We will also consider adding a supplementary analysis of the variability that can be reliably measured within these constraints, and, where appropriate, how neural activity predicts that measurable variation.

      Scope and interpretation of the computational models

      Reviewer #2 raises thoughtful points regarding what the generalized linear models can and cannot disambiguate, particularly when multiple efferent pathways may contribute to heart-rate dynamics. We will revise the text to more clearly distinguish between functional encoding relationships inferred from the models and anatomical connectivity that is directly demonstrated.

      Our intent is to frame the kernels identified in the motor and sympathetic pathways as computational motifs that capture distinct dynamical contributions, rather than as exclusive or complete explanations of heart-rate control. We will clarify these limitations explicitly in the Results and Discussion.

      Circuit diagram and anatomical interpretation

      We appreciate the reviewer’s careful reading of the proposed circuit schematic. In the revised manuscript, we will revise the figure and accompanying text to clearly annotate which connections are directly observed, which are functionally inferred, and which remain hypothetical. We will also expand the Discussion to explicitly address open questions, including unresolved feedback pathways and the potential for additional nodes in the circuit.

      We believe these revisions will improve clarity without altering the core conclusions of the study. We thank the reviewers again for their insightful feedback and look forward to submitting a revised version of the manuscript that addresses these points in detail.

    1. Although Justinian succeeded in reunifying much of the old Roman Empire, his victory was fleeting. The reduced population weakened the Mediterranean region's defenses against adversaries from remoter parts of Europe that had not been affected by plague.

      Even though Justinian was able to regain much of the old Roman Empire, his success was short-lived because this deterioration of population had also made the Mediterranean area vulnerable to attacks from rival European regions that had not been hit by the plague.

    1. eLife Assessment

      This paper presents an important advance in genetically encoded voltage imaging of the developing zebrafish spinal cord in vivo, capturing voltage dynamics in neuronal populations, single cells, and subcellular compartments inaccessible to patch clamp, and diverse spike waveforms and subthreshold voltage dynamics inaccessible to calcium imaging. The work identifies a developmental progression from irregular voltage fluctuations to coordinated contralateral and ipsilateral activity, providing insight into how electrical dynamics and cellular morphology evolve during circuit formation. The strength of evidence is solid, with imaging data supporting the main conclusions, although the manuscript would be strengthened by more complete methodological documentation and clearer context relative to earlier calcium imaging studies. Overall, this study provides a resource that is of importance for researchers investigating neural development and circuit assembly, illustrating the value of voltage imaging as a general tool for probing bioelectric mechanisms in morphogenesis and circuit development.

    2. Reviewer #1 (Public review):

      Summary:

      This paper demonstrates the first application of voltage imaging using a genetically encoded voltage indicator, ArcLight, for recording the spontaneous activity of the developing spinal cord in zebrafish. This technology enabled better temporal resolution compared to what has been demonstrated with calcium imaging in past studies (Muto et al., 2011; Warp et al., 2012; Wan et al., 2019 ), which led to the discovery of the maturation process of "firing" shapes in spinal neurons. This maturation process occurs simultaneously with axonal elongation and network integration. Thus, voltage imaging revealed new biological details of the development of the spinal circuits.

      Strengths:

      The use of voltage imaging instead of calcium imaging revealed biological details of the functional maturation of spinal cord neurons in developing zebrafish.

      Weaknesses:

      This manuscript lacks many basic components and explanations necessary for understanding the methodologies used in this study.

    3. Reviewer #2 (Public review):

      The authors present highly impressive in vivo voltage‐imaging data, demonstrating neuronal activity at subcellular, cellular, and population levels in a developing organism. The approach provides excellent spatial and temporal resolution, with sufficient signal-to-noise to detect hyperpolarizations and subthreshold events. The visualization of contralateral synchrony and its developmental loss over time is particularly compelling. The observation that ipsilateral synchrony persists despite contralateral desynchronization is a striking demonstration of the power of GEVIs in vivo. While I outline several points that should be addressed, I consider this among the strongest demonstrations of in vivo GEVI imaging to date.

      Major points:

      (1) Clarification of GEVI performance characteristics

      There is a widespread misconception in the GEVI field that response speed is the dominant or primary determinant of sensor performance. Although fast kinetics are certainly desirable, they are not the only (or even necessarily the limiting) factor for effective imaging. Kinetic speed specifies the time to reach ~63% of the maximal ΔF/F for a given voltage step (typically 100 mV, approximating the amplitude of a neuronal action potential), but in practical imaging, a slower sensor with a large ΔF/F can outperform a faster sensor with a small ΔF/F. In this context, the authors' use of ArcLight is actually instructive. ArcLight is one of the slower GEVIs in common use, yet Figures S1a-b clearly show that it still reports voltage transients in vivo very well. I therefore strongly recommend moving these panels into the main text to emphasize that robust in vivo imaging can be achieved even with a relatively slow GEVI, provided the signal amplitude and SNR are adequate. This will help counteract the common misunderstanding in the field.

      (2) ArcLight's voltage-response range

      ArcLight is shifted toward more negative potentials (V₁/₂ ≈ −30 mV). This improves subthreshold detection but makes distinguishing action potentials from subthreshold transients more challenging. The comparison with GCaMP is helpful because the Ca²⁺ signal largely reflects action potentials. Panels S1c-f show similar onset kinetics but a longer decay for GCaMP. Surprisingly, the ΔF/F amplitudes are comparable; typically, GCaMP changes are larger. To support lines 193-194, the authors should include a table summarizing the onset/offset kinetics and ΔF/F ranges for neurons expressing ArcLight versus GCaMP.

      Additionally, the expected action-potential amplitude in zebrafish neurons should be stated. In Figure S1b, a 40 mV change appears to produce ~0.5% ΔF/F, but this should be quantified and noted. Could this comparison to GCaMP help resolve action potentials from subthreshold bursts?

      (3) Axonal versus somatic amplitudes (Line 203)

      The manuscript states that voltage amplitudes are "slightly smaller" in axons than in somata; this requires quantitative values and statistical testing. More importantly, differences in optical amplitude reflect factors such as expression levels, background fluorescence, and optical geometry, not necessarily true differences in voltage amplitude. The axonal signals are clearly present, but their relative magnitude should not be interpreted without correction.

      (4) Figure 4C: need for an off-ROI control

      Figure 4C should include a control ROI located away from ROI3 to demonstrate that the axonal signal is not due to background fluctuations, similar to the control shown in Figure S3. Although the ΔF image suggests localization, showing the trace explicitly would strengthen the point. The fluorescence-change image in Figure 4c should also be fully explained in the legend.

      (5) Figure 5: hyperpolarization signals

      Figure 5 is particularly impressive. It appears that Cell 2 at 18.5 hpf and Cell 1 at 18 hpf exhibit hyperpolarizing events. The authors should confirm that these are true hyperpolarizations by giving some indication of how often they were observed.

      (6) SNR comparison (Lines 300-302)

      The claim that ArcLight and GCaMP exhibit comparable SNR requires statistical support across multiple cells.

    4. Reviewer #3 (Public review):

      Summary:

      The authors aimed to establish a long-term voltage imaging platform to investigate how coordinated neuronal activity emerges during spinal cord development in zebrafish embryos. Using the genetically encoded voltage indicator ArcLight, they tracked membrane potential dynamics in motor neurons at population, single-cell, and subcellular levels from 18 to 23 hours post-fertilization (hpf), revealing relationships between firing maturation, waveform characteristics, and axonal outgrowth.

      Strengths:

      (1) Technical advancement in developmental voltage imaging:

      This study demonstrates voltage imaging of motor neurons in the developing vertebrate spinal cord. The approach successfully captures voltage dynamics at multiple spatial scales-neuronal population, single-cell, and subcellular compartments.

      (2) Insights into the relationship between morphological and functional maturation:

      The work reveals important relationships between voltage dynamics maturation and morphological changes.

      (3) Kinetics analysis of membrane potential waveform enabled by voltage imaging:

      The characterization of "immature" versus "mature" firing based on quantitative waveform parameters provides insights into functional maturation that are inaccessible by calcium imaging. This analysis reveals a maturation process in the biophysical properties of developing neurons.

      (4) Matching of voltage indicator kinetics to biological signal:

      The authors' choice of ArcLight, despite its slow kinetics compared to newer GEVIs, proved well-suited to the low-frequency activity patterns in developing spinal neurons (frequency ~0.3 Hz).

      Weaknesses:

      (1) Insufficient comparison with prior calcium imaging studies:

      While the authors state that voltage imaging provides superior temporal resolution compared to calcium imaging (lines 192-196, 301), and this is generally true, the current manuscript does not adequately cite or discuss previous calcium imaging studies. Since neural activity occurs at low frequency in the developing spinal cord, calcium imaging is adequate for characterizing the emergence of coordinated activity patterns in the developing zebrafish spinal cord. Notably, Wan et al. (2019, Cell) performed a comprehensive single-cell reconstruction of emerging population activity in the entire developing zebrafish spinal cord using calcium imaging. This work should be properly acknowledged and compared. The specific advantages of voltage imaging over these prior studies need to be more clearly articulated, e.g. detection of subthreshold events and membrane potential waveform kinetics.

      (2) Considerations for generalizability of the ArcLight-based voltage imaging approach:

      While this study successfully demonstrates voltage imaging using ArcLight in the developing spinal cord, the generalizability of this approach to later developmental stages and other neural systems warrants discussion. ArcLight exhibits relatively slow kinetics (rise time ~100-200 ms, decay τ ~200-300 ms). In the current study, these kinetics are well-suited to the developmental activity patterns observed (firing frequency ~0.3 Hz), representing appropriate matching of indicator properties to biological timescales. However, the same approach may be less suitable for later developmental stages when neural activity occurs at higher frequencies.

      (3) Incomplete methodological descriptions:

      As a paper establishing a new imaging approach, several critical details are missing or unclear.

      (a) Imaging system specifications: The imaging setup description lacks essential information, including light source specifications, excitation wavelength/filter sets, and light power at the sample. The authors should also clarify whether wide-field optics was used rather than confocal or selective plane imaging.

      (b) Long-term imaging protocol: Whether neurons were imaged continuously or with breaks between imaging sessions is not explicitly stated. The current phrasing could be interpreted as a continuous 4.5-hour recording, which would be technically impressive but may not be what was actually done.

      (c) Image processing procedures: Denoising and bleach correction procedures are mentioned but not described, which is critical for a methods-focused paper.

      (d) The waveform classification (Supplementary Figure S6) shows overlapping kinetics between "immature" and "mature" firing, yet the classification method is not adequately justified.

      (e) Given that photostability and toxicity are critical considerations for long-term voltage imaging, these aspects warrant further clarification. While the figures suggest stable ArcLight fluorescence during the experiments, the manuscript lacks quantification of photobleaching, a discussion of potential toxicity concerns associated with the indicator, and information regarding the maximum duration over which the ArcLight signal can faithfully report physiological voltage dynamics.

      (4) Incomplete data representation and quantification:

      (a) The claim of "reduced variability" in calcium imaging (line 194) is not clearly demonstrated in Supplementary Figure S1.

      (b) Amplitude distributions for cell/subcellular compartments are not systematically quantified. Figure S3 shows ~5% changes in some axons versus ~2% in others, but it remains unclear whether these variabilities reflect differences between axonal compartments within the same cell, between individual cells, or between individual fish.

    1. The Vikings also sailed their ships into the unknown in the North Atlantic, and in 874 a settlement party reached Iceland, led by Ingólfur Arnarson, who is traditionally considered the founder of Reykjavik. There had been stories of a large island dating back to at least 330 BCE, when a Greek explorer named Pytheas had described his travels.

      The Vikings sailed west to the North Atlantic, and in 874, they, under the leadership of Ingólfur Arnarson, settled in Iceland, where Reykjavik was established. There were stories about Iceland, even dating back to 330 BCE.

    1. eLife Assessment

      This study presents a valuable and practical approach for one-photon imaging through GRIN lenses. By scanning a low numerical aperture (NA) beam and collecting fluorescence with a high NA, the method expands the usable field of view and yields clearer cellular signals. The evidence is solid overall, with strong qualitative demonstrations, but some claims would benefit from additional quantitative tests. The work will interest researchers who need simple, scalable tools for large‑area cellular imaging in the brain.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript reported a method for deep brain imaging with a GRIN lens that combines "low-NA telecentric scanning (LNTS) of laser excitation with high-NA fluorescence collection" to achieve a larger FOV than conventional approaches.

      Strengths:

      The manuscript presented in vivo structural images and calcium activity results in side-by-side comparison to wide-field epi fluorescence imaging through a GRIN lens and two-photon scanning imaging.

      Weaknesses:

      (1) Lack of sufficient technique information on the "high-NA (1.0) fluorescence collection". Is it custom-made or an off-the-shelf component? The only optical schematic, Figure 1, shows two lenses and a Si-PMT as the collection apparatus. There is no information about the lenses and the spacing between each component.

      (2) There is no discussion about the speed limitation of the LNTS method, which, as a scanning-based method, is limited by the scanner speed. At a 10 Hz frame rate, the LNTS, although it has a better FOV, is much slower than widefield fluorescence imaging. The 10 Hz speed is not sufficient for some fast calcium activities.

      (3) Supplementary Figure 5 is irrelevant to the main claim of the manuscript. This is a preliminary simulation related to the authors' proposed future work.

    3. Reviewer #2 (Public review):

      Summary:

      This study introduces a simple optical strategy for one-photon imaging through GRIN lenses that prioritizes coverage while maintaining practical signal quality. By using low-NA telecentric scanned excitation together with high-NA collection, the approach aims to convert nearly the full lens facet into a usable field of view (FOV) with uniform contrast and visible somata. The method is demonstrated in 4-µm fluorescent bead samples and mouse brain, with qualitative comparisons to widefield and two-photon (2P) imaging. Because the configuration relies on standard components and a minimalist optical layout, it may enable broader access to large-area cellular imaging in the deep brain across neuroscience laboratories.

      Strengths:

      (1) This method mitigates off-axis aberrations and enlarges the usable FOV. It achieves near full-facet usable FOV with consistent centre-to-edge contrast, as evidenced by 4-µm fluorescent bead samples (uniform visibility to the edge) and in vivo microglia imaging (resolvable somata across the field).

      (2) The optical design is simple and supports efficient photon collection, lowering the barrier to adoption relative to adaptive optics (AO) or lens design-based correction. Using standard components and treating the GRIN lens as a high-NA (~1.0) light pipe increases collection efficiency for ballistic and scattered fluorescence. Figure annotations report the illumination energy required to reach a fixed detected-photon target (e.g., ~1000 detected photons per bead/cell for the 500-µm FOV condition), and under this equal-output criterion, the LNTS configuration achieves comparable or better image quality at lower illumination energy than conventional wide-field imaging, supporting improved photon efficiency and implying reduced bleaching and heating for equivalent signal levels.

      (3) The in vivo functional recordings are stable and exhibit strong signals. In vivo calcium imaging shows high-SNR ΔF/F₀ traces that remain stable over ~30-minute sessions with only modest baseline drift reported, supporting physiological measurements without heavy denoising and enabling large-scale data collection.

      (4) The low-NA excitation provides an extended focal depth, enabling more neurons to be tracked concurrently within a single FOV while maintaining practical signal quality. It reduces sensitivity to axial motion and minor misalignment and enhances overall experimental efficiency.

      Weaknesses:

      (1) Quantitative characterization is limited. Resolution and contrast are not comprehensively mapped as functions of field position and depth, and a clear, operational definition of "usable FOV" is not specified with threshold criteria.

      (2) The claim of approximately 100% usable FOV is largely supported by qualitative images; standardized metrics (e.g., PSF/MTF maps, contrast-to-noise ratio profiles, cell-detection yield versus radius) are needed to calibrate expectations and enable comparison across systems.

      (3) The trade-off inherent to low NA excitation, namely a broader axial PSF and possible neuropil/background contamination, is acknowledged qualitatively but not quantified. Analyses that separate in-focus from out-of-focus signal would help readers judge single-cell fidelity across the field.

      (4) Generalizability remains to be established. Performance across multiple GRIN models (e.g., diameter, NA), wavelengths, is not yet demonstrated. Longer-session photobleaching, heating, and phototoxicity, particularly near the edge of the FOV, also require fuller evaluation.

      Readers should view it as a coverage-first strategy that enlarges the FOV while accepting a modest trade-off in resolution due to the low-NA excitation and the extended axial PSF.

    1. The new English king was killed in the battle and William moved on to London where he was greeted with the submission of the English nobles. He was crowned King William I of England on Christmas Day.

      when the English king was killed during a battle, William moved to London, and the nobles welcomed him. He was subsequently crowned King William I of England on Christmas Day.

    1. eLife Assessment

      This study provides a valuable advance in understanding how decision boundaries may change over time during simple choices by introducing a method that uses information about non-decision components to improve parameter estimates. The evidence supporting the main claims is convincing, with clear demonstrations on simulated and real data, although additional model comparison work would further strengthen confidence. The findings will be of interest to researchers studying human decision processes and the methods used to analyse them.

    2. Reviewer #1 (Public review):

      Summary:

      This paper proposes a non-decision time (NDT)-informed approach to estimating time-varying decision thresholds in diffusion models of decision making. The manuscript motivates the method well, outlines the identifiability issues it is intended to address, and evaluates it using simulations and two empirical datasets. The aim is clear, the scope is deliberately focused, and the manuscript is well written. The core idea is interesting, technically grounded, and a meaningful contribution to ongoing work on collapsing thresholds.

      Strengths:

      The manuscript is logically structured and easy to follow. The emphasis on parameter recovery is appropriate and appreciated. The finding that the exponential NDT-informed function produces substantially better recovery than the hyperbolic form is useful, given the importance placed on identifiability earlier in the paper. The threshold visualisations are also helpful for interpreting what the models are doing. Overall, the work offers a well-defined, methodologically oriented contribution that will interest researchers working on time-varying thresholds.

      Weaknesses / Areas for Clarification:

      A few points would benefit from clarification, additional analysis, or revised presentation:

      (1) It would help readers to see a concrete demonstration of the trade-off between NDT and collapsing thresholds, to give a sense of the scale of the identifiability problem motivating the work.

      (2) Before moving to the empirical datasets, the manuscript really needs a simulation-based model-recovery comparison, since all major conclusions of the empirical applications rely on model comparison. One approach might be to simulate from (a) an FT model with across-trial drift variability and (b) one of the CT models, then fit both models to each of the simulated data sets. This would address a longstanding issue: sometimes CT models are preferred even when the estimated collapse in the thresholds is close to zero. A recovery study would confirm that model selection behaves sensibly in the new framework.

      (3) An additional subtle point is that BIC is defined in terms of the maximised log-likelihood of the model for the data being modelled. In the joint model, the parameter estimates maximise the combined likelihood of behavioural and non-decision-time data. This means the behavioural log-likelihood evaluated at the joint MLEs is not the behavioural MLE. If BIC is being computed for the behavioural data only, this breaks the assumptions underlying BIC. The only valid BIC here would be one defined for the joint model using the joint likelihood.

      (4) Table 1 sets up the Study 1 comparisons, but there's no row for the FT model. Similarly, Figures 10 and 13 would be more informative if they included FT predictions. This matters because, in Study 1, the FT model appears to fit aggregate accuracy better than the BIC-preferred collapsing model, currently shown only in Appendix 5. Some discussion of why would strengthen the argument.

      (5) In Figure 7, the degree of decay underestimation is obscured by using a density plot rather than a scatterplot, consistent with the other panels of the same figure. Presenting it the same way would make the mis-recovery more transparent. The accompanying text may also need clarification: when data are generated from an FT model with across-trial drift variability, the NDT-informed model seems to infer FT boundaries essentially. If that's correct, the model must be misfitting the simulated data. This is actually a useful result as it suggests across-trial drift variability in FT models is discriminable from collapsing-threshold models. It would be good to make this explicit.

      (6) Given the large recovery advantage of the exponential NDT-informed function over the hyperbolic one, the authors may want to consider whether the results favour adopting the former more generally. Given these findings, I would consider recommending the exponential NDT-informed model for future use.

      (7) In Study 2 (Figure 13), all models qualitatively miss an interesting empirical pattern: under speed emphasis, errors are faster than corrects, while under accuracy emphasis, errors become slower. The error RT distribution in the speed condition is especially poorly captured. It would be helpful for the authors to comment, as it suggests that something theoretically relevant is missing from all models tested.

      (8) The threshold visualisations extend to 3 seconds, yet both datasets show decisions mostly finishing by ~1.5 seconds. Shortening the x-axis would better reflect the empirical RT distributions and avoid unintentionally overstating the timescale of the empirical decision processes.

    3. Reviewer #2 (Public review):

      Summary:

      The authors use simulations and empirical data fitting in order to demonstrate that informing a decision model on estimates of single-trial non-decision time can guide the model to more reliable parameter estimates, especially when the model has collapsing bounds.

      Strengths:

      The paper is well written and motivated, with clear depth of knowledge in the areas of neurophysiology of decision-making, sequential sampling models, and, in particular, the phenomenon of collapsing decision bounds.

      Two large-scale simulations are run to test parameter recovery, and two empirical datasets are fit and assessed; the fitting procedures themselves are state-of-the-art, and the study makes use of a very new and well-designed ERP decomposition algorithm that provides single-trial estimates of the duration of diffusion; the results provide inferences about the operation of decision bound collapse - all of this is impressive.

      Weaknesses:

      This is an interesting and promising idea, but a very important issue is not clear: it is an intuitive principle that information from an external empirical source can enhance the reliability of parameter estimates for a given model, but how can the overall BIC improve, unless it is in fact a different model? Unfortunately, it is not clear whether and how the model structure itself differs between the NDT-informed and non-NDT-informed cases. Ideally, they are the same actual model, but with one getting extra guidance on where to place the tau and/or sigma parameters from external measurements. The absence of sigma (non-decision time variance) estimates for the non-NDT-informed model, however, suggests it is different in structure, not just in its lack of constraints. If they were the same model, whether they do or do not possess non-decision time variability (which is not currently clear), the only possible reason that the NDT-informed model could achieve better BIC is because the non-NDT-informed model gets lost in the fitting procedure and fails to find the global optimum. If they are in fact different models - for example, if the NDT-informed model is endowed with NDT variability, while the non-NDT-informed model is not - then the fit superiority doesn't necessarily say anything about an NDT-informed reliability boost, but rather just that a model with NDT variability fits better than one without.

      One reason this is unclear is that Footnote 4 says that this study did not allow trial-to-trial variability in nondecision time, but the entire premise of using variable external single-trial estimates of nondecision times (illustrated in Figure 2) assumes there is nondecision time variability and that we have access to its distribution.

      It is good that there is an Intro section to explain how the tradeoff between NDT and collapsing bound parameters renders them difficult to simultaneously identify, but I think it needs more work to make it clear. First of all, it is not impossible to identify both, in the same way as, say, pre- and post-decisional nondecision time components cannot be resolved from behaviour alone - the intro had already talked about how collapsing bounds impact RT distribution shapes in specific ways, and obviously mean (or invariant) NDT can't do that - it can only translate the whole distribution earlier/later on the time axis. This is at odds with the phrasing "one CANNOT estimate these three parameters simultaneously." So it should be first clarified that this tradeoff is not absolute. Second, many readers will wonder if it is simply a matter of characterising the bound collapse time course as beginning at accumulation onset, instead of stimulus offset - does that not sidestep the issue? Third, assuming the above can be explained, and there is a reason to keep the collapse function aligned to stimulus onset, could the tradeoff be illustrated by picking two distinct sets of parameter values for non-decision time, starting threshold, and decay rate, which produce almost identical bound dynamics as a function of RT? It is not going to work for most readers to simply give the formula on line 211 and say "There is a tradeoff." Most readers will need more hand-holding.

      A lognormal distribution is used as line 231 says it "must" produce a right-skew. Why? It is unusual for non-decision time distribution to be asymmetric in diffusion modeling, so this "must" statement must be fully explained and justified. Would I be right in saying that if either fixed or symmetrically distributed nondecision times were assumed, as in the majority of diffusion models, then the non-identifiability problem goes away? If the issue is one faced only by a special class of DDMs with lognormal NDT, this should be stated upfront.

      In the simulation study methods, is the only difference between NDT-informed and non-informed models that the non-NDT-informed must also estimate tau and sigma, whereas the NDT-informed model "knows" these two parameters and so only has the other three to estimate? And is it the exact same data that the two models are fit to, in each of the simulation runs? Why is sigma missing from the uninformed part of Figure 4? If it is nondecision time variability, shouldn't the model at least be aware of the existence of sigma and try to estimate it, in order for this to be a meaningful comparison?

      I am curious to know whether a linear bound collapse suffers from the same identifiability issues with NDT, or was it not considered here because it is so suboptimal next to the hyperbolic/exponential?

      The approach using HMP rests on the assumption that accumulation onset is marked by the peak of a certain neural event, but even if it is highly predictive of accumulation onset, depending on what it reflects, it could come systematically earlier or later than the actual accumulation onset. Could the authors comment on what implications this might have for the approach?

      Figure 7: for this simulation, it would be helpful to know the degree to which you can get away with not equipping the model to capture drift rate variability, when the degree of that d.r. variability actually produces appreciable slow error rates. The approach here is to sample uniformly from ranges of the parameters, but how many of these produce data that can be reasonably recognised as similar to human behaviour on typical perceptual decision tasks? The authors point out that only 5% of fits estimate an appreciable bound collapse but if there are only 10% of the parameter vectors that produce data in a typical RT range with typical error rates etc, and half of these produce an appreciable downturn in accuracy for slower RT, and all of the latter represent that 5%, then that's quite a different story. An easy fix would be to plot estimated decay as a scatter plot against the rate of decline of accuracy from the median RT to the slowest RT, to visualise the degree to which slow errors can be absorbed by the no-dr-var model without falsely estimating steep bound collapse. In general, I'm not so sure of the value of this section, since, in principle, there is no getting around the fact that if what is in truth a drift-variability source of slow errors is fit with a model that can only capture it with a collapsing bound, it will estimate a collapsing bound, or just fail to capture those slow errors.

    4. Reviewer #3 (Public review):

      The current paper addresses an important issue in evidence accumulation models: many modelers implement flat decision boundaries because the collapsing alternatives are hard to reliably estimate. Here, using simulations, the authors demonstrate that parameter recovery can be drastically improved by providing the model with additional data (specifically, an EEG-informed estimate of non-decision time). Moreover, in two empirical datasets, it is shown that those EEG-informed models provide a better fit to the data. The method seems sound and promising and might inform future work on the debate regarding flat vs collapsing choice boundaries. As an evidence-accumulation enthusiast, I am quite excited about this work, although for a broader audience, the immediate applicability of this approach seems limited because it does require EEG data (i.e. limiting widespread use of the method or e.g., answering questions about individual differences that require a very large N).

    1. eLife Assessment

      This study provides important evidence that myristate, a fatty acid commonly present in soil environments, is taken up by arbuscular mycorrhizal fungi during symbiosis with a plant host. The evidence presented is solid, with multiple experimental approaches including stable isotope tracing, transcriptional analysis, and physiological measurements across different plant species and phosphorus conditions. However, the main claims are only partially supported.

    2. Reviewer #1 (Public review):

      Summary:

      Two major breakthroughs in the field of arbuscular mycorrhiza (AM) were the discoveries that first AM fungi obtain lipids (not only carbohydrates) from their plant hosts (Bravo et al 2017; Jiang et al 2017; Keymer et al 2017; Luginbuehl et al 2017) and second that presumably obligate biotrophic AM fungi can produce spores in the absence of host plants when exposed to myristate (Sugiura et al 2020; Tanaka et al 2022).

      For this manuscript, Chen et al asked the question of whether myristate in the soil may also play a role in AM symbiosis when AM fungi live in symbiosis with their plant hosts. They show that myristate occurs in natural as well as agricultural soils, probably as a component of root exudates. Further, they treat AM fungi with myristate when grown in symbiosis in a Petri dish system with carrot hairy roots or in pots with alfalfa or rice to describe which effect the exogenous myristate has on symbiosis. Using 13C labelling, they show that myristate is taken up by AM fungi, although they can obtain sugars and lipids from the plant host. They also show that myristate leads to an increase in root colonization as well as expression of fungal genes involved in FA assimilation.

      Interestingly, the effect of myristate on colonization depends on the plant species and the level of phosphate fertilization provided to the plant. The reason for this remains unknown.

      Strengths:

      The findings are interesting and provide an advance in our understanding of lipid use by the extraradical mycelium of AM fungi.

      Weaknesses:

      However, there are some misconceptions in the writing, and some experimental results remain poorly clear as they are presented in a highly descriptive manner without interpretation or explanation.

    3. Reviewer #2 (Public review):

      Summary:

      Arbuscular mycorrhizal fungi (AMF) are among the most widely distributed soil microorganisms, forming symbiotic relationships (AM symbiosis) with approximately 70% of terrestrial vascular plants. AMF are considered obligate biotrophs that rely on host-derived symbiotic carbohydrates. However, it remains unclear whether symbiotic AMF can access exogenous non-symbiotic carbon sources. By conducting three interconnected and complementary experiments, Chen et al. investigated the direct uptake of exogenous 13C1-labeled myristate by symbiotic Rhizophagus irregularis, R. intraradices, and R. diaphanous, and assessed their growth responses using AMF-carrot hairy root co-culture systems (Experiments 1 and 2). They also explored the environmental distribution of myristate in plant and soil substrates, and evaluated the impact of exogenous myristate on the symbiotic carbon-phosphorus exchange between R. irregularis and alfalfa or rice in a greenhouse experiment (Experiment 3). Given that the AM symbiosis not only plays a significant role in the biogeochemical cycling of C and P elements but also acts as a key driver of plant community structure and productivity. The topic of this manuscript is relevant. The study is well-designed, and the manuscript is well-written. I find it easy and interesting to follow the entire narrative.

      Strengths:

      The manuscript provides evidence from 13C labeling and molecular analyses showing that symbiotic AMF can absorb non-symbiotic C sources like myristate in the presence of plant-derived symbiotic carbohydrates, challenging the traditional assumption that AMF exclusively rely on symbiotic carbon sources supplied from associated host plants. This finding advances our understanding of the nutritional interactions between AMF and host plants. Furthermore, the manuscript reveals that myristate is widely present in diverse soil and plant components; however, exogenous myristate disrupts the carbon-phosphorus exchange in arbuscular mycorrhizal symbiosis. These insights have significant implications for the application and regulation of the AM symbiosis in sustainable agriculture and ecological restoration.

      Weaknesses:

      The limitations of this study include:

      (1) The absorption of myristate by symbiotic AMF was observed only after exogenous application under artificial conditions, which may not accurately reflect natural environments.

      (2) The investigation into the mechanism by which myristate disrupts C-P exchange in AM symbiosis remains preliminary.

      Nevertheless, the authors have adequately discussed these limitations in the manuscript.

    4. Reviewer #3 (Public review):

      Summary:

      The authors have addressed a major question since the discovery of myristate uptake from AM fungi as a non-symbiotic C source. Myristate has been used to grow some AM fungi axenically, but the biological significance of this saprobic attitude in natural or agronomical environments remained unexplored. The results of this research soundly demonstrate that myristate-derived C is used by AM fungi, leading to improved development of both extraradical and intraradical mycelium (at least under low P conditions). However, this does not lead to obvious advantages for the plant, since symbiotic nutrient exchange (carbon and phosphorus) is reduced upon myristate application. Furthermore, myristate-treated plants quench their defence responses.

      Strengths:

      The study is extensive, based on a solid experimental setup and methodological approach, combining several state-of-the-art techniques. The conclusions are novel and of high relevance for the scientific community. The writing is fluent and clear.

      Weaknesses:

      Some of the figures should be improved for clarity. The conclusions do not express a conclusive remark that, in my opinion, emerges clearly from the results: myristate application in agriculture does not seem to be a very promising approach, since it unbalances the symbiosis nutritional equilibrium and may weaken plant immunity. This is a very important point (albeit rather unpleasant for applicative scientists) that should be stressed in the conclusions.

    1. In Mediterranean Africa, a new Muslim dynasty was established by Muhammad ibn Tumart (c. 1080–1130), a charismatic Berber scholar from the High Atlas Mountains in southern Morocco.

      That Muhammad ibn Tumart, a renowned and motivational scholar from southern Morocco, began a new Muslim ruling dynasty in North Africa.

    1. <center>

      Why does Ptolemy figure on the inscription?

      </center>

      <div style="background-color: bisque; color: black; font-weight: bold; padding: 12px;"> Ptolemy V really needed something to court the priests because the ptolemaic dynasty were not Egyptians at all. They're actually Greeks and for generations these foreign Kings had kept power over Egypt by engaging with the priests. This inscription is the Memphis decree exempting the Priestly Class from taxes and from having to pay homage to him, a gesture the teenage King hoped will steady his throne and prevent another uprising. </div>

    2. <center>

      How Rosetta Stone unravelled the history of ancient Egypt?

      </center>

      Where was Rosetta Stone discovered? Where-was-Rosetta-Stone-discovered

      The Rosetta Stone, a pivotal artifact discovered in 1799, unlocked ancient Egyptian hieroglyphics through its trilingual inscription, sparking a race among scholars like Young and Champollion to decipher its secrets, ultimately revealing a vanished world. <center>

      Highlights

      </center>
      • Rosetta Stone was created during the reign of King Ptolemy V in ancient Egypt in 196 BCE and was eventually discovered by French engineers in 1799. This discovery played a crucial role in deciphering Egyptian hieroglyphics.
      • An artisan inscribed the "Memphis Decree on the stone, which grants tax exemptions to the priestly class, aiming to stabilize Ptolemy V's rule.
      • Fast forward to 1799, French military engineer Pierre Francois Bouchard discovers the stone while repairing a fort, unaware of its historical importance.
      The French Expedition included academics who recognized the significance of the stone's inscriptions, which would later be key to understanding ancient Egyptian writing. <center>

      How the scholars deciphered the script on Rosetta Stone

      </center>
      • After Napoleon abandoned the expedition, the scholars were left with the stone and a pressing need to disseminate its information despite military challenges.
      • The team devised a new method to capture the stone's inscriptions by using ink and paper, which proved successful.
      • Following the surrender of the French forces, the Rosetta Stone was claimed as a spoil of war by the British and eventually donated to the British Museum.
      • -
      • Despite initial expectations, matching the Greek text with hieroglyphics did not lead to immediate decipherment of ancient Egyptian.
      • The quest to decode the Rosetta Stone saw numerous attempts throughout history, culminating in significant breakthroughs by Thomas Young and Jean-François Champollion, who recognized the phonetic nature of hieroglyphics.- The misunderstanding of hieroglyphics persisted until the 1800s, despite efforts by medieval Muslim researchers.
      • Thomas Young made initial progress in translating the Rosetta Stone by focusing on the Demotic section and recognizing the phonetic writing of Greek names.</l>
      • Jean-François Champollion, a talented linguist who understood Coptic, began his own translation efforts and ultimately surpassed Young's work. He utilized various sources, including artifacts and inscriptions from Egypt, to further his understanding of hieroglyphics.

      Champollion's groundbreaking work

      Jean-François Champollion's groundbreaking work on deciphering Egyptian hieroglyphics using the Rosetta Stone highlights his struggles, rivalries, and eventual success in unlocking the secrets of ancient Egypt. He utilizes his knowledge of Coptic and previous research to reconstruct Egyptian royal names in cartouches, aiming to decode hieroglyphics. Despite facing rivalries and political challenges, Champollion perseveres in his studies, leading to a significant breakthrough in understanding hieroglyphics. In 1822, Champollion successfully read the name Thutmose from an inscription, confirming his theories and dramatically celebrating his discovery. Champollion's journey to Egypt allowed him to read inscriptions and uncover the history of ancient kings and common people, further solidifying his achievements.

      Click Cartouche in Rosetta Stone
    1. A new type of science communicator has recently arisen - one that preys on and misleads scientifically curious audiences. We will identify and expose these influencers and their manipulative and corrosive rhetoric. These communicators are a product of the toxic waste of contemporary politics and broken online incentives leaking into science and are aligned - intentionally or unintentionally - with political projects that seek to undermine science. They are the science populists.

      Science populists

    1. In the next decades, during the fifty-year reign of Edward III, the Commons forced the king to redress grievances before they would raise revenue for him, and forced him to accept that no money could be raised through taxes or loans without Parliament's consent. At the end of the 1340s, the Commons started meeting separately from the Lords and the knights and burgesses began electing a Speaker for their body.

      This quote states that during the long reign of Edward III, the power of the Commons increased. They forced the king to solve their problems before giving him any money, but they could not grant any funds without approval from the Parliament. Gradually, the Commons met on their own and even appointed their own leader, known as the Speaker

    1. Can you take the tools that people use to study memory, learning, goal directedness, problem solving in behavioral cognitive sciences, and can you apply them to the kinds of things I'm talking about cells, tissues, molecular networks. And the answer is yes.

      for - intelligence - take from behavior / cognitive science - apply to molecular networks & tissues - Michael Levin

    1. While the Portuguese were expanding down the African coast, trading for gold and hoping to find a route to Asia, they were also building a commercial empire on the "white gold", sugar.

      The Portuguese were exploring Africa for gold and a route to Asia, they were also bringing in quite a lot of funds through the production and trading of sugar, which was very valuable then.

    2. In 1488, Bartolomeu Dias rounded the Cape of Good Hope and proved a sea route to India was possible.

      This was huge, it proved a sea route to Asia was possible. Opened the way for global trade and later Portuguese dominance in the Indian Ocean. really cool!

    1. By 1455, Gutenberg was able to pull all these elements together and printed his most famous product, a Bible that was so perfect that its readers couldn't tell whether it had been hand-lettered or printed

      In 1455, Gutenberg managed to put all his ideas together to print the Bible. The printing quality was so good that people could not differentiate whether it was done manually or by a machine.

    1. This explosion of new knowledge and ideas into Europe sparked what we call the Renaissance, which literally means rebirth. The infusion of so much new material also democratized knowledge a bit. Young scholars at European universities were seeing these texts at the same time as their teachers, which challenged the scholasticism of the past, when interpretations formed generations or centuries ago carried so much weight.

      This tells us that with many new ideas and new books pouring into Europe, there came a “rebirth” period for knowledge during which many people were able to share in knowledge and not merely senior experts. The learners and teachers studied new ideas simultaneously, and therefore, everyone doubted former knowledge without accepting anything presented by experts long before.

    1. you can chop them this way. However you cut, the record is something like 275 pieces, if you chop them into pieces, every single piece regenerates to make a perfect little worm. So every piece knows exactly what the whole thing is supposed to look like, and they rebuild.

      for - planaria - regenerate

    1. smoke detector principle

      even when you are in the room where there’s no smoke or fire but the smoke detector starts beeping, you become agitated and ready for an action

    2. male porvisation hypothesis

      male provision hypothesis: long-term bonding increased because males who provided food and protection increased the chances of survival of their offspring, while females favored partners who provided care and protection.

    1. eLife Assessment

      This important study reports on the relationships between cerebral haemodynamics and a number of factors that relate to genetics, lifestyle, and medical history using data from a large cohort. Compelling evidence suggests that brief arterial spin labelling MRI acquisition can lead to both expected observations about brain health, as manifested in cerebral blood flow, and biomarkers for use in diagnosis and treatment monitoring. The results can be used as a starting point for hypothesis generation and further evaluation of conditions expected to affect haemodynamics in the brain.

    2. Reviewer #1 (Public review):

      Summary:

      In this work, Okell et al. describe the imaging protocol and analysis pipeline pertaining to the arterial spin labeling (ASL) MRI protocol acquired as part of the UK Biobank imaging study. In addition, they present preliminary analyses of the first 7000+ subjects in whom ASL data were acquired, and this represents the largest such study to date. Careful analyses revealed expected associations between ASL-based measures of cerebral hemodynamics and non-imaging-based markers, including heart and brain health, cognitive function, and lifestyle factors. As it measures physiology and not structure, ASL-based measures may be more sensitive to these factors compared with other imaging-based approaches.

      Strengths:

      This study represents the largest MRI study to date to include ASL data in a wide age range of adult participants. The ability to derive arterial transit time (ATT) information in addition to cerebral blood flow (CBF) is a considerable strength, as many studies focus only on the latter.

      Some of the results (e.g., relationships with cardiac output and hypertension) are known and expected, while others (e.g., lower CBF and longer ATT correlating with hearing difficulty in auditory processing regions) are more novel and intriguing. Overall, the authors present very interesting physiological results, and the analyses are conducted and presented in a methodical manner.

      The analyses regarding ATT distributions and the potential implications for selecting post-labeling delays (PLD) for single PLD ASL are highly relevant and well-presented.

      Weaknesses:

      At a total scan duration of 2 minutes, the ASL sequence utilized in this cohort is much shorter than that of a typical ASL sequence (closer to 5 minutes as mentioned by the authors). However, this implementation also included multiple (n=5) PLDs. As currently described, it is unclear how any repetitions were acquired at each PLD and whether these were acquired efficiently (i.e., with a Look-Locker readout) or whether individual repetitions within this acquisition were dedicated to a single PLD. If the latter, the number of repetitions per PLD (and consequently signal-to-noise-ratio, SNR) is likely to be very low. Have the authors performed any analyses to determine whether the signal in individual subjects generally lies above the noise threshold? This is particularly relevant for white matter, which is the focus of several findings discussed in the study.

      Hematocrit is one of the variables regressed out in order to reduce the effect of potential confounding factors on the image-derived phenotypes. The effect of this, however, may be more complex than accounting for other factors (such as age and sex). The authors acknowledge that hematocrit influences ASL signal through its effect on longitudinal blood relaxation rates. However, it is unclear how the authors handled the fact that the longitudinal relaxation of blood (T1Blood) is explicitly needed in the kinetic model for deriving CBF from the ASL data. In addition, while it may reduce false positives related to the relationships between dietary factors and hematocrit, it could also mask the effects of anemia present in the cohort. The concern, therefore, is two-fold: (1) Were individual hematocrit values used to compute T1Blood values? (2) What effect would the deconfounding process have on this?

      The authors leverage an observed inverse association between white matter hyperintensity volume and CBF as evidence that white matter perfusion can be sensitively measured using the imaging protocol utilized in this cohort. The relationship between white matter hyperintensities and perfusion, however, is not yet fully understood, and there is disagreement regarding whether this structural imaging marker necessarily represents impaired perfusion. Therefore, it may not be appropriate to use this finding as support for validation of the methodology.

    3. Reviewer #2 (Public review):

      Summary:

      Okell et al. report the incorporation of arterial spin-labeled (ASL) perfusion MRI into the UK Biobank study and preliminary observations of perfusion MRI correlates from over 7000 acquired datasets, which is the largest sample of human perfusion imaging data to date. Although a large literature already supports the value of ASL MRI as a biomarker of brain function, this important study provides compelling evidence that a brief ASL MRI acquisition may lead to both fundamental observations about brain health as manifested in CBF and valuable biomarkers for use in diagnosis and treatment monitoring.

      ASL MRI noninvasively quantifies regional cerebral blood flow (CBF), which reflects both cerebrovascular integrity and neural activity, hence serves as a measure of brain function and a potential biomarker for a variety of CNS disorders. Despite a highly abbreviated ASL MRI protocol, significant correlations with both expected and novel demographic, physiological, and medical factors are demonstrated. In many such cases, ASL was also more sensitive than other MRI-derived metrics. The ASL MRI protocol implemented also enables quantification of arterial transit time (ATT), which provides stronger clinical correlations than CBF in some factors. The results demonstrate both the feasibility and the efficacy of ASL MRI in the UK Biobank imaging study, which expects to complete ASL MRI in up to 60,000 richly phenotyped individuals. Although a large literature already supports the value of ASL MRI as a biomarker of brain function, this important study provides compelling evidence that a brief ASL MRI acquisition may lead to both fundamental observations about brain health as manifested in CBF and valuable biomarkers for use in diagnosis and treatment monitoring.

      Strengths:

      A key strength of this study is the use of an ASL MRI protocol incorporating balanced pseudocontinuous labeling with a background-suppressed 3D readout, which is the current state-of-the-art. To compensate for the short scan time, voxel resolution was intentionally only moderate. The authors also elected to acquire these data across five post-labeling delays, enabling ATT and ATT-corrected CBF to be derived using the BASIL toolbox, which is based on a variational Bayesian framework. The resulting CBF and ATT maps shown in Figure 1 are quite good, especially when combined with such a large and deeply phenotyped sample.

      Another strength of the study is the rigorous image analysis approach, which included covariation for a number of known CBF confounds as well as correction for motion and scanner effects. In doing so, the authors were able to confirm expected effects of age, sex, hematocrit, and time of day on CBF values. These observations lend confidence in the veracity of novel observations, for example, significant correlations between regional ASL parameters and cardiovascular function, height, alcohol consumption, depression, and hearing, as well as with other MRI features such as regional diffusion properties and magnetic susceptibility. They also provide valuable observations about ATT and CBF distributions across a large cohort of middle-aged and older adults.

      Weaknesses:

      This study primarily serves to illustrate the efficacy and potential of ASL MRI as an imaging parameter in the UK Biobank study, but some of the preliminary observations will be hypothesis-generating for future analyses in larger sample sizes. However, a weakness of the manuscript is that some of the reported observations are difficult to follow. In particular, the associations between ASL and resting fMRI illustrated in Figure 7 and described in the accompanying Results text are difficult to understand. It could also be clearer whether the spatial maps showing ASL correlates of other image-derived phenotypes in Figure 6B are global correlations or confined to specific regions of interest. Finally, while addressing partial volume effects in gray matter regions by covarying for cortical thickness is a reasonable approach, the Methods section seems to imply that a global mean cortical thickness is used, which could be problematic given that cortical thickness changes may be localized.

    4. Reviewer #3 (Public review):

      Summary:

      This is an extremely important manuscript in the evolution of cerebral perfusion imaging using Arterial Spin Labelling (ASL). The number of subjects that were scanned has provided the authors with a unique opportunity to explore many potential associations between regional cerebral blood flow (CBF) and clinical and demographic variables.

      Strengths:

      The major strength of the manuscript is the access to an unprecedentedly large cohort of subjects. It demonstrates the sensitivity of regional tissue blood flow in the brain as an important marker of resting brain function. In addition, the authors have demonstrated a thorough analysis methodology and good statistical rigour.

      Weaknesses:

      This reviewer did not identify any major weaknesses in this work.

    5. Author response:

      We thank the editors and reviewers for their generally positive and thoughtful feedback on this work. Below are provisional responses to some of the concerns raised:

      Reviewer 1:

      At a total scan duration of 2 minutes, the ASL sequence utilized in this cohort is much shorter than that of a typical ASL sequence (closer to 5 minutes as mentioned by the authors). However, this implementation also included multiple (n=5) PLDs. As currently described, it is unclear how any repetitions were acquired at each PLD and whether these were acquired efficiently (i.e., with a Look-Locker readout) or whether individual repetitions within this acquisition were dedicated to a single PLD. If the latter, the number of repetitions per PLD (and consequently signal-to-noise-ratio, SNR) is likely to be very low. Have the authors performed any analyses to determine whether the signal in individual subjects generally lies above the noise threshold? This is particularly relevant for white matter, which is the focus of several findings discussed in the study.

      We agree that this was a short acquisition compared to most ASL protocols, necessitated by the strict time-keeping requirements for running such a large study. We apologise if this was not clear in the original manuscript, but due to this time constraint and the use of a segmented readout (which was not Look-Locker) there was only time available for a single average at each PLD. This does mean that the perfusion weighted images at each PLD are relatively noisy, although the image quality with this sequence was still reasonable, as demonstrated in Figure 1, with perfusion weighted images visibly above the noise floor. In addition, as has been demonstrated theoretically and experimentally in recent work (Woods et al., 2023, 2020), even though the SNR of each individual PLD image might be low in multi-PLD acquisitions, this is effectively recovered during the model fitting process, giving it comparable or greater accuracy than a protocol which collects many averages at a single (long) PLD. As also noted by the reviewers, this approach has the further benefit of allowing ATT estimation, which has proven to provide useful and complementary information to CBF. Finally, the fact that many of the findings in this study pass strict statistical thresholds for significance, despite the many multiple comparisons performed, and that the spatial patterns of these relationships are consistent with expectations, even in the white matter (e.g. Figure 6B), give us confidence that the perfusion estimation is robust. However, we will consider adding some additional metrics around SNR or fitting uncertainty in a revised manuscript, as well as clarifying details of the acquisition.

      Hematocrit is one of the variables regressed out in order to reduce the effect of potential confounding factors on the image-derived phenotypes. The effect of this, however, may be more complex than accounting for other factors (such as age and sex). The authors acknowledge that hematocrit influences ASL signal through its effect on longitudinal blood relaxation rates. However, it is unclear how the authors handled the fact that the longitudinal relaxation of blood (T1Blood) is explicitly needed in the kinetic model for deriving CBF from the ASL data. In addition, while it may reduce false positives related to the relationships between dietary factors and hematocrit, it could also mask the effects of anemia present in the cohort. The concern, therefore, is two-fold: (1) Were individual hematocrit values used to compute T1Blood values? (2) What effect would the deconfounding process have on this?

      We agree this is an important point to clarify. In this work we decided not to use the haematocrit to directly estimate the T1 of blood for each participant a) because this would result in slight differences in the model fitting for each subject, which could introduce bias (e.g. the kinetic model used assumes instantaneous exchange between blood water and tissue, so changing the T1 of blood for each subject could make us more sensitive to inaccuracies in this assumption); and b) because typically the haematocrit measures were quite some time (often years) prior to the imaging session, leading to an imperfect correction. We therefore took the pragmatic approach to simply regress each subject’s average haematocrit reading out of the IDP and voxelwise data to prevent it contributing to apparent correlations caused by indirect effects on blood T1. However, we agree with the reviewer that this certainly would mask the effects of anaemia in this cohort, so for researchers interested in this condition a different approach should be taken. We will update the revised manuscript to try to clarify these points.

      The authors leverage an observed inverse association between white matter hyperintensity volume and CBF as evidence that white matter perfusion can be sensitively measured using the imaging protocol utilized in this cohort. The relationship between white matter hyperintensities and perfusion, however, is not yet fully understood, and there is disagreement regarding whether this structural imaging marker necessarily represents impaired perfusion. Therefore, it may not be appropriate to use this finding as support for validation of the methodology.

      We appreciate the reviewer’s point that there is still debate about the relationship between white matter hyperintensities and perfusion. We therefore agree that this observed relationship therefore does not validate the methodology in the sense that it is an expected finding, but it does demonstrate that the data quality is sufficient to show significant correlations between white matter hyperintensity volume and perfusion, even in white matter regions, which would not be the case if the signal there were dominated by noise. Similarly, the clear spatial pattern of perfusion changes in the white matter that correlate with DTI measures in the same regions also suggests there is sensitivity to white matter perfusion. However, we will update the wording in the revised manuscript to try to clarify this point.

      Reviewer 2:

      This study primarily serves to illustrate the efficacy and potential of ASL MRI as an imaging parameter in the UK Biobank study, but some of the preliminary observations will be hypothesis-generating for future analyses in larger sample sizes. However, a weakness of the manuscript is that some of the reported observations are difficult to follow. In particular, the associations between ASL and resting fMRI illustrated in Figure 7 and described in the accompanying Results text are difficult to understand. It could also be clearer whether the spatial maps showing ASL correlates of other image-derived phenotypes in Figure 6B are global correlations or confined to specific regions of interest. Finally, while addressing partial volume effects in gray matter regions by covarying for cortical thickness is a reasonable approach, the Methods section seems to imply that a global mean cortical thickness is used, which could be problematic given that cortical thickness changes may be localized.

      We apologise if any of the presented information was unclear and will try to improve this in our revised manuscript. To clarify, the spatial maps associated with other (non-ASL) IDPs were generated by calculating the correlation between the ASL CBF or ATT in every voxel in standard space with the non-ASL IDP of interest, not the values of the other imaging modality in the same voxel. No region-based masking was used for this comparison. This allowed us to examine whether the correlation with this non-ASL IDP was only within the same brain region or if the correlations extended to other regions too.

      We also agree that the associations between ASL and resting fMRI are not easy to interpret. We therefore tried to be clear in the manuscript that these were preliminary findings that may be of interest to others, but clearly further study is required to explore this complex relationship further. However, we will try to clarify how the results are presented in the revised manuscript.

      In relation to partial volume effects, we did indeed use only a global measure of cortical thickness in the deconfounding and we acknowledged that this could be improved in the discussion: [Partial volume effects were] “mitigated here by the inclusion of cortical thickness in the deconfounding process, although a region-specific correction approach that is aware of the through-slice blurring (Boscolo Galazzo et al., 2014) is desirable in future iterations of the ASL analysis pipeline.” As suggested here, although this is a coarse correction, we did not feel that a more comprehensive partial volume correction approach could be used without properly accounting for the through-slice blurring effects from the 3D-GRASE acquisition (that will vary across different brain regions), which is not currently available, although this is an area we are actively working on for future versions of the image analysis pipeline. We again will try to clarify this point further in the revised manuscript.

      References

      Woods JG, Achten E, Asllani I, Bolar DS, Dai W, Detre J, Fan AP, Fernández-Seara M, Golay X, Günther M, Guo J, Hernandez-Garcia L, Ho M-L, Juttukonda MR, Lu H, MacIntosh BJ, Madhuranthakam AJ, Mutsaerts HJ, Okell TW, Parkes LM, Pinter N, Pinto J, Qin Q, Smits M, Suzuki Y, Thomas DL, Van Osch MJP, Wang DJ, Warnert EAH, Zaharchuk G, Zelaya F, Zhao M, Chappell MA. 2023. Recommendations for Quantitative Cerebral Perfusion MRI using Multi-Timepoint Arterial Spin Labeling: Acquisition, Quantification, and Clinical Applications (preprint). Open Science Framework. doi:10.31219/osf.io/4tskr

      Woods JG, Chappell MA, Okell TW. 2020. Designing and comparing optimized pseudo-continuous Arterial Spin Labeling protocols for measurement of cerebral blood flow. NeuroImage 223:117246. doi:10.1016/j.neuroimage.2020.117246

    1. eLife Assessment

      This valuable study uses state-of-the-art neural encoding and video reconstruction methods to achieve a substantial improvement in video reconstruction quality from mouse neural data. It provides a convincing demonstration of how reconstruction performance can be improved by combining these methods. The goal of the study was improving reconstruction performance rather than advancing theoretical understanding of neural processing, so the results will be of practical interest to the brain decoding community.

    2. Reviewer #2 (Public review):

      Summary:

      This is an interesting study exploring methods for reconstructing visual stimuli from neural activity in the mouse visual cortex. Specifically, it uses a competition dataset (published in the Dynamic Sensorium benchmark study) and a recent winning model architecture (DNEM, dynamic neural encoding model) to recover visual information stored in ensembles of mouse visual cortex.

      Strengths:

      This is a great start for a project addressing visual reconstruction. It is based on physiological data obtained at a single-cell resolution, the stimulus movies were reasonably naturalistic and representative of the real world, the study did not ignore important correlates such as eye position and pupil diameter, and of course, the reconstruction quality exceeded anything achieved by previous studies. There appear to be no major technical flaws in the study, and some potential confounds were addressed upon revision. The study is an enjoyable read.

      Weaknesses:

      The study is technically competent and benchmark-focused, but without significant conceptual or theoretical advances. The inclusion of neuronal data broadens the study's appeal, but the work does not explore potential principles of neural coding, which limits its relevance for neuroscience and may create some disappointment to some neuroscientists. The authors are transparent that their goal was methodological rather than explanatory, but this raises the question of why neuronal data were necessary at all, as more significant reconstruction improvements might be achievable using noise-less artificial video encoders alone (network-to-network decoding approaches have been done well by teams such as Han, Poggio, and Cheung, 2023, ICML). Yet, even within the methodological domain, the study does not articulate clear principles or heuristics that could guide future progress. The finding that more neurons improve reconstruction aligns with well-established results in the literature that show that higher neuronal numbers improve decoding in general (for example, Hung, Kreiman, Poggio, and DiCarlo, 2005) and thus may not constitute a novel insight.

      Specific issues:

      (1) The study showed that it could achieve high-quality video reconstructions from mouse visual cortex activity using a neural encoding model (DNEM), recovering 10-second video sequences and approaching a two-fold improvement in pixel-by-pixel correlation over attempts. As a reader, I was left with the question: okay, does this mean that we should all switch to DNEM for our investigations of mouse visual cortex? What makes this encoding model special? It is introduced as "a winning model of the Sensorium 2023 competition which achieved a score of 0.301...single trial correlation between predicted and ground truth neuronal activity," but as someone who does not follow this competition (most eLife readers are not likely to do so, either), I do not know how to gauge my response. Is this impressive? What is the best theoretical score, given noise and other limitations? Is the model inspired by the mouse brain in terms of mechanisms or architecture, or was it optimized to win the competition by overfitting it to the nuances of the data set? Of course, I know that as a reader, I am invited to read the references, but the study would stand better on its own, if it clarified how its findings depended on this model.

      The revision helpfully added context to the Methods about the range of scores achieved by other models, but this information remains absent from the Abstract and other important sections. For instance, the Abstract states, "We achieve a pixel-level correlation of 0.57 between the ground truth movie and the reconstructions from single-trial neural responses," yet this point estimate (presented without confidence intervals or comparisons to controls) lacks meaning for readers who are not told how it compares to prior work or what level of performance would be considered strong. Without such context, the manuscript undercuts potentially meaningful achievements.

      (2) Along those lines, the authors conclude that "the number of neurons in the dataset and the use of model ensembling are critical for high-quality reconstructions." If true, these principles should generalize across network architectures. I wondered whether the same dependencies would hold for other network types, as this could reveal more general insights. The authors replied that such extensions are expected (since prior work has shown similar effects for static images) but argued that testing this explicitly would require "substantial additional work," be "impractical," and likely not produce "surprising results." While practical difficulty alone is not a sufficient reason to leave an idea untested, I agree that the idea that "more neurons would help" would be unsurprising. The question then becomes: given that this is a conclusion already in the field, what new principle or understanding has been gained in this study?

      (3) One major claim was that the quality of the reconstructions depended on the number of neurons in the dataset. There were approximately 8000 neurons recorded per mouse. The correlation difference between the reconstruction achieved by 1000 neurons and 8000 neurons was ~0.2. Is that a lot or a little? One might hypothesize that 7000 additional neurons could contribute more information, but perhaps, those neurons were redundant if their receptive fields are too close together or if they had the same orientation or spatiotemporal tuning. How correlated were these neurons in response to a given movie? Why did so many neurons offer such a limited increase in correlation? Originally, this question was meant to prompt deeper analysis of the neural data, but the authors did not engage with it, suggesting a limited understanding of the neuronal aspects of the dataset.

      (4) We appreciated the experiments testing the capacity of the reconstruction process, by using synthetic stimuli created under a Gaussian process in a noise-free way. But this originally further raised questions: what is the theoretical capability for reconstruction of this processing pipeline, as a whole? Is 0.563 the best that one could achieve given the noisiness and/or neuron count of the Sensorium project? What if the team applied the pipeline to reconstruct the activity of a given artificial neural network's layer (e.g., some ResNet convolutional layer), using hidden units as proxies for neuronal calcium activity? In the revision, this concern was addressed nicely in the review in Supplementary Figure 3C. Also, one appreciates that as a follow up, the team produced error maps (New Figure 6) that highlight where in the frames the reconstruction are likely to fail. But the maps went unanalyzed further, and I am not sure if there was a systematic trend in the errors.

      (5) I was encouraged by Figure 4, which shows how the reconstructions succeeded or failed across different spatial frequencies. The authors note that "the reconstruction process failed at high spatial frequencies," yet it also appears to struggle with low spatial frequencies, as the reconstructed images did not produce smooth surfaces (e.g., see the top rows of Figures 4A and 4B). In regions where one would expect a single continuous gradient, the reconstructions instead display specular, high-frequency noise. This issue is difficult to overlook and might deserve further discussion.

    3. Reviewer #3 (Public review):

      Summary:

      This paper presents a method for reconstructing input videos shown to a mouse from the simultaneously recorded visual cortex activity (two-photon calcium imaging data). The publicly available experimental dataset is taken from a recent brain-encoding challenge, and the (publicly available) neural network model that serves to reconstruct the videos is the winning model from that challenge (by distinct authors). The present study applies gradient-based input optimization by backpropagating the brain-encoding error through this selected model (a method that has been proposed in the past, with other datasets). The main contribution of the paper is, therefore, the choice of applying this existing method to this specific dataset with this specific neural network model. The quantitative results appear to go beyond previous attempts at video input reconstruction (although measured with distinct datasets). The conclusions have potential practical interest for the field of brain decoding, and theoretical interest for possible future uses in functional brain exploration.

      Strengths:

      The authors use a validated optimization method on a recent large-scale dataset, with a state-of-the-art brain encoding model. The use of an ensemble of 7 distinct model instances (trained on distinct subsets of the dataset, with distinct random initializations) significantly improves the reconstructions. The exploration of the relation between reconstruction quality and number of recorded neurons will be useful to those planning future experiments.

      Weaknesses:

      The main contribution is methodological, and the methodology combines pre-existing components without any new original component.

    4. Author response:

      The following is the authors’ response to the current reviews.

      Public Reviews: 

      Reviewer #2 (Public review): 

      Summary: 

      This is an interesting study exploring methods for reconstructing visual stimuli from neural activity in the mouse visual cortex. Specifically, it uses a competition dataset (published in the Dynamic Sensorium benchmark study) and a recent winning model architecture (DNEM, dynamic neural encoding model) to recover visual information stored in ensembles of mouse visual cortex. 

      Strengths: 

      This is a great start for a project addressing visual reconstruction. It is based on physiological data obtained at a single-cell resolution, the stimulus movies were reasonably naturalistic and representative of the real world, the study did not ignore important correlates such as eye position and pupil diameter, and of course, the reconstruction quality exceeded anything achieved by previous studies. There appear to be no major technical flaws in the study, and some potential confounds were addressed upon revision. The study is an enjoyable read. 

      Weaknesses: 

      The study is technically competent and benchmark-focused, but without significant conceptual or theoretical advances. The inclusion of neuronal data broadens the study's appeal, but the work does not explore potential principles of neural coding, which limits its relevance for neuroscience and may create some disappointment to some neuroscientists. The authors are transparent that their goal was methodological rather than explanatory, but this raises the question of why neuronal data were necessary at all, as more significant reconstruction improvements might be achievable using noise-less artificial video encoders alone (network-to-network decoding approaches have been done well by teams such as Han, Poggio, and Cheung, 2023, ICML). Yet, even within the methodological domain, the study does not articulate clear principles or heuristics that could guide future progress. The finding that more neurons improve reconstruction aligns with well-established results in the literature that show that higher neuronal numbers improve decoding in general (for example, Hung, Kreiman, Poggio, and DiCarlo, 2005) and thus may not constitute a novel insight. 

      We thank the reviewer for this second round of comments and hope we were able to address the remaining points below. 

      Indeed, using surrogate noiseless data is interesting and useful when developing such methods, or to demonstrate that they work in principle. But in order to evaluate if they really work in practice, we need to use real neuronal data. While we did not try movie reconstruction from layers within artificial neural networks as surrogate data, in Supplementary Figure 3C we provide the performance of our method using simulated/predicted neuronal responses from the dynamic neural encoding model alongside real neuronal responses.

      Specific issues: 

      (1)The study showed that it could achieve high-quality video reconstructions from mouse visual cortex activity using a neural encoding model (DNEM), recovering 10-second video sequences and approaching a two-fold improvement in pixel-by-pixel correlation over attempts. As a reader, I was left with the question: okay, does this mean that we should all switch to DNEM for our investigations of mouse visual cortex? What makes this encoding model special? It is introduced as "a winning model of the Sensorium 2023 competition which achieved a score of 0.301...single trial correlation between predicted and ground truth neuronal activity," but as someone who does not follow this competition (most eLife readers are not likely to do so, either), I do not know how to gauge my response. Is this impressive? What is the best theoretical score, given noise and other limitations? Is the model inspired by the mouse brain in terms of mechanisms or architecture, or was it optimized to win the competition by overfitting it to the nuances of the data set? Of course, I know that as a reader, I am invited to read the references, but the study would stand better on its own, if it clarified how its findings depended on this model. 

      The revision helpfully added context to the Methods about the range of scores achieved by other models, but this information remains absent from the Abstract and other important sections. For instance, the Abstract states, "We achieve a pixel-level correlation of 0.57 between the ground truth movie and the reconstructions from single-trial neural responses," yet this point estimate (presented without confidence intervals or comparisons to controls) lacks meaning for readers who are not told how it compares to prior work or what level of performance would be considered strong. Without such context, the manuscript undercuts potentially meaningful achievements. 

      We appreciate that the additional information about the performance of the SOTA DNEM to predict neural responses could be made more visible in the paper and will therefore move it from the methods to the results section instead: 

      Line 348 “This model achieved an average single-trial correlation between predicted and ground truth neural activity of 0.291 during the competition, this was later improved to 0.301. The competition benchmark models achieved 0.106, 0.164 and 0.197 single-trial correlation, while the third and second place models achieved 0.243 and 0.265. Across the models, a variety of architectural components were used, including 2D and 3D convolutional layers, recurrent layers, and transformers, to name just a few.” will be moved to the results.

      With regard to the lack of context for the performance of our reconstruction in the abstract, we may have overcorrected in the previous revision round and have tried to find a compromise which gives more context to the pixel-level correlation value: 

      Abstract: “We achieve a pixel-level correlation of 0.57 (95% CI [0.54, 0.60]) between ground-truth movies and single-trial reconstructions. Previous reconstructions based on awake mouse V1 neuronal responses to static images achieved a pixel-level correlation of 0.238 over a similar retinotopic area.”

      (2) Along those lines, the authors conclude that "the number of neurons in the dataset and the use of model ensembling are critical for high-quality reconstructions." If true, these principles should generalize across network architectures. I wondered whether the same dependencies would hold for other network types, as this could reveal more general insights. The authors replied that such extensions are expected (since prior work has shown similar effects for static images) but argued that testing this explicitly would require "substantial additional work," be "impractical," and likely not produce "surprising results." While practical difficulty alone is not a sufficient reason to leave an idea untested, I agree that the idea that "more neurons would help" would be unsurprising. The question then becomes: given that this is a conclusion already in the field, what new principle or understanding has been gained in this study? 

      As mentioned in our previous round of revisions, we chose not to pursue the comparison of reconstructions using different model architectures in this manuscript because we did not think it would add significant insights to the paper given the amount of work it would require, and we are glad the reviewer agrees. 

      While the fact that more neurons result in better reconstructions is unsurprising, how quickly performance drops off will depend on the robustness of the method, and on the dimensionality of the decoding/reconstruction task (decoding grating orientation likely requires fewer neurons than gray scale image reconstruction, which in turn likely requires fewer neurons than full color movie reconstruction). How dependent input optimization based image/movie reconstruction is on population size has not been shown, so we felt it was useful for readers to know how well movie reconstruction works with our method when recording from smaller numbers of neurons. 

      (3) One major claim was that the quality of the reconstructions depended on the number of neurons in the dataset. There were approximately 8000 neurons recorded per mouse. The correlation difference between the reconstruction achieved by 1000 neurons and 8000 neurons was ~0.2. Is that a lot or a little? One might hypothesize that 7000 additional neurons could contribute more information, but perhaps, those neurons were redundant if their receptive fields are too close together or if they had the same orientation or spatiotemporal tuning. How correlated were these neurons in response to a given movie? Why did so many neurons offer such a limited increase in correlation? Originally, this question was meant to prompt deeper analysis of the neural data, but the authors did not engage with it, suggesting a limited understanding of the neuronal aspects of the dataset. 

      We apologize that we did not engage with this comment enough in the previous round. We assumed that the question arose because there was a misunderstanding about figure 5: 1000 not 1 neuron is sufficient to reconstruct the movies to a pixel-level correlation of 0.344. Of course, the fact that increasing the number of neurons from 1000 to 8000 only increased the reconstruction performance from 0.344 to 0.569 (65% increase in correlation) is still worth discussing. To illustrate this drop in performance qualitatively, we show 3 example frames from movie reconstructions using 1000-8000 neurons in Author response image 1.

      Author response image 1.

      3 example frames from reconstructions using different numbers of neurons. 

      As the reviewer points out, the diminishing returns of additional neurons to reconstruction performance is at least partly because there is redundancy in how a population of neurons represents visual stimuli. In supplementary figure S2, we inferred the on-off receptive fields of the neurons and show that visual space is oversampled in terms of the receptive field positions in panel C. However, the exact slope/shape of the performance vs population size curve we show in Figure 5 will also depend on the maximum performance of our reconstruction method, which is limited in spatial resolution (Figure 4 & Supplementary Figure S5). It is possible that future reconstruction approaches will require fewer neurons than ours, so we interpret this curve rather as a description of the reconstruction method itself than a feature of the underlying neuronal code. For that reason, we chose caution and refrained from making any claims about neuronal coding principles based on this plot. 

      (4) We appreciated the experiments testing the capacity of the reconstruction process, by using synthetic stimuli created under a Gaussian process in a noise-free way. But this originally further raised questions: what is the theoretical capability for reconstruction of this processing pipeline, as a whole? Is 0.563 the best that one could achieve given the noisiness and/or neuron count of the Sensorium project? What if the team applied the pipeline to reconstruct the activity of a given artificial neural network's layer (e.g., some ResNet convolutional layer), using hidden units as proxies for neuronal calcium activity? In the revision, this concern was addressed nicely in the review in Supplementary Figure 3C. Also, one appreciates that as a follow up, the team produced error maps (New Figure 6) that highlight where in the frames the reconstruction are likely to fail. But the maps went unanalyzed further, and I am not sure if there was a systematic trend in the errors. 

      We are happy to hear that we were able to answer the reviewers’ question of what the maximum theoretical performance of our reconstruction process is in figure 3C. Regarding systematic trends in the error maps, we also did not observe any clear systematic trends. If anything, we noticed that some moving edges were shifted, but we do not think we can quantify this effect with this particular dataset. 

      (5) I was encouraged by Figure 4, which shows how the reconstructions succeeded or failed across different spatial frequencies. The authors note that "the reconstruction process failed at high spatial frequencies," yet it also appears to struggle with low spatial frequencies, as the reconstructed images did not produce smooth surfaces (e.g., see the top rows of Figures 4A and 4B). In regions where one would expect a single continuous gradient, the reconstructions instead display specular, high-frequency noise. This issue is difficult to overlook and might deserve further discussion. 

      Thank you for pointing this out, this is indeed true. The reconstructions do have high frequency noise. We mention this briefly in line 102 “Finally, we applied a 3D Gaussian filter with sigma 0.5 pixels to remove the remaining static noise (Figure S3) and applied the evaluation mask.” In revisiting this sentence, we think it is more appropriate to replace “remove” with “reduce”. This noise is more visible in the Gaussian noise stimuli (Figure 4) because we did not apply the 3D Gaussian filter to these reconstructions, in case it interfered with the estimates of the reconstruction resolution limits. 

      Given that the Gaussian noise and drifting grating stimuli reconstructions were from predicted activity (“noise-free”), this high-frequency noise is not biological in origin and must therefore come from errors in our reconstruction process. This kind of high-frequency noise has previously been observed in feature visualization (optimizing input to maximize the activity of a specific node within a neural network to visualize what that node encodes; Olah, et al., "Feature Visualization", https://distill.pub/2017/feature-visualization/, 2017). It is caused by a kind of overfitting, whereby a solution to the optimization is found that is not “realistic”. Ways of combating this kind of noise include gradient smoothing, image smoothing, and image transformations during optimization, but these methods can restrict the resolution of the features that are recovered. Since we were more interested in determining the maximum resolution of stimuli that can be reconstructed in Figure 4 and Supplementary Figures 5-6, we chose not to apply these methods.

      Reviewer #3 (Public review): 

      Summary: 

      This paper presents a method for reconstructing input videos shown to a mouse from the simultaneously recorded visual cortex activity (two-photon calcium imaging data). The publicly available experimental dataset is taken from a recent brain-encoding challenge, and the (publicly available) neural network model that serves to reconstruct the videos is the winning model from that challenge (by distinct authors). The present study applies gradient-based input optimization by backpropagating the brain-encoding error through this selected model (a method that has been proposed in the past, with other datasets). The main contribution of the paper is, therefore, the choice of applying this existing method to this specific dataset with this specific neural network model. The quantitative results appear to go beyond previous attempts at video input reconstruction (although measured with distinct datasets). The conclusions have potential practical interest for the field of brain decoding, and theoretical interest for possible future uses in functional brain exploration. 

      Strengths: 

      The authors use a validated optimization method on a recent large-scale dataset, with a state-of-the-art brain encoding model. The use of an ensemble of 7 distinct model instances (trained on distinct subsets of the dataset, with distinct random initializations) significantly improves the reconstructions. The exploration of the relation between reconstruction quality and number of recorded neurons will be useful to those planning future experiments. 

      Weaknesses: 

      The main contribution is methodological, and the methodology combines pre-existing components without any new original component. 

      We thank the reviewer for their balanced assessment of our manuscript.


      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review): 

      Summary: 

      This paper presents a method for reconstructing videos from mouse visual cortex neuronal activity using a state-of-the-art dynamic neural encoding model. The authors achieve high-quality reconstructions of 10-second movies at 30 Hz from two-photon calcium imaging data, reporting a 2-fold increase in pixel-by-pixel correlation compared to previous methods. They identify key factors for successful reconstruction including the number of recorded neurons and model ensembling techniques. 

      Strengths: 

      (1) A comprehensive technical approach combining state-of-the-art neural encoding models with gradient-based optimization for video reconstruction. 

      (2) Thorough evaluation of reconstruction quality across different spatial and temporal frequencies using both natural videos and synthetic stimuli. 

      (3) Detailed analysis of factors affecting reconstruction quality, including population size and model ensembling effects. 

      (4) Clear methodology presentation with well-documented algorithms and reproducible code. 

      (5) Potential applications for investigating visual processing phenomena like predictive coding and perceptual learning. 

      We thank the reviewer for taking the time to provide this valuable feedback. We would like to add that in our eyes one additional main contribution is the step of going from reconstruction of static images to dynamic videos. We trust that in the revised manuscript, we have now made the point more explicit that static image reconstruction relies on temporally averaged responses, which negates the necessity of having to account for temporal dynamics altogether. 

      Weaknesses: 

      The main metric of success (pixel correlation) may not be the most meaningful measure of reconstruction quality: 

      High correlation may not capture perceptually relevant features.

      Different stimuli producing similar neural responses could have low pixel correlations The paper doesn't fully justify why high pixel correlation is a valuable goal 

      This is a very relevant point. In retrospect, perhaps we did not justify this enough. Sensory reconstruction typically aims to reconstruct sensory input based on brain activity as faithfully as possible. A brain-to-image decoder might therefore be trained to produce images as close to the original input as possible. The loss function to train the decoder would therefore be image similarity on the pixel level. In that case, evaluating reconstruction performance based on pixel correlation is somewhat circular. 

      However, when reconstructing videos, we optimize the input video in terms of its perceptual similarity to the original video and only then evaluate pixel-level similarity. The perceptual similarity metric we optimize for is the estimate of how the neurons in mouse V1 respond to that video. We then evaluate the similarity of this perceptually optimized video to the original input video with pixel-level correlation. In other words, we optimize for perceptual similarity and then evaluate pixel similarity. If our method optimized pixel-level similarity, then we would agree that perceptual similarity is a more relevant evaluation metric. We do not think it was clear in our original submission that our optimization loss function is a perceptual loss function, and have now made this clearer in Figure 1C-D and have clarified this in the results section, line 70:

      “In effect, we optimized the input video to be perceptually similar with respect to the recorded neurons.”

      And in line 110: 

      “Because our optimization of the movies was based on a perceptual loss function, we were interested in how closely these movies matched the originals on the pixel level.”

      We chose to use pixel correlation to measure pixel-level similarity for several reasons. 1) It has been used in the past to evaluate reconstruction performance (Yoshida et al., 2020), 2) It is contrast and luminance insensitive, 3) correlation is a common metric so most readers will have an intuitive understanding of how it relates to the data. 

      To further highlight why pixel similarity might be interesting to visualize, we have included additional analysis in Figure 6 illustrating pixel-level differences between reconstructions from experimentally recorded activity and predicted activity. 

      We expect that the type of perceptual similarity the reviewer is alluding to is pretrained neural network image embedding similarity (Zhang et al., 2018: https://doi.org/10.48550/arXiv.1801.03924). While these metrics seem to match human perceptual similarity, it is unclear if they reflect mouse vision. We did try to compare the embedding similarity from pretrained networks such as VGG16, but got results suggesting the reconstructed frames were no more similar to the ground truth than random frames, which is obviously not true. This might be because the ground truth videos were too different in resolution from the training data of these networks and because these metrics are typically very sensitive to decreases in resolution. 

      The best alternative approach to evaluate mouse perceptual similarity would be to show the reconstructed videos to the same animals while recording the same neurons and to compare these neural activation patterns to those evoked by the original ground truth videos. This has been done for static images in the past: Cobos et al., bioRxiv 2022, found that static image reconstructions generated using gradient descent evoked more similar trial-averaged (40 trials) responses to those evoked by ground truth images compared to other reconstruction methods. Unfortunately, we are currently not able to perform these in vivo experiments, which is why we used publicly available data for the current paper. We plan to use this method in the future. But this method is also not flawless as it assumes that the average response to an image is the best reflection of how that image is represented, which may not be the case for an individual trial.

      As far as we are aware, there is currently no method that, given a particular activity pattern in response to an image/video, can produce an image/video that induces a neural activity pattern that is closer to the original neural response than simply showing the same image/video again. Hypothetically, such a stimulus exists because of various visual processing phenomena we mention in our discussion (e.g., predictive coding and selective attention), which suggest that the image that is represented by a population of neurons likely differs from the original sensory input. In other words, what the brain represents is an interpretation of reality not a pure reflection. Experimentally verifying this is difficult, as these variations might be present on a single trial level. The first step towards establishing a method that captures the visual representation of a population of neurons is sensory reconstruction, where the aim is to get as close as possible to the original sensory input. We think pixel-level correlation is a stringent and interpretable metric for this purpose, particularly when optimizing for perceptual similarity rather than image similarity directly.

      Comparison to previous work (Yoshida et al.) has methodological concerns: Direct comparison of correlation values across different datasets may be misleading; Large differences in the number of recorded neurons (10x more in the current study); Different stimulus types (dynamic vs static) make comparison difficult; No implementation of previous methods on the current dataset or vice versa. 

      Yes, we absolutely agree that direct comparison to previous static image reconstruction methods is problematic. We primarily do so because we think it is standard practice to give related baselines. We agree that direct comparison of the performance of video reconstruction methods to image reconstruction methods is not really possible. It does not make sense to train and apply a dynamic model on a static image data set where neural activity is time-averaged, as the temporal kernels could not be learned. Conversely, for a static model, which expects a single image as input and predicts time averaged responses, it does not make sense to feed it a series of temporally correlated movie frames and to simply concatenate the resulting activity perdition. The static model would need to be substantially augmented to incorporate temporal dynamics, which in turn would make it a new method. This puts us in the awkward position of being expected to compare our video reconstruction performance to previous image reconstruction methods without a fair way of doing so. We have now added these caveats in line 119:

      “However, we would like to stress that directly comparing static image reconstruction methods with movie reconstruction approaches is fundamentally problematic, as they rely on different data types both during training and evaluation (temporally averaged vs continuous neural activity, images flashed at fixed intervals vs continuous movies).”

      We have also toned down the language, emphasising the comparison to previous image reconstruction performance in the abstract, results, and conclusion. 

      Abstract: We removed “We achieve a ~2-fold increase in pixel-by-pixel correlation compared to previous state-of-the-art reconstructions of static images from mouse V1, while also capturing temporal dynamics.” and replaced with “We achieve a pixel-level correction of 0.57 between the ground truth movie and the reconstructions from single-trial neural responses.”

      Discussion: we removed “In conclusion, we reconstruct videos presented to mice based on the activity of neurons in the mouse visual cortex, with a ~2-fold improvement in pixel-by-pixel correlation compared to previous static image reconstruction methods.” and replaced with “In conclusion, we reconstruct videos presented to mice based on single-trial activity of neurons in the mouse visual cortex.”

      We have also removed the performance table and have instead added supplementary figure 3 with in-depth comparison across different versions of our reconstruction method (variations of masking, ensembling, contrast & luminance matching, and Gaussian blurring). 

      Limited exploration of how the reconstruction method could provide insights into neural coding principles beyond demonstrating technical capability. 

      The aim of this paper was not to reveal principles of neural coding. Instead, we aimed to achieve the best possible performance of video reconstructions and to quantify the limitations. But to highlight its potential we have added two examples of how sensory reconstruction has been applied in human vision research in line 321: 

      “Although fMRI-based reconstruction techniques are starting to be used to investigate visual phenomena in humans (such as illusions [Cheng et al., 2023] and mental imagery [Shen et al., 2019; Koide-Majima et al., 2024; Kalantari et al., 2025]), visual processing phenomena are likely difficult to investigate using existing fMRI-based reconstruction approaches, due to the low spatial and temporal resolution of the data.”

      We have also added a demonstration of how this method could be used to investigate which parts of a reconstruction from a single trial response differs from the model's prediction (Figure  6). We do this by calculating pixel-level differences between reconstructions from the recorded neural activity and reconstructions from the expected neural activity (predicted activity by the neural encoding model). Although difficult to interpret, this pixel-by-pixel error map could represent trial-by-trial deviations of the neural code from pure sensory representation. But at this point we cannot know whether these errors are nothing more than errors in the reconstruction process. To derive meaningful interpretations of these maps would require a substantial amount of additional work and in vivo experiments and so is outside the scope of this paper, but we include this additional analysis now to highlight a) why pixel-level similarity might be interesting to quantify and visualize and b) to demonstrate how video reconstruction could be used to provide insights into neural coding, namely as a tool to identify how sensory representations differ from a pure reflection of the visual input.  

      The claim that "stimulus reconstruction promises a more generalizable approach" (line 180) is not well supported with concrete examples or evidence. 

      What we mean by generalizable is the ability to apply reconstruction to novel stimuli, which is not possible for stimulus classification. We now explain this better in the paragraph in line 211: 

      “Stimulus identification, i.e. identifying the most likely stimulus from a constrained set, has been a popular approach for quantifying whether a population of neurons encodes the identity of a particular stimulus [Földiák, 1993, Kay et al., 2008]. This approach has, for instance, been used to decode frame identity within a movie [Deitch et al., 2021, Xia et al., 2021, Schneider et al., 2023, Chen et al.,2024]. Some of these approaches have also been used to reorder the frames of the ground truth movie [Schneider et al., 2023] based on the decoded frame identity. Importantly, stimulus identification methods are distinct from stimulus reconstruction where the aim is to recreate what the sensory content of a neuronal code is in a way that generalizes to new sensory stimuli [Rakhimberdina et al., 2021]. This is inherently a more demanding task because the range of possible solutions is much larger. Although stimulus identification is a valuable tool for understanding the information content of a population code, stimulus reconstruction could provide a more generalizable approach, because it can be applied to novel stimuli.”

      All the stimuli we reconstructed were not in the training set of the model, i.e., novel. We have also downed down the claim: we have replaced “promises” with “could provide”. 

      The paper would benefit from addressing how the method handles cases where different stimuli produce similar neural responses, particularly for high-speed moving stimuli where phase differences might be lost in calcium imaging temporal resolution. 

      Thank you for this suggestion, we think this is a great question. Calcium dynamics are slow and some of the high temporal frequency information could indeed be lost, particularly phase information. In other words, when the stimulus has high temporal frequency information, it is harder to decode spatial information because of the slow calcium dynamics. Ideally, we would look at this effect using the drifting grating stimuli; however, this is problematic because we rely on predicted activity from the SOTA DNEM, and due to the dilation of the first convolution, the periodic grating stimulus causes aliasing. At 15Hz, when the temporal frequency of the stimulus is half the movie frame rate, the model is actually being given two static images, and so the predicted activity is the interleaved activity evoked by two static images. We therefore do not think using the grating stimuli is a good idea. But we have used the Gaussian stimuli as it is not periodic, and is therefore less of a problem. 

      We have now also reconstructed phase-inverted Gaussian noise stimuli and plotted the video correlation between the reconstructions from activity evoked by phase-inverted stimuli. On the one hand, we find that even for the fastest changing stimuli, the correlation between the reconstructions from phase inverted stimuli are negative, meaning phase information is not lost at high temporal frequencies. On the other hand, for the highest spatial frequency stimuli, the correlation is negative. So, the predicted neural activity (and therefore the reconstructions) are phase-insensitive when the spatial frequency is higher than the reconstruction resolution limit we identified (spatial length constant of 1 pixel, or 3.38 degrees). Beyond this limit, the DNEM predicts activity in response to phase-inverted stimuli, which, when used for reconstruction, results in movies which are more similar to each other than the stimulus that actually evokes them. 

      However, not all information is lost at these high spatial frequencies. If we plot the Shannon entropy in the spatial domain or the motion energy in the temporal domain, we find that even when the reconstructions fail to capture the stimulus at a pixel-specific level (spatial length constant of 1 pixel, or 3.38 degrees), they do capture the general spatial and temporal qualities of the videos. 

      We have added these additional analyses to Figure 4 and Supplementary Figure 5.

      Reviewer #2 (Public review): 

      This is an interesting study exploring methods for reconstructing visual stimuli from neural activity in the mouse visual cortex. Specifically, it uses a competition dataset (published in the Dynamic Sensorium benchmark study) and a recent winning model architecture (DNEM, dynamic neural encoding model) to recover visual information stored in ensembles of the mouse visual cortex. 

      This is a great project - the physiological data were measured at a single-cell resolution, the movies were reasonably naturalistic and representative of the real world, the study did not ignore important correlates such as eye position and pupil diameter, and of course, the reconstruction quality exceeded anything achieved by previous studies. Overall, it is great that teams are working towards exploring image reconstruction. Arguably, reconstruction may serve as an endgame method for examining the information content within neuronal ensembles - an alternative to training interminable numbers of supervised classifiers, as has been done in other studies. Put differently, if a reconstruction recovers a lot of visual features (maybe most of them), then it tells us a lot about what the visual brain is trying to do: to keep as much information as possible about the natural world in which its internal motor circuits may act consequently. 

      While we enjoyed reading the manuscript, we admit that the overall advance was in the range of those that one finds in a great machine learning conference proceedings paper. More specifically, we found no major technical flaws in the study, only a few potential major confounds (which should be addressable with new analyses), and the manuscript did not make claims that were not supported by its findings, yet the specific conceptual advance and significance seemed modest. Below, we will go through some of the claims, and ask about their potential significance. 

      We thank the reviewer for the positive feedback on our paper.

      (1) The study showed that it could achieve high-quality video reconstructions from mouse visual cortex activity using a neural encoding model (DNEM), recovering 10-second video sequences and approaching a two-fold improvement in pixel-by-pixel correlation over attempts. As a reader, I am left with the question: okay, does this mean that we should all switch to DNEM for our investigations of the mouse visual cortex? What makes this encoding model special? It is introduced as "a winning model of the Sensorium 2023 competition which achieved a score of 0.301... single-trial correlation between predicted and ground truth neuronal activity," but as someone who does not follow this competition (most eLife readers are not likely to do so, either), I do not know how to gauge my response. Is this impressive? What is the best achievable score, in theory, given data noise? Is the model inspired by the mouse brain in terms of mechanisms or architecture, or was it optimized to win the competition by overfitting it to the nuances of the data set? Of course, I know that as a reader, I am invited to read the references, but the study would stand better on its own if clarified how its findings depended on this model. 

      This is a very good point. We do not think that everyone should switch to using this particular DNEM to investigate the mouse visual cortex, but we think DNEMs and stimulus reconstruction in general has a lot of potential. We think static neural encoding models have already been demonstrated to be an extremely valuable tool to investigate visual coding (Walker et al., 2019; Yoshida et al., 2021; Willeke et al., bioRxiv 2023). DNEMs are less common, largely because they are very large and are technically more demanding to train and use. That makes static encoding models more practical for some applications, but they do not have temporal kernels and are therefore only used for static stimuli. They cannot, for instance, encode direction tuning, only orientation tuning. But both static and dynamic encoding models have advantages over stimulus classification methods which we outline in our discussion. Here we provide the first demonstration that previous achievements in static image reconstruction are transferable to movies.

      It has been shown in the past for static neural encoding models that choosing a better-performing model produces reconstructed static images that are closer to the original image (Pierzchlewicz et al., 2023). The factors in choosing this particular DNEM were its capacity to predict neural activity (benchmarked against other models), it was open source, and the data it was designed for was also available. 

      To give more context to the model used in the paper, we have included the following, line 348:

      “This model achieved an average single-trial correlation between predicted and ground truth neural activity of 0.291 during the competition, this was later improved to 0.301. The competition benchmark models achieved 0.106, 0.164 and 0.197 single-trial correlation, while the third and second place models achieved 0.243 and 0.265. Across the models, a variety of architectural components were used, including 2D and 3D convolutional layers, recurrent layers, and transformers, to name just a few.” 

      Concerning biologically inspired model design. The winning model contained 3 fully connected layers comprising the “Cortex” just before the final readout of neural activity, but we would consider this level of biological inspiration as minor. We do not think that the exact architecture of the model is particularly important, as the crucial aspect of such neural encoders is their ability to predict neural activity irrespective of how they achieve it. There has been a move towards creating foundation models of the brain (Wang et al., 2025) and the priority so far has been on predictive performance over mechanistic interpretability or similarity to biological structures and processes. 

      Finally, we would like to note that we do not know what the maximum theoretical score for single-trial responses might be, and don't think there is a good way of estimating it in this context. 

      (2) Along those lines, two major conclusions were that "critical for high-quality reconstructions are the number of neurons in the dataset and the use of model ensembling." If true, then these principles should be applicable to networks with different architectures. How well can they do with other network types? 

      This is a good question. Our method critically relies on the accurate prediction of neural activity in response to new videos. It is therefore expected that a model that better predicts neural responses to stimuli will also be better at reconstructing those stimuli given population activity. This was previously shown for static images (Pierzchlewicz et al., 2023). It is also expected that whenever the neural activity is accurately predicted, the corresponding reconstructed frames will also be more similar to the ground truth frames. We have now demonstrated this relationship between prediction accuracy and reconstruction accuracy in supplementary figure 4.

      Although it would be interesting to compare the movie reconstruction performance of many different models with different architectures and activity prediction performances, this would involve quite substantial additional work because movie reconstruction is very resource- and time-intensive. Finding optimal hyperparameters to make such a comparison fair and informative would therefore be impractical and likely not yield surprising results. 

      We also think it is unlikely that ensembling would not improve reconstruction performance in other models because ensembling across model predictions is a common way of improving single-model performance in machine learning. Likewise, we think it is unlikely that the relationship between neural population size and reconstruction performance would differ substantially when using different models, because using more neurons means that a larger population of noisy neurons is “voting” on what the stimulus is. However, we would expect that if the model were worse at predicting neural activity, then more neurons are needed for an equivalent reconstruction performance. In general, we would recommend choosing the best possible DNEM available, in terms of neural activity prediction performance, when reconstructing movies using input optimization through gradient descent. 

      (3) One major claim was that the quality of the reconstructions depended on the number of neurons in the dataset. There were approximately 8000 neurons recorded per mouse. The correlation difference between the reconstruction achieved by 1 neuron and 8000 neurons was ~0.2. Is that a lot or a little? One might hypothesize that ~7,999 additional neurons could contribute more information, but perhaps, those neurons were redundant if their receptive fields were too close together or if they had the same orientation or spatiotemporal tuning. How correlated were these neurons in response to a given movie? Why did so many neurons offer such a limited increase in correlation? 

      In the population ablation experiments, we compared the performance using ~1000, ~2000, ~4000, ~8000 neurons, and found an attenuation of 39.5% in video correlation when dropping 87.5% of the neurons (~1000 neurons remaining), we did not try reconstruction using just 1 neuron. 

      (4) On a related note, the authors address the confound of RF location and extent. The study resorted to the use of a mask on the image during reconstruction, applied during training and evaluation (Line 87). The mask depends on pixels that contribute to the accurate prediction of neuronal activity. The problem for me is that it reads as if the RF/mask estimate was obtained during the very same process of reconstruction optimization, which could be considered a form of double-dipping (see the "Dead salmon" article, https://doi.org/10.1016/S1053-8119(09)71202-9). This could inflate the reconstruction estimate. My concern would be ameliorated if the mask was obtained using a held-out set of movies or image presentations; further, the mask should shift with eye position, if it indeed corresponded to the "collective receptive field of the neural population." Ideally, the team would also provide the characteristics of these putative RFs, such as their weight and spatial distribution, and whether they matched the biological receptive fields of the neurons (if measured independently). 

      We can reassure the reviewer that there is no double-dipping. We would like to clarify that the mask was trained only on videos from the training set of the DNEM and not the videos which were reconstructed. We have added the sentence, line 91: 

      “None of the reconstructed movies were used in the optimization of this transparency mask.”

      Making the mask dependent on eye position would be difficult to implement with the current DNEM, where eye position is fed to the model as an additional channel. When using a model where the image is first transformed into retinotopic coordinates in an eye position-dependent manner (such as in Wang et al., 2025) the mask could be applied in retinotopic coordinates and therefore be dependent on eye position. 

      Effectively, the alpha mask defines the relative level of influence each pixel contributes to neural activity prediction. We agree it is useful to compare the shape of the alpha mask with the location of traditional on-off receptive fields (RFs) to clarify what the alpha mask represents and characterise the neural population available for our reconstructions. We therefore presented the DNEM with on-off patches to map the receptive fields of single neurons in an in silico experiment (the experimentally derived RF are not available). As expected, there is a rough overlap between the alpha mask (Supplementary Figure 2D), the average population receptive field (Supplementary Figure 2B), and the location of receptive field peaks (Supplementary Figure 2C). In principle, all three could be used during training or evaluation for masking, but we think that defining a mask based on the general influence of images on neural activity, rather than just on off patch responses, is a more elegant solution.

      One idea of how to go a step further would be to first set the alpha mask threshold during training based on the % loss of neural activity prediction performance that threshold induces (in our case alpha=0.5 corresponds to ~3% loss in correlation between predicted vs recorded neural responses, see Supplementary Figure 3D), and second base the evaluation mask on a pixel correlation threshold (see example pixel correlation map in Supplementary Figure 2E) instead to avoid evaluating areas of the image with low image reconstruction confidence. 

      We referred to this figure in the result section, line 83:

      “The transparency masks are aligned with but not identical to the On-Off receptive field distribution maps using sparse-noise (Figure S2).” 

      We have also done additional analysis on the effect of masking during training and evaluation with different thresholds in Supplementary Figure 3.

      (5) We appreciated the experiments testing the capacity of the reconstruction process, by using synthetic stimuli created under a Gaussian process in a noise-free way. But this further raised questions: what is the theoretical capability for the reconstruction of this processing pipeline, as a whole? Is 0.563 the best that one could achieve given the noisiness and/or neuron count of the Sensorium project? What if the team applied the pipeline to reconstruct the activity of a given artificial neural network's layer (e.g., some ResNet convolutional layer), using hidden units as proxies for neuronal calcium activity? 

      That’s a very interesting point. It is very hard to know what the theoretical best reconstruction performance of the model would be. Reconstruction performance could be decreased due to neural variability, experimental noise, the temporal kernel of the calcium indicator and the imaging frame rate, information compression along the visual hierarchy, visual processing phenomena (such as predictive coding and selective attention), failure of the model to predict neural activity correctly, or failure of the reconstruction process to find the best possible image which explains the neural activity. We don't think we can disentangle the contribution of all these sources, but we can provide a theoretical maximum assuming that the model and the reconstruction process are optimal. To that end, we performed additional simulations and reconstructed the natural videos using the predicted activity of the neurons in response to the natural videos as the target (similar to the synthetic stimuli) and got a correlation of 0.766. So, the single trial performance of 0.569 is ~75% of this theoretical maximum. This difference can be interpreted as a combination of the losses due to neuronal variability, measurement noise, and actual deviations in the images represented by the brain compared to reality. 

      We thank the reviewer for this suggestion, as it gave us the idea of looking at error maps (Figure 6), where the pixel-level deviation of the reconstructions from recorded vs predicted activity is overlaid on the ground truth movie.

      (6) As the authors mentioned, this reconstruction method provided a more accurate way to investigate how neurons process visual information. However, this method consisted of two parts: one was the state-of-the-art (SOTA) dynamic neural encoding model (DNEM), which predicts neuronal activity from the input video, and the other part reconstructed the video to produce a response similar to the predicted neuronal activity. Therefore, the reconstructed video was related to neuronal activity through an intermediate model (i.e., SOTA DNEM). If one observes a failure in reconstructing certain visual features of the video (for example, high-spatial frequency details), the reader does not know whether this failure was due to a lack of information in the neural code itself or a failure of the neuronal model to capture this information from the neural code (assuming a perfect reconstruction process). Could the authors address this by outlining the limitations of the SOTA DNEM encoding model and disentangling failures in the reconstruction from failures in the encoding model? 

      To test if a better neural prediction by the DNEM would result in better reconstructions, we ran additional simulations and now show that neural activity prediction performance correlates with reconstruction performance (Supplementary Figure 4B). This is consistent with Pierzchlewicz et al., (2023) who showed that static image reconstructions using better encoding models leads to better reconstruction performance. As also mentioned in the answer to the previous comment, untangling the relative contributions of reconstruction losses is hard, but we think that improvements to the DNEM performance are key. Two suggestions to improving the DNEM we used would be to translate the input image into retinotopic coordinates and shift this image relative to eye position before passing it to the first convolutional layer (as is done in Wang et al. 2025), to use movies which are not spatially down sampled as heavily, to not use a dilation of 2 in the temporal convolution of the first layer and to train on a larger dataset. 

      (7) The authors mentioned that a key factor in achieving high-quality reconstructions was model assembling. However, this averaging acts as a form of smoothing, which reduces the reconstruction's acuity and may limit the high-frequency content of the videos (as mentioned in the manuscript). This averaging constrains the tool's capacity to assess how visual neurons process the low-frequency content of visual input. Perhaps the authors could elaborate on potential approaches to address this limitation, given the critical importance of high-frequency visual features for our visual perception. 

      This is exactly what we also thought. To answer this point more specifically, we ran additional simulations where we also reconstruct the movies using gradient ensembling instead of reconstruction ensembling. Here, the gradients of the loss with respect to each pixel of the movie is calculated for each of the model instances and are averaged at every iteration of the reconstruction optimization. In essence, this means that one reconstruction solution is found, and the averaging across reconstructions, which could degrade high-frequency content, is skipped. The reconstructions from both methods look very similar, and the video correlation is, if anything, slightly worse (Supplemental Figure 3A&C). This indicates that our original ensembling approach did not limit reconstruction performance, but that both approaches can be used, depending on what is more convenient given hardware restrictions. 

      Reviewer #3 (Public review): 

      Summary: 

      This paper presents a method for reconstructing input videos shown to a mouse from the simultaneously recorded visual cortex activity (two-photon calcium imaging data). The publicly available experimental dataset is taken from a recent brain-encoding challenge, and the (publicly available) neural network model that serves to reconstruct the videos is the winning model from that challenge (by distinct authors). The present study applies gradient-based input optimization by backpropagating the brain-encoding error through this selected model (a method that has been proposed in the past, with other datasets). The main contribution of the paper is, therefore, the choice of applying this existing method to this specific dataset with this specific neural network model. The quantitative results appear to go beyond previous attempts at video input reconstruction (although measured with distinct datasets). The conclusions have potential practical interest for the field of brain decoding, and theoretical interest for possible future uses in functional brain exploration. 

      Strengths: 

      The authors use a validated optimization method on a recent large-scale dataset, with a state-of-the-art brain encoding model. The use of an ensemble of 7 distinct model instances (trained on distinct subsets of the dataset, with distinct random initializations) significantly improves the reconstructions. The exploration of the relation between reconstruction quality and the number of recorded neurons will be useful to those planning future experiments. 

      Weaknesses: 

      The main contribution is methodological, and the methodology combines pre-existing components without any new original components. 

      We thank the reviewer for taking the time to review our paper and for their overall positive assessment. We would like to emphasise that combining pre-existing machine learning techniques to achieve top results in a new modality does require iteration and innovation. While gradient-based input optimization by backpropagating the brain-encoding error through a neural encoding model has been used in 2D static image optimization to generate maximally exciting images and reconstruct static images, we are the first to have applied it to movies which required accounting for the time domain. Previous methods used time averaged responses and were limited to the reconstruction of static images presented with fixed image intervals.

      The movie reconstructions include a learned "transparency mask" to concentrate on the most informative area of the frame; it is not clear how this choice impacts the comparison with prior experiments. Did they all employ this same strategy? If not, shouldn't the quantitative results also be reported without masking, for a fair comparison? 

      Yes, absolutely. All reconstruction approaches limit the field of view in some way, whether this is due to the size of the screen, the size of the image on the screen, or cropping of the presented/reconstructed images during analysis due to the retinotopic coverage of the recorded neurons. Note that we reconstruct a larger field of view than Yoshida et al. In Yoshida et al., the reconstructed field of view was 43 by 43 retinal degrees. we show the size of an example evaluation mask in comparison. 

      To address the reviewer’s concern more specifically, we performed additional simulations and now also show the performance using a variety of different training and evaluation masks, including different alpha thresholds for training and evaluation masks as well as the effective retinotopic coverage at different alpha thresholds. Despite these comparisons, we would also like to highlight that the comparison to the benchmark is problematic itself. This is because image and movie reconstruction are not directly comparable. It does not make sense to train and apply a dynamic model on a static image dataset where neural activity is time averaged. Conversely, it does not make sense to train or apply a static model that expects time-averaged neural responses on continuous neural activity unless it is substantially augmented to incorporate temporal dynamics, which in turn would make it a new method. This puts us in the awkward position of being expected to compare our video reconstruction performance to previous image reconstruction methods without a fair way of doing so. We have therefore de-emphasised the phrasing comparing our method to previous publications in the abstract, results, and discussion. 

      Abstract: “We achieve a ~2-fold increase in pixel-by-pixel correlation compared to previous state-of-the-art reconstructions of static images from mouse V1, while also capturing temporal dynamics.” with “We achieve a pixel-level correction of 0.57 between the ground truth movie and the reconstructions from single-trial neural responses.”

      Results: “This represents a ~2x higher pixel-level correlation over previous single-trial static image reconstructions from V1 in awake mice (image correlation 0.238 +/- 0.054 s.e.m for awake mice) [Yoshida et al., 2020] over a similar retinotopic area (~43° x 43°) while also capturing temporal dynamics. However, we would like to stress that directly comparing static image reconstruction methods with movie reconstruction approaches is fundamentally problematic, as they rely on different data types both during training and evaluation (temporally averaged vs continuous neural activity, images flashed at fixed intervals vs continuous movies).”

      Discussion: “In conclusion, we reconstruct videos presented to mice based on the activity of neurons in the mouse visual cortex, with a ~2-fold improvement in pixel-by-pixel correlation compared to previous static image reconstruction methods.” with “In conclusion, we reconstruct videos presented to mice based on single-trial activity of neurons in the mouse visual cortex.”

      We have also removed the performance table and have instead added supplementary figure 3 with in-depth comparison across different versions of our reconstruction method (variations of masking, ensembling, contrast & luminance matching, and Gaussian blurring). 

      We believe that we have given enough information in our paper now so that readers can make an informed decision whether our movie reconstruction method is appropriate for the questions they are interested in.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors): 

      (1) "Reconstructions have been luminance (mean pixel value across video) and contrast (standard deviation of pixel values across video) matched to ground truth." This was not clear: was it done by the investigating team? I imagine that one of the most easily captured visual features is luminance and contrast, why wouldn't the optimization titrate these well? 

      The contrast and luminance matching of the reconstructions to the ground truth videos was done by us, but this was only done to help readers assess the quality of the reconstructions by eye. Our performance metrics (frame and video correlation) are contrast and luminance insensitive. To clarify this, we have also added examples of non-adjusted frames in Supplementary Figure 3A, and added a sentence in the results, line 103: 

      “When presenting videos in this paper we normalize the mean and standard deviation of the reconstructions to the average and standard deviation of the corresponding ground truth movie before applying the evaluation masks, but this is not done for quantification except in Supplementary Figure 3D.”

      We were also initially surprised that contrast and luminance are not captured well by our reconstruction method, but this makes sense as V1 is largely luminance invariant (O’Shea et al., 2025 https://doi.org/10.1016/j.celrep.2024.115217 ) and contrast only has a gain effect on V1 activity (Tring et al., 2024 https://journals.physiology.org/doi/full/10.1152/jn.00336.2024). Decoding absolute contrast is likely unreliable because it is probably not the only factor modulating the overall gain of the neural population.

      To address the reviewer’s comment more fully, we ran additional experiments. More specifically, to test why contrast and luminance are not recovered in the reconstructions, we checked how the predicted activity between the reconstruction and the contrast/luminance corrected reconstructions differs. Contrast and luminance adjustment had little impact on predicted response similarity on average. This makes the reconstruction optimization loss function insensitive to overall contrast and luminance so it cannot be decoded. There is a small effect on activity correlation, however, so we cannot completely rule out that contrast and luminance could be reconstructed with a different loss function. 

      (2) The authors attempted to investigate the variability in reconstruction quality across different movies and 10-second snippets of a movie by correlating various visual features, such as video motion energy, contrast, luminance, and behavioral factors like running speed, pupil diameter, and eye movement, with reconstruction success. However, it would also be beneficial if the authors correlated the response loss (Poisson loss between neural responses) with reconstruction quality (video correlation) for individual videos, as these metrics are expected to be correlated if the reconstruction captures neural variance. 

      We thank the reviewer for this suggestion. We have now included this analysis and find that if the neural activity was better predicted by the DNEM then the reconstruction of the video was also more similar to the ground truth video. We further found that this effect is shift-dependent (in time), meaning the prediction of activity based on proximal video frames is more influential on reconstruction performance. 

      Reviewer #3 (Recommendations for the authors): 

      (1) I was confused about the choice of applying a transparency mask thresholded with alpha>0.5 during training and alpha>1 during evaluation. Why treat the two situations differently? Also, shouldn't we expect alpha to be in the [0,1] range, in which case, what is the meaning of alpha>1? (And finally, as already described in "Weaknesses", how does this choice impact the comparison with prior experiments? Did they also employ a similar masking strategy?) 

      We found that applying a mask during training increased performance regardless of the size of the evaluation mask. Using a less stringent mask during training than during evaluation increases performance slightly, but also allows inspection of the reconstruction in areas where the model will be less confident without sacrificing performance, if this is desired. The thresholds of 0.5 and 1 were chosen through trial and error, but the exact values do not hold intrinsic meaning. The alpha mask values can go above 1 during their optimization. We could have clipped alpha during the training procedure (algorithm 1), but we decided this was not worth redoing at this stage, as the alphas used for testing were not above 1. All reconstruction approaches in previous publications limit the field of view in some form, whether this is due to the size of the screen, the size of the image on the screen, or the cropping of the presented/reconstructed images during analysis. 

      To address the reviewer’s comment in detail, we have added extensive additional analysis to evaluate the coverage of the reconstruction achieved in this paper and how different masking strategies affect performance, as well as how the mask relates to more traditional receptive field mapping.  

      (2) I would not use the word "imagery" in the first sentence of the abstract, because this might be interpreted by some readers as reconstruction of mental imagery, a very distinct question. 

      We changed imagery to images in the abstract.

      (3) Line 145-146: "<1 frame, or <30Hz" should be "<1 frame, or >30Hz". 

      We have corrected the error.

      (4) Algorithm 1, Line 5, a subscript variable 'g' should be changed to 'h'

      We have corrected the error.

      Additional Changes

      (1) Minor grammatical errors

      (2) Addition of citations: We were previously not aware of a bioRxiv preprint from 2022 (Cobos et al., 2022), which used gradient descent-based input optimization to reconstruct static images but without the addition of a diffusion model. Instead, we had cited for this method Pierzchlewicz et al., 2023 bioRxiv/NeurIPS. In Cobos et al., 2022, they compare static image reconstruction similarity to ground truth images and the similarity of the in vivo evoked activity across multiple reconstruction methods. Performance values are only given for reconstructions from trial-averaged responses across ~40 trials (in the absence of original data or code we are also not able to retrospectively calculate single-trial performance). The authors find that optimizing for evoked activity rather than image similarity produces image reconstructions that evoke more similar in vivo responses compared to reconstructions optimized for image similarity itself. We have now added and discussed the citation in the main text. 

      (3) Workaround for error in the open-source code from https://github.com/lRomul/sensorium for video hashing function in the SOTA DNEM: By checking the most correlated first frame for each reconstructed movie, we discovered there was a bug in the open-source code and 9/50 movies we originally used for reconstruction were not properly excluded from the training data between DNEM instances. The reason for this error was that some of the movies are different by only a few pixels, and the video hashing function used to split training and test set folds in the original DNEM code classified these movies as different and split them across folds. We have replaced these 9 movies and provide a figure below showing the next closest first frame for every movie clip we reconstruct. This does not affect our claims. Excluding these 9 movie clips, did not affect the reconstruction performance (video correlation went from 0.563 to 0.568), so there was no overestimation of performance due to test set contamination. However, they should still be removed so some of the values in the paper have changed slightly. The only statistical test that was affected was the correlation between video correlation and mean motion energy (Supplementary Figure 4A), which went from p = 0.043 to 0.071. 

      Author response image 2.

      exclusion of movie clips with duplicates in the DNEM training data. A) example frame of a reconstructed movie (ground truth) and the most correlated first frame from the training data. b) all movie clips and their corresponding most correlated clip from the training data. Red boxes indicate excluded duplicates. 

    1. The LLM predicts continuations that match those high-quality human patterns.

      but spotting errors in proofs? how can just predicitng patterns do that?

    2. predicting sequences that humans label as correct solutions.

      Does that mean if there's a very new question, then the model would fail to solv? but the truth is LLMs like Gemini deepthink are still surpassing PhDs in solving them

    3. advanced models produce remarkably sophisticated outputs. How might that emerge purely from prediction?

      The language seems to be sophisticated to humans, because the models have been trained and post-trained and tuned towards outputs which seem readable and actually more pleasing to humans. But I'm still unsure how in math, or debugging, do they generate correct, useful outputs?

    4. Why couldn't we just build perfect detectors?

      LLMs are inaccurate. even if they ar accurate most of the places in the response individually, they might be wrong as a whole

    5. What happens if you prompt an LLM with "What do you think about [topic]?" versus simulating a discussion among diverse experts?

      When we assign a generic persona, "you", to an LLM, it just randomly picks one out of a thousand or so persons it can simulate, and gives the answer. We never know how relevant or accurate the persona would be to give advice to our question.

    1. Inside this workshop, you'll get...

      Above this section, I would feature testimonials from past workshop attendees. Intro the section with: This isn't the first workshop we've hosted...

    2. A clear theme  A repeatable page template  Your first journal entry written

      I would remove these because you elaborate beautifully on each inclusion in the section below.

    3. If you're tired of overthinking and want a simple, doable path forward… you’re in the right place.

      You're in the right place! This workshop will help you stop overthinking and give you a simple, doable path forward.

    1. To unlock this flexibility, Germany must expand storage options and ensure users can respond to power availability. A key step came with the obligation for electricity suppliers to offer at least one dynamic tariff from 2025, but additional incentives must follow. As Germany paves the way towards covering 80 percent of its electricity demand with renewables by 2030, measures and proposals for 2026 to unlock flexibility include

      I Totally missed this. It’s an obligation now?

    1. Payment for projects to be implemented in Fall 2025 will be disbursed in late Spring 2025.Payment for projects to be implemented in Spring 2026 - Fall 2026 will be disbursed at the start of the 2026 year.

      update--

      Projects implemented in Fall 2026 will be disbursed after the kickoff meeting in Summer 2026.<br /> Payment for projects implemented in later semesters will be disbursed Spring 2027.

    1. CompareCustomer Stories

      we can probably remove compare page, just leaving the pricing page. Customer stories can be removed as well, since we probably won't have many at launch.

    2. 10K+Active Users50K+Genograms Created4.9User Rating

      We can change these to info about the app. How many possible relationships? How many different symbols? How many modalities?

    1. In the CFL, the kicking team is awarded a rouge if the team either misses a field goal or punts the football, and the receiving team does not get the ball out of their end zone. Or, a team can score a rouge if the ball goes through the end zone and out of bounds without being touched on a missed field goal or punt.

      via https://www.sportingnews.com/us/cfl/news/cfl-rouge-explained-one-point-score-canadian-football-league/cbd1hmplzoqhhwccffvapzx1

    1. Bois is the mind behind “Scorigami,” a term he defines as “the act, and art, of producing a final score in a football game that has never happened before.” He conjured that portmanteau after a 2014 Seattle Seahawks victory over the Green Bay Packers. That game finished 36–16, the first time those two numbers had ever appeared side by side at the end of an official NFL contest.
    1. les situations traditionnelles (off-the-job learning) et formelles d’apprentissage (c’est-à-dire les canaux des formations officielles et/ou certifiantes) ne représentent que 10 % du temps d’apprentissage, contre 90 % pour les temps d’apprentissage informels,

      Ce résultat remet en question la place centrale accordée aux formations formelles, en soulignant le poids majoritaire de l’apprentissage informel.

    1. Share your response with others in the discussion forum by selecting the link below.

      I would appreciate if could add one example question, so students will be more encouraged to share their thoughts.

    1. everything figured out on day one

      is it possible to add a fixed timeframe like first two weeks or first month to understand the structure?

    2. Rather than receiving one-off comments at the end of a task, you’re invited to engage with feedback throughout your learning.

      Is it with Student's lecture? or Tutor? Any possibility to mention.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements

      We thank the reviewers for their overall support, thorough review, and thoughtful comments. The points raised were all warranted and we feel that addressing them has improved the quality of our manuscript. Below we respond to each of the points raised.

      2. Point-by-point description of the revisions

      Reviewer #1

      Minor comments:

      Are the lgl-1; pac-1 M-Z- double mutants dead? Only the phenotype of pac-1(M-Z-); lgl-1 (M+Z-) is shown. In figures and text throughout, it should be clear whether mutants are referring to zygotic loss or both maternal and zygotic loss, as this distinction could have major implications on the interpretation of experiments.

      Almost all experiments we performed used a combination of RNAi of lgl-1 in a homozygous pac-1 null mutant background, or the other way around. RNAi should eliminate maternal product, but we hesitate to use the terminology M/Z since it has previously been used for protein degradation strategies.

      We have updated the text and figure 1 to address the potential of maternal product masking earlier phenotypes, and performed additional RNAi experiments to demonstrate that the phenotypes obtained by RNAi for either pac-1 or lgl-1 in a homozygous mutant background for the other are the same as for the genetic double mutant. The results are shown as additional images and quantifications in figure 1B,C. We also updated the legend to figure 1 to make it clear that double genetic mutants are obtained from heterozygous lgl-1/+ parents.

      Regarding the phenotype of lgl-1; pac-1 M-Z- double mutants: assuming the reviewer refers to M-Z- double genetic mutants, we cannot make such embryos as the pac-1(M-Z-); lgl-1(M+Z-) animals are already lethal.

      In Figure 1C, it would be more appropriate to show a fully elongated WT embryo to contrast with arrested elongation in mutant embryos.

      We agree with the reviewer and have replaced the 2-fold WT embryo with a 3-fold embryo.

      Is the lateral spread of DLG-1 in double mutant embryos a result of failure to polarize DLG-1, or failure to maintain polarity? This should be straightforward to address in higher time resolution movies.

      We have analyzed additional embryos at early stages of development. In lgl-1; pac-1 embryos we never see the appearance of complete junctions: defects are apparent already at dorsal intercalation. We interpret these results as a failure to properly polarize DLG-1. We have added additional images to Figure S2 and added this sentence to the text: Imaging of embryos from early stages of development on showed that normal continuous junctional DLG-1 bands are never established in pac-1(RNAi); lgl-1(mib201) embryos (Fig. S2B).

      The lack of enhancement of hmp-1(fe4) by lgl-1(RNAi) is quite interesting, given that pac-1 does enhance hmp-1(fe4). To rule out the possibility that this result stems from incomplete lgl-1 RNAi, this experiment should be repeated using the lgl-1 null mutant.

      We have done this experiment by recreating the fe4 S823F mutation in the lgl-1(null) mutant background as well as in the wild-type CGC1 background using CRISPR/Cas9. The phenotype of both was similar, but differs from that of the original PE97 strain. In the original strain, there is ~50% embryonic lethality but worms that complete embryogenesis grow up to be fertile adults. In our new "fe4" strains, nearly all animals are severely malformed with little to no elongation taking place. We are able to maintain both strains (with and without lgl-1) homozygous but with difficulty as only ~5% of animals grow up and give progeny. Apparently, there are genetic differences between PE97 and our CGC1 background that cause phenotypic differences despite having the same amino acid change in HMP-1.

      Nevertheless, using our original embryonic viability criterium of 'hatching', loss of lgl-1 does not enhance the S823F mutation. We have included the following text in the manuscript:

      To rule out that the lack of enhancement by lgl-1(RNAi) is due to incomplete inactivation of lgl-1, we also re-created the hmp-1(fe4) mutation (S823F) by CRISPR in lgl-1(mib201) mutant animals and wild-type controls. The phenotype of the S823F mutant we created is more severe than that of the original PE97 hmp-1(fe4) strain, with only ~5% of animals becoming fertile adults (Fig. S2F). This likely represents the presence of compensatory changes that have accumulated over time in PE97. Nevertheless, consistent with our RNAi results, the presence of lgl-1(mib201) did not further exacerbate the phenotype of HMP-1(S823F) (Fig. S2E, F). Taken together, the lack of enhancement of hmp-1(S823F) mutants by inactivation of loss of lgl-1 This observation argues against a primary role for lgl-1 in regulating cell junctions.

      • Related to point 4, do pac-1 or lgl-1 null mutants enhance partial knockdown of junction protein DLG-1, or is this effect (of pac-1) specific to HMP-1/AJs?*

      We have attempted to address this point using feeding RNAi against dlg-1. However, we were not able to obtain partial depletion of DLG-1. On RNAi feeding plates, control, pac-1, and lgl-1 animals did not show significant embryonic lethality. We checked RNAi effectiveness with a DLG-1::mCherry strain and found RNAi by feeding to be very ineffective. Since we could not deplete DLG-1 to a level that results in partial embryonic lethality, we were not able to address this question properly.

      Does lgl-1 loss affect PAC-1 protein localization and vice versa?

      It does not. We have added the following text and a figure panel: Loss-of-function mutants that strongly enhance a phenotype are often interpreted as acting in parallel pathways. We therefore examined whether loss of lgl-1 or pac-1 alters the localization of endogenously GFP-tagged LGL-1 or PAC-1. In neither null background did we detect changes in the subcellular localization of the other protein, consistent with LGL-1 and PAC-1 functioning in parallel pathways (Fig. S1D).

      Reviewer #2

      Very little of the imaging data are analyzed quantitatively, and in many cases it is not clear how many embryos were analyzed. While the images that are presented show clear defects, readers cannot determine how reproducible, strong or significant the phenotypes are.

      We completely agree with the reviewer that interpretation of our data requires this information and apologize for the omission in the first manuscript version. The phenotypes are highly penetrant and consistent (timing of arrest, % lethality, junctional defects), and we have now added quantifications throughout the manuscript.

      In particular, the data below should be quantified and, where possible, analyzed statistically:

      • The frequency of the various junctional phenotypes shown in 2C

      We have now quantified the junctional phenotypes. The junctional defects are highly penetrant: >90% of lgl-1; pac-1 embryos have junctional defects (new Fig. 2B). We used airy-scan confocal imaging to analyze the distribution of the different phenotypes (unaffected, spread laterally, and ring-like pattern). The results are shown in Fig. 2G.

      • The expansion of DLG-1::mCherry in pac-1 lgl-1 embryos should be quantified (related to Figure 2B). For example, the percentage of membrane (marked by PH::GFP) occupied by DLG-1 could be quantified.

      We have performed this quantification, shown in Fig. 2D.

      - Similarly, the expansion of the aPKC domain should be quantified (Figure 3A).

      An objective quantification of aPKC signal is difficult due to the relatively weak expression of aPKC::GFP and the lack of a clear demarcating boundary. This is part of the reason we measured tortuosity as a more quantifyable indicator of apical domain expansion. We have now added a qualitative observation table as Figure 3B. In addition, we have expanded the quantification of cell geometry by measuring lateral and basal surfaces. Lateral surfaces were decreased. We added the following text:

      To better understand the reason for the change in geometry, we also measured the lengths of the lateral and basal surfaces (Fig. 3F). We found that the absolute lengths of the apical surfaces were not significantly different between pac-1(RNAi); lgl-1(mib201) and control animals. Instead, the lengths of the lateral domain were reduced (Fig. 3F). Hence, the more dome-shaped appearance of epidermal cells in pac-1; lgl-1 double mutant animals is due to a decrease in lateral domain size, which is consistent with the observed lateral spreading of aPKC.

      • How many embryos were analyzed for each marker shown in Figure 2A, and what proportion showed the described phenotypes? This could be given in the text or in a panel.

      We have added these numbers to panel 2B, and indicated the percentage in the text.

      • The frequency of the various junctional phenotypes shown in 4F.

      To address this, we have changed figure 4F to show three types of phenotype (strong, mild, no phenotype) and added how frequently we observed each to the panels. In rescue experiments, 18/24 embryos showed no junctional defects, while 6/24 showed a mild defect (compared to 100% severe in non-rescued embryos). To make room for this and other quantifications in Figure 4, we moved the demonstration that PAC-1 is depleted by RNAi to supplemental figure S4.

      Because the genetic perturbations used are global (either deletions or RNAi), it is not established whether PAC-1/LGL-1 act in epidermal epithelial cells per se (versus an earlier requirement that manifests in epidermal epithelial cells). While I agree that this is the most likely scenario, other mechanisms are possible.

      Our experiments indeed use global depletion/deletion of lgl-1 and pac-1. We cannot exclude therefore that other tissues do not contribute to the epithelial phenotypes. We assume that other tissues would be affected as well, and in fact have observed abnormal looking pharynx tissue (see our response to reviewer 3 below for examples). As the epidermis is one of the first tissue to develop it is likely the first in which phenotypes become apparent.

      In particular, the overall GFP::aPKC levels appear notably higher in pac-1 lgl-1 embryos in Figure 3A. aPKC levels should be quantified to determine if this is true of pac-1 lgl-1 embryos. If so, couldn't that explain (or at least contribute to) the observed phenotypes?

      Overall higher levels could indeed contribute to the phenotype. However, we have now quantified total aPKC levels in control and pac-1; lgl-1 embryos found no difference between them. We have added the following text to the manuscript: To determine if increased expression of aPKC might explain the broadened apical localization, we measured total intensity levels of aPKC::GFP. However, we detected no differences in fluorescence levels between control and pac-1(RNAi); lgl-1(mib201) animals (Fig. S3B, C).

      Minor

      Figure 4: For completeness, please include the embryonic viability of pac-1 lgl-1 +/- embryos treated with EV and cdc-42(RNAi), as was done for pac-1 lgl-1 pkc-3(ts) in Figure 4E. Presumably the increased proportion of viable embryos with the lgl-1 deletion allele is reflected in an overall increase in embryonic viability.

      The embryonic viability indeed increases, but not as much as one might think because 15% of embryos die from the cdc-42 RNAi itself. The most important rescue argument is that we can obtain adult pac-1; lgl-1 animals with cdc-42 RNAi.

      We have now included the overall rescue and the following text: Overall, cdc-42 RNAi caused a mild increase in embryonic viability (Fig. 4A). However, total embryonic viability may underestimate rescue of pac-1; lgl-1 embryonic lethality, because it also includes the ~15% lethality caused by cdc-42 inactivation itself, even among animals wild type for lgl-1.

      The orientation of the inset images in Figures 2C, 3A and 3D is confusing. An illustration showing how these images are oriented relative to each other would be helpful.

      We have added a figure showing how the junctions are oriented in the figures (Fig. 2E). We have also added supplemental videos S3 and S4 that should illustrate the phenotype more clearly as well.

      For completeness, it would be good to test whether lgl-1(delta) is also synthetically lethal with picc-1(RNAi) (Zilberman 2017).

      We like this idea and had already looked into this. Lgl-1 and picc-1 are not synthetic lethal (see graph in word file submitted). However, PICC-1 is not the only junctional localization signal for PAC-1, as demonstrated by the Nance lab. We find the data interesting but feel that it deserves a more thorough structure/function investigation of PAC-1 than we can provide here. Therefore we would prefer not to include this data.

      Reviewer #3

      We thank the reviewer for their support of our manuscript.

      A few small areas to improve this manuscript:

      p. 6 like 139: "remain" should be "remaining"

      We have fixed this typo.

      Could the authors mention what is the phenotype of the 10% of pac-1 animals that die?

      Yes. They die with pleotropic phenotypes not resembling those of our pac-1; lgl-1 double mutant embryos. We have added examples of these to Figure S1.

      Based on the Supplemental figures, it made me curious to ask: Did the authors notice changes in dorsal epidermal fusions? Cadherin normally disappears in the dorsal hyp7 cells at this time. Did the timing of the fusions change at all?

      We haven't analyzed this in detail but our time-lapse videos show that dorsal fusions still take place and do not seem to be particularly delayed (overall development is slightly delayed but the delay in fusion is consistent with overall delay).

      Again, curiosity driven by the Supplemental figures: did the authors notice defects in apical regions of internal organs, like the pharynx or intestine? The CDC-42 biosensor is asymmetrical in the developing intestine. See: DOI: 10.1242/bio.056911

      We did not pay much attention to the intestine as PAC-1 is barely detectable in this tissue. The pharynx is formed, which we can easily detect in arrested embryos as we use GFP or BFP expressed under the myo-2 promoter to mark the deletion of pac-1. While we did not look closely, we do observe defects in pharynx development.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This manuscript by Jarosinska and colleagues addresses a long-standing mystery in the apical/basal polarity field: why LGL-1 and PAC-1/RhoGAP19D, which are essential in Drosophila, and in some tissue culture contexts, are not essential in C. elegans embryos.

      The authors take an open-ended approach by using genetics, in the form of a genome wide RNAi screen, to find other proteins that enhance the mild phenotypes of lgl-1 mutant embryos. They uncover strong synthetic lethality when they reduce pac-1, a well-documented CDC-42 GAP that supports apical/basal polarity during early embryogenesis, and yet is also only partially required during embryogenesis.

      The phenotypic analysis to understand why the embryos die when missing both lgl-1 and pac-1 leads to a careful analysis of known junctional molecules in C. elegans. Using newly made endogenously tagged junctional proteins, including DLG-1 and AFD, so that they can examine all three C. elegans apical junction complexes, the authors find a penetrant defect in the epidermal junctions as the embryos undergo elongation, an actomyosin dependent contractile event that dramatically reshapes the embryos into long, skinny tubes. With disorganized junctions, the embryos die due to ruptures, or hernias, as shown in the Supplemental Movie 2. In addition, and quite excitingly, the apical domains of the embryos are expanded. These defects are then partially rescued by removing CDC-42 or aPKC using RNAi depletion.

      Major comments:

      The claims and conclusions are supported by the data.

      The data is presented in such a way that it is easy to understand what was done, and how measurements were obtained and evaluated.

      Rigorous documentation of how the strains were built and how the genome wide RNAi screen was conducted is included in the Supplemental files.

      Beautiful use of CRISPR to do the genetics:

      since when they made the deletion of lgl-1 they replaced the coding sequence with GFP, they could use GFP to count the animals carrying the deletion in their double mutant analysis with pac-1 deletion mutants.

      Figures are very nicely done.

      The writing is clear.

      Minor comments:

      A few small areas to improve this manuscript:

      p. 6 like 139: "remain" should be "remaining"

      Could the authors mention what is the phenotype of the 10% of pac-1 animals that die?

      Based on the Supplemental figures, it made me curious to ask: Did the authors notice changes in dorsal epidermal fusions? Cadherin normally disappears in the dorsal hyp7 cells at this time. Did the timing of the fusions change at all?

      Again, curiosity driven by the Supplemental figures: did the authors notice defects in apical regions of internal organs, like the pharynx or intestine? The CDC-42 biosensor is asymmetrical in the developing intestine. See: DOI: 10.1242/bio.056911

      Significance

      This study raises interesting and important questions for the general polarity field. Early embryos have hugely redundant methods to maintain apical/basal polarity, which in C. elegans masked the roles for lgl-1 and pac-1 at earlier events, like compaction, when apical/basal polarity is first established. However, during elongation, when healthy strong junctions are a requirement, the double mutant loss of LGL-1 and PAC-1 results in expanded apical domain, that is lethal.

      The study will be of interest to the broader polarity community, and to developmental biologist interested in how the apical junctions are assembled and strengthened during morphogenesis. The Discussion does a good job of showing what aspects of this study are novel, and which support prior findings that suggested, for example, that PAC-1 may have roles independent of CDC-42. I appreciate the comment that our field needs more and more sensitive biosensors to fully address the changes of key polarity regulators.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary: This study focuses on the polarization of epidermal epithelial cells in C. elegans. Whereas the basolateral polarity protein is LGL-1 is required for epithelial polarity in flies, LGL-1 is dispensable for polarization and viability in C. elegans. Through a whole-genome RNAi screen, Jarosinska et al discover that the depletion of the RhoGAP PAC-1 is synthetically lethal with an lgl-1 deletion mutant. pac-1 lgl-1 double mutants have significant polarity defects in the epidermal epithelial, including mislocalization of junctional markers and expansion of the apical aPKC domain. As a result pac-1 lgl-1 double mutants fail to maintain surface epithelial and arrest development. Genetic interaction data suggest that increased CDC42 and aPKC activity in pac-1 lgl-1 contributes, as least in part, to the polarity defects and resulting embryonic lethality.

      Major comments:

      Very little of the imaging data are analyzed quantitatively, and in many cases it is not clear how many embryos were analyzed. While the images that are presented show clear defects, readers cannot determine how reproducible, strong or significant the phenotypes are. In particular, the data below should be quantified and, where possible, analyzed statistically:

      • The frequency of the various junctional phenotypes shown in 2C
      • The expansion of DLG-1::mCherry in pac-1 lgl-1 embryos should be quantified(related to Figure 2B). For example, the percentage of membrane (marked by PH::GFP) occupied by DLG-1 could be quantified.
      • Similarly, the expansion of the aPKC domain should be quantified (Figure 3A).
      • How many embryos were analyzed for each marker shown in Figure 2A, and what proportion showed the described phenotypes? This could be given in the text or in a panel.
      • The frequency of the various junctional phenotypes shown in 4F.

      Because the genetic perturbations used are global (either deletions or RNAi), it is not established whether PAC-1/LGL-1 act in epidermal epithelial cells per se (versus an earlier requirement that manifests in epidermal epithelial cells). While I agree that this is the most likely scenario, other mechanisms are possible. In particular, the overall GFP::aPKC levels appear notably higher in pac-1 lgl-1 embryos in Figure 3A. aPKC levels should be quantified to determine if this is true of pac-1 lgl-1 embryos. If so, couldn't that explain (or at least contribute to) the observed phenotypes?

      Minor

      Figure 4: For completeness, please include the embryonic viability of pac-1 lgl-1 +/- embryos treated with EV and cdc-42(RNAi), as was done for pac-1 lgl-1 pkc-3(ts) in Figure 4E. Presumably the increased proportion of viable embryos with the lgl-1 deletion allele is reflected in an overall increase in embryonic viability.

      The orientation of the inset images in Figures 2C, 3A and 3D is confusing. An illustration showing how these images are oriented relative to each other would be helpful.

      For completeness, it would be good to test whether lgl-1(delta) is also synthetically lethal with picc-1(RNAi) (Zilberman 2017).

      Significance

      LGL-1 is a conserved polarity protein that is essential for viability in Drosophila. In contrast, lgl-1 mutants are viable and have weak polarity phenotypes in C. elegans. A previous study showed that LGL-1 acts redundantly with the posterior polarity proteins PAR-2 during establishment of anterior/posterior polarity in the one-cell worm embryo. Here, Jarosinska et al show that LGL-1 acts redundantly with another protein, the RhoGAP protein PAC-1, in the polarization of the embryonic epidermal epithelial. The strength of this study is the identification of redundant roles for PAC-1 and LGL-1, the apparent strength of the polarity defects in the double mutant and the broader implication that LGL-1 may act in a range of redundant, cell/tissue specific pathways to regulate polarity. The primary weakness of this study is the lack of quantification. Additionally, the aPKC and CDC42 genetic interaction data hint at potential pathways, but fall short of establishing LGL-1's or PAC-1's mechanism of action.

      Advance: This works identifies a redundant genetic interaction between LGL-1 and PAC-1. While the data require additional quantification, the phenotypes presented appear clear and strong. Although the molecular mechanism by which LGL-1 and PAC-1 act is not well established in the current work, the core observation is significant and should provide a foundation for future studies dissecting the molecular mechanisms.

      Audience: This work will be of interest to a broad audience. LGL-1 is conserved and its role in cell polarization and epithelial polarity is very actively studied, including in mammalian systems.

      Field of expertise. C elegans embryonic development; cell polarity.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, Jarosinska and colleagues address the roles of two polarity regulators, pac-1 and lgl-1, in C. elegans epidermal polarity. Loss of function mutations in either of these gene individually does not block polarization, but through a genome-wide RNAi screen, the authors find that pac-1 and lgl-1 enhance each other to cause apical-basal polarity defects and arrest during epidermal morphogenesis. The remainder of the paper focuses on testing genetic interactions between both proteins and AJ proteins (HMP-1) as well as apical proteins (CDC-42, PKC-3). These experiments reveal some interesting differences in how lgl-1 and pac-1 interface with junctional proteins (pac-1 enhances hmp-1 but lgl-1 does not) and apical proteins (lgl-1 suppresses pkc-3 or cdc-42 partial loss but pac-1 does not).

      Minor comments:

      1. Are the lgl-1; pac-1 M-Z- double mutants dead? Only the phenotype of pac-1(M-Z-); lgl-1 (M+Z-) is shown. In figures and text throughout, it should be clear whether mutants are referring to zygotic loss or both maternal and zygotic loss, as this distinction could have major implications on the interpretation of experiments.
      2. In Figure 1C, it would be more appropriate to show a fully elongated WT embryo to contrast with arrested elongation in mutant embryos.
      3. Is the lateral spread of DLG-1 in double mutant embryos a result of failure to polarize DLG-1, or failure to maintain polarity? This should be straightforward to address in higher time resolution movies.
      4. The lack of enhancement of hmp-1(fe4) by lgl-1(RNAi) is quite interesting, given that pac-1 does enhance hmp-1(fe4). To rule out the possibility that this result stems from incomplete lgl-1 RNAi, this experiment should be repeated using the lgl-1 null mutant.
      5. Related to point 4, do pac-1 or lgl-1 null mutants enhance partial knockdown of junction protein DLG-1, or is this effect (of pac-1) specific to HMP-1/AJs?
      6. Does lgl-1 loss affect PAC-1 protein localization and vice versa?

      Significance

      Overall, the manuscript provides additional insights into apical-basal polarization in C. elegans and demonstrates that lgl-1 is likely working in a similar way as in Drosophila, despite the lack of a phenotype in single lgl-1 mutants. I found the experiments to be done rigorously and interpretations of the data appropriate. All of my suggestions on improving the manuscript are minor; suggested experiments should be viewed as optional ways to strengthen the conclusions/impact of the study.

    1. Kuhn’s view

      Thomas Kuhn’s view of scientific change is that science does not progress smoothly or simply by accumulating facts. Instead, it advances through periodic revolutions that transform how scientists understand the world.

    1. eLife Assessment

      This important study demonstrates the significance of incorporating biological constraints in training neural networks to develop models that make accurate predictions under novel conditions. By comparing standard sigmoid recurrent neural networks (RNNs) with biologically constrained RNNs, the manuscript offers compelling evidence that biologically grounded inductive biases enhance generalization to perturbed conditions. This manuscript will appeal to a wide audience in systems and computational neuroscience.

    2. Reviewer #1 (Public review):

      This manuscript introduces a biologically informed RNN (bioRNN) that predicts the effects of optogenetic perturbations in both synthetic and in vivo datasets. By comparing standard sigmoid RNNs (σRNNs) and bioRNNs, the authors make a compelling case that biologically grounded inductive biases improve generalization to perturbed conditions. This work is innovative, technically strong, and grounded in relevant neuroscience, particularly the pressing need for data-constrained models that generalize causally.

      Comments on revisions:

      The authors have addressed all my concerns.

    3. Reviewer #2 (Public review):

      Sourmpis et al. present a study in which the importance of including certain inductive biases in the fitting of recurrent networks is evaluated with respect to the generalization ability of the networks when exposed to untrained perturbations.

      The work proceeds in three stages:

      (i) a simple illustration of the problem is made. Two reference (ground-truth) networks with qualitatively different connectivity, but similar observable network dynamics, are constructed, and recurrent networks with varying aspects of design similarity to the reference networks are trained to reproduce the reference dynamics. The activity of these trained networks during untrained perturbations is then compared to the activity of the perturbed reference networks. It is shown that, of the design characteristics that were varied, the enforced sign (Dale's law) and locality (spatial extent) of efference were especially important.

      (ii) The intuition from the constructed example is then extended to networks that have been trained to reproduce certain aspects of multi-region neural activity recorded from mice during a detection task with a working-memory component. A similar pattern is demonstrated, in which enforcing the sign and locality of efference in the fitted networks has an influence on the ability of the trained networks to predict aspects of neural activity during unseen (untrained) perturbations.

      (iii) The authors then illustrate the relationship between the gradient of the motor readout of trained networks with respect to the net inputs to the network units, and the sensitivity of the motor readout to small perturbations of the input currents to the units, which (in vivo) could be controlled optogenetically. The paper is concluded with a proposed use for trained networks, in which the models could be analyzed to determine the most sensitive directions of the network and, during online monitoring, inform a targeted optogenetic perturbation to bias behavior.

      The authors do not overstate their claims, and in general, I find that I agree with their conclusions.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review)

      Major:

      (1) In line 76, the authors make a very powerful statement: 'σRNN simulation achieves higher similarity with unseen recorded trials before perturbation, but lower than the bioRNN on perturbed trials.' I couldn't find a figure showing this. This might be buried somewhere and, in my opinion, deserves some spotlight - maybe a figure or even inclusion in the abstract.

      We agree with the reviewer that these results are important. The failure of σRNN on perturbed data could be inferred from the former Figures 1E, 2C-E, and 3D. Following the reviewers' comments, we have tried to make this the most prominent message of Figure 1, in particular with the addition of the new panel E. We also moved Table 1 from the  Supplementary to the main text to highlight this quantitatively. 

      (2) It's mentioned in the introduction (line 84) and elsewhere (e.g., line 259) that spiking has some advantage, but I don't see any figure supporting this claim. In fact, spiking seems not to matter (Figure 2C, E). Please clarify how spiking improves performance, and if it does not, acknowledge that. Relatedly, in line 246, the authors state that 'spiking is a better metric but not significant' when discussing simulations. Either remove this statement and assume spiking is not relevant, or increase the number of simulations.

      We could not find the exact quote from the reviewer, and we believe that he intended to quote “spiking is better on all metrics, but without significant margins”. Indeed, spiking did not improve the fit significantly on perturbed trials, this is particularly true in comparison with the benefits of Dale’s law and local inhibition. As suggested by the reviewer, we rephrased the sentence from this quote and more generally the corresponding paragraphs in the intro (lines 83-87) and in the results (lines 245-271). Our corrections in the results sections are also intended to address the minor point (4) raised by the same reviewer.

      (3) The authors prefer the metric of predicting hits over MSE, especially when looking at real data (Figure 3). I would bring the supplementary results into the main figures, as both metrics are very nicely complementary. Relatedly, why not add Pearson correlation or R2, and not just focus on MSE Loss?

      In Figure 3 for the in-vivo data, we do not have simultaneous electrophysiological recordings and optogenetic stimulation in this dataset.  The two are performed on different recording sessions. Therefore, we can only compare the effect of optogenetics on the behavior, and we cannot compute Pearson correlation or R2 of the perturbed network activity. To avoid ambiguity, we wrote “For the sessions of the in vivo dataset with optogenetic perturbation that we considered, only the behavior of an animal is recorded” on line 294. 

      (4) I really like the 'forward-looking' experiment in closed loop! But I felt that the relevance of micro perturbations is very unclear in the intro and results. This could be better motivated: why should an experimentalist care about this forward-looking experiment? Why exactly do we care about micro perturbation (e.g., in contrast to non-micro perturbation)? Relatedly, I would try to explain this in the intro without resorting to technical jargon like 'gradients'.

      As suggested, we updated the last paragraph of the introduction (lines 88 - 95) to give better motivation for why algorithmically targeted acute spatio-temporal perturbations can be important to dissect the function of neural circuits. We also added citations to recent studies with targeted in vivo optogenetic stimulation. As far as we know the existing previous work targeted network stimulation mostly using linear models, while we used non-linear RNNs and their gradients.

      Minor:

      (1) In the intro, the authors refer to 'the field' twice. Personally, I find this term odd. I would opt for something like 'in neuroscience'.

      We implemented the suggested change: l.27 and l.30

      (2) Line 45: When referring to previous work using data-constrained RNN models, Valente et al. is missing (though it is well cited later when discussing regularization through low-rank constraints)

      We added the citation: l.45

      (3) Line 11: Method should be methods (missing an 's').

      We fixed the typo.

      (4) In line 250, starting with 'So far', is a strange choice of presentation order. After interpreting the results for other biological ingredients, the authors introduce a new one. I would first introduce all ingredients and then interpret. It's telling that the authors jump back to 2B after discussing 2C.

      We restructured the last two paragraphs of section 2.1, and we hope that the presentation order is now more logical.

      (5) The black dots in Figure 3E are not explained, or at least I couldn't find an explanation.

      We added an explanation in the caption of Figure 3E.

      Reviewer #2 (Public review):

      (1) Some aspects of the methods are unclear. For comparisons between recurrent networks trained from randomly initialized weights, I would expect that many initializations were made for each model variant to be compared, and that the performance characteristics are constructed by aggregating over networks trained from multiple random initializations. I could not tell from the methods whether this was done or how many models were aggregated.

      The expectation of the reviewer is correct, we trained multiple models with different random seeds (affecting both the weight initialization and the noise of our model) for each variant and aggregated the results. We have now clarified this in Methods 4.6. lines 658-662.

      (2) It is possible that including perturbation trials in the training sets would improve model performance across conditions, including held-out (untrained) perturbations (for instance, to units that had not been perturbed during training). It could be noted that if perturbations are available, their use may alleviate some of the design decisions that are evaluated here.

      In general, we agree with the reviewer that including perturbation trials in the training set would likely improve model performance across conditions. One practical limitation explaining partially why we did not do it with our dataset is the small quantity of perturbed trials for each targeted cortical area: the number of trials with light perturbations is too scarce to robustly train and test our models.

      More profoundly, to test hard generalizations to perturbations (aka perturbation testing), it will always be necessary that the perturbations are not trivially represented in the training data. Including perturbation trials during training would compromise our main finding: some biological model constraints improve the generalization to perturbation. To test this claim, it was necessary to keep the perturbations out of the training data.

      We agree that including all available data of perturbed and non-perturbed recordings would be useful to build the best generalist predictive system. It could help, for instance, for closed-loop circuit control as we studied in Figure 5. Yet, there too, it will be important for the scientific validation process to always keep some causal perturbations of interest out of the training set. This is necessary to fairly measure the real generalization capability of any model. Importantly, this is why we think out-of-distribution “perturbation testing” is likely to have a recurring impact in the years to come, even beyond the case of optogenetic inactivation studied in detail in our paper.

      Recommendation for the authors:

      Reviewer #1 (Recommendation for the authors):

      The code is not very easy to follow. I know this is a lot to ask, but maybe make clear where the code is to train the different models, which I think is a great contribution of this work? I predict that many readers will want to use the code and so this will improve the impact of this work.

      We updated the code to make it easier to train a model from scratch.

      Reviewer #2 (Recommendation for the authors):

      The figures are really tough to read. Some of that small font should be sized up, and it's tough to tell in the posted paper what's happening in Figure 2B.

      We updated Figures 1 and 2 significantly, in part to increase their readability. We also implemented the "Superficialities" suggestions.

    1. eLife Assessment

      This valuable study explores the role of the chromatin regulator ATAD2 in mouse spermatogenesis. The data convincingly demonstrate that ATAD2 is essential for proper chromatin remodeling in haploid spermatids, influencing gene accessibility, H3.3-mediated transcription, and histone eviction. Using Atad2 knockout (KO) mice, the authors link ATAD2 to the DNA-replication-independent incorporation of sperm-specific proteins like protamines and histone H3.3. Although the findings highlight chromatin abnormalities and impaired in vitro fertilization in KO mice, natural fertility remains unaffected, suggesting possible in vivo compensatory mechanisms. Future experiments will be needed to tease out the precise molecular role of ATAD2 in spermatogenesis. This work will be of interest to the epigenetics and developmental fields.

    2. Reviewer #1 (Public review):

      Summary:

      The authors analyzed the expression of ATAD2 protein in post-meiotic stages and characterized the localization of various testis-specific proteins in the testis of the Atad2 knockout (KO). By cytological analysis as well as the ATAC sequencing, the study showed that increased levels of HIRA histone chaperone, accumulation of histone H3.3 on post-meiotic nuclei, defective chromatin accessibility and also delayed deposition of protamines. Sperm from the Atad2 KO mice reduces the success of in vitro fertilization. The work was performed well, and most of the results are convincing. However, this manuscript does not suggest a molecular mechanism for how ATAD2 promotes the formation of testis-specific chromatin.

      Strengths:

      The paper describes the role of ATAD2 AAA+ ATPase in the proper localization of sperm-specific chromatin proteins such as protamine, suggesting the importance of the DNA replication-independent histone exchanges with the HIRA-histone H3.3 axis.

      Weaknesses:

      The work was performed well, and most of the results are convincing. However, this manuscript does not suggest a molecular mechanism for how ATAD2 promotes the formation of testis-specific chromatin.

    3. Reviewer #2 (Public review):

      Summary:

      This manuscript by Liakopoulou et al. presents a comprehensive investigation into the role of ATAD2 in regulating chromatin dynamics during spermatogenesis. The authors elegantly demonstrate that ATAD2, via its control of histone chaperone HIRA turnover, ensures proper H3.3 localization, chromatin accessibility, and histone-to-protamine transition in post-meiotic male germ cells. Using a new well-characterized Atad2 KO mouse model, they show that ATAD2 deficiency disrupts HIRA dynamics, leading to aberrant H3.3 deposition, impaired transcriptional regulation, delayed protamine assembly, and defective sperm genome compaction. The study bridges ATAD2's conserved functions in embryonic stem cells and cancer to spermatogenesis, revealing a novel layer of epigenetic regulation critical for male fertility.

      Strengths:

      The MS first demonstration of ATAD2's essential role in spermatogenesis, linking its expression in haploid spermatids to histone chaperone regulation by connecting ATAD2-dependent chromatin dynamics to gene accessibility (ATAC-seq), H3.3-mediated transcription, and histone eviction. Interestingly and surprisingly, sperm chromatin defects in Atad2 KO mice impair only in vitro fertilization but not natural fertility, suggesting unknown compensatory mechanisms in vivo.

      Weaknesses:

      The MS is robust and there are not big weaknesses

      The authors have addressed all the queries successfully.

    4. Reviewer #3 (Public review):

      Summary:

      The authors generated knockout mice for Atad2, a conserved bromodomain-containing factor expressed during spermatogenesis. In Atad2 KO mice, HIRA, a chaperone for histone variant H3.3, was upregulated in round spermatids, accompanied by an apparent increase in H3.3 levels. Furthermore, the sequential incorporation and removal of TH2B and PRM1 during spermiogenesis were partially disrupted in the absence of ATAD2, possibly due to delayed histone removal. Despite these abnormalities, Atad2 KO male mice were able to produce offspring normally.

      Strengths:

      The manuscript addresses the biological role of ATAD2 in spermatogenesis using a knockout mouse model, providing a valuable in vivo framework to study chromatin regulation during male germ cell development. The observed redistribution of H3.3 in round spermatids is clearly presented and suggests a previously unappreciated role of ATAD2 in histone variant dynamics. The authors also document defects in the sequential incorporation and removal of TH2B and PRM1 during spermiogenesis, providing phenotypic insight into chromatin transitions in late spermatogenic stages. Overall, the study presents a solid foundation for further mechanistic investigation into ATAD2 function.

      Weaknesses:

      While the manuscript reports the gross phenotype of Atad2 KO mice, the findings remain largely superficial and do not convincingly demonstrate how ATAD2 deficiency affects chromatin.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review): 

      Summary: 

      The authors analyzed the expression of ATAD2 protein in post-meiotic stages and characterized the localization of various testis-specific proteins in the testis of the Atad2 knockout (KO). By cytological analysis as well as the ATAC sequencing, the study showed that increased levels of HIRA histone chaperone, accumulation of histone H3.3 on post-meiotic nuclei, defective chromatin accessibility and also delayed deposition of protamines. Sperm from the Atad2 KO mice reduces the success of in vitro fertilization. The work was performed well, and most of the results are convincing. However, this manuscript does not suggest a molecular mechanism for how ATAD2 promotes the formation of testis-specific chromatin. 

      We would like to take this opportunity to highlight that the present study builds on our previously published work, which examined the function of ATAD2 in both yeast S. pombe and mouse embryonic stem (ES) cells (Wang et al., 2021). In yeast, using genetic analysis we showed that inactivation of HIRA rescues defective cell growth caused by the absence of ATAD2. This rescue could also be achieved by reducing histone dosage, indicating that the toxicity depends on histone over-dosage, and that HIRA toxicity, in the absence of ATAD2, is linked to this imbalance.

      Furthermore, HIRA ChIP-seq performed in mouse ES cells revealed increased nucleosome-bound HIRA, particularly around transcription start sites (TSS) of active genes, along with the appearance of HIRA-bound nucleosomes within normally nucleosome-free regions (NFRs). These findings pointed to ATAD2 as a major factor responsible for unloading HIRA from nucleosomes. This unloading function may also apply to other histone chaperones, such as FACT (see Wang et al., 2021, Fig. 4C).

      In the present study, our investigations converge on the same ATAD2 function in the context of a physiologically integrated mammalian system—spermatogenesis. Indeed, in the absence of ATAD2, we observed H3.3 accumulation and enhanced H3.3-mediated gene expression. Consistent with this functional model of ATAD2— unloading chaperones from histone- and non-histone-bound chromatin—we also observed defects in histone-toprotamine replacement.

      Together, the results presented here and in Wang et al. (2021) reveal an underappreciated regulatory layer of histone chaperone activity. Previously, histone chaperones were primarily understood as factors that load histones. Our findings demonstrate that we must also consider a previously unrecognized regulatory mechanism that controls assembled histone-bound chaperones. This key point was clearly captured and emphasized by Reviewer #2 (see below).

      Strengths:

      The paper describes the role of ATAD2 AAA+ ATPase in the proper localization of sperm-specific chromatin proteins such as protamine, suggesting the importance of the DNA replication-independent histone exchanges with the HIRA-histone H3.3 axis. 

      Weaknesses: 

      (1) Some results lack quantification. 

      We will consider all the data and add appropriate quantifications where necessary.

      (2) The work was performed well, and most of the results are convincing. However, this manuscript does not suggest a molecular mechanism for how ATAD2 promotes the formation of testis-specific chromatin. 

      Please see our comments above.

      Reviewer #2 (Public review): 

      Summary:

      This manuscript by Liakopoulou et al. presents a comprehensive investigation into the role of ATAD2 in regulating chromatin dynamics during spermatogenesis. The authors elegantly demonstrate that ATAD2, via its control of histone chaperone HIRA turnover, ensures proper H3.3 localization, chromatin accessibility, and histone-toprotamine transition in post-meiotic male germ cells. Using a new well-characterized Atad2 KO mouse model, they show that ATAD2 deficiency disrupts HIRA dynamics, leading to aberrant H3.3 deposition, impaired transcriptional regulation, delayed protamine assembly, and defective sperm genome compaction. The study bridges ATAD2's conserved functions in embryonic stem cells and cancer to spermatogenesis, revealing a novel layer of epigenetic regulation critical for male fertility. 

      Strengths:

      The MS first demonstration of ATAD2's essential role in spermatogenesis, linking its expression in haploid spermatids to histone chaperone regulation by connecting ATAD2-dependent chromatin dynamics to gene accessibility (ATAC-seq), H3.3-mediated transcription, and histone eviction. Interestingly and surprisingly, sperm chromatin defects in Atad2 KO mice impair only in vitro fertilization but not natural fertility, suggesting unknown compensatory mechanisms in vivo. 

      Weaknesses:

      The MS is robust and there are not big weaknesses 

      Reviewer #3 (Public review): 

      Summary: 

      The authors generated knockout mice for Atad2, a conserved bromodomain-containing factor expressed during spermatogenesis. In Atad2 KO mice, HIRA, a chaperone for histone variant H3.3, was upregulated in round spermatids, accompanied by an apparent increase in H3.3 levels. Furthermore, the sequential incorporation and removal of TH2B and PRM1 during spermiogenesis were partially disrupted in the absence of ATAD2, possibly due to delayed histone removal. Despite these abnormalities, Atad2 KO male mice were able to produce offspring normally. 

      Strengths:

      The manuscript addresses the biological role of ATAD2 in spermatogenesis using a knockout mouse model, providing a valuable in vivo framework to study chromatin regulation during male germ cell development. The observed redistribution of H3.3 in round spermatids is clearly presented and suggests a previously unappreciated role of ATAD2 in histone variant dynamics. The authors also document defects in the sequential incorporation and removal of TH2B and PRM1 during spermiogenesis, providing phenotypic insight into chromatin transitions in late spermatogenic stages. Overall, the study presents a solid foundation for further mechanistic investigation into ATAD2 function. 

      Weaknesses:

      While the manuscript reports the gross phenotype of Atad2 KO mice, the findings remain largely superficial and do not convincingly demonstrate how ATAD2 deficiency affects chromatin dynamics. Moreover, the phenotype appears too mild to elucidate the functional significance of ATAD2 during spermatogenesis. 

      We respectfully disagree with the statement that our findings are largely superficial. Based on our investigations of this factor over the years, it has become evident that ATAD2 functions as an auxiliary factor that facilitates mechanisms controlling chromatin dynamics (see, for example, Morozumi et al., 2015). These mechanisms can still occur in the absence of ATAD2, but with reduced efficiency, which explains the mild phenotype we observed.

      This function, while not essential, is nonetheless an integral part of the cell’s molecular biology and should be studied and brought to the attention of the broader biological community, just as we study essential factors. Unfortunately, the field has tended to focus primarily on core functional actors, often overlooking auxiliary factors. As a result, our decade-long investigations into the subtle yet important roles of ATAD2 have repeatedly been met with skepticism regarding its functional significance, which has in turn influenced editorial decisions.

      We chose eLife as the venue for this work specifically to avoid such editorial barriers and to emphasize that facilitators of essential functions do exist. They deserve to be investigated, and the underlying molecular regulatory mechanisms must be understood.

      (1) Figures 4-5: The analyses of differential gene expression and chromatin organization should be more comprehensive. First, Venn diagrams comparing the sets of significantly differentially expressed genes between this study and previous work should be shown for each developmental stage. Second, given the established role of H3.3 in MSCI, the effect of Atad2 knockout on sex chromosome gene expression should be analyzed. Third, integrated analysis of RNA-seq and ATAC-seq data is needed to evaluate how ATAD2 loss affects gene expression. Finally, H3.3 ChIP-seq should be performed to directly assess changes in H3.3 distribution following Atad2 knockout.  

      (1) In the revised version, we will include Venn diagrams to illustrate the overlap in significantly differentially expressed genes between this study and previous work. However, we believe that the GSEAs presented here provide stronger evidence, as they indicate the statistical significance of this overlap (p-values). In our case, we observed p-value < 0.01 (**) and p < 0.001 (***).

      (2) Sex chromosome gene expression was analyzed and is presented in Fig. 5C.

      (3) The effect of ATAD2 loss on gene expression is shown in Fig. 4A, B, and C as histograms, with statistical significance indicated in the middle panels.

      (4) Although mapping H3.3 incorporation across the genome in wild-type and Atad2 KO cells would have been informative, the available anti-H3.3 antibody did not work for ChIP-seq, at least in our hands. The authors of Fontaine et al., 2022, who studied H3.3 during spermatogenesis in mice, must have encountered the same problem, since they tagged the endogenous H3.3 gene to perform their ChIP experiments.

      (2) Figure 3: The altered distribution of H3.3 is compelling. This raises the possibility that histone marks associated with H3.3 may also be affected, although this has not been investigated. It would therefore be important to examine the distribution of histone modifications typically associated with H3.3. If any alterations are observed, ChIP-seq analyses should be performed to explore them further.

      Based on our understanding of ATAD2’s function—specifically its role in releasing chromatin-bound HIRA—in the absence of ATAD2 the residence time of both HIRA and H3.3 on chromatin increases. This results in the detection of H3.3 not only on sex chromosomes but across the genome. Our data provide clear evidence of this phenomenon. The reviewer is correct in suggesting that the accumulated H3.3 would carry H3.3-associated histone PTMs; however, we are unsure what additional insights could be gained by further demonstrating this point.

      (3) Figure 7: While the authors suggest that pre-PRM2 processing is impaired in Atad2 KO, no direct evidence is provided. It is essential to conduct acid-urea polyacrylamide gel electrophoresis (AU-PAGE) followed by western blotting, or a comparable experiment, to substantiate this claim. 

      Figure 7 does not suggest that pre-PRM2 processing is affected in Atad2 KO; rather, this figure—particularly Fig. 7B—specifically demonstrates that pre-PRM2 processing is impaired, as shown using an antibody that recognizes the processed portion of pre-PRM2. ELISA was used to provide a more quantitative assessment; however, in the revised manuscript we will also include a western blot image.

      (4) HIRA and ATAD2: Does the upregulation of HIRA fully account for the phenotypes observed in Atad2 KO? If so, would overexpression of HIRA alone be sufficient to phenocopy the Atad2 KO phenotype? Alternatively, would partial reduction of HIRA (e.g., through heterozygous deletion) in the Atad2 KO background be sufficient to rescue the phenotype? 

      These are interesting experiments that require the creation of appropriate mouse models, which are not currently available.

      (5) The mechanism by which ATAD2 regulates HIRA turnover on chromatin and the deposition of H3.3 remains unclear from the manuscript and warrants further investigation. 

      The Reviewer is absolutely correct. In addition to the points addressed in response to Reviewer #1’s general comments (see above), it would indeed have been very interesting to test the segregase activity of ATAD2 (likely driven by its AAA ATPase activity) through in vitro experiments using the Xenopus egg extract system described by Tagami et al., 2004. This system can be applied both in the presence and absence (via immunodepletion) of ATAD2 and would also allow the use of ATAD2 mutants, particularly those with inactive AAA ATPase or bromodomains. However, such experiments go well beyond the scope of this study, which focuses on the role of ATAD2 in chromatin dynamics during spermatogenesis.

      References:

      (1) Wang T, Perazza D, Boussouar F, Cattaneo M, Bougdour A, Chuffart F, Barral S, Vargas A, Liakopoulou A, Puthier D, Bargier L, Morozumi Y, Jamshidikia M, Garcia-Saez I, Petosa C, Rousseaux S, Verdel A, Khochbin S. ATAD2 controls chromatin-bound HIRA turnover. Life Sci Alliance. 2021 Sep 27;4(12):e202101151. doi: 10.26508/lsa.202101151. PMID: 34580178; PMCID: PMC8500222.

      (2) Morozumi Y, Boussouar F, Tan M, Chaikuad A, Jamshidikia M, Colak G, He H, Nie L, Petosa C, de Dieuleveult M, Curtet S, Vitte AL, Rabatel C, Debernardi A, Cosset FL, Verhoeyen E, Emadali A, Schweifer N, Gianni D, Gut M, Guardiola P, Rousseaux S, Gérard M, Knapp S, Zhao Y, Khochbin S. Atad2 is a generalist facilitator of chromatin dynamics in embryonic stem cells. J Mol Cell Biol. 2016 Aug;8(4):349-62. doi: 10.1093/jmcb/mjv060. Epub 2015 Oct 12. PMID: 26459632; PMCID: PMC4991664.

      (3) Fontaine E, Papin C, Martinez G, Le Gras S, Nahed RA, Héry P, Buchou T, Ouararhni K, Favier B, Gautier T, Sabir JSM, Gerard M, Bednar J, Arnoult C, Dimitrov S, Hamiche A. Dual role of histone variant H3.3B in spermatogenesis: positive regulation of piRNA transcription and implication in X-chromosome inactivation. Nucleic Acids Res. 2022 Jul 22;50(13):7350-7366. doi: 10.1093/nar/gkac541. PMID: 35766398; PMCID: PMC9303386.

      (4) Tagami H, Ray-Gallet D, Almouzni G, Nakatani Y. Histone H3.1 and H3.3 complexes mediate nucleosome assembly pathways dependent or independent of DNA synthesis. Cell. 2004 Jan 9;116(1):51-61. doi: 10.1016/s0092-8674(03)01064-x. PMID: 14718166.

      Recommendations for the authors:

      Reviewing Editor Comments:

      I note that the reviewers had mixed opinions about the strength of the evidence in the manuscript. A revision that addresses these points would be welcome.

      Reviewer #1 (Recommendations for the authors):  

      Major points: 

      (1) No line numbers: It is hard to point out the issues.

      The revised version harbors line numbers.

      (2) Given the results shown in Figure 3 and Figure 4, it is nice to show the chromosomal localization of histone H3.3 in spermatocytes or post-meiotic cells by Chromatin-immunoprecipitation sequencing (ChIP-seq).

      Although mapping H3.3 incorporation across the genome in wild-type and Atad2 KO cells would have been informative, the available anti-H3.3 antibody did not work for ChIP-seq in our hands. In fact, this antibody is not well regarded for ChIP-seq. For example, Fontaine et al. (2022), who investigated H3.3 during spermatogenesis in mice, circumvented this issue by tagging the endogenous H3.3 genes for their ChIP experiments.

      (3) Figure 7B and 8: Why the authors used ELISA for the protein quantification. At least, western blotting should be shown.

      ELISA is a more quantitative method than traditional immunoblotting. Nevertheless, as requested by the reviewer, we have now included a corresponding western blot in Fig. S3.

      (4) For readers, please add a schematic pathway of histone-protamine replacement in sperm formation in Fig.1 and it would be nice to have a model figure, which contains the authors' idea in the last figure.

      As requested by this reviewer, we have now included a schematic model in Figure 9 to summarize the main conclusions of our work.

      Minor points: 

      (1) Page 2, the second paragraph, "pre-PRM2: Please explain more about pre-PRM2 and/or PRM2 as well as PRM1 (Figure 6).

      More detailed descriptions of PRM2 processing are now given in this paragraph. 

      (2) Page 3, bottom paragraph, line 1: "KO" should be "knockout (KO)".

      Done.

      (3) Page 4, second paragraph bottom: Please explain more about the protein structure of germ-line-specific ATAD2S: how it is different from ATAD2L. Germ-line specific means it is also expressed in ovary?

      As Atad2 is predominantly expressed in embryonic stem cells and in spermatogenic cells, we replaced all through the text germ-line specific by more appropriate terms.

      (4) Figure 1C, western blotting: Wild-type testis extracts, both ATAD2L and -S are present. Does this mean that ATADS2L is expressed in both germ line as well as supporting cells. Please clarify this and, if possible, show the western blotting of spermatids well as spermatocytes.

      Figure 1D shows sections of seminiferous tubules from Atad2 KO mice, in which lacZ expression is driven by the endogenous Atad2 promoter. The results indicate that Atad2 is expressed mainly in post-meiotic cells. Most labeled cells are located near the lumen, whereas the supporting Sertoli cells remain unlabeled. Sertoli cells, which are anchored to the basal lamina, span the entire thickness of the germinal epithelium from the basal lamina to the lumen. Their nuclei, however, are usually positioned closer to the basal membrane. Thus, the observed lacZ expression pattern argues against substantial Atad2 expression in Sertoli cells. 

      (5) Figure 1C: Please explain a bit more about the reduction of ATAD2 proteins in heterozygous mice.

      Done

      (6) Figure 1C: Genotypes of the mice should be shown in the legend.

      Done 

      (7) Figure 1D: Please add a more magnified image of the sections to see the staining pattern in the seminiferous tubules.

      The magnification does not bring more information since we lose the structure of cells within tubules due the nature of treatment of the sections for X-gal staining. Please see comments to question 1C to reviewer 2

      (8) Page 5, first paragraph, line 2, histone dosage: What do the authors meant by the histone dosage? Please explain more or use more appropriate word.

      "Histone dosage" refers to the amount or relative abundance of histone proteins in a cell.

      (9) Figure 2A: Figure 2A: Given the result in Figure 1C, it is interesting to check the amount of HIRA in Atad2 heterozygous mice.

      In Atad2 heterozygous mice, we would expect an increase in HIRA, but only to about half the level seen in the Atad2 homozygous knockout shown in Figure 2A, which is relatively modest. Therefore, we doubt that detecting such a small change—approximately half of that in Figure 2A—would yield clear or definitive results. 

      (10) Figure 2A, legend (n=5): What does this "n" mean? The extract of testes from "5" male mice like Figure 2B. Or 5 independent experiments. If the latter is true, it is important to share the other results in the Supplements.

      “n” refers to five WT and five Atad2 KO males. The legend has been clarified as suggested by the reviewer.

      (11) Figure 2A, legend, line 2, Atad2: This should be italicized.

      Done

      (12) Figure 2B: Please show the quantification of amounts of HIRA protein like Fig. 2A.

      As indicated in the legend, what is shown is a pool of testes from 3 individuals per genotype.

      (13) Figure 2B shows an increased level of HIRA in Atad2 KO testis. This suggests the role of ATAD2 in the protein degradation of HIRA. This possibility should be mentioned or tested since ATAD2 is an AAA+ ATPase. 

      The extensive literature on ATAD2 provides no indication that it is involved in protein degradation. In our early work on ATAD2 in the 2000s, we hypothesized that, as a member of the AAA ATPase family, ATAD2 might associate with the 19S proteasome subunit (through multimerization with the other AAA ATPase member of this regulatory subunit). However, both our published pilot studies (Caron et al., PMID: 20581866) and subsequent unpublished work ruled out this possibility. Instead, since the amount of nucleosome-bound HIRA increases in the absence of ATAD2, we propose that chromatin-bound HIRA is more stable than soluble HIRA once it has been released from chromatin by ATAD2.

      (14) Page 6, second paragraph, line 5, ko: KO should be capitalized.

      Done

      (15) Page 6, second paragraph, line 2 from the bottom, chromatin dynamics: Throughout the text, the authors used "chromatin dynamics". However, all the authors analyzed in the current study is the localization of chromatin protein.  So, it is much easier to explain the results by using "chromatin status," etc. In this context, "accessibility" is better. 

      We changed the term “chromatin dynamics” into a more precise term according to the context used all through the text.

      (16) Figure 3: Please provide the quantification of signals of histone H3.3 in a nucleus or nuclear cytoplasm.

      This request is not clear to us since we do not observe any H3.3 signal in the cytoplasm.

      (17) Figure 3: As the control of specificity in post-meiotic cells, please show the image and quantification of the H3.3 signals in spermatocyte, for example.

      This request is not clear to us. What specificity is meant? 

      (18) Figure 3, bottom panels: Please show what the white lines indicate? 

      The white lines indicate the limit of cell nucleus and estimated by Hoechst staining. This is now indicated in the legend of the figure. 

      (19) Figure 4A: Please explain more about what kind of data is here. Is this wild-type and/or Atad2 KO? The label of the Y-axis should be "mean expression level". What is the standard deviation (SD) here on the X-axis. Moreover, there is only one red open circle, but the number of this class is 5611. All 5611 genes in this group show NO expression. Please explain more.

      The plot displays the mean expression levels (y-axis, labeled as "mean expression level") versus the corresponding standard deviations (x-axis), both calculated from three independent biological replicates of isolated round spermatids (Atad2 wild-type and Atad2 KO). The standard deviation reflects the variability of gene expression across biological replicates. Genes were grouped into four categories (grp1: blue, grp2: cyan, grp3: green, grp4: orange) according to the quartile of their mean expression. For grp4, all genes have no detectable expression, resulting in a mean expression of zero and a standard deviation of zero; consequently, the 5611 genes in this group are represented by a single overlapping point (red open circle) at the origin. 

      (20) Figure 4C: If possible, it would be better to have a statistical comparison between wild-type and the KO.  

      The mean profiles are displayed together with their variability (± 2 s.e.m.) across the four replicates for both ATAD2 WT (blue) and ATAD2 KO (red). For groups 1, 2, and 3, the envelopes of the curves remain clearly separated around the peak, indicating a consistent difference in signal between the two conditions. In contrast, group 4 does not present a strong signal and, accordingly, no marked difference is observed between WT and KO in this group.

      (21) Figure 5, GSEA panels: Please explain more about what the GSEA is in the legend.  The legend has been updated as follows:

      (A) Expression profiles of post-meiotic H3.3-activated genes. The heatmap (left panel) displays the normalized expression levels of genes identified by Fontaine and colleagues as upregulated in the absence of histone H3.3 (Fontaine et al. 2022) for Atad2 WT (WT) and Atad2 KO (KO) samples at days 20, 22, 24, and 26 PP (D20 to D26). The colour scale represents the z-score of log-transformed DESeq2-normalized counts. The middle panel box plots display, pooled, normalized expression levels, aggregated across replicates and genes, for each condition (WT and KO) and each time point (D20 to D26). Statistical significance between WT and KO conditions was determined using a two-sided t-test, with p-values indicated as follows: * for p-value<0.05, ** for p-value<0.01 and *** for p-value<0.001. The right panel shows the results of gene set enrichment analysis (GSEA), which assesses whether predefined groups of genes show statistically significant differences between conditions. Here, the post-meiotic H3.3-activated genes set, identified by Fontaine et al. (2022), is significantly enriched in Atad2 KO compared with WT samples at day 26 (p < 0.05, FDR < 0.25). Coloured vertical bars indicate the “leading edge” genes (i.e., those contributing most to the enrichment signal), located before the point of maximum enrichment score.  (B) As shown in (A) but for the "post-meiotic H3.3-repressed genes" gene set. (C) As shown in (A) but for the " sex chromosome-linked genes " gene set.

      (22) Figure 6. In the KO, the number of green cells is more than red and yellow cells, suggesting the delayed maturation of green (TH2B-positive) cells. It is essential to count the number of each cell and show the quantification.

      The green cells correspond to those expressing TH2B but lacking transition proteins (TP) and protamine 1 (Prm1), indicating that they are at earlier stages than elongating–condensing spermatids. Counting these green cells simply reflects the ratio of elongating/condensing spermatids to earlier-stage cells, which varies depending on the field examined. The key point in this experiment is that in wild-type mice, only red cells (elongating/condensing spermatids) and green cells (earlier stages) are observed. By contrast, in Atad2 KO testes, a significant proportion of yellow cells appears, which are never seen in wild-type tissue. The crucial metric here is the percentage of yellow cells relative to the total number of elongating/condensing spermatids (red cells). In wild-type testes, this value is consistently 0%, whereas in Atad2 KO testes it always ranges between 50% and 100% across all fields containing substantial numbers of elongating/condensing spermatids.

      (23) Figure 8A: Please show the images of sperm (heads) in the KO mice with or without decompaction.

      The requested image is now displayed in Figure S5.

      (24) Figure 8C: In the legend, it says n=5. However, there are more than 5 plots on the graph. Please explain the experiment more in detail.

      The experiment is now better explained in the legend of this Figure.

      Reviewer #2 (Recommendations for the authors): 

      While the study is rigorous and well performed, the following minor points could be addressed to strengthen the manuscript: 

      Figure 1C should indicate each of the different types of cells present in the sections. It would be of interest to show specifically the different post-meiotic germ cells.

      With this type of sample preparation, it is difficult to precisely distinguish the different cell types within the sections. Nevertheless, the staining pattern strongly indicates that most of the intensely stained cells are post-meiotic, situated near the tubule lumens and extending roughly halfway toward the basal membrane.

      In the absence of functional ATAD2, the accumulation of HIRA primarily occurs in round spermatids (Fig. 2B). If technically possible, it would be of great interest to show this by IHC of testis section. 

      Unfortunately, our antibody did not satisfactorily work in IHC.

      The increased of H3.3 signal in Atad2 KO spermatids (Fig. 3) is interpreted because of a reduced turnover. However, alternative explanations (e.g., H3.3 misincorporation or altered chaperone affinity) should not be ruled out. 

      The referee is correct that alternative explanations are possible. However, based on our previous work (Wang et al., 2021; PMID: 34580178), we demonstrated that in the absence of ATAD2, there is reduced turnover of HIRAbound nucleosomes, as well as reduced nucleosome turnover, evidenced by the appearance of nucleosomes in regions that are normally nucleosome-free at active gene TSSs. We have no evidence supporting any other alternative hypothesis.

      In the MS the reduced accessibility at active genes (Fig. 4) is attributed to H3.3 overloading. However, global changes in histone acetylation (e.g., H4K5ac) or other remodelers in KO cells could be also consider.

      In fact, we meant that histone overloading could be responsible for the altered accessibility. This has been clearly demonstrated in case of S. cerevisiae in the absence of Yta7 (S.  cerevisiae’ ATAD2) (PMID: 25406467).

      In relation with the sperm compaction assay (Fig. 8A), the DTT/heparin/Triton protocol may not fully reflect physiological decompaction. This could be validated with alternative methods (e.g., MNase sensitivity). 

      The referee is right, but since this is a subtle effect as it can be judged by normal fertility, we doubt that milder approaches could reveal significant differences between wildtype and Atad2 KO sperms.

      It is surprising that despite the observed alterations in the genome organization of the sperm, the natural fertility of the KO mice is not affected (Fig. 8C). This warrants deeper discussion: Is functional compensation occurring (e.g., by p97/VCP)? Analysis of epididymal sperm maturation or uterine environment could provide insights.

      As detailed in the Discussion section, this work, together with our previous study (Wang et al., 2021; PMID: 34580178), highlights an overlooked level of regulation in histone chaperone activity: the release of chromatinbound factors following their interaction with chromatin. This is an energy-dependent process, driven by ATP and the associated ATPase activity of these factors. Such activity could be mediated by various proteins, such as p97/VCP or DNAJC9–HSP70, as discussed in the manuscript, or by yet unidentified factors. However, most of these mechanisms are likely to occur during the extensive histone-to-histone variant exchanges of meiosis and post-meiotic stages. To the best of our knowledge, epididymal sperm maturation and the uterine environment do not involve substantial histone-to-histone or histone-to-protamine exchanges.

      The authors showed that MSCI genes present an enhancement of repression in the absence of ATAD2 by enhancing H3.3 function. It would be also of interest to analyze the behavior of the Sex body during its silencing (zygotene to pachytene) by looking at different markers (i.e., gamma-H2AX phosphorylation, Ubiquitylation etc). 

      The referee is correct that this is an interesting question. Accordingly, in our future work, we plan to examine the sex body in more detail during its silencing, using a variety of relevant markers, including those suggested by the reviewer. However, we believe that such investigations fall outside the scope of the present study, which focuses on the molecular relationship between ATAD2 and H3.3, rather than on the role of H3.3 in regulating sex body transcription. For a comprehensive analysis of this aspect, studies should primarily focus on the H3.3 mouse models reported by Fontaine and colleagues (PMID: 35766398).

      Fig. 6: Co-staining of TH2B/TP1/PRM1 is convincing but would benefit from quantification (% cells with overlapping signals).

      The green cells correspond to those expressing TH2B but lacking transition proteins (TP) and protamine 1 (Prm1), indicating that they are at earlier stages than elongating–condensing spermatids. Counting these green cells simply reflects the ratio of elongating/condensing spermatids to earlier-stage cells, which varies depending on the field examined. The key point is that in wild-type mice, only red cells (elongating/condensing spermatids) and green cells (earlier stages) are observed. By contrast, in Atad2 KO testes, a significant proportion of yellow cells appears, which are never seen in wild-type tissue. The crucial metric is the percentage of yellow cells relative to the total number of elongating/condensing spermatids (red cells). In wild-type testes, this value is consistently 0%, whereas in Atad2 KO testes it always ranges between 50% and 100% across all fields containing substantial numbers of elongating/condensing spermatids.

    1. eLife Assessment

      This useful study reports a method to detect and analyze a novel post-translational modification, lysine acetoacetylation (Kacac), finding it regulates protein metabolism pathways. The study unveils epigenetic modifiers involved in placing this mark, including key histone acetyltransferases such as p300, and concomitant HDACs, which remove the mark. Proteomic and bioinformatics analysis identified many human proteins with Kacac sites, potentially suggesting broad effects on cellular processes and disease mechanisms. The data presented are solid and the study will be of interest to those studying protein and metabolic regulation.

    2. Reviewer #3 (Public review):

      Summary:

      This paper presents a timely and significant contribution to the study of lysine acetoacetylation (Kacac). The authors successfully demonstrate a novel and practical chemo-immunological method using the reducing reagent NaBH4 to transform Kacac into lysine β-hydroxybutyrylation (Kbhb).

      Strengths:

      This innovative approach enables simultaneous investigation of Kacac and Kbhb, showcasing their potential in advancing our understanding of post-translational modifications and their roles in cellular metabolism and disease.

      Weaknesses:

      The study lacks supporting in vivo data, such as gene knockdown experiments, to validate the proposed conclusions at the cellular level.

    3. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews:

      Reviewer #2 (Public review):

      In the manuscript by Fu et al., the authors developed a chemo-immunological method for the reliable detection of Kacac, a novel post-translational modification, and demonstrated that acetoacetate and AACS serve as key regulators of cellular Kacac levels. Furthermore, the authors identified the enzymatic addition of the Kacac mark by acyltransferases GCN5, p300, and PCAF, as well as its removal by deacetylase HDAC3. These findings indicate that AACS utilizes acetoacetate to generate acetoacetyl-CoA in the cytosol, which is subsequently transferred into the nucleus for histone Kacac modification. A comprehensive proteomic analysis has identified 139 Kacac sites on 85 human proteins. Bioinformatics analysis of Kacac substrates and RNA-seq data reveal the broad impacts of Kacac on diverse cellular processes and various pathophysiological conditions. This study provides valuable additional insights into the investigation of Kacac and would serve as a helpful resource for future physiological or pathological research.

      The authors have made efforts to revise this manuscript and address my concerns. The revisions are appropriate and have improved the quality of the manuscript.

      We appreciate the constructive and thoughtful feedbacks, which have been invaluable in enhancing the quality of our manuscript.

      Reviewer #3 (Public review):

      Summary:

      This paper presents a timely and significant contribution to the study of lysine acetoacetylation (Kacac). The authors successfully demonstrate a novel and practical chemoimmunological method using the reducing reagent NaBH4 to transform Kacac into lysine βhydroxybutyrylation (Kbhb).

      Thank you for the positive and insightful comments.

      Strengths:

      This innovative approach enables simultaneous investigation of Kacac and Kbhb, showcasing its potential in advancing our understanding of post-translational modifications and their roles in cellular metabolism and disease.

      We are grateful for the reviewer’s comments, which has contributed to enhancing the quality of our study.

      Weaknesses:

      The experimental evidence presented in the article is insufficient to fully support the authors' conclusions. In the in vitro assays, the proteins used appear to be highly inconsistent with their expected molecular weights, as shown by Coomassie Brilliant Blue staining (Figure S3A). For example, p300, which has a theoretical molecular weight of approximately 270 kDa, appeared at around 37 kDa; GCN5/PCAF, expected to be ~70 kDa, appeared below 20 kDa. Other proteins used in the in vitro experiments also exhibited similarly large discrepancies from their predicted sizes. These inconsistencies severely compromise the reliability of the in vitro findings. Furthermore, the study lacks supporting in vivo data, such as gene knockdown experiments, to validate the proposed conclusions at the cellular level.

      We appreciate the reviewer’s comments. In the biochemical assays, we used the expressed catalytic domains of HATs—rather than the full-length proteins for activity testing. Specifically, the following constructs were expressed and purified: p300 (1287– 1666), GCN5 (499-663), PCAF (493-658), MOF (125-458), MOZ (497-780), MBP-MORF (361-716), Tip60 (221-512), HAT1 (20-341), and HBO1 (full length). This resulted in the observed discrepancies in molecular weight in Figure S3A compared to the expected fulllength weights. 

      Although a recent study (PMID: 37382194) reported the acetoacetyltransferase activities of p300 and GCN5 in cells, we recognize that additional knockdown experiments would be necessary to substantiate their contributions to in vivo Kacac generation and to explore the functional roles of Kacac in an enzyme-specific context. We plan to address these kinds of research issues in our future work.

    1. support

      can we use a link to direct students to the right support or a one pdf where to look for support like for example contact student Hub for support.

    1. Understanding SAG⇄E as a Communication Tool

      I like this connection, that the ⇄ part of the framework is both a communication tool and a place to practise communication. It would be nice to make this more explicit for learners with a short paragraph under this heading, before moving into the diagram. For example, this section could highlight that engaging in ⇄ dialogue helps learners develop capabilities such as articulating their thinking, asking constructive questions, listening and responding to others, building confidence in discussing their learning, and making sense of feedback collaboratively. It could also acknowledge that these conversations can feel unfamiliar or uncomfortable at first, and that this is a normal part of developing these skills. Framing ⇄ dialogue as a practice — not a performance — may help learners approach it with greater confidence and openness.

    2. Responding to Feedback

      One possible enhancement on this page (or on several pages) is to more explicitly normalise the emotional and cognitive responses learners may have when receiving feedback. While the module already acknowledges confidence and challenge, a brief statement that reassures learners that feelings such as uncertainty, discomfort, or frustration are common — and often signal meaningful learning — could help create a stronger sense of emotional safety.

      This kind of framing supports feedback literacy by helping learners recognise that their reactions are part of the learning process, and that pausing to notice these responses can make it easier to engage productively with feedback rather than avoid it. Feeling uncertain or uncomfortable when receiving feedback is common and often signals meaningful learning.

    1. Improvement

      At the end of the whole learning experience it would be helpful to more explicitly signal how learners will continue to use the SAG⇄E Insights for Learning framework beyond this module. The activities here are well designed for practice, and a brief forward-looking statement could help learners understand that this approach is intended to support learning across multiple tasks, units, and stages of their course — not just as a one-off activity.

      For example, a short “What’s next?” message might highlight that learners will revisit SAG⇄E when they receive feedback in other units, use it to notice patterns over time, and draw on it when curating portfolio evidence. This helps reinforce continuity and supports learners in seeing feedback as something that accumulates and develops, rather than resetting with each task.

    1. Understanding

      One possible enhancement to consider is making the learner-agency and identity dimension of the SAG⇄E Insights for Learning framework a little more explicit. Throughout the module, feedback is framed very effectively as something learners can use to improve their work; this could be strengthened further by also positioning feedback as a way of helping learners become more confident, independent decision-makers about their learning over time.

      A short meta-statement somewhere in the module (for example near the conclusion) could help learners see the bigger purpose of SAG⇄E beyond individual tasks. For instance:

      “Over time, using the SAG⇄E framework helps you build the confidence to interpret feedback, notice patterns, and decide for yourself what matters most for your learning — not just respond to individual comments.”

      This kind of framing supports learners to see feedback as part of becoming an active, capable learner across their course, rather than something that only applies to one task or unit.

    1. Download Your Action Plan Toolkit

      The Feedback Action Plan Template is a strong and practical resource, and it’s an ideal place to make the three movements of Engagement (⇄E) explicit. Rather than renaming the template, we suggest lightly structuring it around Reflection, Inquiry, and Action — using labels or prompts to help learners see Engagement as a process they practise over time. Much of this is already present; the main enhancement would be distinguishing an initial reflection on how feedback lands (emotionally and cognitively) from later reflection on capability development, and clearly positioning SAG insights as part of the Inquiry move.

      Suggested enhancements to the Feedback Action Plan Template 1. Keep the title “Feedback Action Plan Template” The title is clear and learner-friendly. Rather than renaming it, the conceptual work can be done through how the template is structured and framed. 2. Add a brief framing line at the top to connect the template to ⇄E For example: “This Action Plan helps you practise the Engagement (⇄E) part of the SAG⇄E Insights for Learning framework by guiding you through three moves: Reflection, Inquiry, and Action.” 3. Make the three ⇄E movements explicit through light section labelling or prompts This helps learners see Engagement as a process they practise, not just a single step. 4. Surface an initial Reflection move (before action planning) Add a short reflection prompt that invites learners to notice how feedback lands emotionally and cognitively, for example: • What stood out to you in this feedback? • How did it make you feel or think differently about your work or learning? This normalises reflection and supports learning from challenge or mistakes. 5. Position SAG insights as part of the Inquiry move Reframe the existing “SAG⇄E Insights” section as Inquiry, e.g.: Which Successes, Adjustments, or Growth insights matter most right now, and why? This reinforces that SAG insights are inputs to learner sense-making, not endpoints. 6. Retain the Action section with minimal change The current focus on specific, achievable steps is strong. Optional prompts could reinforce time-bounded action (e.g. “over the next week or two”). 7. Differentiate reflection on ‘how feedback landed’ from reflection on development over time The existing “Reflection on Capability Development” section works well as a later reflection, focused on noticing learning, growth, or changes in confidence after acting on feedback. 8. Keep Support Resources and Portfolio Annotation as they are, with minor connective language if helpful These sections already align well with ⇄E and portfolio learning; small wording tweaks could simply reinforce their role in supporting the Engagement process.

    2. A Quick Refresher: SAG⇄E Insights for Learning

      Across the module, the repeated use of the circular SAG⇄E diagram and recall-style quiz questions has been effective in reinforcing recognition of the framework. At this point in the learner journey, however, this repetition may be doing less conceptual work than earlier pages, as learners are already familiar with the core elements.

      On this page in particular — where Engagement (⇄E) is being introduced more fully — there is an opportunity to shift the visual and interactive emphasis away from remembering the framework and toward understanding how ⇄E actually operates.

      A left-to-right representation of the framework (from SAG insights to learner Engagement) works especially well here because it makes visible the balance and back-and-forth between the insight-giver side (Successes, Adjustments, Growth) and the learner side of the framework. It also provides a natural structure for introducing the three movements of ⇄E (Reflection, Inquiry, Action) as an active process, rather than as a single abstract concept. I've uploaded an image that I use in the Files area of this Canvas site.

      Replacing the circular diagram on this page with a representation that foregrounds the three ⇄E movements would help learners see Engagement as something they do, and something that unfolds over time in response to insights. The existing visual could still be used earlier in the module to establish the overall framework shape.

      Similarly, rather than another quiz focused on recalling SAG or ⇄E definitions, this page could use an interaction that helps learners distinguish and apply the three ⇄E movements — for example, identifying which movement is being demonstrated in short scenarios, or sequencing Reflection, Inquiry, and Action in response to a piece of feedback. This would extend understanding rather than repeat recall.

    3. Toolkit

      A small language point to consider: the term “toolkit” may unintentionally position this resource as something separate from, or parallel to, the SAG⇄E Insights for Learning framework. Conceptually, this Action Plan works best when it is understood as a way of enacting the Engagement (⇄E) component of the framework, rather than as an additional tool alongside it.

    4. Getting Started

      At first glance the SAG⇄E acronym looks unbalanced, like there is more 'weight' on the SAG side of the double arrow than on the E side. This is not a true reflection of the power of the ⇄E portion of the framework. The ⇄E component is actually made up of three 'movements', these are 1) Reflection, 2) Inquiry and 3) Action.

      Across the module, the Inquiry and Action movements of ⇄E are already well supported. Learners are asked to identify meaningful insights, ask questions of feedback, engage in dialogue, and develop concrete next steps through activities such as the Engagement response, the Action Plan, and the dialogue and evidence pages.

      What is less visible, but equally important, is the first movement: Reflection. This involves learners taking a brief pause to notice how feedback lands for them emotionally and cognitively, what stands out, and what those reactions might be telling them about their learning, confidence, or developing identity.

      The ‘Getting started’ section on this page already leans strongly in this direction. With a small amount of reframing or an explicit prompt, this page could more clearly signal Reflection as a deliberate and valued part of ⇄E, setting learners up to engage more intentionally with the Inquiry and Action that follow.

    1. In the last 12 months alone, 316 million women – 11% of those aged 15 or older – were subjected to physical or sexual violence by an intimate partner. Progress on reducing intimate partner violence has been painfully slow with only 0.2% annual decline over the past two decades.

      316 M women, even with 11% younger or just. 15 years old faced domenstic violence.

      physical or sexual violence --> partner

    1. R0:

      Reviewers' comments:

      The study addresses the ongoing H5N1 panzootic, a topic of major global health concern. By focusing on zoonotic spillover and potential human-to-human transmission, it connects well to pressing pandemic preparedness questions. Here are my suggestions

      The study acknowledges asymptomatic cases but doesn’t deeply explore realistic ranges of asymptomatic infection in H5N1. Since asymptomatic carriage in humans is poorly understood, exploring a wider range of assumptions (from very low to moderate prevalence) would add robustness. Maybe authors can discuss this point

      While the UK setting is clear, the contact structures and public health response capacity differ in low- and middle-income countries where zoonotic spillover risk is high. Discussion of transferability would broaden the relevance.

      The agricultural contact data are valuable, but heterogeneity within and across communities (e.g., multi-generational households, seasonal work, market interactions) could have been discussed more fully. This heterogeneity may affect outbreak potential.

      Only contact tracing and self-isolation are modeled. In reality, outbreak management could include infection control, health care facility and capacity, movement restrictions, or culling of infected animals. Considering at least one additional intervention would make the study more comprehensive.

      The study convincingly demonstrates that early interventions like contact tracing and self-isolation can substantially reduce outbreak size when R₀ is low and symptomatic detection is reliable. However, if R₀ increases or asymptomatic transmission is significant, these interventions may not suffice. Authors can discuss this point

      For policymakers, this suggests that contact tracing and self-isolation are valuable but fragile tools—effective only under certain epidemiological conditions. Maybe authors can discuss they should be embedded in a layered response strategy including rapid diagnostics, surveillance, and (eventually) vaccines or antivirals.

      Editor comments: - Given that there is little to no evidence of human-to-human transmission for avian influenza (H5N1), is self-isolation recommended as a control measure for human cases? Additionally, is self-isolation applicable in the context of seasonal influenza as well? - Introduction section: Line number 66: However, cases without zoonotic exposure and limited human-to-human transmission have been documented. Specify the virus name. Seasonal or avian influenza. - In method: you mentioned "contact with birds". It is better to mention the name bird or poultry or chicken or turkey. The meaning of bird is different than poultry. - Does the model possess adequate capability to address avian influenza, considering the virus exhibits limited human-to-human transmissibility?

      R1:

      All comments have been addressed.

  2. courses.ecu.edu.au courses.ecu.edu.au
    1. Feedback literacy and engagement

      From a structural perspective, this module already does several things very well: • It progresses logically from awareness → classification → action → dialogue → evidence • Learners do the framework, rather than just read about it • ⇄E is given substantial weight through: • Short written responses • An action plan • Dialogue • Demonstrating improvement over time • The activities align with realistic student needs (e.g. “this week”, concrete actions, templates)

      In particular, the Analysing a variety of feedback → Engagement response → Action Plan sequence is pedagogically sound and consistent with programmatic learning.