15,518 Matching Annotations
  1. Oct 2023
    1. Reviewer #1 (Public Review):

      Summary:<br /> In this manuscript, the role of orexin receptors in dopamine neurons is studied. Considering the importance of both orexin and dopamine signalling in the brain, with critical roles in arousal and drug seeking, this study is important to understand the anatomical and functional interaction between these two neuromodulators. This work suggests that such interaction is direct and occurs at the level of SN and VTA, via the expression of OX1R-type orexin receptors by dopaminergic neurons.

      Strengths:<br /> The use of a transgenic line that lacks OX1R in dopamine-transporter-expressing neurons is a strong approach to dissecting the direct role of orexin in modulating dopamine signalling in the brain. The battery of behavioural assays to study this line provides a valuable source of information for researchers interested in the role of orexin-A in animal physiology.

      Weaknesses:<br /> The choice of methods to demonstrate the role of orexin in the activation of dopamine neurons is not justified and the quantification methods are not described with enough detail. The representation of results can be dramatically improved and the data can be statistically analysed with more appropriate methods.

    2. Reviewer #2 (Public Review):

      Summary:<br /> This manuscript examines the expression of orexin receptors in the midbrain - with a focus on dopamine neurons - and uses several fairly sophisticated manipulation techniques to explore the role of this peptide neurotransmitter in reward-related behaviors. Specifically, in situ hybridization is used to show that dopamine neurons predominantly express the orexin receptor 1 subtype and then go on to delete this receptor in dopamine neurons using a transgenic strategy. Ex vivo calcium imaging of midbrain neurons is used to show that in the absence of this receptor orexin is no longer able to excite dopamine neurons of the substantia nigra.

      The authors proceed to use this same model to study the effect of orexin receptor 1 deletion on a series of behavioral tests, namely, novelty-induced locomotion and exploration, anxiety-related behavior, preference for sweet solutions, cocaine-induced conditioned place preference, and energy metabolism. Of these, the most consistent effects are seen in the tests of novelty-induced locomotion and exploration in which the mice with orexin 1 receptor deletion are observed to show greater levels of exploration, relative to wild-type, when placed in a novel environment, an effect that is augmented after icv administration of orexin.

      In the final part of the paper, the authors use PET imaging to compare brain-wide activity patterns in the mutant mice compared to wildtype. They find differences in several areas both under control conditions (i.e., after injection of saline) as well as after injection of orexin. They focus on changes in the dorsal bed nucleus of stria terminalis (dBNST) and the lateral paragigantocellular nucleus (LPGi) and perform analysis of the dopaminergic projections to these areas. They provide anatomical evidence that these regions are innervated by dopamine fibers from the midbrain, are activated by orexin in control, but not mutant mice, and that dopamine receptors are present. Thus, they argue these anatomical data support the hypothesis that behavioral effects of orexin receptor 1 deletion in dopamine neurons are due to changes in dopamine signaling in these areas.

      Strengths:<br /> Understanding how orexin interacts with the dopamine system is an important question and this paper contains several novel findings along these lines. Specifically:<br /> (1) The distribution of orexin receptor subtypes in VTA and SN is explored thoroughly.<br /> (2) Use of the genetic model that knocks out a specific orexin receptor subtype from only dopamine neurons is a useful model and helps to narrow down the behavioral significance of this interaction.<br /> (3) PET studies showing how central administration of orexin evokes dopamine release across the brain is intriguing, especially since two key areas are pursued - BNST and LPGi - where the dopamine projection is not as well described/understood.

      Weaknesses:<br /> The role of the orexin-dopamine interaction is not explored in enough detail. The manuscript presents several related findings, but the combination of anatomy and manipulation studies does not quite tell a cogent story. Ideally, one would like to see the authors focus on a specific behavioral parameter and show that one of their final target areas (dBNST or LPGi) was responsible or at least correlated with this behavioral readout. In addition, some more discussion on what the results tell us about orexin signaling to dopamine neurons under normal physiological conditions would be very useful. For example, what is the relevance of the orexin-dopamine interaction blunting novelty-induced locomotion under wildtype conditions?

      In some places in the Results, insufficient explanation and reporting is provided. For example, when reporting the behavioral effects of the Ox1 deletion in two bottle preference, it is stated that "[mutant] mice showed significant changes..." without stating the direction in which preference was affected.

      The cocaine CPP results are difficult to interpret because it is unclear whether any of the control mice developed a CPP preference. Therefore, it is difficult to conclude that the knockout animals were unaffected by drug reward learning. Similarly, the sucrose/sucralose preference scores are also difficult to interpret because no test of preference vs. water is performed (although the data appear to show that there is a preference at least at higher concentrations, it has not been tested).

    1. Reviewer #1 (Public Review):

      This manuscript represents an elegant bioinformatics approach to addressing causal pathways in vascular and liver tissue related to atherosclerosis/coronary artery disease, including those shared by humans and mice and those that are specific to only one of these species. The authors constructed co-expression networks using bulk transcriptome data from human (aorta, coronary) and mouse (aorta) vascular and liver tissue. They mapped human CAD GWAS data onto these modules, mapped GWAS SNPs to putatively causal genes, identified pathways and modules enriched in CAD GWAS hits, assessed those shared between vascular and liver tissues and between humans and mice, determined key driver genes in CAD-associated supersets, and used mouse single-cell transcriptome data to infer the roles of specific vascular and liver cell types. The overall approach used by the authors is rigorous and provides new insights into potentially causal pathways in vascular tissue and liver involved in atherosclerosis/CAD that are shared between humans and mice as well as those that are species-specific. This approach could be applied to a variety of other common complex conditions.

    2. Reviewer #2 (Public Review):

      Summary:<br /> Mouse models are widely used to determine key molecular mechanisms of atherosclerosis, the underlying pathology that leads to coronary artery disease. The authors use various systems biology approaches, namely co-expression and Bayesian Network analysis, as well as key driver analysis, to identify co-regulated genes and pathways involved in human and mouse atherosclerosis in artery and liver tissues. They identify species-specific and tissue-specific pathways enriched for the genetic association signals obtained in genome-wide association studies of human and mouse cohorts.

      Strengths:<br /> The manuscript is well executed with appropriate analysis methods. It also provides a compelling series of results regarding mouse and human atherosclerosis.

    1. Reviewer #1 (Public Review):

      Summary:<br /> This study examines the context-dependent modulation of auditory cortical neurons in response to expected sensory input, either self-generated sounds or expected perturbations of self-generated sounds. Specifically, using songbirds, the authors ask whether social context (the presence of a female conspecific) affects 1) the response of auditory cortical neurons to the bird's own song when he is singing; and 2) the response of neurons to perturbations of auditory feedback that the bird has been trained to expect.

      Strengths:<br /> First, the authors report that across the population, the responses of the neurons does not differ when a male bird sings alone or if he sings to a female. A fraction of auditory cortical neurons, however, do show significant differences in the firing rate, precision, and/or degree of burst firing when males sing alone vs. when they sing to females. This finding is broadly consistent with the literature showing that sensory neurons (visual, auditory, somatosensory, etc.) can be rapidly reconfigured into different "information processing modes" depending on behavioral state (e.g, quiescence vs vigilance).

      For the perturbation experiments, the authors trained birds to expect distorted auditory feedback during a particular syllable. They found that some neurons showed greater responses during perturbation when a female was present (compared to when males were alone) while other neurons had smaller responses during perturbation when a female was present. In addition, the response of a small number of auditory cortical neurons were not affected by behavioral state. These results contrast with their prior report that the responses of midbrain dopaminergic neurons that project to the basal ganglia are "uniformly reduced" in the presence of a female, raising a question of how an evaluation signal is transformed in the circuit from the primary sensory region to the midbrain.

      Weaknesses:<br /> While the experiments and analysis are solid, the finding that social context can alter responses of auditory cortical neurons in a multitude of ways (increase, decrease or no change) raises several questions that can be examined with additional analysis. For example, do context-dependent differences in auditory responses derive from context-dependent differences in the songs? Are context-dependent differences present in all classes of neurons and throughout the auditory system?

      The observed heterogeneity in the firing properties of auditory cortical neurons, both in response to self-generated sounds and during perturbations of auditory feedback, raises the question of which neurons are sensitive to social context (which likely can be addressed by the authors in a revision). The authors should provide additional details about the recordings:

      a) What are the locations of the recording sites?<br /> Prior work has shown that there is an organized map of spectrotemporal features of sounds in the auditory cortex of songbirds; spectral tuning widths change along the medial-lateral axis and temporal tuning widths differ between the input and output layers of Field L. Were the recordings primarily in Field L2 (thalamo-recipient region), L1 or L3? Were some recordings lateral to Field L in secondary auditory regions? Were the neurons that showed context-dependent changes in firing properties localized or distributed throughout Field L (i.e., were the context-dependent differences in neural responses truly brain-wide)? At a minimum, the authors should include a schematic showing the different regions of Field L and a summary of the location of the recording sites. Images of the processed tissue with electrolytic lesions would also be helpful.

      b) Was the context-dependent modulation limited to a particular class of neurons (distinguished by spike waveform shape, spontaneous firing rate, or other feature)?

      While the authors attribute differences in the responses of single auditory cortical neurons to the presence of a female, other potential explanations for the observed differences should be examined (and potentially ruled out):

      a) Prior work has shown that songs of zebra finches differ slightly when males sing alone compared to when they sing to females: songs are faster; pitch is less variable; and the number of introductory elements is greater when males sing to females. Do some of the observed social context-dependent differences in the responses of auditory neurons reflect differences in the songs in the two conditions? This idea is supported in part by a prior study in juvenile zebra finches (Keller & Hahnloser, 2009) showing that ~20% of the neurons they recorded in Field L and a secondary auditory region (CLM) showed anticipatory activity even before the onset of a song bout, suggesting a source of premotor (or at least non-auditory drive) to neurons in the auditory cortex. Did the authors of this study also find premotor activity in Field L, and if so, did it differ between the two social contexts? Might differences in Field L responses reflect motor/song differences?

      b) For the perturbation experiments, the authors report heterogeneous responses to playback, with some neurons firing more and other firing less when a female is present compared to when the male is alone. Keller and Hahnloser (2009) found that in juvenile birds, responses of Field L to perturbations of auditory feedback were sensitive to sound amplitude; perturbation responses increased with relative perturbation amplitude. This raises a question of whether perturbation amplitude is different when a male is alone and when a female is present (i.e., the male may move towards the female when she is present and if the speaker is close to the female, the perturbation may be louder than when the male is alone; alternatively, the male be more active when he is alone so the loudness of the perturbation may be more variable across song bouts). It would be useful to know if (and how much) perturbation amplitude varied depending on the location inside the cage as well as whether the sound pressure level of the underlying song was higher (e.g., Lombard effect). Addition of details of the experimental setup/procedure would help to allay concerns that the amplitude of the white noise varied significantly depending on behavioral context.

      Finally, I am still trying to make sense of the differences in the context-dependent modulation of responses of auditory cortical neurons vs. midbrain dopaminergic neurons. Given the heterogeneity of responses in Field L, both to self-generated sounds and to expected perturbations during singing, how are the signals decoded downstream of Field L? At the population level, neither the mean firing rate nor the timing of firing of Field L neurons changed with courtship. Similarly, across the population, the responses to perturbations of auditory feedback were not affected by courtship state (error signal attenuated in 11 neurons, increased in 22 neurons and not affected in 10 neurons). Yet, the courtship state "uniformly" reduces the response of midbrain dopaminergic neurons to auditory perturbation. It would be helpful if the authors could include a model and/or more discussion of how this change may arise.

    2. Reviewer #2 (Public Review):

      Summary:

      In the manuscript 'Auditory cortical error signals retune during songbird courtship', Jones and Goldberg study auditory cortex in male zebra finches. They explore song-related responses in two different contexts, when the male is either alone or in the presence of a female. Social-context related responses are hypothesized based on previous results on downstream VTA neurons where such modulation is found. They play jamming stimuli through a loudspeaker to probe sensitivity of song-related neural responses to these external stimuli. They find a heterogeneity of responses, in line with auditory cortical neurons computing the social modulation of responses found in VTA.

      Strengths:

      In general, the work is interesting and sheds light onto auditory processing and self-perception mechanisms in songbirds.

      Weaknesses:

      Stability of responses has not been studied: some neurons seem to have responses that slowly drift in time, which could lead to observed differences between alone and with-female conditions. Also, possible motor confounds and sound-of-audience confounds should be addressed. The language is often imprecise.

      Stability and Reversal: It is a bit unfortunate that stability of effects seemingly has not been studied by reversing experimental conditions. The work would be much stronger if authors could show that audience-dependent tuning is robust in individual cells. Did they record from some neurons during reversal back to the alone condition? Ideally, the responses should be identical before and after recording with an audience. This would control for possible non-stationarities in their neuron recordings/spike-sorting/circadian trends. If authors do not have such data, it would be worth wile to even just try to divide the dataset for each neuron and condition (either the audience or isolate condition) into two parts to verify that the response is the same in either part (provided sufficient song renditions are recorded). See also my comment below about Fig. 2A.

      Motor responses: Does DAF playback change song? If so, especially if it applies only in one of the two conditions (audience/no audience), then the observed response differences could be motor-related rather than auditory responses. Analyses of song spectrograms right after DAF would presumably provide the answer.

      Similarly, motif-aligned spiking activity was time warped to the median duration of undirected or directed motifs. Could the shorter motifs during directed song (as has been reported in other studies) lead to alignment differences that would account for the different error responses in alone/wfemale conditions? In other words, could increased error responses be due to the fixed 100 ms analysis window of the audience condition that extends into a song region beyond the 100 ms region of the no-audience condition where there is increased firing? And vice versa for observed decreases in error responses, i.e. is there a firing pause just after the offset of the 100 ms window in the no-audience condition that causes audience dependence of responses? A simple compensation of song tempo differences by shortening/stretching the analysis window in one of the two conditions would allow to test for this.

      Audience versus sound of audience: In the first sentence of the discussion authors write: we discovered that auditory representations of an animal's own vocalizations change with an audience. Is it truly the audience that causes the difference in error responses or is it the sounds the audience makes? To control for that would be to play back stimuli that simulate a non-silent audience through a loudspeaker to see whether error responses depend on the soundscape created by a typical audience (either present or absent). Authors probably do not have such data and to record it would go beyond the scope of this study, but it would be important to discuss this possibility or perform some analysis in that vein.

    3. Reviewer #3 (Public Review):

      Summary:

      In this study, Jones et al. examine how neural activity in a primary auditory area (field L) of singing male songbirds is modulated by the presence or absence of an audience (a female conspecific). Prior work has demonstrated that the presence of an audience attenuates the responses of dopaminergic neurons to distortions of auditory feedback (DAF). Here the authors report that even in a region that is primarily considered sensory, responses to DAF are also modulated by the audience, although in a heterogeneous manner that does not readily explain previously observed attenuation. These findings address an interesting question and will potentially be important in adding to an understanding of how non-sensory factors can alter response properties of neurons even in primary sensory regions in a context dependent fashion. However, to be fully persuasive, additional analyses will be required to address how much of the apparent modulation by audience may be explained by other factors such as changes in recorded neurons or their properties over time.

      Full Public Review:

      In this study, Jones et al. examine how neural activity in a primary auditory area (field L) of singing male songbirds is modulated by the presence or absence of an audience (a female conspecific). They test whether activity in Field L differs between conditions in which the male is singing to a female (directed song) or alone (undirected song) and whether response to distortions of auditory feedback (DAF) differ between these conditions. Previous work has shown that in other parts of the songbird brain, sensory-motor activity can differ between directed and undirected song, and that responses to DAF are attenuated when males sing directed song versus undirected song. These prior results raise the interesting question of the extent to which such modulations of activity by the presence of an audience are already present in primary sensory areas such as Field L. This possibility is also motivated by prior work that has shown that Field L activity is not exclusively explained by auditory input, but can also be modulated by the bird's state - whether it is singing or not.

      Against this background, the questions asked here are of interest for two inter-related reasons:

      1) the authors address whether the presence of an audience (a female conspecific) alters activity in a primary auditory area during singing. Primary auditory areas such as Field L, and analogous mammalian thalamo-recipient cortical regions such as A1, are often thought of as responding very specifically to the features of sensory stimuli, but are also understood to be modulated by a variety of factors including the attentional and behavioral state of the animal. For audition, such modulation includes whether or not animals are vocalizing and listening to themselves or listening to playback of their own vocalizations. Cited works from Keller (2009) as well as Eliades and Wang (2008) have indicated that the act of vocalizing can modulate auditory responses to self-generated feedback in primary auditory areas relative to those arising from playback of the same sounds. Here, the question is whether responses to self-generated feedback differ between conditions of singing alone versus singing to a female audience. A demonstration that the presence of an audience matters to responses in Field L would add to a general understanding of how it is that non-auditory factors can modulate sensory responses.

      2) the authors address the possible source of an audience-dependent modulation of responses to feedback perturbation in the VTA previously reported by Goldberg and colleagues (2023). In the VTA, responses to perturbations during singing are consistently attenuated when males are singing to females versus when they are singing alone, but the underlying mechanisms of this modulation are unknown. Here, the authors test the possibility that such modulation by an audience is already present at the level of Field L. The previously reported attenuation in VTA is quite striking and reflects a nice example of how neural processing can differ with varying behavioral priorities. Understanding whether this modulation of responses to DAF arises already in primary auditory areas would further a mechanistic understanding of an intriguing example of state-dependent modulation of sensory processing and behavior, and lend broad insight into related phenomena.

      The authors report 1) that activity in Field L differs between directed and undirected singing at many individual recording sites, but that these changes are heterogeneous, with both increases and decreases in activity, so that there is no consistent change across the population and 2) that the responses to DAF differ between directed and undirected song, but that there is no consistent attenuation of response (as observed in the VTA) and instead heterogeneous increases and decreases in response to DAF so that there is no net change at the population level.

      These findings, if firmly established, are important and of general interest. While they do not readily explain the source of the audience-dependent attenuation of auditory responses to DAF in the VTA, the demonstration of audience-dependent modulation of self-generated feedback and its disruption in a primary auditory area is an exciting result that would provide an opportunity for further investigation of how changes in social context influence brain and behavior. The manuscript is generally well written, although the presentation is terse. My main reservations about the current manuscript relate to aspects of experimental design and analysis that need to be clarified and addressed before these conclusions will be fully persuasive. There are also some places where further discussion of the findings and their relationship to prior studies would be helpful.

      1. A central concern relates to whether the main reported effects associated with differences in singing directed versus undirected song reflect only those changes in conditions, versus contributions from changes in unit isolation or response properties over time. The authors record undirected song in a block in the morning and only after collecting at least 40 renditions do they later record responses during directed song over a series of repeated exposures to a female. Therefore, differences between data collected during undirected song and directed song also reflect differences between data collected initially during the morning versus later. It is unclear from methods whether any of these recordings during undirected and directed conditions are interleaved, but if this is not the case, then it is crucial to ask how stable were neural recordings with respect to unit isolation, and potential changes to response properties, over the duration of the experiments. This would be less of a concern if the results mirrored those observed in the VTA, where attenuation of responses was observed across the entire population during directed versus undirected conditions - it is hard to explain a phenomenon that is consistently observed across the population as arising from a change in which neurons and spikes are contributing to responses, or other forms of non-stationarity. However, because there are no significant differences reported at the population level in the current study, it is important to address the possibility that observed differences between conditions reflect some form of noise or drift in recorded units, rather than being entirely due to directed versus undirected singing. I have elaborated in more detail below on this concern, including places where the data seems to suggest some non-stationarity of responses, and have some suggestions for ways in which this concern might be addressed.

      2. A second concern, related to this first one, has to do with the categorical definition of 'error neurons'. The authors note in their text that it could be problematic to apply categorical definitions to continuous distributions, and yet that seems to be what they then do. The authors have a metric of error sensitivity that they apply to each neuron's response to DAF in both undirected and directed conditions (the error score). They show that there is a continuous distribution of error scores (Figure 2 - figure supplement 1) across the population, with no bimodality that would be suggestive of distinct error sensitive and error-insensitive neurons. One nice feature of their analysis is that they also show the distribution of error scores computed in an analogous fashion for a period of neural activity in the song prior to DAF. This control data set makes it persuasive that there is a significant response to DAF, but also shows that there can be a broad range of error scores even when no DAF has been played, and that this range of 'noise' responses to DAF overlaps substantially with the actual responses to DAF. Despite the continuum of error scores, the authors define a subset of neurons as error responsive only if their responses to DAF exceed a specific threshold (2.5 standard deviations). One of the main conclusions of the paper is based on finding a subset of 22 neurons that exhibited error responses (by this definition) only during singing to a female and 11 neurons that exhibited error responses only when singing alone. These neurons are described as 'retuned' because they have error responses in only one condition.

      The problem here is that for some, if not many, of the neurons that are categorically defined as being responsive to DAF in only one condition (directed versus undirected) there is almost certainly not a significant difference in the actual responses to DAF between conditions. This is apparent in the relevant data figure (figure 2 - figure supplement 1) and is a consequence of using a threshold to split a continuous distribution into groups defined as error responsive or not. For example, several neurons in this plot that have almost identical scores in the directed and undirected condition are counted as examples of retuning because the error scores are just a bit over 2.5 in the directed condition and just a bit under 2.5 in the undirected condition.

      That this kind of categorical approach may be problematic is apparent in the control data in the plot. Despite the absence of any perturbation, there are error responsive neurons present in these data that are considered selective for directed versus undirected singing - this is an expected consequence of using a threshold on dispersed or noisy biological data. Shifting to a more stringent threshold of three standard deviations, as the authors do, does not help with this problem, as that still treats as categorically different responses that fall on either side of a line, even if only by a tiny amount. I suggest that the authors devise a measure for each neuron to test whether the responses to DAF are significantly different under the two conditions (directed versus undirected). As noted above, this measure should take into account some assessment of the stationarity of responses, as well as the distribution of responses (which, in some of the examples does not seem to be Gaussian around a mean response level, but rather highly variable across trials).

      3. There are several places where further discussion of the previous literature and how the current results relate to that literature would be helpful. This includes:

      3a. Some discussion of what is already known about the auditory tuning of field L, and the extent to which responses associated with distortion of feedback may reflect the frequency tuning of field L neurons versus something that might be construed as more specifically as detecting an error in perceived feedback. For example, Field L neurons have previously been characterized as having relatively simple spectro-temporal receptive fields, often with a single frequency band that is excitatory and nearby frequency bands that are inhibitory. It would be beyond the scope of this paper to directly assess the extent to which both song responses and responses to DAF are well predicted by simple STRFs that might be measured for the recorded neurons, or computed from activity during a range of vocalizations, but perhaps worth discussing whether a neuron with such frequency tuning would potentially exhibit 'error responses' of the sort described here, simply because the DAF stimulus happens to fall into the excitatory or inhibitory regions of the neuron's receptive field. While it is OK to use the term 'error responsive' in the current study, it would be good to make clear that changes in firing associated with playing DAF should be expected even for neurons that have simple auditory receptive fields (i.e. with center surround tuning to specific frequencies in a tonotopic map, as has been described for Field L) without necessarily indicating that these neurons are specifically registering any deviation or 'error' between expected feedback and experienced feedback. In this respect, there are multiple subdivisions of Field L with different tuning properties. Please specify further what criteria were used to determine recording locations and how these correspond with previously defined subdivisions.

      3b. It would also be useful to discuss further previous work on differences in auditory tuning or responses between conditions when subjects are vocalizing, versus when vocalizations are played back (as in Keller, Eliades) and whether the results in the current study are similar or different. For example, this prior work has indicated that efference copy or other signals that precede vocalizations can reach and influence activity in auditory areas - with the most compelling evidence for this being the modulation of activity prior to the onset of vocalizations. Was this also observed in the current study, and to what extent might this kind of mechanism contribute to the processing of feedback distortions? With respect to this kind of efference signal, or other possibilities, can the authors provide some discussion or speculation about possible mechanisms that might be differentially engaged between conditions of singing directed versus undirected song?

      3c. The previous study on DAF responses in VTA indicates enhanced responses to female calls during directed song. To what extent did the current study control for any vocalizations or other sounds produced by females during the directed singing, and could this have contributed to differences in Field L activity between conditions? This question is motivated partly by the highly variable responses in raster plots even within one condition - might some of this reflect motifs during which transient noises are produced from female calling or other movements by the male or female?

      More regarding stability of recordings:

      The data presented in Figure 1D illustrate some of my concerns about the stationarity of recordings. In the directed condition there are no spikes at all following the first handful of motif renditions. Were the directed and undirected recordings interleaved here? If not, could the recorded neuron simply have been lost, changed in amplitude of recorded spikes so that it was no longer counted, or reduced its responsiveness over the course of the recordings? Because the recordings of undirected and directed singing are described as occurring sequentially, it seems likely that this type of change in recorded signal could contribute to changes in measured responses over time, independently of effects due to directed versus undirected singing.

      A minor issue of this example is that the raw example trace with male alone does not seem to have a corresponding set of points in the roster plot. For panel E, I also cannot find rasters that correspond to the example recordings shown at top.

      Figure 2A also shows a neuron that looks like it has non-stationarity; for the alone condition without altered feedback, the main peak has no spikes for the bottom half of the rasters. For the directed condition, much of the difference between control and distorted feedback conditions seems to come from a few trials towards the bottom of the raster plot that show more and earlier firing than most other rasters.

      Other more subtle examples are suggested in the figures, such as Figure 1F where responses in the alone condition seem to increase over the course of recordings. A related issue apparent in some of the raster plots is that the firing rate distributions within a given condition sometimes appear to be very non-gaussian, with some motifs during which there is a lot of activity, or apparent bursting, and others in which there is little activity. In addition to the examples above, this includes<br /> responses in Fig 1E and Fig 2F. Does anything distinguish these cases or trails? Where differences between conditions are driven by firing differences that are present on only a subset of trials, such as in Fig 2A, there is some deviation from the normal criteria for use of T-tests/Z-scores. Please consider this point and discuss any caveats and/or apply other tests (Monte Carlo? Non-parametric?) as appropriate.

      These potential issues of non-stationarily, and non-Gaussian firing rate distributions in each condition, make it complicated to think about what differences in activity reflect changes from undirected to directed conditions versus these other factors.

      Approaches to addressing this issue could include more specifically indicating examples in which recordings from the alone condition and directed condition are interleaved and exhibit reversible (between conditions) changes in the pattern of responses (both without DAF in comparing alone versus directed, and with DAF demonstrating differences in DAF influences between conditions). Some good interleaved examples of this sort would be very helpful to illustrate the robustness of differences between conditions. More generally, the methods and or raster plots should include some further explanation of the time periods over which recordings were made in the alone versus directed conditions, and the extent to which they are interleaved or not.

      Another approach that could be used if there are not many instances of inter-leaved recordings is to try to document the stationarily or stability of unit isolation and/or responses over time. It would be most helpful when applied to recordings from a given singing condition (i.e. alone or directed) that are interleaved, but even in cases where this is not possible perhaps one could assess the stability of waveforms and unit isolation across time. For example in Figure 2 - Supplementary figure 2, the left-hand and middle examples appear to have quite good unit isolation, and might be the sorts of cases where measures of unit isolation and waveform stability could be used to argue that a gain or loss of spikes due to drift in recordings or changes to SNR and spike detection are not contributing to changes in firing patterns over time (and across conditions).

      It potentially would also be informative to present the prevalence of the main effects reported in the study as a function of some measures of unit isolation, SNR, and recording stability. It would be reassuring to see that significant differences between conditions are equally or more prevalent under the conditions of greatest unit isolation and recording stability than in cases with worse SNR or stability.

      One other way that the authors might be able to address my main concern would be to look at the stability of firing patterns within conditions, where differences across trials most directly indicate the potential contributions of technical or biological changes in neural activity over time that are not related to the experimental conditions.

      To further address some of these issues, it would be helpful to have additional explanations in this paper (rather than by reference to Goldberg and Fee, 2010) of the criteria that were used for counting spikes, and assessing stability of recordings. All I found about this in the Goldberg and Fee, 2010 reference was that "Spikes were sorted off-line using custom Matlab software" Does this require human inspection and judgment? Is there a simple threshold, or waveform measurement used for detecting spikes from single units? Are some sort of signal to noise measures, or ISI violations used to score how well units are isolated?

      For the specific examples shown in figures, it would be useful to indicate by small tick marks or otherwise which spikes were counted as single units. For example in figure 2 column B, for the condition with female, did only the 1-3 largest spikes get counted, or also the spikes of medium height?

      Page 11: "Many channels on the probes recorded multi-unit activity, which were taken note of but not analyzed in this study."

      What were the criteria for this? For several of the examples in the figures there are spikes of varying amplitudes and as mentioned above it would be helpful to clarify how the spikes were sorted into single units in such cases.

      Categorical scores:

      Page 13: "Neurons with error responses greater than 2.5 in only one condition (undirected versus directed) were considered to have retuned; neurons with error scores greater than 2.5 in both conditions were considered not to have retuned."

      This definition results in cases where responses of 2.45 vs 2.55 are described as 'retuned', even if these responses are not significantly different. The figure (Figure 2 - figure supplement 1) indicates that multiple neurons that were scored as retuning had responses that fall very near the threshold in this way.

      Page 13, "Our results did not fundamentally change with ... a more stringent threshold of 3..."

      The stringency is not issue here, rather the categorical threshold. Retuning would be more persuasively demonstrated if the authors could provide a test of whether or not the responses for individual neurons differ significantly between conditions appropriately taking into account multiple comparisons, stability of recordings, non-Gaussian firing rate distributions across motif renditions, etc. and use this metric to report effects, rather than setting a categorical threshold.

    1. Reviewer #1 (Public Review):

      Summary:<br /> Through an unbiased genomewide KO screen, the authors identified loss of DBT to suppress MG132-mediated death of cultured RPE cells. Further analyses suggested that DBT reduces ubiquitinated proteins by promoting autophagy. Mechanistic studies indicated that DBT loss promotes autophagy via AMPK and its downstream ULK and mTOR signaling. Furthermore, loss of DBT suppresses polyglutamine- or TDP-43-mediated cytotoxicity and/or neurodegeneration in fly models. Finally, the authors showed that DBT proteins are increased in ALS patient tissues, compared to non-neurological controls.

      Strengths:<br /> The idea is novel, the evidence is mostly convincing, and the data are clean. The findings have implications for human diseases.

      Weaknesses:<br /> More experiments are needed to establish the connections between DBT and autophagy. The mechanistic studies are somewhat biased, and it's unclear whether the same mechanism (i.e., AMPK-->mTOR) can be applied to TDP-43-mediated neurodegeneration. Also, some data interpretation has to be more accurate.

    2. Reviewer #2 (Public Review):

      Summary:<br /> Hwang, Ran-Der et al utilized a CRISPR-Cas9 knockout in human retinal pigment epithelium (RPE1) cells to evaluate for suppressors of toxicity by the proteasome inhibitor MG132 and identified that knockout of dihydrolipoamide branched chain transacylase E2 (DBT) suppressed cell death. They show that DBT knockout in RPE1 cells does not alter proteasome or autophagy function at baseline. However, with MG132 treatment, they show a reduction in ubiquitinated proteins but with no change in proteasome function. Instead, they show that DBT knockout cells treated with MG132 have improved autophagy flux compared to wildtype cells treated with MG132. They show that MG132 treatment decreases ATP/ADP ratios to a greater extent in DBT knockout cells, and in accordance causes activation of AMPK. They then show downstream altered autophagy signaling in DBT knockout cells treated with MG132 compared to wild-type cells treated with MG132. Then they express the ALS mutant TDP43 M337 or expanded polyglutamine repeats to model Huntington's disease and show that knockdown of DBT improves cell survival in RPE1 cells with improved autophagic flux. They also utilize a Drosophila model and show that utilizing either a RNAi or CRISPR-Cas9 knockout of DBT improves eye pigment in TDP43M337V and polyglutamine repeat-expressing transgenic flies. Finally, they show evidence for increased DBT in postmortem spinal cord tissue from patients with ALS via both immunoblotting and immunofluorescence.

      Strengths:<br /> This is a mechanistic and well-designed paper that identifies DBT as a novel regulator of proteotoxicity via activating autophagy in the setting of proteasome inhibition. Major strengths include careful delineation of a mechanistic pathway to define how DBT is protective. These conclusions are largely justified, but additional experiments and information would be useful to clarify and extend these conclusions.

      Weaknesses:<br /> The large majority of the experiments are evaluating suppression of drug (MG132) toxicity in an in vitro epithelial cell line, so the generalizability to disease is unclear. Indeed, MG132 itself has been shown to modulate autophagy, and off-target effects of MG132 are not addressed. While this paper is strengthened by the inclusion of mouse-induced motor neurons, Drosophila models, and postmortem tissue, the putative mechanisms are minimally evaluated in these models.

      Also, this effect is only seen with MG132 treatment, at a dose that causes markedly impaired cell survival. In this setting, it is certainly plausible that changes in autophagy could be the result of differences in cell survival, as opposed to an underlying mechanism for cell survival. Additional controls would be useful to increase confidence that DBT knockdown is protective via modulation of autophagy.

      While the authors report increased DBT in postmortem ALS tissue as suggestive that DBT may modulate proteotoxicity in neurodegeneration, this point would be better supported with the evaluation of overexpression of DBT in their model.

    1. Reviewer #1 (Public Review):

      Summary:<br /> In this work, Xie, Prescott, and colleagues have reevaluated the role of Nav1.7 in nociceptive sensory neuron excitability. They find that nociceptors can make use of different sodium channel subtypes to reach equivalent excitability. The existence of this degeneracy is critical to understanding neuronal physiology under normal and pathological conditions and could explain why Nav subtype-selective drugs have failed in clinical trials. More concretely, nociceptor repetitive spiking relies on Nav1.8 at DIV0 (and probably under normal conditions in vivo), but on Nav1.7 and Nav1.3 at DIV4-7 (and after inflammation in vivo).

      The conclusions of this paper are mostly well supported by data, and these findings should be of broad interest to scientists working on pain, drug development, neuronal excitability, and ion channels.

      Strengths:<br /> The authors have employed elegant electrophysiology experiments (including specific pharmacology and dynamic clamp) and computational simulations to study the excitability of a subpopulation of DRGs that would very likely match with nociceptors (they take advantage of using transgenic mice to detect Nav1.8-expressing neurons). They make a strong point showing the degeneracy that occurs at the ion channel expression level in nociceptors, adding this new data to previous observations in other neuronal types. They also demonstrate that the different Nav subtypes functionally overlap and are able to interchange their "typical" roles in action potential generation. As Xie, Prescott, and colleagues argue, the functional implications of the degenerate character of nociceptive sensory neuron excitability need to be seriously taken into account regarding drug development and clinical trials with Nav subtype-selective inhibitors.

      Weaknesses:<br /> The next comments are minor criticisms, as the major conclusions of the paper are well substantiated. Most of the results presented in the article have been obtained from experiments with DRG neuron cultures, and surely there is a greater degree of complexity and heterogeneity about the degeneracy of nociceptors excitability in the "in vivo" condition. Indeed, the authors show in Figures 7 and 8 data that support their hypothesis and an increased Nav1.7's influence on nociceptor excitability after inflammation, but also a higher variability in the nociceptors spiking responses. On the other hand, DRG neurons targeted in this study (YFP (+) after crossing with Nav1.8-Cre mice) are >90% nociceptors, but not all nociceptors express Nav1.8 in vivo. As shown by Li et al., 2016 ("Somatosensory neuron types identified by high-coverage single-cell RNA-sequencing and functional heterogeneity"), there is a high heterogeneity of neuron subtypes within sensory neurons. Therefore, some caution should be taken when translating the results obtained with the DRG neuron cultures to the more complex "in vivo" panorama.

      Although the authors have focused their attention on Nav channels, it should be noted that degeneracy concerning other ion channels (such as potassium ion channels) could also impact the nociceptor excitability. The action potential AHP in Figure 1, panel A is very different comparing the DIV0 (blue) and DIV4-7 examples. Indeed, the conductance density values for the AHP current are higher at DIV0 than at DIV7 in the computational model (supplementary table 5). The role of other ion channels in order to obtain equivalent excitability should not be underestimated.

    2. Reviewer #2 (Public Review):

      Summary:<br /> The authors have noted in preliminary work that tetrodotoxin (TTX), which inhibits NaV1.7 and several other TTX-sensitive sodium channels, has differential effects on nociceptors, dramatically reducing their excitability under certain conditions but not under others. Partly because of this coincidental observation, the aim of the present work was to re-examine or characterize the role of NaV1.7 in nociceptor excitability and its effects on drug efficacy. The manuscript demonstrates that a NaV1.7-selective inhibitor produces analgesia only when nociceptor excitability is based on NaV1.7. More generally and comprehensively, the results show that nociceptors can achieve equivalent excitability through changes in differential NaV inactivation and NaV expression of different NaV subtypes (NaV 1.3/1.7 and 1.8). This can cause widespread changes in the role of a particular subtype over time. The degenerate nature of nociceptor excitability shows functional implications that make the assignment of pathological changes to a particular NaV subtype difficult or even impossible.

      Thus, the analgesic efficacy of NaV1.7- or NaV1.8-selective agents depends essentially on which NaV subtype controls excitability at a given time point. These results explain, at least in part, the poor clinical outcomes with the use of subtype-selective NaV inhibitors and therefore have major implications for the future development of Nav-selective analgesics.

      Strengths:<br /> The above results are clearly and impressively supported by the experiments and data shown. All methods are described in detail, presumably allow good reproducibility, and were suitable to address the corresponding question. The only exception is the description of the computer model, which should be described in more detail.

      The results showing that nociceptors can achieve equivalent excitability through changes in differential NaV inactivation and expression of different NaV subtypes are of great importance in the fields of basic and clinical pain research and sodium channel physiology and pharmacology, but also for a broad readership and community. The degenerate nature of nociceptor excitability, which is clearly shown and well supported by data has large functional implications. The results are of great importance because they may explain, at least in part, the poor clinical outcomes with the use of subtype-selective NaV inhibitors and therefore have major implications for the future development of Nav-selective analgesics.

      In summary, the authors achieved their overall aim to enlighten the role of NaV1.7 in nociceptor excitability and the effects on drug efficacy. The data support the conclusions, although the clinical implications could be highlighted in a more detailed manner.

      Weaknesses:<br /> As mentioned before, the results that nociceptors can achieve equivalent excitability through changes in differential NaV inactivation and NaV expression of different NaV subtypes are impressive. However, there is some "gap" between the DRG culture experiments and acutely dissociated DRGs from mice after CFA injection. In the extensive experiments with cultured DRG neurons, different time points after dissociation were compared. Although it would have been difficult for functional testing to examine additional time points (besides DIV0 and DIV4-7), at least mRNA and protein levels should have been determined at additional time points (DIV) to examine the time course or whether gene expression (mRNA) or membrane expression (protein) changes slowly and gradually or rapidly and more abruptly. It would also be interesting to clarify whether the changes that occur in culture (DIV0 vs. DIV4-7) are accompanied by (pro-)inflammatory changes in gene and protein expression, such as those known for nociceptors after CFA injection. This would better link the following data demonstrating that in acutely dissociated nociceptors after CFA injection, the inflammation-induced increase in NaV1.7 membrane expression enhances the effect of (or more neurons respond to) the NaV1.7 inhibitor PF-71, whereas fewer CFA neurons respond to the NaV1.8 inhibitor PF-24.

      The results shown explain, at least in part, the poor clinical outcomes with the use of subtype-selective NaV inhibitors and therefore have important implications for the future development of Nav-selective analgesics. However, this point, which is also evident from the title of the manuscript, is discussed only superficially with respect to clinical outcomes. In particular, the promising role of NaV1.7, which plays a role in nociceptor hyperexcitability but not in "normal" neurons, should be discussed in light of clinical results and not just covered with a citation of a review. Which clinical results of NaV1.7-selective drugs can now be better explained and how?

      Another point directly related to the previous one, which should at least be discussed, is that all the data are from rodents, or in this case from mice, and this should explain the clinical data in humans. Even if "impediment to translation" is briefly mentioned in a slightly different context, one could (as mentioned above) discuss in more detail which human clinical data support the existence of "equivalent excitability through different sodium channels" also in humans.

      Although speculative, it would be interesting for readers to know whether a treatment regimen based on "time since injury" with NaV1.7 and NaV1.8 inhibitors might offer benefits. Based on the data, could one hypothesize that NaV1.7 inhibitors are more likely to benefit (albeit in the short term) in patients with neuropathic pain with better patient selection (e.g., defined interval between injury and treatment)?

    3. Reviewer #3 (Public Review):

      Summary:<br /> In this study, the authors used patch-clamp to characterize the implication of various voltage-gated Na+ channels in the firing properties of mouse nociceptive sensory neurons. They report that depending on the culture conditions NaV1.3, NaV1.7, and NaV1.8 have distinct contributions to action potential firing and that similar firing patterns can result from distinct relative roles of these channels. The findings may be relevant for the design of better strategies targeting NaV channels to treat pain.

      Strengths:<br /> The paper addresses the important issue of understanding, from an interesting perspective, the lack of success of therapeutic strategies targeting NaV channels in the context of pain. Specifically, the authors test the hypothesis that different NaV channels contribute in a plastic manner to action potential firing, which may be the reason why it is difficult to target pain by inhibiting these channels. The experiments seem to have been properly performed and most conclusions are justified. The paper is concisely written and easy to follow.

      Weaknesses:<br /> 1) The most critical issue I find in the manuscript is the claim that different combinations of NaV channels result in equivalent excitability. For example, in the Abstract it is stated that: "...we show that nociceptors can achieve equivalent excitability using different combinations of NaV1.3, NaV1.7, and NaV1.8". The gating properties of these channels are not identical, and therefore their contributions to excitability should not be the same. I think that the culprit of this issue is that the authors reach their conclusion from the comparison of the (average) firing rate determined over 1 s current stimulation in distinct conditions. However, this is not the only parameter that determines how sensory neurons convey information. For instance, the time dependence of the instantaneous frequency, the actual firing pattern, may be important too. Moreover, the use of 1 s of current stimulation might not be sufficient to characterize the firing pattern if one wants to obtain conclusions that could translate to clinical settings (i.e., sustained pain). A neuron in which NaV1.7 is the main contributor is expected to have a damping firing pattern due to cumulative channel inactivation, whereas another depending mainly on NaV1.8 is expected to display more sustained firing. This is actually seen in the results of the modelling.

      2) In Fig. 1, is 100 nM TTX sufficient to inhibit all TTX-sensitive NaV currents? More common in literature values to fully inhibit these currents are between 300 to 500 nM. The currents shown as TTX-sensitive in Fig. 1D look very strange (not like the ones at Baseline DIV4-7). It seems that 100 nM TTX was not enough, leading to an underestimation of the amplitude of the TTX-sensitive currents.

      3) Page 8, the authors conclude that "Inflammation caused nociceptors to become much more variable in their reliance of specific NaV subtypes". However, how did the authors ensure that all neurons tested were affected by the CFA model? It could be that the heterogeneity in neuron properties results from distinct levels of effects of CFA.

    1. Reviewer #1 (Public Review):

      Summary:<br /> This study aimed to investigate the effects of optically stimulating the A13 region in healthy mice and a unilateral 6-OHDA mouse model of Parkinson's disease (PD). The primary objectives were to assess changes in locomotion, motor behaviors, and the neural connectome. For this, the authors examined the dopaminergic loss induced by 6-OHDA lesioning. They found a significant loss of tyrosine hydroxylase (TH+) neurons in the substantia nigra pars compacta (SNc) while the dopaminergic cells in the A13 region were largely preserved. Then, they optically stimulated the A13 region using a viral vector to deliver the channelrhodopsine (CamKII promoter). In both sham and PD model mice, optogenetic stimulation of the A13 region induced pro-locomotor effects, including increased locomotion, more locomotion bouts, longer durations of locomotion, and higher movement speeds. Additionally, PD model mice exhibited increased ipsilesional turning during A13 region photoactivation. Lastly, the authors used whole-brain imaging to explore changes in the A13 region's connectome after 6-OHDA lesions. These alterations involved a complex rewiring of neural circuits, impacting both afferent and efferent projections. In summary, this study unveiled the pro-locomotor effects of A13 region photoactivation in both healthy and PD model mice. The study also indicates the preservation of A13 dopaminergic cells and the anatomical changes in neural circuitry following PD-like lesions that represent the anatomical substrate for a parallel motor pathway.

      Strengths:<br /> These findings hold significant relevance for the field of motor control, providing valuable insights into the organization of the motor system in mammals. Additionally, they offer potential avenues for addressing motor deficits in Parkinson's disease (PD). The study fills a crucial knowledge gap, underscoring its importance, and the results bolster its clinical relevance and overall strength.

      The authors adeptly set the stage for their research by framing the central questions in the introduction, and they provide thoughtful interpretations of the data in the discussion section. The results section, while straightforward, effectively supports the study's primary conclusion - the pro-locomotor effects of A13 region stimulation, both in normal motor control and in the 6-OHDA model of brain damage.

      Weaknesses:<br /> 1) Anatomical investigation. I have a major concern regarding the anatomical investigation of plastic changes in the A13 connectome (Figures 4 and 5). While the methodology employed to assess the connectome is technically advanced and powerful, the results lack mechanistic insight at the cell or circuit level into the pro-locomotor effects of A13 region stimulation in both physiological and pathological conditions. This concern is exacerbated by a textual description of results that doesn't pinpoint precise brain areas or subareas but instead references large brain portions like the cortical plate, making it challenging to discern the implications for A13 stimulation. Lastly, the study is generally well-written with a smooth and straightforward style, but the connectome section presents challenges in readability and comprehension. The presentation of results, particularly the correlation matrices and correlation strength, doesn't facilitate biological understanding. It would be beneficial to explore specific pathways responsible for driving the locomotor effects of A13 stimulation, including examining the strength of connections to well-known locomotor-associated regions like the Pedunculopontine nucleus, Cuneiformis nucleus, LPGi, and others in the diencephalon, midbrain, pons, and medulla. Additionally, identifying the primary inputs to A13 associated with motor function would enhance the study's clarity and relevance.

      The study raises intriguing questions about compensatory mechanisms in Parkinson's disease and a new perspective on the preservation of dopaminergic cells in A13, despite the SNc degeneration, and the plastic changes to input/output matrices. To gain inspiration for a more straightforward reanalysis and discussion of the results, I recommend the authors refer to the paper titled "Specific populations of basal ganglia output neurons target distinct brain stem areas while collateralizing throughout the diencephalon from the David Kleinfeld laboratory." This could guide the authors in investigating motor pathways across different brain regions.

      2) Description of locomotor performance. Figure 3 provides valuable data on the locomotor effects of A13 region photoactivation in both control and 6-OHDA mice. However, a more detailed analysis of the changes in locomotion during stimulation would enhance our understanding of the pro-locomotor effects, especially in the context of 6-OHDA lesions. For example, it would be informative to explore whether the probability of locomotion changes during stimulation in the control and 6-OHDA groups. Investigating reaction time, speed, total distance, and even kinematic aspects during stimulation could reveal how A13 is influencing locomotion, particularly after 6-OHDA lesions. The laboratory of Whelan has a deep knowledge of locomotion and the neural circuits driving it so these features may be instructive to infer insights on the neural circuits driving movement. On the same line, examining features like the frequency or power of stimulation related to walking patterns may help elucidate whether A13 is engaging with the Mesencephalic Locomotor Region (MLR) to drive the pro-locomotor effects. These insights would provide a more comprehensive understanding of the mechanisms underlying A13-mediated locomotor changes in both healthy and pathological conditions.

    2. Reviewer #3 (Public Review):

      Kim, Lognon et al. present an important finding on pro-locomotor effects of optogenetic activation of the A13 region, which they identify as a dopamine-containing area of the medial zona incerta that undergoes profound remodeling in terms of afferent and efferent connectivity after administration of 6-OHDA to the MFB. The authors claim to address a model of PD-related gait dysfunction, a contentious problem that can be difficult to treat with dopaminergic medication or DBS in conventional targets. They make use of an impressive array of technologies to gain insight into the role of A13 remodeling in the 6-OHDA model of PD. The evidence provided is solid and the paper is well written, but there are several general issues that reduce the value of the paper in its current form, and a number of specific, more minor ones. Also, some suggestions, that may improve the paper compared to its recent form, come to mind.

      The most fundamental issue that needs to be addressed is the relation of the structural to the behavioral findings. It would be very interesting to see whether the structural heterogeneity in afferent/effects projections induced by 6-OHDA is related to the degree of symptom severity and motor improvement during A13 stimulation.

      The authors provide extensive interrogation of large-scale changes in the organization of the A13 region afferent and efferent distributions. It remains unclear how many animals were included to produce Fig 4 and 5. Fig S5 suggests that only 3 animals were used, is that correct? Please provide details about the heterogeneity between animals. Please provide a table detailing how many animals were used for which experiment. Were the same animals used for several experiments?

      While the authors provide evidence that photoactivation of the A13 is sufficient in driving locomotion in the OFT, this pro-locomotor effect seems to be independent of 6-OHDA-induced pathophysiology. Only in the pole test do they find that there seems to be a difference between Sham vs 6-OHDA concerning the effects of photoactivation of the A13. Because of these behavioral findings, optogenic activation of A13 may represent a gain of function rather than disease-specific rescue. This needs to be highlighted more explicitly in the title, abstract, and conclusion.

      The authors claim that A13 may be a possible target for DBS to treat gait dysfunction. However, the experimental evidence provided (in particular the lack of disease-specific changes in the OFT) seems insufficient to draw such conclusions. It needs to be highlighted that optogenetic activation does not necessarily have the same effects as DBS (see the recent review from Neumann et al. in Brain: https://pubmed.ncbi.nlm.nih.gov/37450573/). This is important because ZI-DBS so far had very mixed clinical effects. The authors should provide plausible reasons for these discrepancies. Is cell-specificity, which only optogenetic interventions can achieve, necessary? Can new forms of cyclic burst DBS achieve similar specificity (Spix et al, Science 2021)? Please comment.

      In a recent study, Jeon et al (Topographic connectivity and cellular profiling reveal detailed input pathways and functionally distinct cell types in the subthalamic nucleus, 2022, Cell Reports) provided evidence on the topographically graded organization of STN afferents and McElvain et al. (Specific populations of basal ganglia output neurons target distinct brain stem areas while collateralizing throughout the diencephalon, 2021, Neuron) have shown similar topographical resolution for SNr efferents. Can a similar topographical organization of efferents and afferents be derived for the A13/ ZI in total?

      In conclusion, this is an interesting study that can be improved by taking into consideration the points mentioned above.

    1. Reviewer #1 (Public Review):

      This is an interesting study of the nature of representations across the visual field. The question of how peripheral vision differs from foveal vision is a fascinating and important one. The majority of our visual field is extra-foveal yet our sensory and perceptual capabilities decline in pronounced and well-documented ways away from the fovea. Part of the decline is thought to be due to spatial averaging ('pooling') of features. Here, the authors contrast two models of such feature pooling with human judgments of image content. They use much larger visual stimuli than in most previous studies, and some sophisticated image synthesis methods to tease apart the prediction of the distinct models.

      More importantly, in so doing, the researchers thoroughly explore the general approach of probing visual representations through metamers-stimuli that are physically distinct but perceptually indistinguishable. The work is embedded within a rigorous and general mathematical framework for expressing equivalence classes of images and how visual representations influence these. They describe how image-computable models can be used to make predictions about metamers, which can then be compared to make inferences about the underlying sensory representations. The main merit of the work lies in providing a formal framework for reasoning about metamers and their implications, for comparing models of sensory processing in terms of the metamers that they predict, and for mapping such models onto physiology. Importantly, they also consider the limits of what can be inferred about sensory processing from metamers derived from different models.

      Overall, the work is of a very high standard and represents a significant advance over our current understanding of perceptual representations of image structure at different locations across the visual field. The authors do a good job of capturing the limits of their approach and I particularly appreciated the detailed and thoughtful Discussion section and the suggestion to extend the metamer-based approach described in the MS with observer models. The work will have an impact on researchers studying many different aspects of visual function including texture perception, crowding, natural image statistics, and the physiology of low- and mid-level vision.

      The main weaknesses of the original submission relate to the writing. A clearer motivation could have been provided for the specific models that they consider, and the text could have been written in a more didactic and easy-to-follow manner. The authors could also have been more explicit about the assumptions that they make.

    2. Reviewer #2 (Public Review):

      Summary<br /> This paper expands on the literature on spatial metamers, evaluating different aspects of spatial metamers including the effect of different models and initialization conditions, as well as the relationship between metamers of the human visual system and metamers for a model. The authors conduct psychophysics experiments testing variations of metamer synthesis parameters including type of target image, scaling factor, and initialization parameters, and also compare two different metamer models (luminance vs energy). An additional contribution is doing this for a field of view larger than has been explored previously.

      General Comments<br /> Overall, this paper addresses some important outstanding questions regarding comparing original to synthesized images in metamer experiments and begins to explore the effect of noise vs image seed on the resulting syntheses. While the paper tests some model classes that could be better motivated, and the results are not particularly groundbreaking, the contributions are convincing and undoubtedly important to the field. The paper includes an interesting Voronoi-like schematic of how to think about perceptual metamers, which I found helpful, but for which I do have some questions and suggestions. I also have some major concerns regarding incomplete psychophysical methodology including lack of eye-tracking, results inferred from a single subject, and a huge number of trials. I have only minor typographical criticisms and suggestions to improve clarity. The authors also use very good data reproducibility practices.

      Specific Comments

      Experimental Setup<br /> Firstly, the experiments do not appear to utilize an eye tracker to monitor fixation. Without eye tracking or another manipulation to ensure fixation, we cannot ensure the subjects were fixating the center of the image, and viewing the metamer as intended. While the short stimulus time (200ms) can help minimize eye movements, this does not guarantee that subjects began the trial with correct fixation, especially in such a long experiment. While Covid-19 did at one point limit in-person eye-tracked experiments, the paper reports no such restrictions that would have made the addition of eye-tracking impossible. While such a large-scale experiment may be difficult to repeat with the addition of eye tracking, the paper would be greatly improved with, at a minimum, an explanation as to why eye tracking was not included.

      Secondly, many of the comparisons later in the paper (Figures 9,10) are made from a single subject. N=1 is not typically accepted as sufficient to draw conclusions in such a psychophysics experiment. Again, if there were restrictions limiting this it should be discussed. Also (P11) Is subject sub-00 is this an author? Other expert? A naive subject? The subject's expertise in viewing metamers will likely affect their performance.

      Finally, the number of trials per subject is quite large. 13,000 over 9 sessions is much larger than most human experiments in this area. The reason for this should be justified.

      Model<br /> For the main experiment, the authors compare the results of two models: a 'luminance model' that spatially pools mean luminance values, and an 'energy model' that spatially pools energy calculated from a multi-scale pyramid decomposition. They show that these models create metamers that result in different thresholds for human performance, and therefore different critical scaling parameters, with the basic luminance pooling model producing a scaling factor 1/4 that of the energy model. While this is certain to be true, due to the luminance model being so much simpler, the motivation for the simple luminance-based model as a comparison is unclear.

      The authors claim that this luminance model captures the response of retinal ganglion cells, often modeled as a center-surround operation (Rodieck, 1964). I am unclear in what aspect(s) the authors claim these center-surround neurons mimic a simple mean luminance, especially in the context of evidence supporting a much more complex role of RGCs in vision (Atick & Redlich, 1992). Why do the authors not compare the energy model to a model that captures center-surround responses instead? Do the authors mean to claim that the luminance model captures only the pooling aspects of an RGC model? This is particularly confusing as Figures 6 and 9 show the luminance and energy models for original vs synth aligning with the scaling of Midget and Parasol RGCs, respectively. These claims should be more clearly stated, and citations included to motivate this. Similarly, with the energy model, the physiological evidence is very loosely connected to the model discussed.

      Prior Work:<br /> While the explorations in this paper clearly have value, it does not present any particularly groundbreaking results, and those reported are consistent with previous literature. The explorations around critical eccentricity measurement have been done for texture models (Figure 11) in multiple papers (Freeman 2011, Wallis, 2019, Balas 2009). In particular, Freeman 20111 demonstrated that simpler models, representing measurements presumed to occur earlier in visual processing need smaller pooling regions to achieve metamerism. This work's measurements for the simpler models tested here are consistent with those results, though the model details are different. In addition, Brown, 2023 (which is miscited) also used an extended field of view (though not as large as in this work). Both Brown 2023, and Wallis 2019 performed an exploration of the effect of the target image. Also, much of the more recent previous work uses color images, while the author's exploration is only done for greyscale.

      Discussion of Prior Work:<br /> The prior work on testing metamerism between original vs. synthesized and synthesized vs. synthesized images is presented in a misleading way. Wallis et al.'s prior work on this should not be a minor remark in the post-experiment discussion. Rather, it was surely a motivation for the experiment. The text should make this clear; a discussion of Wallis et al. should appear at the start of that section. The authors similarly cite much of the most relevant literature in this area as a minor remark at the end of the introduction (P3L72).

      White Noise:<br /> The authors make an analogy to the inability of humans to distinguish samples of white noise. It is unclear however that human difficulty distinguishing samples of white noise is a perceptual issue- It could instead perhaps be due to cognitive/memory limitations. If one concentrates on an individual patch one can usually tell apart two samples. Support for these difficulties emerging from perceptual limitations, or a discussion of the possibility of these limitations being more cognitive should be discussed, or a different analogy employed.

      Relatedly, in Figure 14, the authors do not explain why the white noise seeds would be more likely to produce syntheses that end up in different human equivalence classes.

      It would be nice to see the effect of pink noise seeds, which mirror the power spectrum of natural images, but do not contain the same structure as natural images - this may address the artifacts noted in Figure 9b.

      Finally, the authors note high-frequency artifacts in Figure 4 & P5L135, that remain after syntheses from the luminance model. They hypothesize that this is due to a lack of constraints on frequencies above that defined by the pooling region size. Could these be addressed with a white noise image seed that is pre-blurred with a low pass filter removing the frequencies above the spatial frequency constrained at the given eccentricity?

      Schematic of metamerism:<br /> Figures 1,2,12, and 13 show a visual schematic of the state space of images, and their relationship to both model and human metamers. This is depicted as a Voronoi diagram, with individual images near the center of each shape, and other images that fall at different locations within the same cell producing the same human visual system response. I felt this conceptualization was helpful. However, implicitly it seems to make a distinction between metamerism and JND (just noticeable difference). I felt this would be better made explicit. In the case of JND, neighboring points, despite having different visual system responses, might not be distinguishable to a human observer.

      In these diagrams and throughout the paper, the phrase 'visual stimulus' rather than 'image' would improve clarity, because the location of the stimulus in relation to the fovea matters whereas the image can be interpreted as the pixels displayed on the computer.

      Other<br /> The authors show good reproducibility practices with links to relevant code, datasets, and figures.

    1. Reviewer #1 (Public Review):

      In the manuscript by Urban et al., the authors attempt to further delineate the role with which non-neuronal CNS cells play in the development of ALS. Towards this goal, the transmembrane signaling molecule ephrinB2 was studied. It was found that there is an increased expression of ephrinB2 in astrocytes within the cervical ventral horn of the spinal cord in a rodent model of ALS. Moreover, reduction of ephrinB2 reduced motoneuron loss and prevented respiratory dysfunction at the NMJ. Further driving the importance of ephrinB2 is an increased expression in the spinal cords of human ALS individuals. Collectively, these findings present compelling evidence implicating ephrinB2 as a contributing factor towards the development of ALS.

    1. Reviewer #1 (Public Review):

      To further understand the plasticity of vestibular compensation, Schenberg et al. sought to characterize the response of the vestibular system to short-term and partial impairment using gaze stabilization behaviors. A transient ototoxic protocol affected type I hair cells and produced gain changes in the vestibulo-ocular reflex and optokinetic response. Interestingly, decreases in vestibular function occurred in coordination with an increase in ocular reflex gain at frequencies where vestibular information is more highly weighted over visual. Moreover, computational approaches revealed unexpected detriment from low reproducibility on combined gaze responses. These results inform the current understanding of visual-vestibular integration especially in the face of dysfunction.

      Strengths<br /> The manuscript takes advantage of VOR measurements that can be activated by targeted organs, are used in many species including clinically, and indicate additional adverse effects of vestibular dysfunction.

      The authors use a variety of experimental procedures and analysis methods to verify results and consider individual performance effects on the population data.

      The conclusions are well-justified by current data and supported by previous research and theories of visuo-vestibular function and plasticity.

    2. Reviewer #2 (Public Review):

      This is a very nice study showing how partial loss of vestibular function leads to long term alterations in behavioural responses of mice. Specifically, the authors show that VOR involving both canal and otolith afferents are strongly attenuated following treatment and partially recover. The main result is that loss of VOR is partially "compensated" by increased OKR in treated animals. Finally, the authors show that treatment primarily affects type I hair cells as opposed to type II hair cells. Overall, these results have important implications for our understanding of how the VOR Is generated using input from both type I and type II hair cells.

      The major strength of the study lies in the use of partial inactivation of hair cells to look at the effects on behaviors such as VOR and OKR. Some weaknesses stem from the fact that the effects of inactivation are highly variable across specimens and that there is no recovery of behavioral function.

    1. Reviewer #1 Public Review:

      Summary:<br /> This study examines to what extent this phenomenon varies based on the visibility of the saccade target. Visibility is defined as the contrast level of the target with respect to the noise background, and it is related to the signal-to-noise ratio of the target. A more visible target facilitates the oculomotor behavior planning and execution, however, as speculated by the authors, it can also benefit foveal prediction even if the foveal stimulus visibility is maintained constant. Remarkably, the authors show that presenting a highly visible saccade target is beneficial for foveal vision as the detection of stimuli with an orientation similar to that of the saccade target is improved, the lower the saccade target visibility, the less prominent the effect.

      Strengths:<br /> The results are convincing and the research methodology is technically sound.

      Weaknesses:<br /> Discussion on how this phenomenon may unfold in natural viewing conditions when the foveal and saccade target stimuli are complex and are constituted by different visual properties is lacking. Some speculations regarding feedforward vs feedback neural processing involved in the phenomenon and the speed of the feedforward signal in relation to the visibility of the target, are not well justified and not clearly supported by the data.

    2. Reviewer #2 Public Review:

      Summary:<br /> In this manuscript, the authors ran a dual task. Subjects monitored a peripheral location for a target onset (to generate a saccade to), and they also monitored a foveal location for a foveal probe. The foveal probe could be congruent or incongruent with the orientation of the peripheral target. In this study, the authors manipulated the conspicuity of the peripheral target, and they saw changes in performance in the foveal task. However, the changes were somewhat counterintuitive.

      Strengths:<br /> The authors use solid analysis methods and careful experimental design.

      Weaknesses:<br /> I have some issues with the interpretation of the results, as explained below. In general, I feel that a lot of effects are being explained by attention and target-probe onset asynchrony etc, but this seems to be against the idea put forth by the authors of "foveal prediction for visual continuity across saccades". Why would foveal prediction be so dependent on such other processes? This needs to be better clarified and justified.

      Specifics:<br /> The explanation of decreased hit rates with increased peripheral target opacity is not convincing. The authors suggest that higher contrast stimuli in the periphery attract attention. But, then, why are the foveal results occurring earlier (as per the later descriptions in the manuscript)? And, more importantly, why would foveal prediction need to be weaker with stronger pre-saccadic attention to the periphery? What is the function of foveal prediction? What of the other interpretation that could be invoked in general for this type of task used by the authors: that the dual task is challenging and that subjects somehow misattribute what they saw in the peripheral task when planning the saccade. i.e. foveal hit rates are misperceptions of the peripheral target. When the peripheral target is easier to see, then the foveal hit rate drops.

      The analyses of Fig. 3C appear to be overly convoluted. They also imply an acknowledgment by the authors that target-probe temporal difference matters. Doesn't this already negate the idea that the foveal effects are associated with the saccade generation process itself? If the effect is related to target onset, how is it interpreted as related to a foveal prediction that is associated with the saccade itself? Also, the oscillatory nature of the effect in Fig. 3C for 59% and 90% opacity is quite confusing and not addressed. The authors simply state that enhancement occurs earlier before the saccade for higher contrasts. But, this is not entirely true. The enhancement emerges then disappears and then emerges again leading up to the saccade. Why would foveal prediction do that?

      The interpretation of Fig. 4 is also confusing. Doesn't the longer latency already account for the lapse in attention, such that visual continuity can proceed normally now that the saccade is actually eventually made? In all results, it seems that the effects are all related to the dual nature of the task and/or attention, rather than to the act of making the saccade itself. Why should visual continuity (when a saccade is actually made, whether with short or long latency) have different "fidelity"? And, isn't this disruptive to the whole idea of visual continuity in the first place?

      Small question: is it just me or does the data in general seem to be too excessively smoothed?

    1. Reviewer #1 (Public Review):

      Summary:<br /> In this manuscript, the authors have applied an asymmetric split mNeonGreen2 (mNG2) system to human iPSCs. Integrating a constitutively expressed long fragment of mNG2 at the AAVS1 locus, allows other proteins to be tagged through the use of available ssODN donors. This removes the need to generate long AAV donors for tagging, thus greatly facilitating high-throughput tagging efforts. The authors then demonstrate the feasibility of the method by successfully tagging 9 markers expressed in iPSC at various, and one expressed upon endoderm differentiation. Several additional differentiation markers were also successfully tagged but not subsequently tested for expression/visibility. As one might expect for high-throughput tagging, a few proteins, while successfully tagged at the genomic level, failed to be visible. Finally, to demonstrate the utility of the tagged cells, the authors isolated clones with genes relevant to cytokinesis tagged, and together with an AI to enhance signal-to-noise ratios, monitored their localization over cell division.

      Strengths:<br /> Characterization of the mNG2 tagged parental iPSC line was well and carefully done including validation of a single integration, the presence of markers for continued pluripotency, selected off-target analysis, and G-banding-based structural rearrangement detection.

      The ability to tag proteins with simple ssODNs in iPSC capable of multi-lineage differentiation will undoubtedly be useful for localization tracking and reporter line generation.

      Validation of clone genotypes was carefully performed and highlights the continued need for caution with regard to editing outcomes.

      Weaknesses:<br /> IF and flow cytometry figures lack quantification and information on replication. How consistent is the brightness and localization of the markers? How representative are the specific images? Stability is mentioned in the text but data on the stability of expression/brightness is not shown.

      The localization of markers, while consistent with expectations, is not validated by a second technique such as antibody staining, and in many cases not even with Hoechst to show nuclear vs cytoplasmic.

      For the multi-germ layer differentiation validation, NCAM is also expressed by ectoderm, so isn't a good solo marker for mesoderm as it was used. Indeed, the kit used for the differentiation suggests Brachyury combined with either NCAM or CXCR4, not NCAM alone.

      Only a single female parental line has been generated and characterized. It would have been useful to have several lines and both male and female to allow sex differences to be explored.

      The AI-based signal-to-noise enhancement needs more details and testing. Such models can introduce strong assumptions and thus artefacts into the resolved data. Was the model trained on all markers or were multiple models trained on a single marker each? For example, if trained to enhance a single marker (or co-localized group of markers), it could introduce artefacts where it forces signal localization to those areas even for others. What happens if you feed in images with scrambled pixel locations, does it still say the structures are where the training data says they should be? What about markers with different localization from the training set? If you feed those in, does it force them to the location expected by the training data or does it retain their differential true localization and simply enhance the signal?

    2. Reviewer #2 (Public Review):

      Summary:<br /> The authors have generated human iPSC cells constitutively expressing the mNG21-10 and tested them by endogenous tagging multiple genes with mNG211 (several tagged iPS cell lines clones were isolated). With this tool, they have explored several weakly expressed cytokinesis genes and gained insights into how cytokinesis occurs.

      Strengths:<br /> Human iPSC cells are used.

      Weaknesses:<br /> i) The manuscript is extremely incremental, no improvements are present in the split-fluorescent (split-FP) protein variant used nor in the approach for endogenous tagging with split-FPs (both of them are already very well established and used in literature as well as in different cell types).

      ii) The fluorescence intensity of the split mNeonGreen appears rather low, for example in Figure 2C the H2BC11, ANLN, SOX2, and TUBB3 signals are very noisy (differences between the structures observed are almost absent). For low-expression targets, this is an important limitation. This is also stated by the authors but image restoration could not be the best solution since a lot of biologically relevant information will be lost anyway.

      iii) There is no comparison with other existing split-FP variants, methods, or imaging and it is unclear what the advantages of the system are.

    3. Reviewer #3 (Public Review):

      The authors report on the engineering of an induced Pluripotent Stem Cell (iPSC) line that harbours a single copy of a split mNeonGreen, mNG2(1-10). This cell line is subsequently used to take endogenous protein with a smaller part of mNeonGreen, mNG2(11), enabling the complementation of mNG into a fluorescent protein that is then used to visualize the protein. The parental cell is validated and used to construct several iPSC lines with endogenously tagged proteins. These are used to visualize and quantify endogenous protein localisation during mitosis.

      I see the advantage of tagging endogenous loci with small fragments, but the complementation strategy has disadvantages that deserve some attention. One potential issue is the level of the mNG2(1-10). Is it clear that the current level is saturating? Based on the data in Figure S3, the expression levels and fluorescence intensity levels show a similar dose-dependency which is reassuring, but not definitive proof that all the mNG2(11)-tagged protein is detected.

      Do the authors see a difference in fluorescence intensity for homo- and heterozygous cell lines that have the same protein tagged with mNG2(11)? One would expect two-fold differences, or not?

      Related to this, would it be favourable to have a homozygous line for expressing mNG2(1-10)?

      The complementation seems to work well for the proteins that are tested. Would this also work for secreted (or other organelle-resident) proteins, for which the mNG2(11) tag is localised in a membrane-enclosed compartment?

      The authors present a technological advance and it would be great if others could benefit from this as well by having access to the cell lines.

    1. Reviewer #1 (Public Review):

      Summary:<br /> The authors develop a method to fluorescently tag peptides loaded onto dendritic cells using a two-step method with a tetracystein motif modified peptide and labelling step done on the surface of live DC using a dye with high affinity for the added motif. The results are convincing in demonstrating in vitro and in vivo T cell activation and efficient label transfer to specific T cells in vivo. The label transfer technique will be useful to identify T cells that have recognised a DC presenting a specific peptide antigen to allow the isolation of the T cell and cloning of its TCR subunits, for example. It may also be useful as a general assay for in vitro or in vivo T-DC communication that can allow the detection of genetic or chemical modulators.

      Strengths:<br /> The study includes both in vitro and in vivo analysis including flow cytometry and two-photon laser scanning microscopy. The results are convincing and the level of T cell labelling with the fluorescent pMHC is surprisingly robust and suggests that the approach is potentially revealing something about fundamental mechanisms beyond the state of the art.

      Weaknesses:<br /> The method is demonstrated only at high pMHC density and it is not clear if it can operate at at lower peptide doses where T cells normally operate. However, this doesn't limit the utility of the method for applications where the peptide of interest is known. It's not clear to me how it could be used to de-orphan known TCR and this should be explained if they want to claim this as an application. Previous methods based on biotin-streptavidin and phycoerythrin had single pMHC sensitivity, but there were limitations to the PE-based probe so the use of organic dyes could offer advantages.

    2. Reviewer #2 (Public Review):

      Summary:<br /> The authors here develop a novel Ovalbumin model peptide that can be labeled with a site-specific FlAsH dye to track agonist peptides both in vitro and in vivo. The utility of this tool could allow better tracking of activated polyclonal T cells particularly in novel systems. The authors have provided solid evidence that peptides are functional, capable of activating OTII T cells, and that these peptides can undergo trogocytosis by cognate T cells only.

      Strengths:<br /> -An array of in vitro and in vivo studies are used to assess peptide functionality.<br /> -Nice use of cutting-edge intravital imaging.<br /> -Internal controls such as non-cogate T cells to improve the robustness of the results (such as Fig 5A-D).<br /> -One of the strengths is the direct labeling of the peptide and the potential utility in other systems.

      Weaknesses:<br /> 1. What is the background signal from FlAsH?<br /> The baselines for Figure 1 flow plots are all quite different. Hard to follow. What does the background signal look like without FLASH (how much fluorescence shift is unlabeled cells to No antigen+FLASH?). How much of the FlAsH in cells is actually conjugated to the peptide? In Figure 2E, it doesn't look like it's very specific to pMHC complexes. Maybe you could double-stain with Ab for MHCII. Figure 4e suggests there is no background without MHCII but I'm not fully convinced. Potentially some MassSpec for FLASH-containing peptides.

      2. On the flip side, how much of the variant peptides are getting conjugated in cells? I'd like to see some quantification (HPLC or MassSpec). If it's ~10% of peptides that get labeled, this could explain the low shifts in fluorescence and the similar T cell activation to native peptides if FlasH has any deleterious effects on TCR recognition. But if it's a high rate of labeling, then it adds confidence to this system.

      3. Conceptually, what is the value of labeling peptides after loading with DCs? Why not preconjugate peptides with dye, before loading, so you have a cleaner, potentially higher fluorescence signal? If there is a potential utility, I do not see it being well exploited in this paper. There are some hints in the discussion of additional use cases, but it was not clear exactly how they would work. One mention was that the dye could be added in real-time in vivo to label complexes, but I believe this was not done here. Is that feasible to show?

      4. Figure 5D-F the imaging data isn't fully convincing. For example, in 5F and 2G, the speeds for T cells with no Ag should be much higher (10-15micron/min or 0.16-0.25micron/sec). The fact that yours are much lower speeds suggests technical or biological issues, that might need to be acknowledged or use other readouts like the flow cytometry.

    1. Reviewer #1 (Public Review):

      Summary:<br /> The authors have studied the effects of platelets in OPC biology and remyelination. For this, they used mutant mice with lower levels of platelets as a demyelinating/remyelinating scenario, as well as in a model with large numbers of circulating platelets.

      Strengths:<br /> -The work is very focused, with defined objectives.<br /> -The work is properly done.

      Weaknesses:<br /> -There is no clear effect on a single cell type and/or mechanism involved.

    2. Reviewer #2 (Public Review):

      Summary:<br /> This paper examined whether circulating platelets regulate oligodendrocyte progenitor cell (OPC) differentiation for the link with multiple sclerosis (MS). They identified that the interaction with platelets enhances OPC differentiation although persistent contact inhibits the process in the long-term. The mouse model with increased platelet levels in the blood reduced mature oligodendrocytes, while how platelets might regulate OPC differentiation is not clear yet.

      Strengths:<br /> The use of both partial platelet depletion and thrombocytosis mouse models gives in vivo evidence. The presentation of platelet accumulation in a time-course manner is rigorous. The in vitro co-culture model tested the role of platelets in OPC differentiation, which was supportive of in vivo observations.

      Weaknesses:<br /> How platelets regulate OPC differentiation is not clear. What the significance of platelets is in MS progression is not clear.

    1. Reviewer #1 (Public Review):

      Summary:<br /> The manuscript by Xia et al. investigated the mechanisms underlying Glucocorticoid-induced osteonecrosis of the femoral head (GONFH). The authors observed that abnormal osteogenesis and adipogenesis are associated with decreased β-catenin in the necrotic femoral head of GONFH patients, and that the inhibition of β-catenin signalling leads to abnormal osteogenesis and adipogenesis in GONFH rats. Of interest, the deletion of β-catenin in Col2-expressing cells rather than in osx-expressing cells leads to a GONFH-like phenotype in the femoral head of mice.

      Strengths:<br /> A strength of the study is that it sets up a Col2-expressing cell-specific β-catenin knockout mouse model that mimics the full spectrum of osteonecrosis phenotype of GONFH. This is interesting and provides new insights into the understanding of GONFH. Overall, the data are solid and support their conclusions.

    2. Reviewer #2 (Public Review):

      Summary:<br /> In this manuscript, the authors reported a study to uncover that β-catenin inhibition disrupting the homeostasis of osteogenic/adipogenic differentiation contributes to the development of Glucocorticoid-induced osteonecrosis of the femoral head (GONFH). In this study, they first observed abnormal osteogenesis and adipogenesis associated with decreased β-catenin in the necrotic femoral head of GONFH patients, but the exact pathological mechanisms of GONFH remain unknown. They then performed in vivo and in vitro studies to further reveal that glucocorticoid exposure disrupted osteogenic/adipogenic differentiation bone marrow stromal cells (BMSCs) by inhibiting β-catenin signaling in glucocorticoid-induced GONFH rats, and specific deletion of β-catenin in Col2+ cells shifted BMSCs commitment from osteoblasts to adipocytes, leading to a full spectrum of disease phenotype of GONFH in adult mice.

      Strengths:<br /> This innovative study provides strong evidence supporting that β-catenin inhibition disrupts the homeostasis of osteogenic/adipogenic differentiation that contributes to the development of GONFH. This study also identifies an ideal genetically modified mouse model of GONFH. Overall, the experiment is logically designed, the figures are clear, and the data generated from humans and animals is abundant supporting their conclusions.

      Weaknesses:<br /> There is a lack of discussion to explain how the Wnt agonist 1 works. There are several types of Wnt ligands. It is not clear if this agonist only targets Wnt1 or other Wnts as well. Also, why Wnt agonist 1 couldn't rescue the GONFH-like phenotype in β-cateninCol2ER mice needs to be discussed.

    3. Reviewer #3 (Public Review):

      Summary:<br /> In this manuscript, the authors are trying to delineate the mechanism underlying the osteonecrosis of the femoral head.

      Strengths:<br /> The authors provided compelling in vivo and in vitro data to demonstrate Col2+ cells and Osx+ cells were differentially expressed in the femoral head. Moreover, inducible knockout of β-catenin in Col2+ cells but not Osx+ cells lead to a GONFH-like phenotype including fat accumulation, subchondral bone destruction, and femoral head collapse, indicating that imbalance of osteogenic/adipogenic differentiation of Col2+ cells plays an important role in GONFH pathogenesis. Therefore, this manuscript provided mechanistic insights into osteonecrosis as well as potential therapeutic targets for disease treatment.

      Weaknesses:<br /> However, additional in-depth discussion regarding the phenotype observed in mice is highly encouraged.

    1. Reviewer #1 (Public Review):

      Summary:<br /> In this report, Yu et al ascribe potential tumor suppressive functions to the non-core regions of RAG1/2 recombinases. Using a well-established BCR-ABL oncogene-driven system, the authors model the development of B cell acute lymphoblastic leukemia in mice and found that RAG mutants lacking non-core regions show accelerated leukemogenesis. They further report that the loss of non-core regions of RAG1/2 increases genomic instability, possibly caused by increased off-target recombination of aberrant RAG-induced breaks. The authors conclude that the non-core regions of RAG1 in particular not only increase the fidelity of VDJ recombination, but may also influence the recombination "range" of off-target joints, and that in the absence of the non-core regions, mutant RAG1/2 (termed cRAGs) catalyze high levels of off-target recombination leading to the development of aggressive leukemia.

      Strengths:<br /> The authors used a genetically defined oncogene-driven model to study the effect of RAG non-core regions on leukemogenesis. The animal studies were well performed and generally included a good number of mice. Therefore, the finding that cRAG expression led to the development of more aggressive BCR-ABL+ leukemia compared to fRAG is solid.

      Weaknesses:<br /> In general, I find the mechanistic explanation offered by the authors to explain how the non-core regions of RAG1/2 suppress leukemogenesis to be less convincing. My main concern is that cRAG1 and cRAG2 are overexpressed relative to fRAG1/2. This raises the possibility that the observed increased aggressiveness of cRAG tumors compared to fRAG tumors could be solely due to cRAG1/2 overexpression, rather than any intrinsic differences in the activity of cRAG1/2 vs fRAG1/2; and indeed, the authors allude to this possibility in Fig S8, where it was shown that elevated expression of RAG (i.e. fRAG) correlated with decreased survival in pediatric ALL. Although it doesn't mean the authors' assertions are incorrect, this potential caveat should nevertheless be discussed.

      Some of the conclusions drawn were not supported by the data.<br /> 1. I'm not sure that the authors can conclude based on μHC expression that there is a loss of pre-BCR checkpoint in cRAG tumors. In fact, Fig. 2B showed that the differences are not statistically significant overall, and more importantly, μHC expression should be detectable in small pre-B cells (CD43-). This is also corroborated by the authors' analysis of VDJ rearrangements, showing that it has occurred at the H chain locus in cRAG cells.

      2. The authors found a high degree of polyclonal VDJ rearrangements in fRAG tumor cells but a much more limited oligoclonal VDJ repertoire in cRAG tumors. They concluded that this explains why cRAG tumors are more aggressive because BCR-ABL induced leukemia requires secondary oncogenic hits, resulting in the outgrowth of a few dominant clones (Page 19, lines 381-398). I'm not sure this is necessarily a causal relationship since we don't know if the oligoclonality of cRAG tumors is due to selection based on oncogenic potential or if it may actually reflect a more restricted usage of different VDJ gene segments during rearrangement.

      3. What constitutes a cancer gene can be highly context- and tissue-dependent. Given that there is no additional information on how any putative cancer gene was disrupted (e.g., truncation of regulatory or coding regions), it is not possible to infer whether increased off-target cRAG activity really directly contributed to the increased aggressiveness of leukemia.

      4. Fig. 6A, it seems that it is really the first four nucleotide (CACA) that determines fRAG binding and the first three (CAC) that determine cRAG binding, as opposed to five for fRAG and four for cRAG, as the author wrote (page 24, lines 493-497).

      5. Fig S3B, I don't really see why "significant variations in NHEJ" would necessarily equate "aberrant expression of DNA repair pathways in cRAG leukemic cells". This is purely speculative. Since it has been reported previously that alt-EJ/MMEJ can join off target RAG breaks, do the authors detect high levels of microhomology usage at break points in cRAG tumors?

      6. Fig. S7, CDKN2B inhibits CDK4/6 activation by cyclin D, but I don't think it has been shown to regulate CDK6 mRNA expression. The increase in CDK6 mRNA likely just reflects a more proliferative tumor but may have nothing to do with CDKN2B deletion in cRAG1 tumors.

      Insufficient details in some figures. For instance, Fig. 1A, please include statistics in the plot showing a comparison of fRAG vs cRAG1, fRAG vs cRAG2, cRAG1 vs cRAG2. As of now, there's a single p-value (0.0425) stated in the main text and the legend but why is there only one p-value when fRAG is compared to cRAG1 or cRAG2? Similarly, the authors wrote "median survival days 11-26, 10-16, 11-21 days, P < 0.0023-0.0299, Fig. S2B." However, it is difficult for me to figure out what are the numbers referring to. For instance, is 11-26 referring to median survival of fRAG inoculated with three different concentrations of GFP+ leukemic cells or is 11-26 referring to median survival of fRAG, cRAG1, cRAG2 inoculated with 10^5 cells? It would be much clearer if the authors can provide the numbers for each pair-wise comparison, if not in the main text, then at least in the figure legend. In Fig. 5A-B, do the plots depict SVs in cRAG tumors or both cRAG and fRAG cells? Also in Fig. 5, why did 24 SVs give rise to 42 breakpoints, and not 48? Doesn't it take 2 breaks to accomplish rearrangement? In Fig. 6B-C, it is not clear how the recombination sizes were calculated. In the examples shown in Fig. 4, only cRAG1 tumors show intra-chromosomal joins (chr 12), while fRAG and cRAG2 tumors show exclusively inter-chromosomal joins.

      Insufficient details on certain reagents/methods. For instance, are the cRAG1/2 mice of the same genetic background as fRAG mice (C57BL/6 WT)? On Page 23, line 481, what is a cancer gene? How are they defined? In Fig. 3C, are the FACS plots gated on intact cells? Since apoptotic cells show high levels of gH2AX, I'm surprised that the fraction of gH2AX+ cells is so much lower in fRAG tumors compared to cRAG tumors. The in vitro VDJ assay shown in Fig 3B is not described in the Method section (although it is described in Fig S5b). Fig. 5A-B, do the plots depict SVs in cRAG tumors or both cRAG and fRAG cells?

    2. Reviewer #2 (Public Review):

      Summary: In the manuscript, the authors summarized and introduced the correlation between the non-core regions of RAG1 and RAG2 in BCR-ABL1+acute B lymphoblastic leukemia and off-target recombination which has certain innovative and clinical significance.

    1. Reviewer #1 (Public Review):

      Summary of Author's Objectives:

      The authors aimed to explore JMJD6's role in MYC-driven neuroblastoma, particularly in the interplay between pre-mRNA splicing and cancer metabolism, and to investigate the potential for targeting this pathway.

      Strengths:

      1. The study employs a diverse range of experimental techniques, including molecular biology assays, next-generation sequencing, interactome profiling, and metabolic analysis. Moreover, the authors specifically focused on gained chromosome 17q in neuroblastoma, in combination with analyzing cancer dependency genes screened with Crispr/Cas9 library, analyzing the association of gene expression with prognosis of neuroblastoma patients with large clinical cohort. This comprehensive approach strengthens the credibility of the findings. The identification of the link between JMJD6-mediated pre-mRNA splicing and metabolic reprogramming in MYC-driven cancer cells is innovative.

      2. The authors effectively integrate data from multiple sources, such as gene expression analysis, RNA splicing analysis, JMJD6 interactome assay, and metabolic profiling. This holistic approach provides a more complete understanding of JMJD6's role.

      3. The identification of JMJD6 as a potential therapeutic target and its correlation with the response to indisulam have significant clinical implications, addressing an unmet need in cancer treatment.

      Weaknesses:

      1. The manuscript contains complex technical details and terminology that may pose challenges for readers without a deep background in molecular biology and cancer research. Providing simplified explanations or additional context would enhance accessibility.

      2. It would be beneficial to explore whether treatment with JMJD6 inhibitors, both in vitro and in vivo, can effectively target the enhanced pre-mRNA splicing of metabolic genes in MYC-driven cancer cells.

      Appraisal of Achievement and Conclusion Support:

      The authors have effectively met their objectives by offering valuable insights into JMJD6's role in MYC-driven neuroblastoma. The results robustly underpin their conclusions about JMJD6's contribution to metabolic reprogramming through alternative splicing and its connection to the therapeutic response to indisulam.

      Likely Impact on the Field and Utility of Methods/Data:

      The study's findings have the potential to significantly impact the field of cancer research by identifying JMJD6 as a promising therapeutic target for MYC-driven cancers. The methods and data presented in the manuscript offer valuable resources to the research community for further investigations into cancer metabolism and splicing regulation.

      Additional Context for Interpretation:

      Understanding the complex interplay between cancer metabolism and splicing regulation is crucial for developing effective cancer treatments. This study sheds light on a previously poorly understood aspect of MYC-driven cancers and opens new avenues for targeted therapies. However, the transition from preclinical findings to clinical applications may face challenges, which should be considered in future research and clinical trials.

    2. Reviewer #2 (Public Review):

      Summary:

      Jablonowski and colleagues studied key characteristics of MYC-driven cancers: dysregulated pre-mRNA splicing and altered metabolism. This is an important field of study as it remains largely unclear as to how these processes are coordinated in response to malignant transformation and how they are exploitable for future treatments. In the present study, the authors attempt to show that Jumonji Domain Containing 6, Arginine Demethylase And Lysine Hydroxylase (JMJD6) plays a central role in connecting pre-mRNA splicing and metabolism in MYC-driven neuroblastoma. JMJD6 collaborates with the MYC protein in driving cellular transformation by physically interacting with RNA-binding proteins involved in pre-mRNA splicing and protein regulation. In cell line experiments, JMJD6 affected the alternative splicing of two forms of glutaminase (GLS), an essential enzyme in the glutaminolysis process within the central carbon metabolism of neuroblastoma cells. Additionally, the study provides in vitro (and in silico) evidence for JMJD6 being associated with the anti-proliferation effects of a compound called indisulam, which degrades the splicing factor RBM39, known to interact with JMJD6.

      Overall, the findings presented by Jabolonowski et al. begin to illuminate a cancer-promoting metabolic, and potentially, a protein synthesis suppression program that may be linked to alternative pre-mRNA splicing through the action of JMJD6 - downstream of MYC. This discovery can provide further evidence for considering JMJD6 as a potential therapeutic target for the treatment of MYC-driven cancers.

      Strengths:

      Alternative Splicing Induced by JMJD6 Knockdown: the study presents evidence for the role of JMJD6 in alternative splicing in neuroblastoma cells. Specifically, the RNA immunoprecipitation experiments demonstrated a significant shift from the GAC to the KGA GLS isoform upon JMJD6 knockdown. Moreover, a significant correlation between JMJD6 levels and GAC/KGA isoform expression was identified in two distinct neuroblastoma cohorts. This suggests a causative link between JMJD6 activity and isoform prevalence.

      Physical Interaction of JMJD6 in Neuroblastoma Cells: The paper provides preliminary insight into the physical interactome of JMJD6 in neuroblastoma cells. This offers a potential mechanistic avenue for the observed effects on metabolism and protein synthesis and could be exploited for a deeper investigation into the exact nature, and implications of neuroblastoma-specific JMJD6 protein-protein interactions.

      Weaknesses:

      There are several areas that would benefit from improvements with regard to the current data supporting the claims of the paper (i.e., the conclusion presented in Figure 8).

      Neuroblastoma Modelling Strategy: The study heavily relies on cell lines without incorporating patient-derived cells/biomaterials. Using databases to fill gaps in the experimental design can only fortify the observations to a certain extent. A critical oversight is the absence of non-cancerous control cells in many figures, and the rationale for selecting specific cell lines for assays/approaches remains somewhat unclear. A foundational control for such experiments should involve the non-transformed neural crest cell line, which the authors have readily available. Are the observed splicing and metabolic effects of JMJD6 specific to neuroblastoma? Is there a neuroblastoma-specific JMJD6 interactome? Is MYC function essential?

      In Vivo Modelling: The inclusion of a genetic mouse model combined with an inducible JMJD6 knockdown, would enhance the study by allowing examination of JMJD6's role during both tumor initiation and growth in vivo. For instance, the TH-MYCN mice overexpressing MYCN in neural crest cells, could be a promising choice.

      Dependence on Colony Formation Assay: The study leans on 2D and semi-quantitative colony formation assays to assess malignant growth. To validate the link between the mechanistic insights discussed (e.g., reduced protein synthesis) and JMJD6-mediated malignant growth as a potential therapeutic target, evidence from in vivo or representative 3D models would be crucial.

      Data Presentation and Rigor: The presented data is predominantly qualitative and necessitates quantification. For instance, Western blots should be quantified. The RNAseq, metabolism, and pull-down data should be transparently and numerically presented. The figure legends seem elusive and their lack of transparency (often with regards to biological repeats, error bars, cell line used etc.) is concerning. Adequate citation and identification of all data sources, including online resources, are imperative. The manuscript would also benefit from a more rigorous depiction and quantification of RNA interference of both stable and transient knockdowns with quantitative validation at mRNA and protein levels.

      Novelty Concerns: The emphasis on JMJD6 as a novel neuroblastoma target is contingent on the new mechanistic revelations about the JMJD6-centered link between splicing, metabolism, and protein synthesis. Given that JMJD6 has been previously linked to neuroblastoma biology, the rationale (particularly in Figure 1) for concentrating on JMJD6 may stem more from bias rather than data-driven reasoning.

      Depth of Mechanistic Investigation: Current evidence lacks depth in key areas such as JMJD6-RNA binding. A more thorough approach would involve pinpointing specific JMJD6 binding sites on endogenous RNAs using techniques such as cross-linking and immunoprecipitation, paired with complementary proximity-based methodologies. Regarding the presented metabolism data, diving deeper into metabolic flux via isotope labeling experiments could shed light on dynamic processes like TCA and glutaminolysis. As it stands, the 'pathway cartoon' in Figure 6d appears overly qualitative.

    1. Reviewer #1 (Public Review):

      Assessment:

      The manuscript titled 'Rab7 dependent regulation of goblet cell protein CLCA1 modulates gastrointestinal 1 homeostasis' by Gaur et al discusses the role of Rab7 in the development of ulcerative colitis by regulating the lysosomal degradation of Clca1, a mucin protease. The manuscript presents interesting data and provides a potential molecular mechanism for the pathological alterations observed in ulcerative colitis. Gaur et al demonstrate that Rab7 levels are lowered in UC and CD. However, a similar analysis of Rab7 levels in ulcerative colitis (UC) and Crohn's disease (CD) patient samples was conducted recently (Du et al, Dev Cell, 2020) which showed that Rab7 levels are found to be elevated under these conditions. While Gaur et al have briefly mentioned Du et al's paper in passing in the discussion, they need to discuss these contradictory results in their paper and clarify these differences. Additionally, Du et al are not included in the list of references.

      Strengths:

      The manuscript used a multi-pronged approach and compares patient samples, mouse models of DSS, and protocols that allow differentiation of goblet cells. They also use a nanogel-based delivery system for siRNAs, which is ideal for the knockdown of specific genes in the gut.

      Weaknesses:

      Du et al, Dev Cell 2020 (https://doi.org/10.1016/j.devcel.2020.03.002) have previously shown that Rab7 levels are elevated in a similar set of colonic samples (age group, number etc) from UC and CD patients. Gaur et al have not discussed this paper or its findings in detail, which directly contradicts their results. Clarification regarding this should be provided.

    2. Reviewer #2 (Public Review):

      Summary:

      In this work, the authors report a role for the well-studied GTPase Rab7 in gut homeostasis. The study combines cell culture experiments with mouse models and human ulcerative colitis patient tissues to propose a model where, Rab7 by delivering a key mucous component CLCA1 to lysosomes, regulates its secretion in the goblet cells. This is important for the maintenance of mucous permeability and gut microbiota composition. In the absence of Rab7, CLCA1 protein levels are higher in tissues as well as the mucus layer, corroborating with the anti-correlation of Rab7 (reduced) and CLCA1 (increased) from ulcerative colitis patients. The authors conclude that Rab7 maintains CLCA1 level by controlling its lysosomal degradation, thereby playing a vital role in mucous composition, colon integrity, and gut homeostasis.

      Strengths:

      The biggest strength of this manuscript is the combination of cell culture, mouse model, and human tissues. The experiments are largely well done and in most cases, the results support their conclusions. The authors go to substantial lengths to find a link, such as alteration in microbiota, or mucus proteomics.

      Weaknesses:

      There are also some weaknesses that need to be addressed. The association of Rab7 with UC in both mice and humans is clear, however, claims on the underlying mechanisms are less clear. Does Rab7 regulate specifically CLCA1 delivery to lysosomes, or is it an outcome of a generic trafficking defect? CLCA1 is a secretory protein, how does it get routed to lysosomes, i.e. through Golgi-derived vesicles, or by endocytosis of mucous components? Mechanistic details on how CLCA1 is routed to lysosomes will add substantial value.

      Why does the level of Rab7 fluctuate during DSS treatment (Fig 1B)? Does the reduction seen in Rab7 levels (by WB) also reflect in reduced Rab7 endosome numbers? Are other late endosomal (and lysosomal) populations also reduced upon DSS treatment and UC? Is there a general defect in lysosomal function?

      The evidence for lysosomal delivery of CLCA1 (Fig 7 I, J) is weak. Although used sometimes in combination with antibodies, lysotracker red is not well compatible with permeabilization and immunofluorescence staining. The authors can substantiate this result further using lysosomal antibodies such as Lamp1 and Lamp2. For Fig 7J, it will be good to see a reduction in Rab7 levels upon KD in the same cell. In this connection, Fig S3D is somewhat confusing. While it is clear that the pattern of Muc2 in WT and Rab7-/- cells are different, how this corroborates with the in vivo data on alterations in mucus layer permeability -- as claimed -- is not clear.

      Overall, the work shows a role for a well-studied GTPase, Rab7, in gut homeostasis. This is an important finding and could provide scope and testable hypotheses for future studies aimed at understanding in detail the mechanisms involved.

    1. Reviewer #1 (Public Review):

      Summary:

      In this study, the authors describe the participation of the Hes4-BEST4-Twist axis in controlling the process of epithelial-mesenchymal transition (EMT) and the advancement of colorectal cancers (CRC). They assert that this axis diminishes the EMT capabilities of CRC cells through a variety of molecular mechanisms. Additionally, they propose that reduced BEST4 expression within tumor cells might serve as an indicator of an adverse prognosis for individuals with CRC.

      Strengths:

      • Exploring the correlation between the Hes4-BEST4-Twist axis, EMT, and the advancement of CRC is a novel perspective and gives readers a fresh standpoint.<br /> • The whole transcriptome sequence analysis (Figure 5) showing low expression of BEST4 in CRC samples will be of broad interest to cancer specialists as well as cell biologists although further corroborative data is essential to strengthen these findings (See Weaknesses).

      Weaknesses:

      • The authors employed three kinds of CRC cell lines, but not untransformed cells such as intestinal epithelial organoids which are commonly used in recent research.<br /> • The authors use three different human CRC cell lines with a lack of consistency in the selection of them. Please clarify 1) how these lines are different from each other, 2) why they pick up one or two of them for each experiment. To be more convincing, at least two lines should be employed for each in vitro experiment.<br /> • The authors demonstrated associations between BEST4 and cell proliferation/viability as well as migration/invasion, utilizing CRC cell lines, but it should be noted that these findings do not indicate a tumor-suppressive role of BEST4 as mentioned in line 120. Furthermore, while the authors propose that "BEST4 functions as a tumor suppressor in CRC" in line 50, there seems no supporting data to suggest BEST4 as a tumor suppressor gene.<br /> • The HES4-BEST4-Twist1 axis likely plays a significant role in CRC progression via EMT but not CRC initiation. Some sentences could lead to a misunderstanding that the axis is important for CRC initiation.<br /> • The authors mostly focus on the relationship of the HES4-BEST4-Twist1 axis with EMT, but their claims sometimes appear to deviate from this focus.<br /> • Some experiments do not appear to have a direct relevance to their claims. For example, the analysis using the xenograft model in Figure 2E-J is not optimal for analyzing EMT. The authors should analyze metastatic or invasive properties of the transplanted tumors if they intend to provide some supporting evidence for their claims.<br /> • In Figure 4H, ZO-1 and E-cad expression looks unchanged in the BEST4 KD.<br /> • The in vivo and in vitro data supporting the whole transcriptome sequence analysis (Figure 5) is mostly insufficient. Including the following experiments will substantiate their claims: 1) BEST4 and HES4 immunostaining of human surgical tissue samples, 2) qPCR data of HES4, Twist1, Vimentin, etc. as shown in Figure 5C, 5D.<br /> • Some statements are inconsistent probably due to grammatical errors. (For example, some High/low may be reversed in lines 234-244.)

    1. Joint Public Review:

      Summary:

      This is an interesting study with high quality imaging and quantitative data. The authors devise a robust quantitative parameter that is easy applicable to any experimental system. The drug screen data can potentially be helpful to the wider community studying nucleolar architecture and effects of chemotherapy drugs. Additionally, the authors find Treacle phosphorylation as a potential link between CDK9 inhibition, rDNA transcription and nucleolar stress. Therefore I think this would be of broad interest to researchers studying transcription, CDKs, nucleolus and chemotherapy drug mechanisms.

      Revised manuscript:

      While most of my concerns related were addressed, a PolI ChIP on rDNA would be an important experiment to establish the relevance of some of the conclusions of the paper using well established protocols with validated antibodies for PolI ChIP. Furthermore, additional S to A mutants of Treacle S1299A/S1301A is an important control which could have provided evidence if indeed S1299/S1301 were the only sites being phosphorylated by CDK9. To support their model, the authors should test if overexpression of Treacle mutants S1299A/S1301A can partially phenocopy the nucleolar stress seen upon CDK9 inhibition. This would considerably strengthen the author's claim that reduced Treacle phosphorylation leads to Pol I disassociation from rDNA and consequently leads to nucleolar stress. If not, it would have strengthened the authors' argument that Treacle could have multiple sites targeted by CDK9 and that mutating any one or two may not be sufficient to cause disassociation from PolI.

      Overall, I believe the primary conclusions regarding the impact of various chemotherapy drugs on nucleolar state are solid and valuable to the broader scientific community. However, the mechanistic exploration of CDK9i is not sufficiently developed, and the authors have not adequately addressed the feedback provided in the original manuscript.

    1. Reviewer #1 (Public Review):

      The manuscript by Muthana et al. describes the effect of injection of an antibody specific for human CTLA4 conjugated to a cytotoxic molecule (Ipi-DM1) in knock-in mice expressing human CTLA4. The authors show that Ipi-DM1 administration causes a partial decrease (about 50% in absolute number) of mature B cells in blood and bone marrow 9-14 days after the beginning of treatment. B cell progenitors and pre-B cells in the BM are not affected. Ipi-DM1 also results in a partial decrease in Foxp3+ Tregs (about 40% in absolute number) and a slight increase in activation of conventional T cells (Tconvs) in the blood, spleen, BM and LNs at D9 as well as increased plasma immunoglobulins especially IgE. Tconv depletion, CTLA4-Ig or anti-TNF mAb partially prevents the effect of ipi-DM1 on B cells. This effect of Ipi-DM1 on the reduction B cells and Tregs at D9 is not observed in the spleen and lymph nodes (maybe not the good timing to see it), and there is even an increase in the number of Treg and the frequency and number of B cells in lymph nodes. This work is interesting but has the following major limitations:

      1- This work could have been of more interest if the Ipi-DM1 molecule would be used in the clinic. As this is not the case, the intimate mechanism of the effect of this molecule in mice is of reduced interest.

      2- The fact that a partial deletion of Tregs is associated with activation of Tconvs and a decrease in B cells is not new. According to the authors, their work would be the first to show that activation of Tconvs would lead to B cell death. However, this is shown in an indirect way and the mechanisms are not really elucidated. The experiments to try to show a causal link are of 2 types: deletion of T cells (Fig 5) and blocking T cell activation with CTLA4-Ig (Fig 6). These 2 experiments are not fully convincing. The absence of B cell depletion in the blood when T cells are deleted can be explained by other mechanisms, such as B cell recirculation to lymphoid tissues or an effect of massive T cell death for example. The experiment with CTLA4-Ig is more convincing because the effect is targeted to activated T cells only. However, the prevention of B cell ablation is only partial. Since only blood is analyzed, other mechanisms could explain the B cell loss, such as their recirculation in lymphoid tissues.

      3- The authors propose that the drop in B cell numbers in the blood in mice treated with Ipi-DM1 results from reduced mature B cells in the bone marrow. However, B cells are continuously recirculating between the blood and secondary lymphoid tissues. The drop of blood B cells could be well explained by an increased recirculation to lymphoid organs. The increased numbers of B cells in lymph nodes support this latter hypothesis.

      4- The new Figure 2 suggests direct evidence of apoptosis of mature B cells in the BM of treated mice using a PI/annexin V staining assay. This is an important point to support the point of the manuscript. However, using the same assay, the level of B cell apoptosis is of 80% in lymph nodes and 50% in the spleen in control mice (see new Figure 2-figure supplement 1), which is way too high and questions the reliability of this assay. It is likely that B cells enter apoptosis only in vitro due to some artefactual stress.

    2. Reviewer #2 (Public Review):

      Despite the fact that CTLA-4 is a critical molecule for inhibiting the immune response, surprisingly individuals with heterozygous CTLA-4 mutations exhibit immunodeficiency, presenting with antibody deficiency secondary to B cell loss. Why the loss of a molecule that regulates T cell activation should lead to B cell loss has remained unclear. In this study, Muthana and colleagues use an anti-CTLA-4 antibody drug conjugate (aCTLA-4 ADC) to delete cells expressing high levels of CTLA-4, and show that this leads to a reduction in B cells. The aCTLA-4 ADC is found to delete a subset of Tregs, leading to hyperactivation of T cells that is associated with B cell depletion. Using blocking antibodies, the authors implicate TNFa in the observed B cell loss.

      The reciprocal regulation of T and B cell homeostasis is an important research area. While it has been shown that Treg defects are associated with B cell loss, the mechanisms at play are incompletely understood. CTLA-4 is not normally expressed in B cells so an indirect mechanism of action is assumed. The authors show that the decrease in Treg following aCTLA-4 ADC treatment is associated with activation of T cells, and that B cell loss is blunted if T cells are depleted. A role for both CD4 and CD8 T cells is identified by selective CD4/CD8 depletion. T cells appear to require CD28 costimulation in order to mediate B cell loss, since the response is partially inhibited in the presence of the costimulation blockade drug belatacept (CTLA-4-Ig). Finally, experiments using the anti-TNFa antibody adalimumab suggest a potential role for TNFa in the depletion of B cells.

      While the manuscript makes a useful contribution, a number of limitations remain. Perhaps most important is the extent to which this model mimics the natural situation in individuals with CTLA-4 mutations (or following CTLA-4-based clinical interventions). aCTLA-4 ADC treatment permits acute deletion of Treg expressing high levels of CTLA-4, whereas in patients the Treg population remains but is specifically impaired in CTLA-4 function. Secondly, although the requirement for T cells to mediate B cell loss is convincingly demonstrated, the incomplete reversal by TNFa blockade suggests additional unidentified factors contribute to this effect. Finally, although the manuscript favours peripheral killing of mature B cells over alterations to B cell lymphopoiesis, one concern is that this may simply reflect the model employed: the short-term (6 day) treatment used here may be too acute to alter B cell development, but this may nevertheless be a feature of prolonged immune dysregulation in humans.

    3. Reviewer #3 (Public Review):

      The co-suppressive molecule CTLA-4 has a critical role in the maintenance of peripheral tolerance, primarily by Treg mediated control of the co-stimulatory molecules CD80 and CD86. As stated by the authors, previous studies have found a variety of effects of anti-CTLA-4 antibody treatment or genetic loss of CTLA-4 on B-cells. These include increased B-cell activation and antibody production, autoantibody production, impairment of B-cell production in the bone marrow and loss of peripheral B-cells. In this article Muthana et al use a CTLA-4 humanized mouse model and examine the effects of drug conjugated CTLA-4 on the immune system. They observe a transient loss of B-cells in the blood of the treated mice. They then use a range of immune interventions such as T-cell depletion and blocking antibodies to demonstrate that this effect is dependent on T-cell activation.

      Since anti-CTLA-4 immunotherapy is in active clinical use exploration of its effects are welcome, this is helped by the use of a humanized CTLA-4 system which should be considered a strength of the paper. However, currently the central premise of this paper, that B-cells are depleted seems underexplored. Direct evidence of T-cell killing of B-cells is never presented, rather it is inferred from the reduced numbers of B-cells in the blood and increased apoptosis in the bone marrow. It is not made clear if B-cell numbers in the bone marrow are reduced.

      Upon examining lymphoid organs it seems that the spleen is relatively unchanged while the lymph nodes have a large increase in B-cells alongside increased serum antibody levels. The paper does underline the importance of looking at the differences of multiple immune compartments and interesting phenomenon are described in each compartment. Simultaneous inhibition of B-cell lymphopoiesis and blood trafficking with strong activation and antibody production of lymphoid resident (presumably germinal center) B-cells appears to be occurring. However the current overall interpretation that B-cells are broadly depleted is perhaps too simplistic and largely ignores the lymphoid organs and serum antibodies.

    1. Reviewer #1 (Public Review):

      Summary:<br /> This study introduces an innovative method for assessing the mean kurtosis, utilizing the mathematical foundation of the sub-diffusion framework. In particular, a new fitting technique that incorporates two different diffusion times is proposed to estimate the parameters of the sub-diffusion model. The evaluation of this technique, which generates kurtosis maps based on the sub-diffusion framework, is conducted through simulations and the examination of data obtained from human subjects.

      Strengths:<br /> The utilization of the sub-diffusion model for tissue characterization is a significant conceptual advancement for the field of diffusion MRI. This study adeptly harnesses this approach for an accurate estimation of the parameters of the widely employed diffusion model, DKI, leveraging their established analytical interconnection as evidenced in prior research. Notably, this approach not only proposes a robust, fast, and accurate technique for DKI parameter estimation but also underscores the viability of deploying the sub-diffusion model for tissue characterization, substantiated by both simulated and human subject analyses. The paper is very-well written; well-organized; and coherent. The simulation study included different aspects of water diffusion as captured by diffusion-weighted MRI such as varying diffusion times and different b-value subpopulations, resulting in a comprehensive and thorough discussion.

      Weaknesses:<br /> The primary objective of this study is to demonstrate a robust approach for estimating DKI parameters by directly calculating them using the parameters of the sub-diffusion model. This premise, however, relies on the assumption that the sub-diffusion model effectively characterizes the diffusion MRI signal and that its parameters are both robust and accurate. Throughout the manuscript, the term "ground truth kurtosis K" is frequently used to denote the "true K" value in the context of the simulation study. Nonetheless, given that the data is simulated using the new sub-diffusion model - an approximation of the DKI-based signal expression- this value cannot truly be considered the "ground truth K". The simulation study highlights the robustness and accuracy of D* and K*, but it inherently operates under the assumption that the observed data is in the form of the sub-diffusion model.

    2. Reviewer #2 (Public Review):

      Summary: The authors present a technique for fitting diffusion magnetic resonance images (dMRI) to a sub-diffusion model of the diffusion process within brain imaging. The authors suggest that their technique provides robust and accurate calculation of diffusional kurtosis imaging parameters from which high quality images can be calculated from short dMRI data acquisitions at two diffusion times.

      Strengths: If the authors can show that the dMRI signal in brain tissue follows a sub-diffusion model decay curve then their technique for accurately and robustly calculating diffusional kurtosis parameters from multiple diffusion times would be of benefit for tissue microstructural imaging in research and clinical arenas.

      Weaknesses: The applied sub-diffusion model has two parameters that are invariant to diffusion time, D_β and β which are used to calculate the diffusional kurtosis measures of a diffusion time dependent D* and a diffusion time invariant K*. However, the authors do not demonstrate that the D_β, β and K* parameters are invariant to diffusion time in brain tissue. The authors' results visually show that there is time dependence of the K* measure (in Figure 6) that is more apparent in white matter with K* values being higher for diffusion times of ∆=49 ms than ∆ = 19 ms. The diffusion time dependence of K* indicates there is also diffusion time dependence of β. Furthermore, Figure 7 shows that there is a tissue specific root mean squared error in model fitting over the two diffusion times which indicates greater deviation from the model fit in white matter than grey matter. To show that the sub-diffusion model is robust and accurate (and consequently that K* is robust and accurate) the authors would have to demonstrate that there is no diffusion time-dependence in both D_β and β in application to brain imaging data for each diffusion time separately. Simulated data should not be used to demonstrate the robustness and accuracy of the sub-diffusion model or to determine optimization of dMRI acquisition parameters without first demonstrating that D_β and β are invariant to diffusion time. This is because simulated signals calculated by using the sub-diffusion charateristic equation of dMRI signal decay will necessarily have diffusion time invariant D_β and β parameters.

      Without further information demonstrating diffusion time invariance of D_β, β and K* it is not possible to determine whether the authors have achieved their aims or that their results support their conclusions.

    1. Joint Public Review:

      The manuscript presents compelling evidence for the role of the zona incerta area of the brain in regulating movement and sensory stimuli in mice. The study uses appropriate and validated methodology in line with the current state-of-the-art, including optogenetic manipulation and recording of single-unit activity. The authors' claims and conclusions are well-supported by their data, which includes a comprehensive review of previous research on the zona incerta. Overall, the manuscript provides solid evidence for the role of the zona incerta in regulating movement and sensory processing.

      Major strengths and weaknesses of the methods and results.<br /> The zona incerta have many integrative functions that link sensory stimuli with motor responses to guide behavior.<br /> The study explored the activation of zona incerta GABAergic neurons during cued avoidance tasks and found that these neurons activate during goal-directed avoidance movement. Optogenetic manipulation of these neurons affected movement speed and performance during active avoidance tasks.<br /> The findings suggest that the zona incerta area of the brain plays a significant role in regulating movement and responding to salient auditory tones in association with movement in mice. The evidence presented is fundamental and provides a comprehensive review of previous research on the zona incerta and its involvement in various behaviors and sensory processing.

      The article is very well written, with a correct hypothesis and a cutting-edge methodology to achieve the expected objectives. Moreover, they use statistical rigorous approaches in the analysis of the results. Also, analyzes are performed using scripts that automate all aspects of data analysis, ensuring their objectivity. The results are very novel, and provides solid evidence for the role of the zona incerta in regulating movement and sensory processing.

    1. Joint Public Review:

      In the current paper, Jones et al. describe a new framework, named "coccinella", for real-time high-throughput behavioral analysis aimed at reducing the cost of analyzing behavior. In the setup used here each fly is confined to a small circular arena and able to walk around on an agar bed spiked with nutrients or pharmacological agents. The new framework, built on the researchers' previously developed platform Ethoscope, relies on relatively low-cost Raspberry Pi video cameras to acquire images at ~0.5 Hz and pull out, in real time, the maximal velocity (parameter extraction) during 10 second windows from each video. Thus, the program produces a text file, and not voluminous videos requiring storage facilities for large amounts of video data, a prohibitive step in many behavioral analyses. The maximal velocity time-series is then fed to an algorithm called Highly Comparative Time-Series Classification (HCTSA)(which itself is based on a large number of feature extraction algorithms) developed by other researchers. HCTSA identifies statistically salient features in the time-series which are then passed on to a type of linear classifier algorithm called support vector machines (SVM). In cases where such analyses are sufficient for characterizing the behaviors of interest this system performs as well as other state-of-the-art systems used in behavioral analysis (e.g., DeepLabCut)

      In a pharmacobehavior paradigm testing different chemicals, the authors show that coccinella can identify specific compounds as effectively as other more time-consuming and resource-consuming systems.

      The new paradigm should be of interest to researchers involved in drug screens, and more generally, in high-throughput analysis focused on gross locomotor defects in fruit flies such as identification of sleep phenotypes. By extracting/saving only the maximal velocity from video clips, the method is fast. However, the rapidity of the platform comes at a cost--loss of information on subtle but important behavioral alterations. When seeking subtle modifications in animal behavior, solutions like DeepLabCut, which are admittedly slower but far superior in terms of the level of details they yield, would be more appropriate.

      The manuscript reads well, and it is scientifically solid. The comments listed below were directed to the original submission and were satisfactorily addressed in the revised version.

      1- The fact that Coccinella runs on Ethoscopes, an open source hardware platform described by the same group, is very useful because the relevant publication describes Ethoscope in detail. However, the current version of the paper does not offer details or alternatives for users that would like to test the framework, but do not have an Ethoscope. Would it be possible to overcome this barrier and have coccinella run with any video data (and, thus, potentially be used to analyze data obtained from other animal models)?

      2- Readers who want background on the analytical approaches that the platform relies on following maximal velocity extraction, will have to consult the original publications. In particular, the current manuscript does not provide much explanation on Highly Comparative Time-Series Classification (HCTSA) or SVM; this may be reasonable because the methods were developed earlier by others. While some readers may find that the lack of details increases the manuscript's readability, others may be left wanting to see more discussion on these not-so-trivial approaches. In addition, it is worth noting that the same authors that published the HCTSA method, also described a shorter version named catch22, that runs faster with a similar output. Thus, explaining in more detail how HCTSA operates, considering is a relatively new method, will make the method more convincing.

    1. Reviewer #1 (Public Review):

      Summary:

      This manuscript explores the multiple cell types present in the wall of murine-collecting lymphatic vessels with the goal of identifying cells that initiate the autonomous action potentials and contractions needed to drive lymphatic pumping. Through the use of genetic models to delete individual genes or detect cytosolic calcium in specific cell types, the authors convincingly determine that lymphatic muscle cells are the origin of the action potential that triggers lymphatic contraction.

      Strengths:

      The experiments are rigorously performed, the data justify the conclusions, and the limitations of the study are appropriately discussed.

      There is a need to identify therapeutic targets to improve lymphatic contraction and this work helps identify lymphatic muscle cells as potential cellular targets for intervention.

      Weaknesses:

      My only major comment would be that the manuscript provides a lot of rich information describing the cellular components of the muscular lymphatic vessel wall and that these data are not well represented by the title. The title (while currently accurate) could be tweaked to better represent all that is in this manuscript. Maybe something like "Characterization/Interrogation of the cellular components of murine collecting lymphatic vessels reveals that lymphatic muscle cells are the innate pacemaker cells regulating lymphatic contractions" or "Discovery/Confirmation of lymphatic muscle cells as innate pacemaker cells of lymphatic contraction through characterization of the cellular components of murine collecting lymphatic vessels". Potentially a cartoon summary figure of the components that make up the collecting lymphatic vessel wall could also be included. In my opinion, these changes will make this manuscript of more interest to a broader group of scientists. I have a few additional comments for consideration to improve the clarity and enhance the discussion of this work.

    2. Reviewer #2 (Public Review):

      Summary:

      This is a well-written manuscript describing studies directed at identifying the cell type responsible for pacemaking in murine-collecting lymphatics. Using state-of-the-art approaches, the authors identified a number of different cell types in the wall of these lymphatics and then using targeted expression of Channel Rhodopsin and GCaMP, the authors convincingly demonstrate that only activation of lymphatic muscle cells produces coordinated lymphatic contraction and that only lymphatic muscle cells display pressure-dependent Ca2+ transients as would be expected of a pacemaker in these lymphatics.

      Strengths:

      The use of a targeted expression of channel rhodopsin and GCaMP to test the hypothesis that lymphatic muscle cells serve as the pacemakers in musing lymphatic collecting vessels.

      Weaknesses:

      The only significant weakness was the lack of quantitative analysis of most of the imaging data shown in Figures 1-11. In particular, the colonization analysis should be extended to show cells not expected to demonstrate colocalization as a negative control for the colocalization analysis that the authors present.

    3. Reviewer #3 (Public Review):

      Summary:

      Zawieja et al. aimed to identify the pacemaker cells in the lymphatic collecting vessels. Authors have used various Cre-based expression systems and optogentic tools to identify these cells. Their findings suggest these cells are lymphatic muscle cells that drive the pacemaker activity in the lymphatic collecting vessels.

      Strengths:

      The authors have used multiple approaches to test their hypothesis. Some findings are presented as qualitative images, while some quantitative measurements are provided.

      Weaknesses:

      - More quantitative measurements.<br /> - Possible mechanisms associated with the pacemaker activity.<br /> - Membrane potential measurements.

    1. Reviewer #1 (Public Review):

      The evolution of dioecy in angiosperms has significant implications for plant reproductive efficiency, adaptation, evolutionary potential, and resilience to environmental changes. Dioecy allows for the specialization and division of labor between male and female plants, where each sex can focus on specific aspects of reproduction and allocate resources accordingly. This division of labor creates an opportunity for sexual selection to act and can drive the evolution of sexual dimorphism.

      In the present study, the authors investigate sex-biased gene expression patterns in juvenile and mature dioecious flowers to gain insights into the molecular basis of sexual dimorphism. They find that a large proportion of the plant transcriptome is differentially regulated between males and females with the number of sex-biased genes in floral buds being approximately 15 times higher than in mature flowers. The functional analysis of sex-biased genes reveals that chemical defense pathways against herbivores are up-regulated in the female buds along with genes involved in the acquisition of resources such as carbon for fruit and seed production, whereas male buds are enriched in genes related to signaling, inflorescence development and senescence of male flowers. Furthermore, the authors implement sophisticated maximum likelihood methods to understand the forces driving the evolution of sex-biased genes. They highlight the influence of positive and relaxed purifying selection on the evolution of male-biased genes, which show significantly higher rates of non-synonymous to synonymous substitutions than female or unbiased genes. This is the first report (to my knowledge) highlighting the occurrence of this pattern in plants. Overall, this study provides important insights into the genetic basis of sexual dimorphism and the evolution of reproductive genes in Cucurbitaceae.

    2. Reviewer #2 (Public Review):

      Summary:

      This study uses transcriptome sequence from a dioecious plant to compare evolutionary rates between genes with male- and female-biased expression and distinguish between relaxed selection and positive selection as causes for more rapid evolution. These questions have been explored in animals and algae, but few studies have investigated this in dioecious angiosperms, and none have so far identified faster rates of evolution in male-biased genes (though see Hough et al. 2014 https://doi.org/10.1073/pnas.1319227111).

      Strengths:

      The methods are appropriate to the questions asked. Both the sample size and the depth of sequencing are sufficient, and the methods used to estimate evolutionary rates and the strength of selection are appropriate. The data presented are consistent with faster evolution of genes with male-biased expression, due to both positive and relaxed selection.

      This is a useful contribution to understanding the effect of sex-biased expression in genetic evolution in plants. It demonstrates the range of variation in evolutionary rates and selective mechanisms, and provides further context to connect these patterns to potential explanatory factors in plant diversity such as the age of sex chromosomes and the developmental trajectories of male and female flowers.

      Weaknesses:

      The presence of sex chromosomes is a potential confounding factor, since there are different evolutionary expectations for X-linked, Y-linked, and autosomal genes. Attempting to distinguish transcripts on the sex chromosomes from autosomal transcripts could provide additional insight into the relative contributions of positive and relaxed selection.

    3. Reviewer #3 (Public Review):

      The potential for sexual selection and the extent of sexual dimorphism in gene expression have been studied in great detail in animals, but hardly examined in plants so far. In this context, the study by Zhao, Zhou et al. al represents a welcome addition to the literature.

      Relative to the previous studies in Angiosperms, the dataset is interesting in that it focuses on reproductive rather than somatic tissues (which makes sense to investigate sexual selection), and includes more than a single developmental stage (buds + mature flowers).

      Some aspects of the presentation have been improved in this new version of the manuscript. Specifically:

      - the link between sex-biased and tissue-biased genes is now slightly clearer,<br /> - the limitation related to the de novo assembled transcriptome is now formally acknowledged,<br /> - the interpretation of functional categories of the genes identified is more precise,<br /> - the legends of supplementary figures have been improved<br /> - a large number of typos have been fixed.

      However, overall the analyses are largely unchanged and the manuscript did not mature much in response to this first round of reviews. As I detail below, many of the relevant and constructive suggestions by the previous reviewers were not taken into account in this revision. For instance:

      - Reviewer 2 made precise suggestions for trying to take into account the potential confonding factor of sex-chromosomes. This suggestion was not followed.<br /> - Reviewer 1 & 3 indicated that results were mentioned in the discussion section without having been described before. This was not fixed in this new version.<br /> - Reviewer 1 asked for a comparison between the number of de novo assembled unigenes in this transcriptome and the number of genes in other Cucurbitaceae species. I could not see this comparison reported.<br /> - Reviewer 1 pointed out that permutation tests were more appropriate, but no change was made to the manuscript.<br /> - Reviewer 3 pointed out the small sample size (both for the RNA-seq and the phylogenetic analysis), but again this limitation is not acknowledged very clearly.<br /> - Reviewer 1 & 3 pointed out that Fig 3 was hard to understand and asked for clarifications that I did not see in the text and the figure in unchanged.<br /> - Reviewer 3 suggested to combine all genes with sex-bias expression when evaluating the evolutionary rate, in addition to the analyses already done. This suggestion was not followed.<br /> - Reviewer 3 pointed out that hand-picking specific categories of genes was not statistically valid, and in fact not necessary in the present context. This was not changed.<br /> - Reviewer 1 asked for all data to be public, but I could not find in the manuscript where the link to the data on ResearchGate was provided.<br /> - Reviewers 1 & 3 pointed out that since only two tissues were compared, the claims on pleiotropy should have been toned down, but no change was made to the text.<br /> - Reviewer 1 asked for a clarification on which genes are plotted on the heatmap of Fig3C and an explanation of the color scale. No change was made.<br /> - Reviewer 1 asked for panel B in Fig 5 and 6 to be removed. They are still there. They asked for abbreviations to be explained in the legend of Fig S8. This was not done. They asked for details about coluln headers. Such detailed were not added. They asked for more recent references on line 53-56 : this was not done.

    1. Reviewer #1 (Public Review):

      This study addresses the fundamental question of how the nucleotide, associated with the beta-subunit of the tubulin dimer, dictates the tubulin-tubulin interaction strength in the microtubule polymer. This problem has been a topic of debate in the field for over a decade, and it is essential for understanding microtubule dynamics.

      McCormick and colleagues focus their attention on two hypotheses, which they call the "self-acting" model and the "interface-acting" model. Both models have been previously discussed in the literature and they are related to the specific way, in which the GTP hydrolysis in the beta-tubulin subunit exerts an effect on the microtubule lattice. The authors argue that the two considered models can be discriminated based on a quantitative analysis of the sensitivity of the growth rates at the plus- and minus-ends of microtubules to the concentration of GDP-tubulins in mixed nucleotide (GDP/GMPCPP) experiments. By combing computational simulations and in vitro observations, they conclude that the tubulin-tubulin interaction strength is determined by the interfacial nucleotide.

      The major strength of the paper is a systematic and thorough consideration of GDP as a modulator of microtubule dynamics, which brings novel insights about the structure of the stabilizing cap on the growing microtubule end.

    2. Reviewer #2 (Public Review):

      In their manuscript, McCormick, Cleary et al., explore the question of how the nucleotide state of the tubulin heterodimer affects the interaction between adjacent tubulins. They use a solid combination of biochemical reconstitution assays and modeling to reveal that the nucleotide at the interface of two tubulin dimers determines the strength of the interaction between two dimers. Overall, the findings will be valuable to the field of microtubule biology.

    1. Reviewer #1 (Public Review):

      This work challenges previously published results regarding the presence and abundance of 6mA in the Drosophila genome, as well as the claim that the TET or DMAD enzyme serves as the "eraser" of this DNA methylation mark and its roles in development. This information is needed to clarify these questions in the field. I am less familiar with the biochemical approaches in this work, so my comments are mainly on the genetic analyses. Generally speaking, the methods for fly husbandry and treatment seem to be in accordance with those established in the field.

    2. Reviewer #2 (Public Review):

      DNA adenine methylation (6mA) is a rediscovered modification that has been described in a wide range of eukaryotes. However, 6mA presence in eukaryote remains controversial due to the low abundance of its modification in eukaryotic genome. In this manuscript, Boulet et al. re-investigate 6mA presence in drosophila using axenic or conventional fly to avoid contaminants from feeding bacteria. By using these flies, they find that 6mA is rare but present in the drosophila genome by performing LC/MS/MS. They also find that the loss of TET (also known as DMAD) does not impact 6mA levels in drosophila, contrary to previous studies. In addition, the authors find that TET is required for fly development in its enzymatic activity-independent manner.

      The strength of this study is, that compared to previous studies of 6mA in drosophila, the authors employed axenic or conventional fly for 6mA analysis. These fly strains make it possible to analyze 6mA presence in drosophila without bacterial contaminant. Therefore, showing data of 6mA abundance in drosophila by performing LC-MS/MS in this manuscript is more convincing as compared with previous studies. Intriguingly, the authors find that the conserved iron-binding motif required for the catalytic activity of TET is dispensable for its function. This finding could be important to reveal TET function in organisms whose genomic 5mC levels are very low.

      The manuscript in this paper is well written but some aspects of data analysis and discussion need to be clarified and extended.<br /> 1) It is convincing that an increase in 6mA levels is not observed in TETnull presented in Fig1. But it seems 6mA levels are altered in Ax.TET1/2 compared with Ax.TETwt and Ax.TETnull presented in Fig1f (and also WT vs TET1/2 presented in Fig1g). Is it sure that no statistically significant were not observed between Ax.TET1/2 and Ax.TETwt?<br /> 2) The representing data of in vitro demethylation assay presented in Fig.3 is convincing, but it is not well discussed and analyzed why these results are contrary to previous reports (Yao et al., 2018 and Zhang et al., 2015).

    1. Reviewer #1 (Public Review):

      In this manuscript by DeHaro-Arbona et al., the authors wish to understand how a signaling pathway (Notch) is dynamically decoded to elicit a specific transcriptional output. In particular, they investigate the kinetic properties of Notch-responsive nuclear complexes (the DNA binding factor CSL and its co-activator Mastermind (mam) along with several candidate interacting partners). Their experimental model is the polytene chromosome of the Drosophila salivary gland, in which the naturally inactive Notch can be artificially induced through the expression of a constitutively active form of Notch.

      The authors develop a series of CRISPR and transgenic lines enabling the live imaging of these complexes at a specific locus and in various backgrounds (genetic perturbations/drug treatments). This quantitative live imaging data suggests that Notch nuclear complexes form hubs and the authors characterize their binding dynamics. Interestingly, they elegantly demonstrate that the content of these hubs and their kinetic properties can evolve, even within Notch ON cells. Hence, they propose the existence of distinct hubs, distinguishing an open (CSL), engaged (CSK-Mam), or active (CSL-Mam-Med-PolII) configuration in Notch ON cells and an inactive hub (in Notch OFF having previously been exposed to Notch) state, that would explain the surprising transcriptional memory that the authors observe hours after Notch withdrawal.

    2. Reviewer #2 (Public Review):

      The manuscript from deHaro-Arbona et al, entitled "Dynamic modes of Notch transcription hubs conferring memory and stochastic activation revealed by live imaging the co-activator Mastermind", uses single molecule microscopy imaging in live tissues to understand the dynamics and molecular determinants of transcription factor recruitment to the E(spl)-C locus in Drosophila salivary gland cells under Notch-ON and -OFF conditions. Previous studies have identified the major players that are involved in transcription regulation in the Notch pathway, as well as the importance of general transcriptional coregulators, such as CBP/P300 and the Mediator CDK module, but the detailed steps and dynamics involved in these processes are poorly defined. The authors present a wealth of single molecule data that provides significant insights into Notch pathway activation, including:

      1. Activation complexes, containing CSL and Mam, have slower dynamics than the repressor complexes, containing CSL and Hairless.

      2. Contribution of CSL, NICD, and Mam IDRs to recruitment.

      3. CSL-Mam slow-diffusing complexes are recruited and form a hub of high protein concentrations around the target locus in Notch-ON conditions.

      4. Mam recruitment is not dependent on transcription initiation or RNA production.

      5. CBP/P300 or its associated HAT activity is not required for Mam recruitment.

      6. Mediator CDK module and CDK8 activity are required for Mam recruitment, and vice-versa, but not CSL recruitment.

      7. Mam is not required for chromatin accessibility but is dependent on CSL and NICD.

      8. CSL recruitment and increased chromatin accessibility persist after NICD removal and loss of Mam, which confers a memory state that enables rapid re-activation in response to subsequent Notch activation.

      9. Differences in the proportions of nuclei with both Pol II and with Mam enrichment, which results in transcription being probabilistic/stochastic. These data demonstrate that the presence of Mam-complexes is not sufficient to drive all the steps required for transcription in every Notch-ON nucleus.

      10. The switch from more stochastic to robust transcription initiation was elicited when ecdysone was added.

      Overall, the manuscript is well written, concise, and clear, and makes significant contributions to the Notch field, which are also important for a general understanding of transcription factor regulation and behavior in the nucleus. I recommend that the authors address my relatively minor criticisms detailed below.

      Page 7, bottom. The authors speculate, "It is possible therefore that, once recruited, Mam can be retained at target loci independently of CSL by interactions with other factors so that it resides for longer." Is it possible that another interpretation of that data is that Mam is a limiting factor?

      Page 9. The authors write, "A very low level of enrichment was evident for... for the CSL C-terminus..". The recruitment of CSL ct IDR does not appear to be statistically significant or there is no apparent difference (Figure S2C), suggesting the CSL ct IDR does not play a role in enrichment.

      Page 9. The authors write, "Notably, MamnIDR::GFP fusion was present in droplets, suggesting it can self-associate when present in a high local concentration (Figure S2B)." Is this result only valid for Mam nIDR or does full-length Mam also localize into droplets, as has been previously observed for full-length mammalian Maml1 in transfected cells?

      Previous studies in mammalian cells suggest that Maml1 is a high-confidence target for phosphorylation by CDK8, see Poss et al 2016 Cell Reports https://doi.org/10.1016/j.celrep.2016.03.030. By sequence comparison, does fly Mam have similar potential phosphorylation sites, and might these be critical for Mam/CDK module recruitment?

      Page 11: The authors write, "The differences in the effects on Mam and CSL imply that the CDK module is specifically involved in retaining Mam in the hub, and that in its absence other CSL complexes "win-out", either because the altered conditions favour them and/or because they are the more abundant." Are the "other" complexes the authors are referring to Hairless-containing complexes? With the reagents the authors have in hand couldn't this be explicitly shown for CSL-complexes rather than speculated upon?

      Page 12/13: The authors write, "Based on these results we propose that, after Notch activity decays, the locus remains accessible because when Mam-containing complexes are lost they are replaced by other CSL complexes (e.g. co-repressor complexes)." Again, why not actually test this hypothesis rather than speculate? The dynamics of Hairless complexes following the removal of Notch would be very interesting and build upon previously published results from the Bray lab.

      Page 13: The authors write, "As Notch removal leads to a loss of Mam, but not CSL, from the hub, it should recapitulate the effects of MamDN." While the data in Figure 5B seem to support this hypothesis, it's not clear to me that the loss of Mam and MamDN should phenocopy each other, bc in the case of MamDN, NICD would still be present.

      The temporal dynamics for Mam recruitment using the temperature- and optogenetic-paradigms are quite different. For example, in the optogenetic time course experiments, the preactivated cells are in the dark for 4 hours, while in the temperature-controlled experiments, there is still considerable enrichment of Mam at 4 hours. For the preactivated optogenetic experiments, how sure are the authors that Mam is completely gone from the locus, and alternatively, can the optogenetic experimental results be replicated in the temperature-controlled assays? My concern is whether the putative "memory" observation is just due to incomplete Mam removal from the previous activation event.

    3. Reviewer #3 (Public Review):

      Summary:<br /> DeHaro-Arbona and colleagues investigate the in vivo dynamics of Notch-dependent transcriptional activation with a focus on the role of the Mastermind (MAM) transcriptional co-activator. They use GFP and HALO-tagged versions of the CSL DNA-binding protein and MAM to visualize the complex, and Int/ParB to visualize the site of Notch-dependent E(Spl)-C transcription. They make several conclusions. First, MAM accumulates at E(Spl)-C when Notch signaling is active, just like CSL. Second, MAM recruits the CDK module of Mediator but does not initiate chromatin accessibility. Third, after signaling is turned off, MAM leaves the site quickly but CSL and chromatin accessibility are retained. Fourth, RNA pol II recruitment, Mediator recruitment, and active transcription were similar and stochastic. Fifth, ecdysone enhances the probability of transcriptional initiation.

      Strengths:<br /> The conclusions are well supported by multiple lines of extensive data that are carefully executed and controlled. A major strength is the strategic combination of Drosophila genetics, imaging, and quantitative analyses to conduct compelling and easily interpretable experiments. A second major strength is the focus on MAM to gain insights into the dynamics of transcriptional activation specifically.

      Weaknesses:<br /> Weaknesses are minor. There were no p-values reported for data presented in Figure S1D and no indication of how variable measurements were. In addition, the discussion of stochasticity was not integrated optimally with relevant literature.

    1. Joint Public Review:

      The revised version of the manuscript "Delayed postglacial colonization of Betula in Iceland and the circum North Atlantic" by Harning et al. investigates the colonization of shrubs during the Late Pleistocene/Holocene in Northern America and Europe by comparing published sedimentary ancient DNA (sedaDNA) records (and pollen data) with a new sedaDNA record from Island. The manuscript aims to identify shrub colonization patterns, discusses their drivers and evaluates the importance of shrubification under future warming.

      The revised version improved the clarity of methods and discussion and results presented are more convincing.

      However, parts of the methods (e.g. assessment of blanks and data filtering) and results (e.g. visualization of plant community data) could still be polished, and the figures should be improved to increase the clarity of the manuscript.

    1. Reviewer #1 (Public Review):

      Summary:

      This study generated 3D cell constructs from endometrial cell mixtures that were seeded in the Matrigel scaffold. The cell assemblies were treated with hormones to induce a "window of implantation" (WOI) state. Although many bioinformatic analyses point in this direction, there are major concerns that must be addressed.

      Strengths:

      The addition of 3 hormones to enhance the WOI state (although not clearly supported in comparison to the secretory state).

      Weaknesses:

      First of all, the term organoid must be discarded. The authors just seed the endometrial cell mixture which assembles and aggregates into a 3D structure which is then immediately used for analysis. Organoids grow from tissue stem cells and must be passage-able (see their own description in lines 69-71). So, the term organoid must be removed everywhere, to not confuse the organoid field. It is not shown that the whole 3D assembly is passageable, which would be very surprising given the fact that immune and stromal cells do not grow in Matrigel because of the unfavorable growing conditions (which are targeted to epithelial cell growth).

      Second, the study remains fully descriptive, bombing the reader with a mass of bioinformatic analyses without clear descriptions and take-home messages. The paper is very dense, meaning readers may give up. Moreover, functional validation, except for morphological and immunostaining analyses (which are posed as "functional" but actually are only again expression) is missing, such as in vivo functionality (after transplantation e.g.) and embryo interaction. Importantly, the 3D structure misses the right architecture with a lining luminal epithelium which is present in the receptive endometrium in vivo and needed as the first contact site with the embryo. So, in contrast to what the authors claim, this is not the best model to study embryo interaction, or the closest model to the in vivo state (line 318, line 326).

      Third, receptive endometrial organoids (assembloids; Rawlings et al., eLife 2021) and receptive organoid-derived "open-faced endometrial layer" (Kagawa et al., Nature 2022) have already been described, which is in contrast to what the authors claim in several places that "they are the first" (e.g. lines 87-88, 316-319, etc). These studies used real organoids to achieve their model (and even showed embryo interaction), while in the present study, different cell types are just seeded and assembled. Hence, logically, immune cells are present which are never found in real organoid models. The only original aspect in the present study is the use of hormones to enhance the WOI phenotype. However, crucial information on this original aspect is missing such as concentration of the hormones, refreshment schedule, all 3 hormones added together or separately, and all 3 required?

      Moreover, it is not a "robust" model at all as the authors claim, given the variability of the initial cell mixture (varying from patient to patient). Actually, the reproducibility is not shown. The proportions of the different cell types seeded in the Matrigel droplet will be different with every endometrial biopsy. It would be much better to recombine epithelial (passageable) organoids with stromal and immune cells in a quantified, standardized manner to establish a "robust" model.

    2. Reviewer #2 (Public Review):

      A wide variety of assays are used to describe the new culture system and compare it both with those previously described and with the endometrial tissue itself. The three different cultures they used are control organoids (CTRL) cultured with described expansion media, secretory organoids (SEC, cultured with E2, MPA and cAMP inducing secretory phase as previously reported) and WOI organoids (cultured with E2, MPA, cAMP, prolactin (PRL), human chorionic gonadotropin (hCG) and human placental lactogen (hPL)). First, they performed morphological characterization of cultures using different antibodies, showing the presence of epithelial glandular cells and stromal cells, as well as their proliferation and absence of apoptosis. Glycogen secretion and progesterone receptor expression complete organoid characterization at the functional and hormone response levels respectively.

      Then, they performed single-cell transcriptomics to analyse its composition in terms of cell type, comparing with different databases, but with an unknown "n". They detect stromal, epithelial, and immune cells (also by microscopy), and analyse gene expression and transcription regulation, showing similarities between WOI organoids and mid-secretory endometrium. With endometrial receptivity analysis, they suggest a successful formation of the implantation window in vitro, but this result is difficult to interpret.

      Analyzing transcriptome and proteome information of WOI organoids, authors demonstrate a strong response to estrogen and progesterone, but some comparisons are made with CTRL and SEC, and others only with CTRL, which limits the power of some results. In the same way, some genes related to Cilia and pinopodes appear dominant in WOI organoids, but the comparison by electron microscopy is made only against CTRL organoids.

      In subsequent analysis, WOI organoids showed a marked differentiation from proliferative to secretory epithelium, and from proliferative epithelium to EMT-derived stromal cells than SEC organoids. These statements are based on their upregulation of monocarboxylic acid and lipid metabolism, their enhanced peptide metabolism and mitochondrial energy metabolism, or their pseudotime trajectories. However, other analyses (such as the accumulation of secretory epithelium or decreased proliferative epithelium, the increased ciliated epithelium after hormonal treatment, or the presence of EMT-derived stromal cells) show only small differences between SEC and WOI organoids.

      In summary, the development of an endometrial organoid culture methodology that allows targeting the endometrial situation in the window of implantation could change the experimental approaches of many studies, but more evidence is needed, and above all, more approaches on how different WOI organoids are from SEC organoids, to be sure if it is worth using them in implantation.

    1. Reviewer #1 (Public Review):

      Summary:<br /> TRAP transporters are an unusual class of secondary active transporters that utilize periplasmic binding proteins to deliver their substrates. This paper contributes a new 3 Å structure of the Haemophilus influenzae TRAP transporter. The structure joins two other recent cryo-EM structures of TRAP transporters, including a lower-resolution structure of the same H. influenzae protein (overall 4.7 Å), and a ~3 Å structure of a homologue from P. profundum. In addition to reporting a higher resolution cryo-EM structure, the authors also recapitulate protein activity in a reconstituted system, investigate protein oligomerization using analytic ultracentrifugation, and evaluate interactions and function in "mix and match" configurations with periplasmic subunits from other homologues.

      Strengths:<br /> The strength of the paper is that the better resolution cryo-EM data permits sidechain assignment, the identification of bound lipids, and the identification of sodium ions. It is important to get this structure out there since the resolution passes an important threshold for model-building accuracy. The current structure nicely explains a lot of prior mutagenesis data on the H. influenzae TRAP. This is also the first structure of a TRAP protein to be solved without a fiducial, although the overall structure is not very different from those solved with fiducials.

      Weaknesses:<br /> The experiments examining the monomer/dimer equilibrium appear somewhat preliminary. The biological or mechanistic importance of oligomerization is not established, so these experiments are inherently of limited scope. Moreover, cryo-EM datasets exhibit both parallel and antiparallel dimers, the latter of which are clearly not biologically relevant. It is probably impossible to distinguish these in the AUC experiments, which makes interpretation of these experiments more difficult.

      Similarly, the importance of the lipid binding sites observed in cryo-EM isn't experimentally established (for example by mutating the binding site) and it thus seems too preliminary to infer that they are important for function.

    2. Reviewer #2 (Public Review):

      Summary:<br /> In this manuscript, the membrane component of the sialic acid-specific TRAP transporter, SiaQM (HiSiaQM), from H. influenzae, is structurally characterized. TRAP transporters are substrate binding protein (SBP)-dependent secondary-active transporters, and HiSiaQM is the most comprehensively studied member of this family. While all previous work on fused TRAP transporter membrane proteins suggests that they are monomeric (including the previous structural characterization of HiSiaQM by a different group), a surprising finding from this work is the observation that HiSiaQM can form higher oligomers, consistent with it being a dimer. These higher oligomeric states were initially observed after extraction of the protein with LMNG detergent but were also observed in DDM detergent, amphipol and nanodiscs using analytical ultracentrifugation (AUC). Structural characterization of dimeric HiSiaQM revealed 2 arrangements, parallel and antiparallel arrangements, the latter of which is unlikely to be physiologically relevant.

      The higher resolution of this new structure of HiSiaQM (2.2-2.7 Å compared to 4.7 Å for the previous structure) facilitated the assignment of bound lipids at the dimer interface and a lipid molecule embedded in each of the protomers; allowed for a clearer refinement of the Na+ and putative substrate binding sites, which differ slightly from the previous structure; and produced better-modelled side chains for the residues involved in the SBP:HiSiaQM interaction. The authors developed a useful AUC-based assay to determine the affinity for this interaction revealing an affinity of 65 µM. Finally, the authors make the very interesting observation that a sialic acid-specific SBP from a different TRAP transporter can utilize HiSiaQM for transport, contrary to previous observations, revealing for the first time that TRAP membrane components can recognize multiple SBPs.

      Overall, this is a well-written and presented manuscript detailing some interesting new observations about this interesting protein family. One of the main findings, that the protein can form a dimer, is supported by data, but the physiological relevance of this is questionable, and the possibility that this is artefactual has not been ruled out. Conclusions regarding the mechanistic importance of the lipid-binding sites are not currently supported by the data.

      Strengths:<br /> The main strength of this work is the increased resolution of HiSiaQM, which allows for a much more precise assignment of side chains and their orientation. This will be of importance for subsequent mechanistic studies on the contributions of these residues to Na+ and sialic acid binding and conformational changes.

      The observation of the lipids, especially the lipid embedded near the fusion helix, is an intriguing observation, which lays the groundwork for future work to understand the lipid-dependence of these transporters. The development of the AUC-based approach to measure SBP affinity for the membrane component will likely prove useful to future studies.

      Weaknesses:<br /> One of the main results from this work is the observation that HiSiaQM can form a dimer. Two arrangements were observed, parallel and antiparallel, the latter of which is almost certainly physiologically irrelevant as it would preclude essential interactions with the extracytoplasmic substrate-binding protein. As acknowledged by the author, this non-physiological arrangement is likely a consequence of protein preparation (overexpression, extraction, purification, etc.). However, if one dismisses the antiparallel arrangement as non-relevant and an artefact of protein preparation, it is difficult for the parallel arrangement to maintain its credibility, as it was also processed in the same way. This is especially true when one considers that there is only 100 Å2 buried surface area in the parallel arrangement that does not involve any sidechains; it is difficult to envisage this as a specific interaction, e.g. compared to related proteins that have ~2000 Å2 buried surface area. Unless this dimerization is observed in a bacterial membrane at physiological protein concentrations, it is difficult to rule out the possibility that the observed dimerization is merely an artefact caused by the expression, purification and concentration of the protein.

      The manuscript contains some excellent structural analysis of this protein, whose higher resolution reveals some new and interesting insights. However, a weakness of the current work is a lack of validation of these observations using other approaches. For example, lipid interactions are observed in the structure that the authors claim is mechanistically important. However, without disrupting these interactions to look at the effect on transport, this conclusion is not supported. Similarly, the authors use their structure to predict residues that are important for the SBP:membrane protein interaction, and they develop an AUC-based binding assay to study this interaction, but they do not test their predictions using this approach.

    3. Reviewer #3 (Public Review):

      Summary:<br /> The manuscript reports new molecular characterization of the Haemophilus influenza tripartite ATP-independent periplasmic (TRAP) transporter of N-acetylneuraminate (Neu5Ac). This membrane transporter is important for the virulence of the pathogen. H. influenza lacks Neu5Ac biosynthetic pathway and utilizes the TRAP transporter to import it. Neu5Ac is used as a nutrient source but also as a protection from the human immune response. The transporter is composed of two fused membrane subunits, HiSiaQM, and one soluble, periplasmic subunit HiSiaP. HiSiaP, by binding to the substrate Neu5Ac, changes its conformation, allowing its binding to HiSiaQM, followed by Neu5Ac and Na+ transport to the cytoplasm. The combination of structural, biophysical and biochemical approaches provides a solid basis for describing the functioning of the Haemophilus influenza Neu5Ac TRAP transporter, which is essential for the pathogen virulence.

      Strengths:<br /> The paper describes the electron microscopy structure of HiSiaQM, thanks to its solubilization in L-MNG followed by the exchange to amphipol or nanodisc. In these conditions, HiSiaQM consists of a mixture of monomers and dimers, as characterized by analytical ultracentrifugation. The cryo-EM analysis shows two types of dimers: one in an antiparallel configuration, which is artifactual, and a parallel one, which may be physiologically relevant. Cryo-EM on the dimers allows high-resolution (≈ 3 Å) structure determination. The structure is the first one of a fused SiaQM, and is the first obtained without megabody. The work highlights structural elements (fusion helix, lipids) that could modulate transport. The authors checked the functionality of the purified HiSiaQM, which, after reconstitution in liposome, displays a significantly larger Neu5Ac transport activity compared to the non-fused PpSiaQM homolog. The work identifies Na+ binding sites, and the putative Neu5Ac binding site. From analytical ultracentrifugation using fluorescently labelled HiSiaP, the authors show that HiSiaP is able to interact with HiSiaQM monomer and dimer, with a low but physiologically relevant affinity. HiSiaP interaction with HiSiaQM was modelled using AlphaFold2, and discussed in view of published activity on mutants, and new transport activity assays using SiaQM and SiaP from different organisms. In conclusion, the combination of structural, biophysical and biochemical approaches provides a solid basis for describing the functioning of this TRAP fused transporter.

      Weakness:<br /> This work evidences in vitro a HiSiaQM dimer, whose in vivo relevance is not ascertained. However, the authors are very careful, not to over-interpret their data, and their conclusions regarding the transporter structure and function are valid irrespective of its state of association.

    1. Reviewer #1 (Public Review):

      Summary:<br /> Parkinson and colleagues address an interesting and important question, i.e. whether the bumblebee Bombus terrestris can receive field-realistic concentrations of different pesticides in a sugar solution mimicking nectar. The study directly follows up on a previous study conducted by the same team (Kessler et al. 2015, Nature), which was partly questioned by another more recent study (Arce et al. 2018, Proc R. Soc. B). The authors apply a combination of electrophysiological measurements and behavioral feeding tests to answer this question. Their results strongly suggest that B. terrestris workers are not able to perceive field-realistic doses of pesticides in a sugar solution. They additionally show that B. terrestris can physiologically differentiate between solutions varying in sugar composition.

      Strengths:<br /> Sophisticated methodology, combination of approaches, clear and precise language. The stats questions have been addressed to my satisfaction. In terms of interpretation, however, several suggestions and comments were provided from an ecological perspective, which was deemed important, while the authors have expressed their intent to concentrate on the electrophysiological mechanism. Given that this study was motivated by conflicting results from earlier research, which were frequently employed to discuss the authors' findings, I still find that the discussion needs to be expanded in order to encompass a wider context.

    2. Reviewer #2 (Public Review):

      Summary:<br /> This manuscript is part of the Wright lab's ongoing studies that investigate whether the bumblebee B. terrestris can detect the presence of pesticides when feeding. Previously, they showed that B. terrestris cannot detect neonicotinoids and would prefer food containing neonicotinoids (Kessler et al. 2015). However, in that paper, they showed that B. terrestris cannot taste neonicotinoids but did not provide evidence on why B. terrestris prefer food containing neonicotinoids. In the current paper, the authors continue to suggest that B. terrestris cannot taste neonicotinoids as well as another insecticide, sulfoxaflor, based on additional behavioral experiments and electrophysiological experiments focusing on specific GRNs. While the data from these experiments continue to suggest that B. terrestris cannot taste these insecticides using their mouthparts, whether B. terrestris can actually perceive these insecticides, and why this species prefers food containing these compounds remains unknown.

      Strengths:<br /> The authors provided additional evidence that B. terrestris cannot taste neonicotinoids with their mouthparts. The authors have addressed my concerns regarding overgeneralization and that parts of the manuscript were written in a way that sounded combative with studies from other groups that had come to slightly different conclusions from their previous paper.

    1. Reviewer #1 (Public Review):

      The article "A randomized multiplex CRISPRi-Seq approach for the identification of critical combinations of genes" describes the development of a multiplex randomized CRISPRi screening method that they named MurCiS and applied it to study redundancy of L. pneumophila virulence factors. The authors used a L. pneumophila strain carrying dCas9 on the chromosome that they had constructed for a CRISPRi screen they had published recently and here combined it with self-assembly randomized multiplex CRISPR arrays that they developed. The strains carrying the dCas9 and the different CRISPRi arrays were used to infect U937 or Acanthamoeba castellanii cells and the intracellular growth phenotypes were recorded as readout. This allowed the authors to identify certain gene combinations that when knocked down induced a growth defect in either or both cells tested but not when they were knocked down alone. A particular gene combination caught their attention, as the genes lpg2888 and lpg3000 were inducing a growth defect only when both were knocked down in U937 cells but in A. castellanii cells lpg3000 alone was sufficient to cause a growth defect.

      The concept of using CRISPRi to look at functional redundancy in effectors is a very useful one to the Legionella field and where biological redundancy limits studies. It has the potential to uncover virulence effectors of importance that have not been described before.

      Comments on revised version: In this revised version the authors have answered our concerns satisfactorily except the point related to the use of only one guide per gene.

    2. Reviewer #2 (Public Review):

      The study by Ellis et al. documents the development of a CRISPR interference (CRISPRi) screen aiming at identifying virulence-critical genes of Legionella pneumophila, the facultative intracellular bacterium causing Legionnaires' disease. L. pneumophila employs the Dot/Icm type IV secretion system to translocate more than 300 different "effector proteins" into host cells. Many effector proteins appear to have redundant functions, and therefore, depleting several of them is required to observe a strong intracellular replication phenotype. In the current study, Ellis et al. develop a "multiplex, randomized CRISPRi sequencing" (MuRCiS) approach to silence several effector genes simultaneously and randomly, thereby possibly causing synthetic lethality for L. pneumophila upon infection of host cells.

      The MuRCiS approach comprises the ligation of different CRISPR spacers flanked by repeats in presence of "dead end" oligonucleotide pairs capping a random array of building blocks to be inserted into a library vector. Thus, spacer arrays with an average of 3.3 spacers per array were obtained. As a proof-of-concept, spacer arrays targeting 44 transmembrane effector-encoding L. pneumophila genes were employed to screen for intracellular growth defects in macrophages and amoeba. Consequently, novel pairs of synergistically functioning effector genes were identified by comparative next-generation sequencing of the input and output pools of spacer arrays.

      A major strength of this well-written and straightforward study is the construction and use of random and multiplexed CRISPRi arrays, allowing an unbiased and comprehensive screen for multiple genes affecting the intracellular growth of L. pneumophila. The ingenious approach established by Ellis et al. will be useful for further genetic analysis of L. pneumophila infection and might also be adopted for other pathogens employing a large set of (functionally redundant) virulence factors.<br /> The reviewer's suggestion to test the single and double L. pneumophila effector mutant strains for growth in protozoa other than A. castellanii was considered beyond the scope of the current manuscript describing the optimization of the MuRCiS platform. The authors have satisfactorily addressed the minor points raised previously.

    1. Joint Public Review:

      This concise review provides a clear and instructive picture of the state-of-the-art understanding of protein kinases' activity and sets of approaches and tools to analyse and regulate it.

      Three major parts of the work include: methods to map allosteric communications, tools to control allostery, and allosteric regulation of protein kinases. The work provides an important and timely view of the current status of our understanding of the function of protein kinases and state-of-the-art methods to study its allosteric regulation and to develop allosteric approaches to control it.

    1. Reviewer #1 (Public Review):

      The objective of this investigation was to determine whether experimental pain could induce alterations in cortical inhibitory / facilitatory activity observed in TMS-evoked potentials (TEPs). Previous TMS investigations of pain perception had focused on motor evoked potentials (MEPs), which reflect a combination of cortical, spinal, and peripheral activity, as well as restricting the focus to M1. The main strength of this investigation is the combined use of TMS and EEG in the context of experimental pain. More specifically, Experiment 1 investigated whether acute pain altered cortical excitability, reflected in the modulation of TEPs. The main outcome of this study is that relative to non-painful warm stimuli, painful thermal stimuli led to an increase on the amplitude of the TEP N45, with a larger increase associated with higher pain ratings. Because it has been argued that a significant portion of TEPs could reflect auditory potentials elicited by the sound (click) of the TMS, Experiment 2 constituted a control study that aimed to disentangle the cortical response related to TMS and auditory activity. Finally, Experiment 3 aimed to disentangle the cortical response to TMS and reafferent feedback from muscular activity elicited by suprathreshold TMS applied over M1. The fact that the authors accompanied their main experiment with two control experiments strengthens the conclusion that the N45 TEP peak could be implicated in the perception of painful stimuli. Perhaps, the addition of a highly salient but non-painful stimulus (i.e. from another modality) would have further ruled out that the effects on the N45 are not predominantly related to intensity / saliency of the stimulus rather than to pain per se.

    2. Reviewer #3 (Public Review):

      The present study aims to investigate whether pain influences cortical excitability. To this end, heat pain stimuli are applied to healthy human participants. Simultaneously, TMS pulses are applied to M1 and TMS-evoked potentials (TEPs) and pain ratings are assessed after each TMS pulse. TEPs are used as measures of cortical excitability. The results show that TEP amplitudes at 45 msec (N45) after TMS pulses are higher during painful stimulation than during non-painful warm stimulation. Control experiments indicate that auditory, somatosensory, or proprioceptive effects cannot explain this effect. Considering that the N45 might reflect GABAergic activity, the results suggest that pain changes GABAergic activity. The authors conclude that TEP indices of GABAergic transmission might be useful as biomarkers of pain sensitivity.

      Pain-induced cortical excitability changes is an interesting, timely, and potentially clinically relevant topic. The paradigm and the analysis are sound, the results are convincing, and the interpretation is adequate. The findings will be of interest to researchers interested in the brain mechanisms of pain.

    1. Reviewer #2 (Public Review):

      The authors present the OpenApePose database constituting a collection of over 70000 ape images which will be important for many applications within primatology and the behavioural sciences. The authors have also rigorously tested the utility of this database in comparison to available Pose image databases for monkeys and humans to clearly demonstrate its solid potential. However, the variation in the database with regards to individuals, background, source/setting is not clearly articulated and would be beneficial information for those wishing to make use of this resource in the future.

    1. Reviewer #1 (Public Review):

      Summary:<br /> The study used the sci-Plex system to perform in vitro screen of chemicals and found that 2 compounds improved the reprogramming efficiency in Ascl1-overexpressed MG (Muller glia), and in addition, administration of the identified compounds in the previously established in vivo model (Ascl1, NMDA, TSA) showed that DBZ and metformin increased Otx2+ cells for improved neurogenesis.

      Strengths: The overall study was straightforward and well designed. The method in the study could be potentially useful for large-scale in vitro screens for compounds to further improve reprogramming efficiency. The data and results of the study are of good quality.

      Weaknesses: The findings may not generate significant interest for two main reasons. One, the compounds only increased the population of bipolar neurons but did not generate new retinal neuronal types compared to the earlier methods, and the reprogramming efficiency may not be as high as other earlier strategies such as overexpression of Ascl1 plus Atoh1 reported from the same group. Two, the overall study produced some interesting initial discoveries but was quite descriptive overall, was weak on performing more in-depth analysis and weak on mechanistic examinations.

    2. Reviewer #2 (Public Review):

      Summary:

      In the current manuscript, Tresenrider et al., present their recent study focusing on screening of small molecules to enhance the conversion from Müller cells (MG) to retina neurons induced by ectopic Ascl1 expression.

      Strengths:

      To analyze results from multiple treatment conditions in a single experiment, the authors employed a method called sci-Plex to perform scRNA-seq on mixed samples to investigate the effects of different durations of Ascl1 expression and screen for potential small molecules to promote reprogramming. Ultimately, they identified two compounds with intended activities on mouse retina. The findings may aid in future development of a cell replacement strategy for treating retinal degeneration.

      Weaknesses:

      The mechanistic insights are limited. Certain claims are confusing or superficial at this point, as detailed in issues/concerns.

    1. Reviewer #1 (Public Review):

      Summary:<br /> Herein, Blaeser et al. explored the impact of migraine-related cortical spreading depression (CSD) on the calcium dynamics of meningeal afferents that are considered the putative source of migraine-related pain. Critically previous studies have identified widespread activation of these meningeal afferents following CSD; however, most studies of this kind have been performed in anesthetized rodents. By conducting a series of technically challenging calcium imaging experiments in conscious head fixed mice they find in contrast that a much smaller proportion of meningeal afferents are persistently activated following CSD. Instead, they identify that post-CSD responses are differentially altered across a wide array of afferents, including increased and decreased responses to mechanical meningeal deformations and activation of previously non-responsive afferents following CSD. Given that migraine is characterized by worsening head pain in response to movement, the findings offer a potential mechanism that may explain this clinical phenomenon.

      Strengths:<br /> Using head fixed conscious mice overcomes the limitations of anesthetized preps and the potential impact of anaesthesia on meningeal afferent function which facilitated novel results when compared to previous anesthetized studies. Further, the authors used a closed cranial window preparation to maximize normal physiological states during recording, although the introduction of a needle prick to induce CSD will have generated a small opening in the cranial preparation, rendering it not fully closed as suggested.

      Weaknesses:<br /> Although this is a well conducted technically challenging study that has added valuable knowledge on the response of meningeal afferents the study would have benefited from the inclusion of more female mice. Migraine is a female dominant condition and an attempt to compare potential sex-differences in afferent responses would undoubtedly have improved the outcome.

      The authors imply that the current method shows clear differences when compared to older anaesthetized studies; however, many of these were conducted in rats and relied on recording from the trigeminal ganglion. Inclusion of a subgroup of anesthetized mice in the current preparation may have helped to answer these outstanding questions, being is this species dependent or as a result of the different technical approaches.

      The authors discuss meningeal deformations as a result of locomotion; however, despite referring to their previous work (Blaeser et al., 2022), the exact method of how these deformations were measured could be clearer. It is challenging to imaging that simple locomotion would induce such deformations and the one reference in the introduction refers to straining, such as cough that may induce intracranial hypertension, which is likely a more powerful stimulus than locomotion.

    2. Reviewer #2 (Public Review):

      This is an interesting study examining the question of whether CSD sensitizes meningeal afferent sensory neurons leading to spontaneous activity or whether CSD sensitizes these neurons to mechanical stimulation related to locomotion. Using two-photon in vivo calcium imaging based on viral expression of GCaMP6 in the TG, awake mice on a running wheel were imaged following CSD induction by cortical pinprick. The CSD wave evoked a rise in intracellular calcium in many sensory neurons during the propagation of the wave but several patterns of afferent activity developed after the CSD. The minority of recorded neurons (10%) showed spontaneous activity while slightly larger numbers (20%) showed depression of activity, the latter pattern developed earlier than the former. The vast majority of neurons (70%) were unaffected by the CSD. CSD decreased the time spent running and the numbers of bouts per minute but each bout was unaffected by CSD. There also was no influence of CSD on the parameters referred to as meningeal deformation including scale, shear, and Z-shift. Using GLM, the authors then determine that there there is an increase in locomotion/deformation-related afferent activity in 51% of neurons, a decrease in 12% of neurons, and no change in 37%. GLM coefficients were increased for deformation related activity but not locomotion related activity after CSD. There also was an increase in afferents responsive to locomotion/deformation following CSD that were previously silent. This study shows that unlike prior reports, CSD does not lead to spontaneous activity in the majority of sensory neurons but that it increases sensitivity to mechanical deformation of the meninges. This has important implications for headache disorders like migraine where CSD is thought to contribute to the pathology in unclear ways with this new study suggesting that it may lead to increased mechanical sensitivity characteristic of migraine attacks.

    3. Reviewer #3 (Public Review):

      Summary:<br /> Blaeser et al. set out to explore the link between CSD and headache pain. How does an electrochemical wave in the brain parenchyma, which lacks nociceptors, result in pain and allodynia in the V1-3 distribution? Prior work had established that CSD increased the firing rate of trigeminal neurons, measured electrophysiologically at the level of the peripheral ganglion. Here, Blaeser et al. focus on the fine afferent processes of the trigeminal neurons, resolving Ca2+ activity of individual fibers within the meninges. To accomplish these experiments, the authors injected AAV encoding the Ca2+ sensitive fluorophore GCamp6s into the trigeminal ganglion, and 8 weeks later imaged fluorescence signals from the afferent terminals within the meninges through a closed cranial window. They captured activity patterns at rest, with locomotion, and in response to CSD. They found that mechanical forces due to meningeal deformations during locomotion (shearing, scaling, and Z-shifts) drove non-spreading Ca2+ signals throughout the imaging field, whereas CSD caused propagating Ca2+ signals in the trigeminal afferent fibers, moving at the expected speed of CSD (3.8 mm/min). Following CSD, there were variable changes in basal GCamp6s signals: these signals decreased in the majority of fibers, signals increased (after a 25 min delay) in other fibers, and signals remained unchanged in the remainder of fibers. Bouts of locomotion were less frequent following CSD, but when they did occur, they elicited more robust GCamp6s signals than pre-CSD. These findings advance the field, suggesting that headache pain following CSD can be explained on the basis of peripheral cranial nerve activity, without invoking central sensitization at the brain stem/thalamic level. This insight could open new pathways for targeting the parenchymal-meningeal interface to develop novel abortive or preventive migraine treatments.

      Strengths:<br /> The manuscript is well-written. The studies are broadly relevant to neuroscientists and physiologists, as well as neurologists, pain clinicians, and patients with migraine with aura and acephalgic migraine. The studies are well-conceived and appear to be technically well-executed.

      Weaknesses:<br /> 1) Lack of anatomic confirmation that the dura were intact in these studies: it is notoriously challenging to create a cranial window in mouse skull without disrupting or even removing the dura. It was unclear which meningeal layers were captured in the imaging plane. Did the visualized trigeminal afferents terminate in the dura, subarachnoid space, or pia (as suggested by Supplemental Fig 1, capturing a pial artery in the imaging plane)? Were z-stacks obtained, to maintain the imaging plane, or to follow visualized afferents when they migrated out of the imaging plane during meningeal deformations?<br /> 2) Findings here, from mice with chronic closed cranial windows, failed to fully replicate prior findings from rats with acute open cranial windows. While the species, differing levels of inflammation and intracranial pressure in these two preparations may contribute, as the authors suggested, the modality of measuring neuronal activity could also contribute to the discrepancy. In the present study, conclusions are based entirely on fluorescence signals from GCamp6s, whereas prior rat studies relied upon multiunit recordings/local field potentials from tungsten electrodes inserted in the trigeminal ganglion. As a family, GCamp6 fluorophores are strongly pH dependent, with decreased signal at acidic pH values (at matched Ca2+ concentration). CSD induces an impressive acidosis transient, at least in the brain parenchyma, so one wonders whether the suppression of activity reported in the wake of CSD (Figure 2) in fact reflects decreased sensitivity of the GCamp6 reporter, rather than decreased activity in the fibers. If intracellular pH in trigeminal afferent fibers acidifies in the wake of CSD, GCamp6s fluorescence may underestimate the actual neuronal activity.

    1. Reviewer #1 (Public Review):

      Summary:<br /> This study examined the impact of exogenous microapplication of acetylcholine (Ach) on metrics of novelty detection in the anesthetized rat auditory cortex. The authors found that the majority of units showed some degree of modulation of novelty detection, with roughly similar numbers showing enhanced novelty detection, suppressed novelty detection, or no change. Enhanced novelty responses were driven by increases in repetition suppression. Suppressed novelty responses were driven by deviance suppression. There were no compelling differences seen between auditory cortical subfields or layers, though there was heterogeneity in the Ach effects within subfields. Overall, these findings are important because they suggest that fluctuations in cortical Ach, which are known to occur during changes in arousal or attentional states, will likely influence the capacity of individual auditory cortical neurons to respond to novel stimuli.

      Strengths:<br /> The work addresses an important problem in auditory neuroscience. The main strengths of the study are that the work was systematically done with appropriate controls (cascaded stimuli) and utilizes a classical approach that ensures that drug application is isolated to the micro-environment of the recorded neuron. In addition, the authors do not isolate their study to only the primary auditory cortex, but examine the impact of Ach across all known auditory cortical subfields.

      Weaknesses:<br /> 1. As acknowledged by the authors, this study explicitly examines a phenomenon of high relevance to active listening but is done in anesthetized animals, limiting its applicability to the waking state.<br /> 2. The authors do not make any attempt to determine, by spike shape/duration, if their units are excitatory or inhibitory, which may explain some of the variance of the data.<br /> 3. The application of exogenous Ach, potentially in supra-physiological amounts, makes this study hard to extrapolate to a behaving animal. A more compelling design would be to block Ach, particularly at particular receptor types, to determine the effect of endogenous Ach.

    2. Reviewer #2 (Public Review):

      Summary:<br /> In this study, the authors investigate the effect of ACh on neuronal responses in the auditory cortex of anesthetized rats during an auditory oddball task. The paradigm consisted of two pure tones (selected from the frequency responses at each recording site) presented in a pseudo-random sequence. One tone was presented frequently (the "standard" tone) and the other infrequently (the "deviant" tone). The authors found that ACh enhances the detection of unexpected stimuli in the auditory environment by increasing or decreasing the neuronal responses to deviant and standard tones.

      Strengths:<br /> The study includes the use of appropriate and validated methodology in line with the current state-of-the-art, rigorous statistical analysis, and the demonstration of the effects of acetylcholine on auditory processing.

      Weaknesses:<br /> The study was conducted in anesthetized rats, and further research is needed to determine the behavioral relevance of these findings.

    1. Reviewer #1 (Public Review):

      Summary: The authors study the effects of myelin alterations in working memory via the complementary use of two computational approaches: one based on the de- and re-myelination in multicompartmental models of pyramidal neurons, and one based on synaptic changes in a spiking bump attractor model for spatial working memory. The first model provides the most precise angle (biophysically speaking) of the different effects (loss of myelin lamella or segments, remyelination with thinner and shorter nodes, etc), while the second model allows to infer the consequences of myelin alterations in working memory performance, including memory stability, duration, and bump diffusion. The results indicate (i) a slowing down and failure of propagation of spikes with demyelination and partial recovery with remyelination, with detailed predictions on the role of nodes and myelina lamella, and (ii) a decrease in memory duration and an increase in memory drift as a function of the demyelination, in agreement with multiple experimental studies.

      Strengths: Overall, the work offers a very interesting approach of a topic which is hard to accomplish experimentally --therefore the computational take is entirely justified and extremely useful. The authors carefully designed the computational experiments to shed light into the demyelination effects on working memory from multiple levels of description, increasing the reliability of their conclusions. I think this work is solid and has the potential to be influential in future studies of myelin alterations (and related disorders such as multiple sclerosis).

      Weaknesses: In its current form, the study still presents several issues which prevent it from achieving a higher potential impact. These can be summarized in two main items. First, the manuscript is missing some important details about how demyelination and remyelination are incorporated in both models (and what is the connection between both implementations). For example, it is unclear whether an unperturbed axon and a fully remyelinated axon would be mathematically equivalent in the multicompartment model, or how the changes in the number of nodes, myelin lamella, etc, are implemented in the spiking neural network model. Second, it is unclear whether some of the conclusions are strong computational predictions or just a consequence of the model chosen. For example, the lack of effect of decreasing the conduction velocity on working memory performance could be due to the choice of considering a certain type of working memory model (continuous attractor), and therefore be absent under other valid assumptions (i.e. a silent working memory model, which has a higher dependence on temporal synaptic dynamics).

      With additional simulations to address these issues, I consider that the present study would become a convincing milestone in the computational modeling of myelin-related models, and an important study in the field of working memory.

    2. Reviewer #2 (Public Review):

      This paper analyzes the effect of axon de-myelination and re-myelination on action potential speed, and propagation failure. Next, the findings are then incorporated in a standard spiking ring attractor model of working memory.

      I think the results are not very surprising or solid and there are issues with method and presentation.<br /> The authors did many simulations with random parameters, then averaged the result, and found for instance that the Conduction Velocity drops in demyelination. It gives the reader little insight into what is really going on. My personal preference is for a well understood simple model rather than a poorly understood complex model. The link between the model outcome of WM and data remains qualitative, and is further weakened by the existence of known other age-related effects in PFC circuits.

      * Both for the de/re myelination the spatial patterns are fully random. Why is this justified?<br /> * Similarly, to model the myelin parameters where drawn from uniform distributions, Table 1 (I guess). Again, why is this reasonable?

      * The focus of most analysis is on the conduction velocity but in the end, this has no effect on WM, so the discussion of CV remains sterile.

      * The more important effect of de/re myelination is on failure.<br /> However, the failure is, AFAIK, just characterized by a constant current injection of 380pA.<br /> From Fig 2 it seems however that the first spike is particularly susceptible to failure.<br /> In other words, it has not been justified that it is fine to use the failure rates from this artificial protocol in the I&F model. I would expect the temporal current trace to affect whether the propagation fails or not.<br /> I don't know if there are many axon-collaterals in the WM circuits and or distance dependence in the connectivity, but if so, then the current implementation of failure would be questionable.<br /> I would also advise against thresholding at 75% failure in Fig3C. Why don't the authors not simply plot the failure rate?

      Regarding the presentation, there are a number of dead-end results that are not used further on. The paper is rather extensive, and it would be clearer if written up in half the space. In addition, much information is really supplementary. The issue of the CV I already mentioned, also the Lasso regression for instance remains unused.

    1. Reviewer #1 (Public Review):

      Lines et al., provide evidence for a sequence of events in vivo in adult anesthetized mice that begin with a foot-shock driving activation of neural projections into layer 2/3 somatosensory cortex, which in turn triggers a rise in calcium in astrocytes within "domains" of their "arbor". The authors segment the astrocyte morphology based on SR101 signal and show that the timing of "arbor" Ca2+ activation precedes somatic activation and that somatic activation only occurs if at least {greater than or equal to}22.6% of the total segmented astrocyte "arbor" area is active. Thus, the authors frame this {greater than or equal to}22.6% activation as a spatial property (spatial threshold) with certain temporal characteristics - i.e., must occur before soma and global activation. The authors then elaborate on this spatial threshold by providing evidence for its intrinsic nature - is not set by the level of neuronal stimulus and is dependent on whether IP3R2, which drives Ca2+ release from the endoplasmic reticulum (ER) in astrocytes, is expressed. Lastly, the authors suggest a potential physiologic role for this spatial threshold by showing ex vivo how exogenous activation of layer 2/3 astrocytes by ATP application can gate glutamate gliotransmission to layer 2/3 cortical neurons - with a strong correlation between the number of active astrocyte Ca2+ domains and the slow inward current (SIC) frequency recorded from nearby neurons as a readout of glutamatergic gliotransmission. This is interesting and would potentially be of great interest to readers within and outside the glia research community, especially in how the authors have tried to systematically deconstruct some of the steps underlying signal integration and propagation in astrocytes. Many of the conclusions posited by the authors are potentially important but we think their approach needs experimental/analytical refinement and elaboration.

      The primary issue for us, and which we would encourage the authors to address, relates to the low spatial-temporal resolution of their approach. This issue does not necessarily compromise the concept of a spatial threshold, but more refined observations and analyses are likely to provide more reliable quantitative parameters and a more comprehensive view of the mode of Ca2+ signal integration in astrocytes. For this reason, and because their observations might be perceived as both a conceptual and numerical standard in the field, we believe that the authors should proceed with both experimental and analytical refinement. Notably, we have difficulty with the reported mean delays of astrocyte Ca2+ elevations upon sensory stimulation. The 11s delay for response onset in "arbor" and 13s in the soma are extremely long, and we do not think they represent a true physiologic latency for astrocyte responses to the sensory activity. Indeed, such delays appear to be slower even than those reported in the initial studies of sensory stimulation in anesthetized mice with limited spatial-temporal resolution (Wang et al. Nat Neurosci., 2006) - not to say of more recent and refined ones in awake mice (Stobart et al. Neuron, 2018) that identified even sub-second astrocyte Ca2+ responses, largely preserved in IP3R2KO mice. Thus, we are inclined to believe that the slowness of responses reported here is an indicator of experimental/analytical issues. There can be several explanations of such slowness that the authors may want to consider for improving their approach: (a) The authors apparently use low zoom imaging for acquiring signals from several astrocytes present in the FOV: do all of these astrocytes respond homogeneously in terms of delay from sensory stimulus? Perhaps some are faster responders than others and only this population is directly activated by the stimulus. Others could be slower in activation because they respond secondarily to stimuli. In this case, the authors could focus their analysis specifically on the "fast-responding population". (b) By focusing on individual astrocytes and using higher zoom, the authors could unmask more subtle Ca2+ elevations that precede those reported in the current manuscript. These signals have been reported to occur mainly in regions of the astrocyte that are GCaMP6-positive but SR101-negative and constitute a large percentage of its volume (Bindocci et al., 2017). By restricting analysis to the SR101-positive part of the astrocyte, the authors might miss the fastest components of the astrocyte Ca2+ response likely representing the primary signals triggered by synaptic activity. It would be important if they could identify such signals in their records, and establish if none/few/many of them propagate to the SR-101-positive part of the astrocyte. In other words, if there is only a single spatial threshold, the one the authors reported, or two or more of them along the path of signal propagation towards the cell soma that leads eventually to the transformation of the signal into a global astrocyte Ca2+ surge. In this context, there is another concept that we encourage the authors to better clarify: whether the spatial threshold that they describe is constituted by the enlargement of a continuous wavefront of Ca2+ elevation, e.g. in a single process, that eventually reaches 22.6% of the segmented astrocyte, or can it also be constituted by several distinct Ca2+ elevations occurring in separate domains of the arbor, but overall totaling 22.6% of the segmented surface? Mechanistically, the latter would suggest the presence of a general excitability threshold of the astrocyte, whereas the former would identify a driving force threshold for the centripetal wavefront. In light of the above points, we think the authors should use caution in presenting and interpreting the experiments in which they use SIC as a readout. Their results might lead some readers to bluntly interpret the 22.6% spatial threshold as the threshold required for the astrocyte to evoke gliotransmitter release. Indeed, SIC are robust signals recorded somatically from a single neuron and likely integrate activation of many synapses all belonging to that neuron. On the other hand, an astrocyte impinges in a myriad of synapses belonging to several distinct neurons. In our opinion, it is quite possible that more local gliotransmission occurs at lower Ca2+ signal thresholds (see above) that may not be efficiently detected by using SIC as a readout; a more sensitive approach, such as the use of a gliotransmitter sensor expressed all along the astrocyte plasma-membrane could be tested to this aim.

      Additional considerations are that the authors propose an event sequence as follows: stimulus - synaptic drive to L2/3 - arbor activation - spatial threshold - soma activation - post soma activation - gliotransmission. This seems reminiscent of the sequence underlying neuronal spike propagation - from dendrite to soma to axon, and the resulting vesicular release. However, there is no consensus within the glial field about an analogous framework for astrocytes. Thus, "arbor activation", "soma activation", and "post soma activation" are not established `terms-of-art´. Similarly, the way the authors use the term "domain" contrasts with how others have (Agarwal et al., 2017; Shigetomi et al., 2013; Di Castro et al., 2011; Grosche et al., 1999) and may produce some confusion. The authors could adopt a more flexible nomenclature or clarify that their terms do not have a defined structural-functional basis, being just constructs that they justifiably adapted to deal with the spatial complexity of astrocytes in line with their past studies (Lines et al., 2020; Lines et al., 2021).

      Our previous points suggest that the paper would be significantly strengthened by new experimental observations focusing on single astrocytes and using acquisitions at higher spatial and temporal resolution. If the authors will not pursue this option, we encourage them to at least improve their analysis, and at the same time recognize in the text some limitations of their experimental approach as discussed above. We indicate here several levels of possible analytical refinement.

      The first relates to the selection of astrocytes being analyzed, and the need to focus on a much narrower subpopulation than (for example) 987 astrocytes used for the core data. This selection would take into greater consideration the aspects of structure and latency. With the structural and latency-based criteria for selection, the number of astrocytes to analyze might be reduced by 10-fold or more, making our second analytical recommendation much more feasible.

      For structure-based selection - Genetically-encoded Ca2+ indicators such as GCaMP6 are in principle expressed throughout an astrocyte, even in regions that are not labelled by SR101. Moreover, astrocytes form independent 3D territories, so one can safely assume that the GCaMP6 signal within an astrocyte volume belongs to that specific astrocyte (this is particularly evident if the neighboring astrocytes are GCaMP6-negative). Therefore, authors could extend their analysis of Ca2+ signals in individual astrocytes to the regions that are SR101-negative and try to better integrate fast signals in their spatial threshold concept. Even if they decided to be conservative on their methods, and stick to the astrocyte segmentation based on the SR-101 signal, they should acknowledge that SR101 dye staining quality can vary considerably between individual astrocytes within a FOV - some astrocytes will have much greater structural visibility in the distal processes than others. This means that some astrocytes may have segmented domains extending more distally than others and we think that authors should privilege such astrocytes for analysis. However, cases like the representative astrocytes shown in Figure 4A or Figure S1B, have segmented domains localized only to proximal processes near the soma. Accordingly, given the reported timing differences between "arbor" and "soma" activation, one might expect there to be comparable timing differences between domains that are distal vs proximal to the soma as well. Fast signals in peripheral regions of astrocytes in contact with synapses are largely IP3R2-independent (Stobart et al., 2018). However, the quality of SR101 staining has implications for interpreting the IP3R2 KO data. There is evidence IP3R2 KO may preferentially impact activity near the soma (Srinivasan et al., 2015). Thus, astrocytes with insufficient staining - visible only in the soma and proximal domains - might show a biased effect for IP3R2 KO. While not necessarily disrupting the core conclusions made by the authors based on their analysis of SR101-segmented astrocytes, we think results would be strengthened if astrocytes with sufficient SR101 staining - i.e. more consistent with previous reports of L2/3 astrocyte area (Lanjakornsiripan et al., 2018) - were only included. This could be achieved by using max or cumulative projections of individual astrocytes in combination with SR101 staining to construct more holistic structural maps (Bindocci et al., 2017).

      For latency-based selection - The authors record calcium activity within a FOV containing at least 20+ astrocytes over a period of 60s, during which a 2Hz hindpaw stimulation at 2mA is applied for 20s. As discussed above, presumably some astrocytes in a FOV are the first to respond to the stimulus series, while others likely respond with longer latency to the stimulus. For the shorter-latency responders <3s, it is easier to attribute their calcium increases as "following the sensory information" projecting to L2/3. In other cases, when "arbor" responses occur at 10s or later, only after 20 stimulus events (at 2Hz), it is likely they are being activated by a more complex and recurrent circuit containing several rounds of neuron-glia crosstalk etc., which would be mechanistically distinct from astrocytes responding earlier. We suggest that authors focus more on the shorter latency response astrocytes, as they are more likely to have activity corresponding to the stimulus itself.

      The second level of analysis refinement we suggest relates specifically to the issue of propagation and timing for the activity within "arbor", "soma" and "post-soma". Currently, the authors use an ROI-based approach that segments the "arbor" into domains. We suggest that this approach could be supplemented by a more robust temporal analysis. This could for example involve starting with temporal maps that take pixels above a certain amplitude and plot their timing relative to the stimulus-onset, or (better) the first active pixel of the astrocyte. This type of approach has become increasingly used (Bindocci et al., 2017; Wang et al., 2019; Ruprecht et al., 2022) and we think its use can greatly help clarify both the proposed sequence and better characterize the spatial threshold. We think this analysis should specifically address several important points:

      1. Where/when does the astrocyte activation begin? Understanding the beginning is very important, particularly because another potential spatial threshold - preceding the one the authors describe in the paper - could gate the initial activation of more distal processes, as discussed above. This sequentially earlier spatial threshold could (for example) rely on microdomain interaction with synaptic elements and (in contrast) be IP3R2 independent (Srinivasan et al., 2015, Stobart et al., 2018). We would be interested to know whether, in a subset of astrocytes that meet the structure and latency criteria proposed above and can produce global activation, there is an initial local GCaMP6f response of a minimal size that must occur before propagation towards the soma begins. The data associated with varying stimulus parameters could potentially be useful here and reveal stimulus intensity/duration-dependent differences.

      2. Whether the propagation in the authors' experimental model is centripetal? This is implied throughout the manuscript but never shown. We think establishing whether (or not) the calcium dynamics are centripetal is important because it would clarify whether spatially adjacent domains within the "arbor" need to be sequentially active before reaching the threshold and then reaching the soma. More broadly, visualizing propagation will help to better visualize summation, which is presumably how the threshold is first reached (and overcome). The alternative hypothesis of a general excitability threshold, as discussed above, would be challenged here and possibly rejected, thereby clarifying the nature of the Ca2+ process that needs to reach a threshold for further expansion to the soma and other parts of the astrocyte.

      3. In complement to the previous point: we understand that the spatial threshold does not per se have a location, but is there some spatial logic underlying the organization of active domains before the soma response occurs? One can easily imagine multiple scenarios of sparse heterogeneous GCaMP6f signal distributions that correspond to {greater than or equal to}22.6% of the arborization, but that would not be expected to trigger soma activation. For example, the diagram in Figure 4C showing the astrocyte response to 2Hz stim (which lacks a soma response) underscores this point. It looks like it has {greater than or equal to}22.6% activation that is sparsely localized throughout the arborization. If an alternative spatial distribution for this activity occurred, such that it localized primarily to a specific process within the arbor, would it be more likely to trigger a soma response?

      4. Does "pre-soma" activation predict the location and onset time of "post-soma" activation? For example, are arbor domains that were part of the "pre-soma" response the first to exhibit GCaMP6f signal in the "post-soma" response?

    2. Reviewer #2 (Public Review):

      Lines et al investigated the integration of calcium signals in astrocytes of the primary somatosensory cortex. Their goal was to better characterize the mechanisms that govern the spatial characteristics of calcium signals in astrocytes. In line with previous reports in the field, they found that most events originated and stayed localized within microdomains in distal astrocyte processes, occasionally coinciding with larger events in the soma, referred to as calcium surges. As a single astrocyte communicates with hundreds of thousands of synapses simultaneously, understanding the spatial integration of calcium signals in astrocytes and the mechanisms governing the latter is of tremendous importance to deepen our understanding of signal processing in the central nervous system. The authors thus aimed to unveil the properties governing the emergence of calcium surges. The main claim of this manuscript is that there would be a spatial threshold of ~23% of microdomain activation above which a calcium surge, i.e. a calcium signal that spreads to the soma, is observed. Although the study provides data that is highly valuable for the community, the conclusions of the current version of the manuscript seem a little too assertive and general compared with what can be deduced from the data and methods used.

      The major strength of this study is the experimental approach that allowed the authors to obtain numerous and informative calcium recordings in vivo in the somatosensory cortex in mice in response to sensory stimuli as well as in situ. Notably, they developed an interesting approach to modulating the number of active domains in peripheral astrocyte processes by varying the intensity of peripheral stimulation (its amplitude, frequency, or duration).

      The major weakness of the manuscript is the method used to analyze and quantify calcium activity, which mostly relies on the analysis of averaged data and overlooks the variability of the signals measured. As a result, the main claims from the manuscript seem to be incompletely supported by the data. The choice of the use of a custom-made semi-automatic ROI-based calcium event detection algorithm rather than established state-of-the-art software, such as the event-based calcium event detection software AQuA (DOI: 10.1038/s41593-019-0492-2), is insufficiently discussed and may bias the analysis. Some references on this matter include: Semyanov et al, Nature Rev Neuro, 2020 (DOI: 10.1038/s41583-020-0361-8); Covelo et al 2022, J Mol Neurosci (DOI: 10.1007/s12031-022-02006-w) & Wang et al, 2019, Nat Neuroscience (DOI: 10.1038/s41593-019-0492-2). Moreover, the ROIs used to quantify calcium activity are based on structural imaging of astrocytes, which may not be functionally relevant.

      For the reasons listed above, the manuscript would probably benefit from some rephrasing of the conclusions and a discussion highlighting the advantages and limitations of the methodological approach. The question investigated by this study is of great importance in the field of neuroscience as the mechanisms dictating the spatio-temporal properties of calcium signals in astrocytes are poorly characterized, yet are essential to understand their involvement in the modulation of signal integration within neural circuits.

    3. Reviewer #3 (Public Review):

      Summary:<br /> The study aims to elucidate the spatial dynamics of subcellular astrocytic calcium signaling. Specifically, they elucidate how subdomain activity above a certain spatial threshold (~23% of domains being active) heralds a calcium surge that also affects the astrocytic soma. Moreover, they demonstrate that processes on average are included earlier than the soma and that IP3R2 is necessary for calcium surges to occur. Finally, they associate calcium surges with slow inward currents.

      Strengths:<br /> The study addresses an interesting topic that is only partially understood. The study uses multiple methods including in vivo two-photon microscopy, acute brain slices, electrophysiology, pharmacology, and knockout models. The conclusions are strengthened by the same findings in both in vivo anesthetized mice and in brain slices.

      Weaknesses:<br /> The method that has been used to quantify astrocytic calcium signals only analyzes what seems to be a small proportion of the total astrocytic domain on the example micrographs, where a structure is visible in the SR101 channel (see for instance Reeves et al. J. Neurosci. 2011, demonstrating to what extent SR101 outlines an astrocyte). This would potentially heavily bias the results: from the example illustrations presented it is clear that the calcium increases in what is putatively the same astrocyte goes well beyond what is outlined with automatically placed small ROIs. The smallest astrocytic processes are an order of magnitude smaller than the resolution of optical imaging and would not be outlined by either SR101 or with the segmentation method judged by the ROIs presented in the figures. Completely ignoring these very large parts of the spatial domain of an astrocyte, in particular when making claims about a spatial threshold, seems inappropriate. Several recent methods published use pixel-by-pixel event-based approaches to define calcium signals. The data should have been analyzed using such a method within a complete astrocyte spatial domain in addition to the analyses presented. Also, the authors do not discuss how two-dimensional sampling of calcium signals from an astrocyte that has processes in three dimensions (see Bindocci et al, Science 2017) may affect the results: if subdomain activation is not homogeneously distributed in the three-dimensional space within the astrocyte territory, the assumptions and findings between a correlation between subdomain activation and somatic activation may be affected.

      The experiments are performed either in anesthetized mice, or in slices. The study would have come across as much more solid and interesting if at least a small set of experiments were performed also in awake mice (for instance during spontaneous behavior), given the profound effect of anesthesia on astrocytic calcium signaling and the highly invasive nature of preparing acute brain slices. The authors mention the caveat of studying anesthetized mice but claim that the intracellular machinery should remain the same. This explanation appears a bit dismissive as the response of an astrocyte not only depends on the internal machinery of the astrocyte, but also on how the astrocyte is stimulated: for instance synaptic stimulation or sensory input likely would be dependent on brain state and concurrent neuromodulatory signaling which is absent in both experimental paradigms. The discussion would have been more balanced if these aspects were dealt with more thoroughly.

      The study uses a heaviside step function to define a spatial 'threshold' for somata either being included or not in a calcium signal. However, Fig 4E and 5D showing how the method separates the signal provide little understanding for the reader. The most informative figure that could support the main finding of the study, namely a ~23% spatial threshold for astrocyte calcium surges reaching the soma, is Fig. 4G, showing the relationship between the percentage of arborizations active and the soma calcium signal. A similar plot should have been presented in Fig 5 as well. Looking at this distribution, though, it is not clear why ~23% would be a clear threshold to separate soma involvement, one can only speculate how the threshold for a soma event would influence this number. Even if the analyses in Fig. 4H and the fact that the same threshold appears in two experimental paradigms strengthen the case, the results would have been more convincing if several types of statistical modeling describing the continuous distribution of values presented in Fig. 4E (in addition to the heaviside step function) were presented.

      The description of methods should have been considerably more thorough throughout. For instance which temperature the acute slice experiments were performed at, and whether slices were prepared in ice-cold solution, are crucial to know as these parameters heavily influence both astrocyte morphology and signaling. Moreover, no monitoring of physiological parameters (oxygen level, CO2, arterial blood gas analyses, temperature etc) of the in vivo anesthetized mice is mentioned. These aspects are critical to control for when working with acute in vivo two-photon microscopy of mice; the physiological parameters rapidly decay within a few hours with anesthesia and following surgery.

    1. Reviewer #1 (Public Review):

      Schnell et al. performed two extensive behavioral experiments concerning the processing of objects in rats and humans. To this aim, they designed a set of objects parametrically varying along alignment and concavity and then they used activations from a pretrained deep convolutional neural network to select stimuli that would require one of two different discrimination strategies, i.e. relying on either low- or high-level processing exclusively. The results show that rodents rely more on low-level processing than humans.

      Strengths:

      1. The results are challenging and call for a different interpretation of previous evidence. Indeed, this work shows that common assumptions about task complexity and visual processing are probably biased by our personal intuitions and are not equivalent in rodents, which instead tend to rely more on low-level properties.<br /> 2. This is an innovative (and assumption-free) approach that will prove useful to many visual neuroscientists. Personally, I second the authors' excitement about the proposed approach, and its potential to overcome the limits of experimenters' creativity and intuitions. In general, the claims seem well supported and the effects sufficiently clear.<br /> 3. This work provides an insightful link between rodent and human literature on object processing. Given the increasing number of studies on visual perception involving rodents, these kinds of comparisons are becoming crucial.<br /> 4. The paper raises several novel questions that will prompt more research in this direction.

      Weaknesses:<br /> 1. The choice of alignment and concavity as baseline properties of the stimuli is not properly discussed.<br /> 2. From the low-correlations I got the feeling that AlexNet is not the best baseline model for rat visual processing.

    2. Reviewer #2 (Public Review):

      Schnell and colleagues trained rats on a two-alternative forced choice visual discrimination task. They used object pairs that differed in their concavity and the alignment of features. They found that rats could discriminate objects across various image transformations. Rat performance correlated best with late convolutional layers of an artificial neural network and was partially explained by factors of brightness and pixel-level similarity. In contrast, human performance showed the strongest correlation with higher, fully connected layers, indicating that rats employed simpler strategies to accomplish this task as compared to humans.

      Strengths:<br /> 1. This is a methodologically rigorous study. The authors tested a substantial number of rats across a large variety of stimuli.<br /> 2. The innovative use of neural networks to generate stimuli with varying levels of complexity is a compelling approach that motivates principled experimental design.<br /> 3. The study provides important data points for cross-species comparisons of object discrimination behavior<br /> 4. The data strongly support the authors' conclusion that rats and humans rely on different visual features for discrimination tasks.<br /> 5. This is a valuable study that provides novel, important insights into the visual capabilities of rats.

      Weaknesses:<br /> 1. The impact of rat visual acuity (~1cycle/degree) on the discriminability of stimuli could be more directly modeled and taken into consideration when comparing rat behavior to humans, who possess substantially higher acuity.<br /> 2. The distinction between low- and high-level visual behavior is coarse, and it remains uncertain which specific features rats utilized for discrimination. The correlations with brightness and pixel-level similarity do provide some insight.<br /> 3. The relatively weak correspondence between rat behavior and AlexNet raises the question of which network architecture, whether computational or biological, might better capture rat behavior, particularly to the level of cross-rat consistency.

    1. Reviewer #1 (Public Review):

      This manuscript describes a set of four passage-reading experiments which are paired with computational modeling to evaluate how task-optimization might modulate attention during reading. Broadly, participants show faster reading and modulated eye-movement patterns of short passages when given a preview of a question they will be asked. The attention weights of a Transformer-based neural network (BERT and variants) show a statistically reliable fit to these reading patterns above-and-beyond text- and semantic-similarity baseline metrics, as well as a recurrent-network-based baseline. Reading strategies are modulated when questions are not previewed, and when participants are L1 versus L2 readers, and these patterns are also statistically tracked by the same transformer-based network.

      Strengths:

      - Task-optimization is a key notion in current models of reading and the current effort provides a computationally rigorous account of how such task effects might be modeled<br /> - Multiple experiments provide reasonable effort towards generalization across readers and different reading scenarios<br /> - Use of RNN-based baseline, text-based features, and semantic features provides a useful baseline for comparing Transformer-based models like BERT

      Weaknesses:

      - Generalization across neural network models may be limited (models differ in size, training data etc.); it is thus not always clear which specific model characteristics support their fit to human reading patterns.

    2. Reviewer #2 (Public Review):

      In this study, researchers aim to understand the computational principles behind attention allocation in goal-directed reading tasks. They explore how deep neural networks (DNNs) optimized for reading tasks can predict reading time and attention distribution. The findings show that attention weights in transformer-based DNNs predict reading time for each word. Eye tracking reveals that readers focus on basic text features and question-relevant information during initial reading and rereading, respectively. Attention weights in shallow and deep DNN layers are separately influenced by text features and question relevance. Additionally, when readers read without a specific question in mind, DNNs optimized for word prediction tasks can predict their reading time. Based on these findings, the authors suggests that attention in real-world reading can be understood as a result of task optimization.

      Strengths of the Methods and Results:<br /> The present study employed stimuli consisting of paragraphs read by middle and high school students, covering a wide range of diverse topics. This choice ensured that the reading experience for participants remained natural, ultimately enhancing the ecological validity of the findings and conclusions.

      In Experiments 1-3, participants were instructed to read questions before the text, while in Experiment 4 participants were instructed to read questions after the text. This deliberate manipulation allowed the paper to assess how different reading task conditions influence reading and eye movements.

      Weaknesses of the Methods and Results:

      While the study benefits from several strengths, it is important to acknowledge its limitations. Notably, recent months have seen significant advancements in Deep Neural Network (DNN) models, including the development of models such as GPT-3.5 and GPT-4, which have demonstrated remarkable capabilities in tasks resembling human cognition, like Theory of Mind. However, as the code for these cutting-edge models was not publicly accessible, they were unable to evaluate whether the attention mechanisms in the most up-to-date DNN models could provide improved predictions for human eye-movement data. This constraint represents a limitation in the investigation.

      The methods and data presented in this study are valuable for gaining insights into the psychological mechanisms of reading. Moreover, the data provided in this paper may prove instrumental in enhancing the performance of future DNN models.

    3. Reviewer #3 (Public Review):

      This paper presents several eyetracking experiments measuring task-directed reading behavior where subjects read texts and answered questions. It then models the measured reading times using attention patterns derived from deep-neural network models from the natural language processing literature. Results are taken to support the theoretical claim that human reading reflects task-optimized attention allocation.

      Strengths:

      (1) The paper leverages modern machine learning to model a high-level behavioral task (reading comprehension). While the claim that human attention reflects optimal behavior is not new, the paper considers a substantially more high-level task in comparison to prior work. The paper leverages recent models from the NLP literature which are known to provide strong performance on such question-answering tasks, and is methodologically well grounded in the NLP literature.

      (2) The modeling uses text- and question-based features in addition to DNNs, specifically evaluates relevant effects, and compares vanilla pretrained and task-finetuned models. This makes the results more transparent and helps assess the contributions of task optimization. In particular, besides fine-tuned DNNs, the role of the task is further established by directly modeling the question relevance of each word. Specifically, the claim that human reading is predicted better by task-optimized attention distributions rests on (i) a role of question relevance in influencing reading in Expts 1-2 but not 4, and (ii) the fact that fine-tuned DNNs improve prediction of gaze in Expts 1-2 but not 4.

      (3) The paper conducts experiments on both L2 and L1 speakers.

      Weaknesses:

      (1) Under the hypothesis advanced, human reading should adapt rationally to task demands. Indeed, Experiment 1 tests questions from different types in blocks (local and global), and the paper provides evidence that this encourages the development of question-type-specific reading strategies -- indeed, this specifically motivates Experiment 2, and is confirmed indirectly in the comparison of the effects found in the two experiments ("all these results indicated that the readers developed question-type-specific strategies in Experiment 1"). On the other hand, finetuning the model on one of the two types does not seem to reproduce this differential behavior, in the sense that fit to reading data is not improved. In this sense, the model seems to have limited abilities in reproducing the observed task dependence of human reading.

      The results support the conclusions well, with the weakness described above a limitation of the modeling approach chosen.

      The data are likely to be useful as a benchmark in further modeling of eye-movements, an area of interest to computational research on psycholinguistics.<br /> The modeling results contribute to theoretical understanding of human reading behavior, and strengthens a line of research arguing that it reflects task-adaptive behavior.

    1. Reviewer #1 (Public Review):

      Summary:<br /> In the presented manuscript the authors aim at quantifying the costs of locomotion in schooling versus solitary fish across a considerable range of speeds. Specifically, they quantify the possible reduction in the cost of locomotion in fish due to schooling behavior. The main novelty appears to be the direct measurement of absolute swimming costs and total energy expenditure, including the anaerobic costs at higher swimming speeds.

      In addition to metabolic parameters, the authors also recorded some basic kinematic parameters such as average distances or school elongation. They find both for solitary and schooling fish, similar optimal swimming speeds of around 1BL/s, and a significant reduction in costs of locomotion due to schooling at high speeds, in particular at ~5-8 BL/s.

      Given the lack of experimental data and the direct measurements across a wide range of speeds comparing solitary and schooling fish, this appears indeed like a potentially important contribution of interest to a broader audience beyond the specific field of fish physiology, in particular for researchers working broadly on collective (fish) behavior.

      Strengths:<br /> The manuscript is for the most part well written, and the figures are of good quality. The experimental method and protocols are very thorough and of high quality. The results are quite compelling and interesting. What is particularly interesting, in light of previous literature on the topic, is that the authors conclude that based on their results, specific fixed relative positions or kinematic features (tail beat phase locking) do not seem to be required for energetic savings. They also provide a review of potential different mechanisms that could play a role in the energetic savings.

      Weaknesses:<br /> A weakness is the actual lack of critical discussion of the different mechanisms as well as the discussion on the conjecture that relative positions and kinematic features do not matter. I found the overall discussion on this rather unsatisfactory, lacking some critical reflections as well as different relevant statements or explanations being scattered across the discussion section. Here I would suggest a revision of the discussion section.

      Also, there is a statement that Danio regularly move within the school and do not maintain inter-individual positions. However, there is no quantitative data shown supporting this statement, quantifying the time scales of neighbor switches. This should be addressed as core conclusions appear to rest on this statement and the authors have 3d tracks of the fish.

      Further, there is a fundamental question on the comparison of schooling in a flow (like a stream or here flow channel) versus schooling in still water. While it is clear that from a pure physics point of view that the situation for individual fish is equivalent. as it is about maintaining a certain relative velocity to the fluid, I do think that it makes a huge qualitative difference from a biological point of view in the context of collective swimming. In a flow, individual fish have to align with the external flow to ensure that they remain stationary and do not fall back, which then leads to highly polarized schools. However, this high polarization is induced also for completely non-interacting fish. At high speeds, also the capability of individuals to control their relative position in the school is likely very restricted, simply by being forced to put most of their afford into maintaining a stationary position in the flow. This appears to me fundamentally different from schooling in still water, where the alignment (high polarization) has to come purely from social interactions. Here, relative positioning with respect to others is much more controlled by the movement decisions of individuals. Thus, I see clearly how this work is relevant for natural behavior in flows and that it provides some insights on the fundamental physiology, but I at least have some doubts about how far it extends actually to "voluntary" highly ordered schooling under still water conditions. Here, I would wish at least some more critical reflection and or explanation.

      Related to this, the reported increase in the elongation of the school at a higher speed could have also different explanations. The authors speculate briefly it could be related to the optimal structure of the school, but it could be simply inter-individual performance differences, with slower individuals simply falling back with respect to faster ones. Did the authors test for certain fish being predominantly at the front or back? Did they test for individual swimming performance before testing them in groups together? Again this should be at least critically reflected somewhere.

    2. Reviewer #2 (Public Review):

      Summary:<br /> This paper tests the idea that schooling can provide an energetic advantage over solitary swimming. The present study measures oxygen consumption over a wide range of speeds, to determine the differences in aerobic and anaerobic cost of swimming, providing a potentially valuable addition to the literature related to the advantages of group living.

      Strengths:<br /> The strength of this paper is related to providing direct measurements of the energetics (oxygen consumption) of fish while swimming in a group vs solitary. The energetic advantages of schooling have been claimed to be one of the major advantages of schooling and therefore a direct energetic assessment is a useful result.

      Weaknesses:<br /> The manuscript suffers from a number of weaknesses which are summarised below:

      1) The possibility that fish in a school show lower oxygen consumption may also be due to a calming effect. While the authors show that there is no difference at low speed, one cannot rule out that calming effects play a more important role at higher speed, i.e. in a more stressful situation.

      2) The ratio of fish volume to water volume in the respirometer is much higher than that recommended by the methodological paper by Svendsen et al (J Fish Biol 2016)

      3) Because the same swimming tunnel was used for schools and solitary fish, schooling fish may end up swimming closer to the wall (because of less volume per fish) than solitary fish. Distances to the wall of schooling fish are not given, and they could provide an advantage to schooling fish.

      4) The statistical analysis has a number of problems. The values of MO2 of each school are the result of the oxygen consumption of each fish, and therefore the test is comparing 5 individuals (i.e. an individual is the statistical unit) vs 5 schools (a school made out of 8 fish is the statistical unit). Therefore the test is comparing two different statistical units. One can see from the graphs that schooling MO2 tends to have a smaller SD than solitary data. This may well be due to the fact that schooling data are based on 5 points (five schools) and each point is the result of the MO2 of five fish, thereby reducing the variability compared to solitary fish. Other issues are related to data (for example Tail beat frequency) not being independent in schooling fish.

    3. Reviewer #3 (Public Review):

      Summary:<br /> Zhang and Lauder characterized both aerobic and anaerobic metabolic energy contributions in schools and solitary fishes in the Giant danio (Devario aequipinnatus) over a wide range of water velocities. By using a highly sophisticated respirometer system, the authors measure the aerobic metabolisms by oxygen uptake rate and the non-aerobic oxygen cost as excess post-exercise oxygen consumption (EPOC). With these data, the authors model the bioenergetic cost of schools and solitary fishes. The authors found that fish schools have a J-shaped metabolism-speed curve, with reduced total energy expenditure per tail beat compared to solitary fish. Fish in schools also recovered from exercise faster than solitary fish. Finally, the authors conclude that these energetic savings may underlie the prevalence of coordinated group locomotion in fish.

      The conclusions of this paper are mostly well supported by data, but some aspects of methods and data acquisition need to be clarified and extended.

      Strengths:<br /> This work aims to understand whether animals moving through fluids (water in this case) exhibit highly coordinated group movement to reduce the cost of locomotion. By calculating the aerobic and anaerobic metabolic rates of school and solitary fishes, the authors provide direct energetic measurements that demonstrate the energy-saving benefits of coordinated group locomotion in fishes. The results of this paper show that fish schools save anaerobic energy and reduce the recovery time after peak swimming performance, suggesting that fishes can apport more energy to other fitness-related activities whether they move collectively through water.

      Weaknesses:<br /> Although the paper does have strengths in principle, the weakness of the paper is the method section. There is too much irrelevant information in the methods that sometimes is hard to follow for a researcher unfamiliar with the research topic. In addition, it was hard to imagine the experimental (respirometer) system used by the authors in the experiments; therefore, it would be beneficial for the article to include a diagram/scheme of that respiratory system.

    1. Reviewer #1 (Public Review):

      Summary:<br /> The investigators employed multi-omics approach to show the functional impact of partial chemical reprogramming in fibroblasts from young and aged mice.

      Strengths:<br /> Multi-omics data was collected, including epigenome, transcriptome, proteome, phosphoproteome, and metabolome. Different analyses were conducted accordingly, including differential expression analysis, gene set enrichment analysis, transcriptomic and epigenetic clock-based analyses. The impact of partial chemical reprogramming on aging was supported by these multi-source results.

      Weaknesses:<br /> More experimental data may be needed to further validate current findings.

    2. Reviewer #2 (Public Review):

      The short-term administration of reprogramming factors to partially reprogram cells has gained traction in recent years as a potential strategy to reverse aging in cells and organisms. Early studies used Yamanaka factors in transgenic mice to reverse aging phenotypes, but chemical cocktails could present a more feasible approach for in vivo delivery. In this study, Mitchell et al sought to determine the effects that short-term administration of chemical reprogramming cocktails have on biological age and function. To address this question, they treated young and old mouse fibroblasts with chemical reprogramming cocktails and performed transcriptome, proteome, metabolome, and DNA methylation profiling pre- and post-treatment. For each of these datasets, they identified changes associated with treatment, showing downregulation of some previously identified molecular signatures of aging in both young and old cells. From these data, the authors conclude that partial chemical reprogramming can rejuvenate both young and old fibroblasts.

      The main strength of this study is the comprehensive profiling of cells pre- and post-treatment with the reprogramming cocktails, which will be a valuable resource for better understanding the molecular changes induced by chemical reprogramming. The authors highlighted consistent changes across the different datasets that are thought to be associated with aging phenotypes, showing reduction of age-associated signatures previously identified in various tissues. However, from the findings, it remains unclear which changes are functionally relevant in the specific fibroblast system being used. Specifically:

      1) The 4 month and 20 month mouse fibroblasts are designated "young" vs "old" in this study. An important analysis that was not shown for each of the profiled modalities was a comparison of untreated young vs old fibroblasts to determine age-associated molecular changes in this specific model of aging. Then, rather than using aging signatures defined in other tissues, it would be more appropriate to determine whether the chemical cocktails reverted old fibroblasts to a younger state based on the age-associated changes identified in this comparison.<br /> 2) Across all datasets, it appears that the global profiles of young vs old mouse fibroblasts are fairly similar compared to treated fibroblasts, suggesting that the chemical cocktails are not reverting the fibroblasts to a younger state but instead driving them to a different cell state. Similarly, in most cases where specific age-related processes/genes are being compared across untreated and treated samples, no significant differences are observed between young and old fibroblasts.<br /> 3) Functional validation experiments to confirm that specific changes observed after partial reprogramming are indeed reducing biological age is limited.<br /> 4) Partial reprogramming appears to substantially reduce biological age of the young (4 month) fibroblasts based on the aging signatures used. It is unclear how this result should be interpreted.

    1. Reviewer #1 (Public Review):

      Summary:<br /> In this manuscript, Unckless and colleagues address the issue of the maintenance of genetic diversity of the gene diptericin A, which encodes an antimicrobial peptide in the model organism Drosophila melanogaster.

      Strengths:<br /> The data indicate that flies homozygous for the dptA S69 allele are better protected against some bacteria. By contrast, male flies homozygous for the R69 allele better resist starvation than flies homozygous for the S69 allele.

      Weaknesses:<br /> -I am surprised by the inconsistency between the data presented in Fig. 1A and Fig. S2A for the survival of male flies after infection with P. rettgeri. I am not convinced that the data presented support the claim that females have lower survival rates than males when infected with P. rettgeri (lines 176-182).

      -The data in Fig. 2 do not seem to support the claim that female flies with either the dptA S69 or the R69 alleles have a longer lifespan than males (lines 211-215). A comment on the [delta] dpt line, which is one of the CRISPR edited lines, would be welcome.

      -The data in Fig. 2B show that male flies with the dptA S69 or R69 alleles have the same lifespan when poly-associated with L. plantarum and A. tropicalis, which contradicts the claim of the authors (lines 256-260).

    2. Reviewer #2 (Public Review):

      Summary: In this study, the authors delve into the mechanisms responsible for the maintenance of two diptericin alleles within Drosophila populations. Diptericin is a significant antimicrobial peptide that plays a dual role in fly defense against systemic bacterial infections and in shaping the gut bacterial community, contributing to gut homeostasis.

      Strengths: The study unquestionably demonstrates the distinct functions of these two diptericin alleles in responding to systemic infections caused by specific bacteria and in regulating gut homeostasis and fly physiology. Notably, these effects vary between male and female flies.

      Weaknesses: Although the findings are highly intriguing and shed light on crucial mechanisms contributing to the preservation of both diptericin alleles in fly populations, a more comprehensive investigation is warranted to dissect the selection mechanisms at play, particularly concerning diptericin's roles in systemic infection and gut homeostasis. Unfortunately, the results from the association study conducted on wild-caught flies lack conclusive evidence.

      Major Concerns:

      Lines 120-134: The second hypothesis is not adequately defined or articulated. Please revise it to provide more clarity. Additionally, it should be explicitly stated that the first part of the first hypothesis (pathogen specificity), i.e., the superior survival of the S allele in Providencia infections compared to the R allele, has been previously investigated and supported by the results in the Unkless et al. 2016 paper. The current study aims to additionally investigate the opposite scenario: whether the R allele exhibits better survival in a different infection. Please consider revising to emphasize this point.

      Figures and statistical analyses: It is essential to present the results of significant differences from the statistical analyses within Figures 1B, 2B, and 3. Additionally, please include detailed descriptions of the statistical analysis methods in the figure legends. Specify whether the error bars represent standard error or standard deviation, particularly in Figure 3, where assays were conducted with as few as 3 flies.

      Lines 317-318 (as well as 320-328): The data related to P. rettgeri appear somewhat incomplete, and the authors acknowledge that bacterial load varies significantly, and this bacterium establishes poorly in the gut. These data may introduce more noise than clarity to the study. Please consider revising these sections by either providing more data, refining the presentation, or possibly removing them altogether.

      Lines 335-387 and Figure 4: Although these results are intriguing and suggest interactions between functional diptericin and fly physiology, some mediated by the gut microbiome, they remain descriptive and do not significantly contribute to our understanding of the mechanism that maintains the diptericin alleles.

      Lines 399-400: The contrast between this result and statement and the highly reproducible data presented in Figures 2-4 should be discussed.

      Lines 422-429 and Figure 5D: The conclusion regarding an association between diptericin alleles and Morganellaceae bacteria is not clearly supported by Figure 5D and lacks statistical evidence.

    3. Reviewer #3 (Public Review):

      Summary:<br /> This paper investigates the evolutionary aspects around a single amino acid polymorphism in an immune peptide (the antimicrobial peptide Diptericin A) of Drosophila melanogaster. This polymorphism was shown in an earlier population genetic study to be under long-term balancing selection. Using flies with different AA at this immune peptide it was found that one allelic form provides better survival of systemic infections by a bacterial pathogen, but that the alternative allele provides its carriers a longer lifespan under certain conditions (depending on the microbiota). It is suggested that these contrasting fitness effects of the two alleles contribute to balance their long-term evolutionary fate.

      Strengths:<br /> The approach taken and the results presented are interesting and show the way forward for studying such polymorphisms experimentally.

      Weaknesses:<br /> 1. A clear demonstration (in one experiment) that the antagonistic effect of the two selection pressures isolated is not provided.

      The study is overwhelming with many experiments and countless statistical tests. The overall conclusion of the many experiments and tests suggests that "dptS69 flies survive systemic infection better, while dptS69R flies survive some opportunistic gut infections better." (line 444-446). Given the number of results, different experiments, and hundreds of tests conducted, how can we make sure that the result is not just one of many possible combinations? I suggest experimentally testing this conclusion in one experiment (one may call this the "killer-experiment") with the relevant treatments being conducted at the same time, side by side, and the appropriate statistical test being conducted by a statistical test for a treatment x genotype interaction effect.

      2. The implication that the two forms of selection acting on the immune peptide are maintained by balancing selection is not supported.

      The picture presented about how balancing selection is working is rather simplistic and not convincing. In particular, it is not distinguished between fluctuating selection (FL) and balancing selection (BL). BL is the result of negative frequency-dependent selection. It may act within populations (e.g. Red Queen type processes, mating types) or between populations (local adaptation). FL is a process that is sometimes suggested to produce BL, but this is only the case when selection is negative frequency dependent. In most cases, FL does not lead to BL.

      The presented study is introduced with a framework of BL, but the aspects investigated are all better described as FL (as the title says: "A suite of selective pressures ..."). The two models presented in the introduction (lines 62 to 69; two pathogens, cost of resistance) are both examples for FL, not for BL.

      Finally, no evidence is presented that the different selection pressures suggested to select on the different allelic forms of the immune peptide are acting to produce a pattern of negative frequency dependence.

    1. Reviewer #1 (Public Review):

      Summary:<br /> This interesting study applies the PSMC model to a set of new genome sequences for migratory and nonmigratory thrushes and seeks to describe differences in the population size history among these groups. The authors create a set of summary statistics describing the PSMC traces - mean and standard deviation of Ne, plus a set of metrics describing the shape of the oldest Ne peak - and use these to compare across migratory and resident species (taking single samples sequenced here as representative of the species). The analyses are framed as supporting or refuting aspects of a biogeographic model describing colonization dynamics from tropical to temperate North and South America.

      Strengths:<br /> At a technical level, the sequencing and analysis up through PSMC looks good and the paper is engaging and interesting to read as an introduction to some verbal biogeographic models of avian evolution in the Pleistocene. The core findings - higher and more variable Ne in migratory species - seem robust, and the biogeographic explanation is plausible.

      Weaknesses:<br /> I did not find the analyses particularly persuasive in linking specific aspects of clade-level PSMC patterns causally to evolutionary driving forces. To their credit, the authors have anticipated my main criticism in the discussion. This is that variation in population size inferred by methods like PSMC is in "effective" terms, and the link between effective and census population size is a morass of bias introduced by population structure and selection so robustly connecting specific aspects of PSMC traces to causal evolutionary forces is somewhere between extremely difficult and impossible.

      Population structure is the most obvious force that can generate large Ne changes mimicking the census-size-focused patterns the authors discuss. The authors argue in the discussion that since they focus on relatively deep time (>50kya at least, with most analyses focusing on the 5mya - 500kya range) population structure is "likely to become less important", and the resident species are usually more structured today (true) which might bias the findings against the observed higher Ne in migrants.

      But is structure really unimportant in driving PSMC results at these specific timescales? There is no numerical analysis presented to support the claim in this paper. The biogeographic model of increased temperate-latitude land area supporting higher populations could yield high Ne via high census size, but shifts in population structure (for example, from one large panmictic population to a series of isolated refugial populations as a result of glaciation-linked climate changes) could plausibly create elevated and more variable Ne. Is it more land area and ecological release leading to a bigger and faster initial Ne bump, or is it changes in population connectivity over time at expanding range edges, or is the whole single-bump PSMC trace an artifact of the dataset size, or what? The authors have convinced me that the Ne history of migratory thrushes is on average very different from nonmigrant thrushes, but beyond that it's unclear what exactly we've learned here about the underlying process.

      I generally agree with the authors that "at present there is no way to fully disentangle the effects of population structure and geographic space on our results". But given that, I think there are two options - either we can fully acknowledge that oversimplified demographic models like PSMC cannot be interpreted as supporting evidence of any particular mechanistic or biogeographic hypothesis and stop trying to use them to do that, or we have to do our best to understand specifically which models can be distinguished by the analyses we're employing.

      Short of developing some novel theory deep in the PSMC model, I think readers would need to see simulations showing that the analyses employed in this paper are capable of supporting or refuting their biogeographic hypothesis before viewing them as strongly supporting a specific biogeographic model. Tools like msprime and stdpopsim can be used to simulate genome-scale data with fairly complex biogeographic models. Running simulations of a thrush-like population under different biogeographic scenarios and then using PSMC to differentiate those patterns would be a more convincing argument for the biogeographic aspects of this paper. The other benefit of this approach would be to nail down a specific quantitative version of the taxon cycles model referenced in the abstract, and it would allow the authors to better study and explain the motivation behind the specific summary statistics they develop for PSMC posthoc analysis.

    2. Reviewer #2 (Public Review):

      Summary:<br /> Winker and Delmore present a study on the demographic consequences of migratory versus resident behavior by contrasting the evolutionary history of lineages within the same songbird group (thrushes of the genus Catharus).

      Strengths:<br /> I appreciate the test-of-hypothesis design of the study and the explicit formulation of three main expectations to test. The data analysis has been done with appropriate available tools.

      Weaknesses:<br /> The current version of the paper, with the case study chosen, the results, and the relative discussion, is not satisfying enough to support or reject the hypotheses here considered.

      The authors hypothesized that the wider realized breeding and ecological range characterising migrants versus resident lineages could be a major drive for increased effective population size and population expansion in migrants versus residents. I understand that this pattern (wider range in migrants) is a common characteristic across bird lineages and that it is viewed as a result of adapting to migration. A problem that I see in their dataset is that the breeding grounds range of the two groups are located in very different geographic areas (mainly South versus North America). The authors could have expanded their dataset to include species whose breeding grounds are from the two areas, regardless of their migratory behaviour, as a comparison to disentangle whether ecological differences of these two areas can affect the population sizes or growth rates.

      As I understand from previous literature, the time-scale to population growth and estimates of effective population sizes considered in the present paper for the resident versus migratory clades seem to widely predate the times to speciation for the same lineages, which were reported in previous work of the same authors (Everson et al 2019) and others (Termignoni-Garcia et al 2022). This piece of information makes the calculation of species-specific population size changes difficult to interpret in the light of lineages' comparison. It is unclear what the authors consider to be lineage-specific in these estimates, as the clades were likely undergoing substantial admixture during the time predating full isolation.

      Regarding the methodological difficulties in interpreting the impact of population structure on the estimates of effective population sizes with the PSMC approach, I would think that performing simulations to compare different scenarios of different degrees of structured populations would have helped substantially understand some of the outcomes.

      Additionally, I have struggled to understand if migratory behaviour in birds is considered to be acquired to relieve species competition, or as a consequence of expanded range (i.e., birds expand their range but their feeding ground is kept where speciation occurred as to exploit a ground with higher quality and abundance of seasonal local resources).

      The points raised above could be considered to improve the current version of the paper.

    3. Reviewer #3 (Public Review):

      Summary:<br /> This paper applies PSMC and genomic data to test interesting questions about how life history changes impact long-term population sizes.

      Strengths:<br /> This is a creative use of PSMC to test explicit a priori hypotheses about season migration and Ne. The PSMC analyses seem well done and the authors acknowledge much of the complexity of interpretation in the discussion.

      Weaknesses:<br /> The authors use an average generation time for all taxa, but the citations imply generation time is known for at least some of them. Are there differences in generation time associated with migration? I am not a bird biologist, but quick googling suggests maybe this is the case (https://doi.org/10.1111/1365-2656.13983). I think it important the authors address this, as differences in generation time I believe should affect estimates of Ne and growth.

      The writing could be improved, both in the introduction for readers not familiar with the system and in the clarity and focus of the discussion.

    1. Reviewer #1 (Public Review):

      The manuscript has helped address a long-standing mystery in splicing regulation: whether splicing occurs co- or post-transcriptionally. Specifically, the authors (1) uniquely combined smFISH, expansion microscopy, and live cell imaging; (2) revealed the ordering and spatial distribution of splicing steps; and (3) discovered that nascent, not-yet-spliced transcripts move more slowly around the transcription site and undergo splicing as they move through the clouds. Based on the experimental results, the authors suggest that the observation of co-transcriptional splicing in previous literature could be due to the limitation of imaging resolution, meaning that the observed co-transcriptional splicing might actually be post-transcriptional splicing occurring in proximity to the transcription site. Overall, the work presented here clearly provides a comprehensive picture of splicing regulation.

      Major points:<br /> 1. Linearity of expansion microscopy. For Figure 2B, it would be helpful to display the same sample before and after expansion, just like Supplementary Figure 3, but with a transcription site and "cloud". In the current version, the transcription site looks quite different in the not-expanded (more green dots on the left) and expanded image (more green dots on the top).

      2. FISH dot colocalization. What is the colocalization rate of FISH dots in general under experimental conditions? In addition, in Figures 2C and 2G, why do some 3'exon dots not have co-localized 5'exon dots?

      3. It would be helpful if the authors uploaded a few examples of live cell imaging movies.

      4. It is recommended to double-check the text for errors.

    2. Reviewer #2 (Public Review):

      Allison Coté et al. investigated the ordering and spatial distribution of nascent transcripts in several cells using smFISH, expansion microscopy, and live-cell imaging. They find that pre-mRNA splicing occurs post-transcriptionally at the clouds around the transcription start site, termed the transcription site proximal zone. They show that pre-mRNA may undergo continuous splicing when they pass through the zone after transcription. These data suggest a unifying model for explaining previously reported co-transcriptional splicing events and provide a direction for further study of the nature of the slow-moving zone around the transcription start site.

      This paper is well-written. The findings are very important, and the data supports the conclusions well. However, some aspects of the image and description need to be clarified and revised.

      The authors describe Figure 4E and 4F results in the main text as that "we performed RNA FISH simultaneously with immunofluorescence for SC35, a component of speckles, and saw that this compartmentalized pre-mRNA did indeed appear near nuclear speckles both before (Supplementary Figure 6C) and after (Figure 4E) splicing inhibition." However, no SC35 staining is shown in the Figure 4E. A similar situation happened in describing Figure 4F.

    1. Reviewer #1 (Public Review):

      Summary:<br /> The authors use a combination of biochemistry and cryo-EM studies to explore a complex between the cap-binding complex and an RNA binding protein, ALYREF, that coordinates mRNA processing and export.

      Strengths:<br /> The biochemistry and structural biology are supported by mutagenesis which tests the model in vitro. The structure provides new insight into how key events in RNA processing and export are likely to be coordinated.

      Weaknesses:<br /> The authors provide biochemical studies to confirm the interactions that they identify; however, they do not perform any studies to test these models in cells or explore the consequences of mRNA export from the nucleus. In fact, several of the amino acids that they identified in ALYREF that are critical for the interaction, as determined by their own biochemical studies, are conserved in budding yeast Yra1 (residues E124/E128 are E/Q in budding yeast and residues Y135/V138/P139 are F/S/P), where the impact on poly(A) RNA export from the nucleus could be readily evaluated. The authors could at least mention this point as part of the implications and the need for future studies. No one seems to have yet targeted any of these conserved residues, so this would be a logical extension of the current work.

      Specific suggestions:<br /> The authors could put their work in context by speculating how some of the amino acids that they identify as being critical for the interactions they identify could contribute to cancer. For example, they mention mutations of interacting residues in NCBP2 are associated with human cancers, pointing out that NCBP2 R105C amino acid substitution has been reported in colorectal cancer and the NCBP2 I110M mutation has been found in head and neck cancer. Do the authors speculate that these changes would decrease the interaction between NCBP2 and ALYREF and, if so, how would this contribute to cancer? They also mention that a K330N mutation in NCBP1 in human uterine corpus endometrial carcinoma, where Y135 on the α2 helix of mALYREF2 makes a hydrogen bond with K330 of NCBP1. How do they speculate loss of this interaction would contribute to cancer?

    2. Reviewer #2 (Public Review):

      Summary:<br /> In this manuscript, Bradley and his colleagues represented the cryo-EM structure of the nuclear cap-binding complex (CBC) in complex with an mRNA export factor, ALYREF, providing a structural basis for understanding CBC regulating gene expression.

      Strengths:<br /> The authors successfully modeled the N-terminal region and the RRM domain of ALYREF (residues 1-183) within the CBC-ALYREF structure, which revealed that both the NCBP1 and NCBP2 subunits of the CBC interact with the RBM domain of ALYREF. Further mutagenesis and pull-down studies provided additional evidence to the observed CBC-ALYREF interface. Additionally, the authors engaged in a comprehensive discussion regarding other cellular complexes containing CBC and/or ALYREF components. They proposed potential models that elucidated coordinated events during mRNA maturation. This study provided good evidence to show how CBC effectively recruits mRNA export factor machinery, enhancing our understanding of CBC regulating gene expression during mRNA transcription, splicing, and export.

      Weaknesses:<br /> No in vivo or in vitro functional data to validate and support the structural observations and the proposed models in this study. Cryo-EM data processing and structural representation need to be strengthened.

    3. Reviewer #3 (Public Review):

      Summary:<br /> The authors carried out structural and biochemical studies to investigate the multiple functions of CBC and ALYREF in RNA metabolism.

      Strengths:<br /> For the structural study part, the authors successfully revealed how NCBP1 and NCBP2 subunits interact with mALYREF (residues 1-155). Their binding interface was then confirmed by biochemical assays (mutagenesis and pull-down assays) presented in this study.

      Weaknesses:<br /> The authors did not provide functional data to support their proposed models. The authors should include more details regarding the workflow of their cryo-EM data processing in the figure.

    1. Reviewer #1 (Public Review):

      Summary:<br /> Kinase inhibitors represent a highly valuable class of drugs as evidenced by their continued clinical success. The target landscape of kinase targeting small molecules can be leveraged to alter multiple phenotypes with increasing complexity that broadly aligns with increasing target promiscuity. This 'tools and resources' contribution provides a starting point for researchers interested in aligning kinase inhibitor activity with cytokine/chemokine stimulated signal transduction networks.

      Strengths:<br /> KinCytE is a forward-thinking database that yields hypothesis-generating options for researchers interested in pharmacologically modulating cytokine/chemokine signaling.

      Weaknesses:<br /> As a 'tools and resources' contribution, the primary (potential) weakness will be the authors' willingness to update and improve the tool. KinCytE will require frequent updating to better inform users in terms of contextual cytokine/chemokine stimulated signaling and the target landscape of those agents that are included as options.

    2. Reviewer #2 (Public Review):

      Summary:<br /> In this manuscript, "KinCytE- a Kinase to Cytokine Explorer to Identify Molecular Regulators and Potential Therapeutic", the authors present a web resource, KinCytE, that lets researchers search for kinase inhibitors that have been shown to affect cytokine and chemokine release and signaling networks. I think it's a valuable resource that has a lot of potential and could be very useful in deciding on statistical analysis that might precede lab experiments.

      Opportunities:<br /> With the release of the manuscript and the code base in place, I hope the authors continue to build upon the platform, perhaps by increasing the number of cell types that are probed (beyond macrophages). Additionally, when new drug-response data becomes available, perhaps it can be used to further validate the findings. Overall, I see this as a great project that can evolve.

      Strengths:<br /> The site contains valuable content, and the structure is such that growing that content should be possible.

      Weaknesses:<br /> Only based on macrophage experiments, would be nice to have other cell types investigated, but I'm sure that will be remedied with some time.

    1. Reviewer #1 (Public Review):

      Summary:

      This work describes a new method for sequence-based remote homology detection. Such methods are essential for the annotation of uncharacterized proteins and for studies of protein evolution.

      Strengths:

      The main strength and novelty of the proposed approach lies in the idea of combining state-of-the-art sequence-based (HHpred and HMMER) and structure-based (Foldseek) homology detection methods with recent developments in the field of protein language models (the ESM2 model was used). The authors show that features extracted from high-dimensional, information-rich ESM2 sequence embeddings can be suitable for efficient use with the aforementioned tools.

      The reduced features take the form of amino acid occurrence probability matrices estimated from ESM2 masked-token predictions, or structural descriptors predicted by a modified variant of the ESM2 model. However, we believe that these should not be called "embeddings" or "representations". This is because they don't come directly from any layer of these networks, but rather from their final predictions.

      The benchmarks presented suggest that the approach improves sensitivity even at very low sequence identities <20%. The method is also expected to be faster because it does not require the computation of multiple sequence alignments (MSAs) for profile calculation or structure prediction.

      Weaknesses:

      The benchmarking of the method is very limited and lacks comparison with other methods. Without additional benchmarks, it is impossible to say whether the proposed approach really allows remote homology detection and how much improvement the discussed method brings over tools that are currently considered state-of-the-art.

    2. Reviewer #2 (Public Review):

      Summary:

      The authors present a number of exploratory applications of current protein representations for remote homology search. They first fine-tune a language model to predict structural alphabets from sequence and demonstrate using these predicted structural alphabets for fast remote homology search both on their own and by building HMM profiles from them. They also demonstrate the use of residue-level language model amino acid predicted probabilities to build HMM profiles. These three implementations are compared to traditional profile-based remote homology search.

      Strengths:

      - Predicting structural alphabets from a sequence is novel and valuable, with another approach (ProstT5) also released in the same time frame further demonstrating its application for the remote homology search task.<br /> - Using these new representations in established and battle-tested workflows such as MMSeqs, HMMER, and HHBlits is a great way to allow researchers to have access to the state-of-the-art methods for their task.<br /> - Given the exponential growth of data in a number of protein resources, approaches that allow for the preparation of searchable datasets and enable fast search is of high relevance.

      Weaknesses:

      - The authors fine-tuned ESM-2 3B to predict 3Di sequences and presented the fine-tuned model ESM-2 3B 3Di with a claimed accuracy of 64% compared to a test set of 3Di sequences derived from AlphaFold2 predicted structures. However, the description of this test set is missing, and I would expect repeating some of the benchmarking efforts described in the Foldseek manuscript as this accuracy value is hard to interpret on its own.<br /> - Given the availability of predicted structure data in AFDB, I would expect to see a comparison between the searches of predicted 3Di sequences and the "true" 3Di sequences derived from these predicted structures. This comparison would substantiate the innovation claimed in the manuscript, demonstrating the potential of conducting new searches solely based on sequence data on a structural database.<br /> - The profile HMMs built from predicted 3Di appear to perform sub-optimally, and those from the ESM-2 3B predicted probabilities also don't seem to improve traditional HMM results significantly. The HHBlits results depicted in lines 5 and 6 in the figure are not discussed at all, and a comparison with traditional HHBlits is missing. With these results and presentation, the advantages of pLM profile-based searches are not clear, and more justification over traditional methods is needed.<br /> - Figure 3 and its associated text are hard to follow due to the abundance of colors and abbreviations used. One figure attempting to explain multiple distinct points adds to the confusion. Suggestion: Splitting the figure into two panels comparing (A) Foldseek-derived searches (lines 7-10) and (B) language-model derived searches (line 3-6) to traditional methods could enhance clarity. Different scatter markers could also help follow the plots more easily.<br /> - The justification for using Foldseek without amino acids (3Di-only mode) is not clear. Its utility should be described, or it should be omitted for clarity.<br /> - Figure 2 is not described, unclear what to read from it.

    1. Reviewer #1 (Public Review):

      Summary:<br /> There has been substantial prior work trying to understand the transcriptional control of proteasome expression as an adaptive response to proteasome inhibition. This field has been mired by fierce debates over the role of the protease Ddi2 in activating the transcription factor Nrf1/NFE2L1. As the authors of this manuscript point out, most of the previous research centers on the continuous treatment of cells with proteasome inhibitors rather than a brief pulse of inhibition that better models the situation when these drugs are used clinically. The authors find that the initial recovery of proteasome activity is independent of Ddi2 and involves a mechanism distinct from transcription. The authors intriguingly point to a model in which the assembly of proteasomes is regulated. If true, this would be a significant finding, but for now, this model remains more speculative.

      Strengths:<br /> The pulsed treatment of proteasome inhibitors is a strength of this lab that few others use. It better mimics the clinical use of these inhibitors and allows for a more detailed analysis of the initial response to inhibition. The authors have used multiple different clones of Ddi2 knockouts and siRNA against Ddi2 to rule out the necessity of Ddi2 in the early production of proteasomes when cells are inhibited with proteasomes. establishing a thorough knockout approach while also avoiding compensatory mutations. These experiments are well controlled showing both the levels of Ddi2 upon knockout or knockdown and demonstration that cleavage of Nrf1, one of two known targets of Ddi2, is impaired. However, it should be noted that even in the knockout residual bands for Ddi2 remain. Since these HAP1 cells only have one copy of the Ddi2 gene, it is possible that this other band could be Ddi1, a very similar paralogue. If so the conclusions of Ddi2-like activity with Ddi1 must be tempered and rely more on the data with Nrf1 knockdowns.

      This article sensitively monitors the recovery of proteasome function with the β5 activity assay and for the production of new proteasome transcripts by Q-PCR. This precision coupled with detailed analysis of the timing are strengths that pointed to a more rapid recovery than transcription alone.

      Weaknesses:<br /> This paper's major weakness is the difficulty in establishing the authors' model that assembly is regulating this process. They do a convincing job demonstrating that activity recovers before transcription. The evidence that translation is not affected depends entirely on the polysome RNA profiling from two replicates. Clearer and orthogonal data would help establish this finding. The stability of subunits is interesting and important in its own right. However, the clustering of proteins is somewhat unusual. The authors include PSMB8, an immuno-proteasome subunit that is not regulated by Nrf1. The proteins highlighted in green are an unusual assortment of alternative activators (PSME1-3), a ubiquitin-binding protein (ADRM1), and proteasome chaperones (PSMG1-2). Similarly, the purple proteins are not just proteins in the 19S regulatory particle but also assembly chaperones. However, these labeling issues do not detract from the conclusions of this figure.

      In short, the authors establish that Ddi2 is not necessary for the initial, non-transcriptional, recovery of proteasome activity after a pulse of proteasome inhibition.

      It is not clear what clinical impact this work will have. Although it models the pulse of proteasome inhibition more perfectly, it only looks at a single pulse rather than multiple treatments. Thus, ruling Ddi2's importance out for clinical benefit may be premature. More significantly this work suggests that the assembly of proteasomes might be a regulated process worth substantial follow-up that will be interesting to follow.

    2. Reviewer #2 (Public Review):

      Summary:<br /> In this work, Ibtisam and Kisselev explore the role of DDI2 in proteasome function recovery after a clinically relevant pulse dosing using different proteasome inhibitors and their corresponding PK properties. The authors report that despite the lack of NRF1 activation by DDI2 there was no difference in recovery from pulsed proteasome inhibition observed in DDI2 KO cells as compared to WT controls suggesting that DDI2 is not required for recovery in this system. They further show that transcription of the proteasome subunits is initiated only after partial recovery of proteasome activity is already observed suggesting that non-transcriptional mechanisms might be also involved. The authors further show that translation inhibition blocked the recovery from proteasome inhibitors.

      Strengths:<br /> Overall, it is very important and informative to use a pulse treatment type approach (mimicking the PK properties of the drugs) to explore the biology of PIs as used in this study. The authors also provide convincing data that DDI2 is not required for proteasome activity recovery post-PI pulse treatment in the systems they explored.

      Weaknesses:<br /> Many of the other conclusions are not supported by the data in the current form of the manuscript and are too speculative and ignore the major findings in the field that can present alternative mechanisms. In particular, the authors discuss the "levels" of the proteasomes post-PI treatment without measuring the actual protein level of the individual subunits or the different assembled proteasome complexes.

    3. Reviewer #3 (Public Review):

      Summary:<br /> In their manuscript "Recovery of proteasome activity in cells pulse-treated with proteasome inhibitors is independent of DDI2", Ibtisam and Kisselev investigate proteasome recovery in HAP1 cells either WT or DDI2 KO upon inhibition of proteasome via bortezomib or carfilzomib. The authors argue that proteasome recovery is independent of DDI2 as it is independent of the novo proteasome subunit synthesis. They argue recovery is dependent on the assembly of already synthesized proteasome subunits.

      Strengths:<br /> The findings are important as they provide insight into a transcriptionally-independent proteasome stress recovery that is likely applicable across distinct cellular subtypes. Comparable proteasome recovery early on (<12 hours) from proteasomal inhibition in DDI2 KO cell lines was already noted in other manuscripts, including Chen et al, suggesting that this phenomenon is applicable to other histotypes.

      Weaknesses:<br /> Some of the conclusions are not adequately supported by the data and how generalizable these findings are is unclear. In particular, there is concern regarding the status of the ubiqutin-proteasome-system in the HAP1 cell line that was used for these studies. In a previously published model system, a dependency on DDI2 and NRF1 was clearly demonstrated and this pathway was critical for late (12-24 hours) proteasome recovery as well as cell viability. The model system used here (HAP1 cells) seems completely independent of DDI2 both for proteasome recovery and viability as curves are substantially overlapping. It would be important to assess how the baseline proteasome activity in HAP1 cells compare to other cell lines and model system as these cells may be largely independent of proteasome degradation and their synthetic load on the pathway very modest.

      It would also be relevant to look at later time points of proteasome recovery as one would expect DDI2 to play a role later on in the recovery of proteasome. the authors may have missed that time point as cells do not appear to recover close to 100% proteasome activity by 24 hours not even when the smallest concentration of carfilzomib is used.

      A critical experiment to look at de novo proteasome assembly was not carried out, leaving the data hypothetical.

      Finally, the authors leverage HAP1 cells for their work and should be mindful of not generalizing findings or disputing other author's conclusions in the absence of adequate experiments to support their hypothesis.

    1. Reviewer #3 (Public Review):

      Youssef et al. have used a range of markers to identify cancer stem cells (CSCs) in patients with oral cancers. CSCs were identified in lab conditions and were often linked to the invasiveness of cancers. The authors found a combination of markers convincingly liked to known biology and found cells expressing them in the invading cancers.<br /> The major weakness of the paper is in the technical side. There isn't enough description as to how they discriminated between CSCs inside the tumour and those invading its surroundings. Similarly, the way the information is presented it is not clear why artificial intelligence was needed to enhance the accuracy of the method linking CSCs to cancer invasion (and ultimately deadly metastasis to other organs).

    2. Reviewer #1 (Public Review):

      This is a valuable study that convincingly demonstrates that quantification of EpCAM+/CD24+/Vimentin+ cells in the stroma of human oral cancers followed by machine learning algorithms can be used as a prognostic indicator of metastasis.

      This manuscript explores the utility of detecting a population of EpCAM+/CD24+/Vimentin+ cells in the stroma of human oral cancers as a prognostic indicator of metastasis. This follows work from the group showing that these cells manifest EMT plasticity. The authors used standard analyses and then machine learning algorithms on a test cohort of 24 patients and then a validation cohort of 60. Overall the staining seems clean, and the presence of these cells does seem to be predictive in a cohort of oral cancer patients.

      The authors have addressed previous comments, adding additional patients and streamlining the work to focus on one hypothesis.

      An additional validation set would enhance the work.

      The authors should include clinical data for all samples used.

    3. Reviewer #2 (Public Review):

      It is recommended to use a blind sample test to determine the specimen's status using the AI they developed.<br /> Where these markers promote tumorigenesis or metastasis if tested in vivo?<br /> The article would be very valuable in the future to promote using AI to predict disease status and facilitate cancer screening.<br /> Much more improvement is required for data validation and presentation.

    1. Reviewer #1 (Public Review):

      The authors aimed to understand whether the superficial, retinorecipient layers of the mouse superior colliculus (sSC) participate in figure-ground segregation and object recognition. To address this question, they use a combination of optogenetic perturbations of sSC and recordings. These data are consistent with SC being causally involved in object recognition. This would be useful information for the field and likely to be cited. However, I have several concerns regarding their conclusions.

      A significant limitation of this study is methodological. The major novelty is the effect of optogenetic silencing, because the recordings are largely correlative, but the optogenetic silencing approach lacks appropriate controls for the effects of the optogenetic excitation light. The authors acknowledge that the optogenetic light is a potential confound, but attempt to address this by shielding the fiber to eliminate light leak and strobing a blue led in the arena. The former does not account for the effects of excitation light scattering intracerebrally--during optogenetic experiments, intracerebral scattering causes the eyes to light up--and for the latter, there is no way to compare the intensity or qualia of the externally strobed LED and the intracerebral light. The proper control would be a cohort of mice lacking channelrhodopsin expression in sSC. Regardless, it is essential to acknowledge this potential confound.

      Relatedly, as the authors note, there are GABAergic projection neurons in sSC that may be driving these effects via gain of function. This is a significant concern that has limited the widespread adoption of this approach in sSC despite its popularity in studies in cortex. Indeed, one recently published study of behavioral functions of deep SC found that activating inhibitory neurons actually caused paradoxical behavioral effects consistent with gain of function in the targeted hemisphere, due to the effects of long-range inhibitory projections on the other SC hemisphere. Given the presence of inhibitory projections in sSC, it would be preferable to use an orthogonal method for silencing and at least to thoroughly acknowledge these concerns and cite these recent studies.

      A minor point is that although activation of GABAergic neurons in sSC is expected to cause inhibition of neighboring neurons, I would expect channelrhodopsin-expressing GABAergic cells to show an increase in firing during optogenetic excitation. However, it seems that none of the cells plotted (assuming each point in Supplementary Fig 4D is a cell, which the legend does not specify) had such an increase. Do these extracellular recordings not detect inhibitory neurons well?

      Finally, the relationship between these stimuli and objects is not entirely clear. The authors acknowledge this but it would be worthwhile to devote more attention to this point. In effect, as the authors note, the gray screen and sinuisoidal grating do not have any sharp edges on the screen, whereas each of the behaviorally relevant stimuli will create a sharp, step-like edge on the screen. Whether edge detection is truly object detection or simply a variant of more general visual detection is unclear.

    2. Reviewer #2 (Public Review):

      The goal of this study is to show that the superficial superior colliculus (sSC) of mouse signals figure-ground differences defined by contrast, orientation, and phase, and that these signals are necessary for the animal to detect such figure-ground differences. By inhibiting sSC while the animals perform a figure-ground detection task, the study shows that detection performance decreases when sSC activity is suppressed during the onset of the visual stimulus. The study then intends to show that sSC neurons exhibit surround suppression based on orientation differences, and that surround suppression is stronger when the animal detects the correct location of the figure on the background.

      The major strength of this study is the use of a behavioural paradigm to test detection performance of figure-ground stimuli while manipulating neural activity in the sSC during different times after stimulus onset. This paradigm would show whether activity in the sSC is relevant for performing the task. Secondly, the study collected data to confirm previous findings: sSC neurons exhibit orientation specific surround suppression. Additionally, it is impressive that the authors were able to train mice to generalize their task performance across different stimulus categories (figure-ground differences in orientation and phase). This should be highlighted as it may inform future studies.

      The study has, however, methodological and analytical weaknesses so that the stated conclusions are not supported by the presented results.

      1) Optogenetic inhibition is not limited to sSC (even expression may not be limited)<br /> About 30% of inhibitory neurons in the sSC project to other areas, e.g. ventral LGN, parabigeminal nucleus and pretectum (Whyland et al, 2019, see ref in manuscript). This means that these areas receive direct inhibition when inhibitory sSC neurons are optogenetically stimulated. This fact is mentioned in the discussion but the consequences and implications for the results are ignored. This is a major flaw of the optogenetic experiments of this study. Additionally, no evidence is given that opsin expression was limited to the superficial layers (except for one histological slice), which the authors acknowledge in line 285. Deeper layers may have other inhibitory neurons with long-range projections.<br /> The finding that sSC neurons show no figure-ground modulation for phase while the optogenetic manipulation has behavioural effects may be an indication for other areas being affected by the optogenetic manipulation.

      2) Could other behavioural variables explain the results?<br /> a) Are there any task events other than the visual stimuli that the mice could use to make their decisions? The authors state the use of a custom made lick spout but it is not clear how this spout works, i.e. how do mechanics of the spout deliver water to the right versus the left output and could the mouse perceive these mechanics?<br /> b) Could the different neural responses to figure versus ground shown in Fig 2I-J and Fig 3B be explained by behaviours varying between the trial types, e.g. by early lick movements (which are conceivable even if the spout is not present), eye movements or changes in pupil-linked arousal? A behavioural difference seems even more likely to occur between hit and error/miss trials (Fig 4). If these behaviours were not measured, the possibility of behavioural modulation should be discussed.

      3) What is the behavioural strategy of the animals?<br /> Only licks beyond 200 ms after stimulus onset determine the choice of the animal because "mice made early random licks" from 0 to 200 ms. To better understand the behavioural strategies of the animals we need to see their behavioural data, i.e. left and right licks aligned to stimulus onset. It would be particularly interesting to see how number and latency of licks changes during optogenetic manipulation.

      4) Data relating to misses should be included in analyses to provide a complete picture of behaviour and neural responses<br /> a) In the optogenetic manipulations, an increase in misses seems to dominate the decreased accuracy (please, explain when a response was counted as a miss). A separate analysis of miss trials may be more robust than of error trials and also offers a different interpretation of the data, namely that the mouse did not see the stimulus rather than perceiving the figure on the opposite side. However, if the mice reduced their lick rate in general during optogenetic stimulation, this begs the question whether their motor performance was affected by optogenetic manipulation. Can this possibility be excluded?<br /> b) Related to Fig 4, it would be equally interesting to see how FGM changes during misses. Do the changes support the observations for error trials?

      5) Statistical tests do not support the conclusions, are missing or inadequate<br /> a) In Fig 1E, accuracy is significantly affected at only 1-2 time points in each task, specifically either the 1st and 3rd or the 2nd time point. How do the authors interpret these results? If inhibition starting at the 2nd time point has no significant effects, why would it be significant when inhibition starts later (at the 3rd time)? Furthermore, given that all other starting points of laser stimulation have no significant effects, there is no reason to trust the latency of inhibition effects based on mostly insignificant data points. This analysis in its current form should be removed, including a comparison of latencies between tasks, which was not tested for significance. It may be more meaningful to analyse accuracy for each animal separately. This may reduce variability.<br /> b) Analyses regarding the difference in neural response to figure and ground (Fig 2I-J, Fig 3B, Fig 4B, Fig 5C) would be more convincing and informative if the differences were analysed on the level of single neurons in response to the same orientation within their RF (or at the location where the figure is presented, for edge-RF neurons). A histogram of these differences would show how many neurons are affected and how large the effect is in single neurons.<br /> c) All statistical tests performed across neurons should account for dependencies due to simultaneous recordings (dependency on session) and due to recordings in the same animal (dependency on animal). This can be done in most cases by using linear mixed-effects models.<br /> d) There was no significant difference between model weights (Fig 3D), so the statement in line 210 (RF-edge neurons had higher weights) should be removed.<br /> e) Fig 4B compares FGM during correct and error trials. This comparison has to be performed with the same set of neurons in correct and error trials (not the case for orientation). Again, the most compelling and informative comparison would be on the level of single neurons: response difference between figure and ground (same visual features at figure position) during hits versus errors.<br /> f) There is no evidence that FGM for phase was different between hit and error trials as stated in line 234.<br /> g) It is not clear why and how the mixed linear effects model was used pooling data across tasks (Fig 4C and Fig 5D). Different neurons were recorded for each task, so the sample points (neurons) are not affected by both task effects (orientation and phase). Each task should be analysed separately.<br /> h) Bonferroni correction in Fig 1E should correct multiple comparisons across time points, not across tasks (see Table 1).<br /> i) What is the reason to perform some tests one-tailed, others two-tailed?

      6) The results relating to "multisensory neurons" are ambiguous regarding their interpretation (if significant at all) and seem unrelated to the goal of the study. It is particularly likely that behaviours like licking or other movements cause the response differences between figure and ground.

      7) What depth were neurons recorded from (Fig 3 and 4)?

    3. Reviewer #3 (Public Review):

      The authors used optogenetic manipulations and electrophysiology recordings to study a causal role and the coding of superficial part of the mouse Superior Colliculus (SCs) during figure detection tasks. Authors previously reported that figure-ground perception relies on V1 activity (Kirchberger et al. 2021) and pointed out that silencing of V1 reduced the accuracy of the mice but still the performance was above the chance level. Therefore, visual information necessary in this task, could be processed via alternative pathways. In this study, authors investigated specifically SCs and used similar approach and analysis as in Kirchberger et al. 2021. Optogenetic silencing of the activity of visual neurons in SCs impaired the accuracy in all 3 versions of the figure detection task: contrast, orientation, and phase. Electrophysiology recordings revealed that SCs neurons are figure-ground modulated, but only by contrast- and orientation-based figures. They show SCs visually responsive neurons reflect behavioral performance in orientation-based figure task. The authors conclusion is that SCs is involved in figure detection task.

      Overall, this study provides evidence that mouse SCs is involved in a figure detection task, and codes for task-related events. Authors heroically compared results between 3 different versions of the figure-based detection task. The logic of the study flows through the manuscript and authors prepared a detailed description of methods. However, my main concern is with 1) the amount of data used to make the key arguments, and 2) the interpretation of results. The key findings of this study (figure-ground modulations in SCs) could be a result of the visual cortical feedback in SCs during the task, or pupil diameter changes. Unfortunately, the authors did not rule out these possibilities.

      Still, this study can be relevant to a general neuroscience audience, and results could be more convincing if the authors could clarify:

      1) Optogenetic inactivation<br /> - The impact of laser stimulation on neural activity is not satisfactory (Supplementary Figure 1). The method seems to be insufficient to fully salience neurons. Electrophysiology control recordings of inactivation are performed in anesthetized mice, which is not a fair estimation of the effect in awake state. Therefore, it rises a major question how effective the inactivation is during the task?<br /> - Could authors provide more details if laser stimulation has an effect only on visual, or all sampled units? How many of units were recorded, and how many show positive and negative laser modulation? How local the inactivation effect is? Where was the silicon probe placed in relation to AAV expression and optical fiber position?

      2) Number of sessions and units<br /> - The inactivation effect on behavior (Figure 1E) during phase-task has a significantly larger effect at 66ms after stimulus onset. How can authors explain this? Could this result be biased by one animal/session, or low number of trials for this condition? There is no information about number of trials, or sessions from individual animals. Adding a single example of animal's performance, and sessions for individual mice could clarify results in Figure 1.

      - Figure 2H shows an example of neuron with an effect in the figure detection task based on phase difference, but Figure 2I/J (population response) shows there is no effect. Overall, the conclusion is that SCs neurons are not modulated by a phase-defined object. It seems that number of mice and hence units are smaller in phase-detection task comparing to two other tasks. How many of single units are modulated in each version of the task? How big is the FGM effect on single neuron response (could authors provide values in spikes/s)?

      - One task is dropped from analysis which it is one of the main points of the paper: to compare responses across different versions of the figure detection task in SCs. But Figures 3-5 only focuses on two tasks, because there is not enough of data for figure-based contrast task.

      3) Figure-ground modulation in SCs<br /> - How is neural activity correlated with pupil size, movement (eg. whisking, or face), or jaw movement (preparation to lick)? Can activity of FGM neurons in SCs be explained by these behavioral variables?<br /> - Could authors describe in more detail how they measure a pupil position and diameter, by showing raw data, pupil size aligned to task events?<br /> - How does pupil diameter change between tasks? Small pupil changes can affect responses of visual neurons, and this could be an explanation of FGM effect in SCs. Can authors rule out this possibility, by for example showing pupil size and changes in position at stimulus onset in different tasks?<br /> - Authors in discussion mentioned that the modulation of V1 could be transferred to SCs through the direct projection. Moreover, animals perform above chance in both inactivation experiments (V1 and SC), which could be also an effect of geniculate projections to HVAs (eg. Sincich et al. 2004). Could authors discuss different possibilities?

      4) Interpretation of multisensory neurons is not clear. In Figure 5B, there is an example of neuron with two peaks of response. Authors speculate about the activity (pre-motor) but there is lack of clear measurement showing "multisensory" response of these neurons. Could these responses be related to the movement of the lick spout towards the mouth of the mouse (500 ms after the presentation of the stimulus)? Moreover, the number of "multisensory" units is very low (5 units, and 8 units).

    1. Joint Public Review:

      The assembly of the apical cytoskeleton of epithelial cells, i.e. the terminal web and microvilli (MV), requires precise control of actin dynamics and non-muscle myosin II (NM M2) contractility. Previous work from the Bretscher lab (Zaman et al, 2021) revealed a connection between ERM protein (ezrin) phosphorylation by LOK/SLK kinases and NM M2 activity and showed that ezrin negatively regulates RhoA. Here the authors now identify the missing link between ezrin and RhoA activity - the GAP ARHGAP18. Binding of ARHGAP18 to the ezrin FERM domain localizes its activity to the site of MV formation, maintaining optimal levels of active RhoA turn on the ezrin kinases LOK/SLK and prevents NM M2 activity (via reduced ROCK activity) within the growing MV. The results here establish that an ARHGAP18-ezrin interaction serves to tightly localize RhoA activity, promoting optimized signalling for MV formation.

      The results from several complementary approaches strongly support the identification of ARHGAP18 as a critical component of a negative feedback loop that relies on interaction with ezrin for highly localized control of RhoA-GTP levels. The work is thoughtful and systematic. The results now bring into focus an elegant mechanism for controlling the formation of microvilli that relies on formation of a complex of key players - ezrin that is required for microvilli formation, LOK/SLK kinases that opens and activates ezrin at the membrane and ARHGAP18 that downregulates RhoA, the GTPase that activates LOK/SLK and NM M2.<br /> The findings also suggest interesting possibilities for a similar mode of control in the building of related cellular protrusions, i.e. filopodia and stereocilia.

      There are a few questions remaining about the results. One concerns the strength of the ARHGAP18-ezrin FERM domain interaction. Also, the authors propose that activation of non-muscle Myo2 activation accounts for increased apical stiffness and that myosin filaments are present within microvilli in cells lacking ARHGAP. The distribution of the NM 2B heavy chain versus the pMLC seems at odds with the first proposition and the localization results don't quite seem to support the author's conclusion about the relocalization of NM 2B within MV. These are straightforward issues that the author should be able to clarify or address.

    1. Reviewer #1 (Public Review):

      Summary:<br /> The paper is an attempt to explain a geographic paradox between infection prevalence and antimalarial resistance emergence. The authors developed a compartmental model that importantly contains antigenic strain diversity and in turn antigen-specific immunity. They find a negative correlation between parasite prevalence and the frequency of resistance emergence and validate this result using empirical data on chloroquine-resistance. Overall, the authors conclude that strain diversity is a key player in explaining observed patterns of resistance evolution across different geographic regions.

      The authors pose and address the following specific questions:

      1. Does strain diversity modulate the equilibrium resistance frequency given different transmission intensities?<br /> 2. Does strain diversity modulate the equilibrium resistance frequency and its changes following drug withdrawal?<br /> 3. Does the model explain biogeographic patterns of drug resistance evolution?

      Strengths:<br /> The model built by the authors is novel. As emphasized in the manuscript, many factors (e.g., drug usage, vectorial capacity, population immunity) have been explored in models attempting to explain resistance emergence, but strain diversity (and strain-specific immunity) has not been explicitly included and thus explored. This is an interesting oversight in previous models, given the vast antigenic diversity of Plasmodium falciparum (the most common human malaria parasite) and its potential to "drive key differences in epidemiological features".

      The model also accounts for multiple infections, which is a key feature of malarial infections, with individuals often infected with either multiple Plasmodium species or multiple strains of the same species. Accounting for multiple infections is critical when considering resistance emergence, as with multiple infections there is within-host competition which will mediate the fitness of resistant genotypes. Overall, the model is an interesting combination of a classic epidemiological model (e.g., SIR) and a population genetics model.

      In terms of major model innovations, the model also directly links selection pressure via drug administration with local transmission dynamics. This is accomplished by the interaction between strain-specific immunity, generalized immunity, and host immune response.

      Weaknesses:<br /> In several places, the explanation of the results (i.e., why are we seeing this result?) is underdeveloped. For example, under the section "Response to drug policy change", it is stated that (according to the model) low diversity scenarios show the least decline in resistant genotype frequency after drug withdrawal; however, this result emerges mechanistically. Without an explicit connection to the workings of the model, it can be difficult to gauge whether the result(s) seen are specific to the model itself or likely to be more generalizable.

      The authors emphasize several model limitations, including the specification of resistance by a single locus (thus not addressing the importance of recombination should resistance be specified by more than one locus); the assumption that parasites are independently and randomly distributed among hosts (contrary to empirical evidence); and the assumption of a random association between the resistant genotype and antigenic diversity. However, each of these limitations is addressed in the discussion.

      Did the authors achieve their goals? Did the results support their conclusion?

      Returning to the questions posed by the authors:

      1. Does strain diversity modulate the equilibrium resistance frequency given different transmission intensities? Yes. The authors demonstrate a negative relationship between prevalence/strain diversity and resistance frequency (Figure 2).

      2. Does strain diversity modulate the equilibrium resistance frequency and its changes following drug withdrawal? Yes. The authors find that, under resistance invasion and some level of drug treatment, resistance frequency decreased with the number of strains (Figure 4). The authors also find that lower strain diversity results in a slower decline in resistant genotypes after drug withdrawal and higher equilibrium resistance frequency (Figure 6).

      3. Does the model explain biogeographic patterns of drug resistance evolution? Yes. The authors find that their full model (which includes strain-specific immunity) produces the empirically observed negative relationship between resistance and prevalence/strain diversity, while a model only incorporating generalised immunity does not (Figure 8).

      Utility of work to others and relevance within and beyond the field?<br /> This work is important because antimalarial drug resistance has been an ongoing issue of concern for much of the 20th century and now 21st century. Further, this resistance emergence is not equitably distributed across biogeographic regions, with South America and Southeast Asia experiencing much of the burden of this resistance emergence. Not only can widespread resistant strains be traced back to these two relatively low-transmission regions, but these strains remain at high frequency even after drug treatment ceases.

    2. Reviewer #2 (Public Review):

      Summary:<br /> The evolution of resistance to antimalarial drugs follows a seemingly counterintuitive pattern, in which resistant strains typically originate in regions where malaria prevalence is relatively low. Previous investigations have suggested that frequent exposures in high-prevalence regions produce high levels of partial immunity in the host population, leading to subclinical infections that go untreated. These subclinical infections serve as refuges for sensitive strains, maintaining them in the population. Prior investigations have supported this hypothesis; however, many of them excluded important dynamics, and the results cannot be generalized. The authors have taken a novel approach using a deterministic model that includes both general and adaptive immunity. They find that high levels of population immunity produce refuges, maintaining the sensitive strains and allowing them to outcompete resistant strains. While general population immunity contributed, adaptive immunity is key to reproducing empirical patterns. These results are robust across a range of fitness costs, treatment rates, and resistance efficacies. They demonstrate that future investigations cannot overlook adaptive immunity and antigenic diversity.

      Strengths:<br /> Overall, this is a very nice paper that makes a significant contribution to the field. It is well-framed within the body of literature and achieves its goal of providing a generalizable, unifying explanation for otherwise disparate investigations. As such, this work will likely serve as a foundation for future investigations. The approach is elegant and rigorous, with results that are supported across a broad range of parameters.

      Weaknesses:<br /> Although the title states that the authors describe resistance invasion, they do not support or even explore this claim. As they state in the discussion (line 351), this work predicts the equilibrium state and doesn't address temporal patterns. While refuges in partially immune hosts may maintain resistance in a population, they do not account for the patterns of resistance spread, such as the rapid spread of chloroquine resistance in Africa once it was introduced from Asia.

      As the authors state in the discussion, the evolution of compensatory mutations that negate the cost of resistance is possible, and in vitro experiments have found evidence of such. It appears that their results are dependent on there being a cost, but the lower range of the cost parameter space was not explored.

      The use of a deterministic, compartmental model may be a structural weakness. This means that selection alone guides the fixation of new mutations on a semi-homogenous adaptive landscape. In reality, there are two severe bottlenecks in the transmission cycle of Plasmodium spp., introducing a substantial force of stochasticity via genetic drift. The well-mixed nature of this type of model is also likely to have affected the results. In reality, within-host selection is highly heterogeneous, strains are not found with equal frequency either in the population or within hosts, and there will be some linkage between the strain and a resistance mutation, at least at first. Of course, there is no recourse for that at this stage, but it is something that should be considered in future investigations.

      The authors mention the observation that patterns of resistance in high-prevalence Papua New Guinea seem to be more similar to Southeast Asia, perhaps because of the low strain diversity in Papua New Guinea. However, they do not investigate that parameter space here. If they did and were able to replicate that observation, not only would that strengthen this work, it could profoundly shape research to come.

    1. Reviewer #1 (Public Review):

      Summary<br /> This work contains 3 sections. The first section describes how protein domains with SQ motifs can increase the abundance of a lacZ reporter in yeast. The authors call this phenomenon autonomous protein expression-enhancing activity, and this finding is well supported. The authors show evidence that this increase in protein abundance and enzymatic activity is not due to changes in plasmid copy number or mRNA abundance, and that this phenomenon is not affected by mutants in translational quality control. It was not completely clear whether the increased protein abundance is due to increased translation or to increased protein stability.

      In section 2, the authors performed mutagenesis of three N-terminal domains to study how protein sequence changes protein stability and enzymatic activity of the fusions. These data are very interesting, but this section needs more interpretation. It is not clear if the effect is due to the number of S/T/Q/N amino acids or due to the number of phosphorylation sites.

      In section 3, the authors undertake an extensive computational analysis of amino acid runs in 27 species. Many aspects of this section are fascinating to an expert reader. They identify regions with poly-X tracks. These data were not normalized correctly: I think that a null expectation for how often poly-X track occur should be built for each species based on the underlying prevalence of amino acids in that species. As a result, I believe that the claim is not well supported by the data.

      Strengths<br /> This work is about an interesting topic and contains stimulating bioinformatics analysis. The first two sections, where the authors investigate how S/T/Q/N abundance modulates protein expression level, is well supported by the data. The bioinformatics analysis of Q abundance in ciliate proteomes is fascinating. There are some ciliates that have repurposed stop codons to code for Q. The authors find that in these proteomes, Q-runs are greatly expanded. They offer interesting speculations on how this expansion might impact protein function.

      Weakness<br /> At this time, the manuscript is disorganized and difficult to read. An expert in the field, who will not be distracted by the disorganization, will find some very interesting results included. In particular, the order of the introduction does not match the rest of the paper.

      In the first and second sections, where the authors investigate how S/T/Q/N abundance modulates protein expression levels, it is unclear if the effect is due to the number of phosphorylation sites or the number of S/T/Q/N residues. The authors also do not discuss if the N-end rule for protein stability applies to the lacZ reporter or the fusion proteins.

      The most interesting part of the paper is an exploration of S/T/Q/N-rich regions and other repetitive AA runs in 27 proteomes, particularly ciliates. However, this analysis is missing a critical control that makes it nearly impossible to evaluate the importance of the findings. The authors find the abundance of different amino acid runs in various proteomes. They also report the background abundance of each amino acid. They do not use this background abundance to normalize the runs of amino acids to create a null expectation from each proteome. For example, it has been clear for some time (Ruff, 2017; Ruff et al., 2016) that Drosophila contains a very high background of Q's in the proteome and it is necessary to control for this background abundance when finding runs of Q's. The authors could easily address this problem with the data and analysis they have already collected. However, at this time, without this normalization, I am hesitant to trust the lists of proteins with long runs of amino acid and the ensuing GO enrichment analysis.

      Ruff KM. 2017. Washington University in St.<br /> Ruff KM, Holehouse AS, Richardson MGO, Pappu RV. 2016. Proteomic and Biophysical Analysis of Polar Tracts. Biophys J 110:556a.

    2. Reviewer #2 (Public Review):

      Summary:<br /> This study seeks to understand the connection between protein sequence and function in disordered regions enriched in polar amino acids (specifically Q, N, S and T). While the authors suggest that specific motifs facilitate protein-enhancing activities, their findings are correlative, and the evidence is incomplete. Similarly, the authors propose that the re-assignment of stop codons to glutamine-encoding codons underlies the greater user of glutamine in a subset of ciliates, but again, the conclusions here are, at best, correlative. The authors perform extensive bioinformatic analysis, with detailed (albeit somewhat ad hoc) discussion on a number of proteins. Overall, the results presented here are interesting, but are unable to exclude competing hypotheses.

      Strengths:<br /> Following up on previous work, the authors wish to uncover a mechanism associated with poly-Q and SCD motifs explaining proposed protein expression-enhancing activities. They note that these motifs often occur IDRs and hypothesize that structural plasticity could be capitalized upon as a mechanism of diversification in evolution. To investigate this further, they employ bioinformatics to investigate the sequence features of proteomes of 27 eukaryotes. They deepen their sequence space exploration uncovering sub-phylum-specific features associated with species in which a stop-codon substitution has occurred. The authors propose this stop-codon substitution underlies an expansion of ploy-Q repeats and increased glutamine distribution.

      Weaknesses:<br /> The preprint provides extensive, detailed, and entirely unnecessary background information throughout, hampering reading and making it difficult to understand the ideas being proposed.<br /> The introduction provides a large amount of detailed background that appears entirely irrelevant for the paper. Many places detailed discussions on specific proteins that are likely of interest to the authors occur, yet without context, this does not enhance the paper for the reader.

      The paper uses many unnecessary, new, or redefined acronyms which makes reading difficult. As examples: (1) Prion forming domains (PFDs). Do the authors mean prion-like domains (PLDs), an established term with an empirical definition from the PLAAC algorithm? If yes, they should say this. If not, they must define what a prion-forming domain is formally. (2) SCD is already an acronym in the IDP field (meaning sequence charge decoration) - the authors should avoid this as their chosen acronym for Serine(S) / threonine (T)-glutamine (Q) cluster domains. Moreover, do we really need another acronym here (we do not). (3) Protein expression-enhancing (PEE) - just say expression-enhancing, there is no need for an acronym here.

      The results suggest autonomous protein expression-enhancing activities of regions of multiple proteins containing Q-rich and SCD motifs. Their definition of expression-enhancing activities is vague and the evidence they provide to support the claim is weak. While their previous work may support their claim with more evidence, it should be explained in more detail. The assay they choose is a fusion reporter measuring beta-galactosidase activity and tracking expression levels. Given the presented data they have shown that they can drive the expression of their reporters and that beta gal remains active, in addition to the increase in expression of fusion reporter during the stress response. They have not detailed what their control and mock treatment is, which makes complete understanding of their experimental approach difficult. Furthermore, their nuclear localization signal on the tag could be influencing the degradation kinetics or sequestering the reporter, leading to its accumulation and the appearance of enhanced expression. Their evidence refuting ubiquitin-mediated degradation does not have a convincing control.

      Based on the experimental results, the authors then go on to perform bioinformatic analysis of SCD proteins and polyX proteins. Unfortunately, there is no clear hypothesis for what is being tested; there is a vague sense of investigating polyX/SCD regions, but I did not find the connection between the first and section compelling (especially given polar-rich regions have been shown to engage in many different functions). As such, this bioinformatic analysis largely presents as many lists of percentages without any meaningful interpretation. The bioinformatics analysis lacks any kind of rigorous statistical tests, making it difficult to evaluate the conclusions drawn.

      The methods section is severely lacking. Specifically, many of the methods require the reader to read many other papers. While referencing prior work is of course, important, the authors should ensure the methods in this paper provide the details needed to allow a reader to evaluate the work being presented. As it stands, this is not the case.

      Overall, my major concern with this work is that the authors make two central claims in this paper (as per the Discussion).

      The authors claim that Q-rich motifs enhance protein expression. The implication here is that Q-rich motif IDRs are special, but this is not tested. As such, they cannot exclude the competing hypothesis ("N-terminal disordered regions enhance expression"). The authors also do not explore the possibility that this effect is in part/entirely driven by mRNA-level effects (see Verma Na Comms 2019). As such, while these observations are interesting, they feel preliminary and, in my opinion, cannot be used to draw hard conclusions on how N-terminal IDR sequence features influence protein expression. This does not mean the authors are necessarily wrong, but from the data presented here, I do not believe strong conclusions can be drawn.

      That re-assignment of stop codons to Q increases proteome-wide Q usage. I was unable to understand what result led the authors to this conclusion. My reading of the results is that a subset of ciliates has re-assigned UAA and UAG from the stop codon to Q. Those ciliates have more polyQ-containing proteins. However, they also have more polyN-containing proteins and proteins enriched in S/T-Q clusters. Surely if this were a stop-codon-dependent effect, we'd ONLY see an enhancement in Q-richness, not a corresponding enhancement in all polar-rich IDR frequencies? It seems the better working hypothesis is that free-floating climate proteomes are enriched in polar amino acids compared to sessile ciliates. Regardless, the absence of any kind of statistical analysis makes it hard to draw strong conclusions here.

    1. Reviewer #3 (Public Review):

      Summary:<br /> The paper "Unveiling the signaling network of FLT3-ITD AML improves drug sensitivity prediction" reports the combination of prior knowledge signaling networks, multiparametric cell-based data on the activation status of 14 crucial proteins emblematic of the cell state downstream of FLT3 obtained under a variety of perturbation conditions and Boolean logic modeling, to gain mechanistic insight into drug resistance in acute myeloid leukemia patients carrying the internal tandem duplication in the FLT3 receptor tyrosine kinase and predict drug combinations that may reverse pharmacorresistant phenotypes. Interestingly, the utility of the approach was validated in vitro, and also using mutational and expression data from 14 patients with FLT3-ITD positive acute myeloid leukemia to generate patient-specific Boolean models.

      Strengths:<br /> The model predictions were positively validated in vitro: it was predicted that the combined inhibition of JNK and FLT3, may reverse resistance to tyrosine kinase inhibitors, which was confirmed in an appropriate FLT3 cell model by comparing the effects on apoptosis and proliferation of a JNK inhibitor and midostaurin vs. midostaurin alone.

      Whereas the study does have some complexity, readability is enhanced by the inclusion of a section that summarizes the study design, plus a summary figure. Availability of data as supplementary material is also a high point.

      Weaknesses:<br /> Some aspects of the methodology are not properly described (for instance, no methodological description has been provided regarding the clustering procedure that led to Figs. 2C and 2D).

      It is not clear in the manuscript whether the patients gave their consent to the use of their data in this study, or the approval from an ethical committee. These are very important points that should be made explicit in the main text of the paper.

      The authors claim that some of the predictions of their models were later confirmed in the follow-up of some of the 14 patients, but it is not crystal clear whether the models helped the physicians to make any decisions on tailored therapeutic interventions, or if this has been just a retrospective exercise and the predictions of the models coincide with (some of) the clinical observations in a rather limited group of patients. Since the paper presents this as additional validation of the models' ability to guide personalized treatment decisions, it would be very important to clarify this point and expand the presentation of the results (comparison of observations vs. model predictions).

    2. Reviewer #1 (Public Review):

      The authors deploy a combination of their own previously developed computational methods and databases (SIGNOR and CellNOptR) to model the FLT3 signaling landscape in AML and identify synergistic drug combinations that may overcome the resistance AML cells harboring ITD mutations in the TKI domain of FLT3 to FLT3 inhibitors. I did not closely evaluate the details of these computational models since they are outside of my area of expertise and have been previously published. The manuscript has significant issues with data interpretation and clarity, as detailed below, which, in my view, call into question the main conclusions of the paper.

      The authors train the model by including perturbation data where TKI-resistant and TKI-sensitive cells are treated with various inhibitors and the activity (i.e. phosphorylation levels) of the key downstream nodes are evaluated. Specifically, in the Results section (p. 6) they state "TKIs sensitive and resistant cells were subjected to 16 experimental conditions, including TNFa and IGF1 stimulation, the presence or absence of the FLT3 inhibitor, midostaurin, and in combination with six small-molecule inhibitors targeting crucial kinases in our PKN (p38, JNK, PI3K, mTOR, MEK1/2 and GSK3)". I would appreciate more details on which specific inhibitors and concentrations were used for this experiment. More importantly, I was very puzzled by the fact that this training dataset appears to contain, among other conditions, the combination of midostaurin with JNK inhibition, i.e. the very combination of drugs that the authors later present as being predicted by their model to have a synergistic effect. Unless my interpretation of this is incorrect, it appears to be a "self-fulfilling prophecy", i.e. an inappropriate use of the same data in training and verification/test datasets.

      My most significant criticism is that the proof-of-principle experiment evaluating the combination effects of midostaurin and SP600125 in FLT3-ITD-TKD cell line model does not appear to show any synergism, in my view. The authors' interpretation of the data is that the addition of SP600125 to midostaurin rescues midostaurin resistance and results in increased apoptosis and decreased viability of the midostaurin-resistant cells. Indeed, they write on p.9: "Strikingly, the combined treatment of JNK inhibitor (SP600125) and midostaurin (PKC412) significantly increased the percentage of FLT3ITD-TKD cells in apoptosis (Fig. 4D). Consistently, in these experimental conditions, we observed a significant reduction of proliferating FLT3ITD- TKD cells versus cells treated with midostaurin alone (Fig. 4E)." However, looking at Figs 4D and 4E, it appears that the effects of the midostaurin/SP600125 combination are virtually identical to SP600125 alone, and midostaurin provides no additional benefit. No p-values are provided to compare midostaurin+SP600125 to SP600125 alone but there seems to be no appreciable difference between the two by eye. In addition, the evaluation of synergism (versus additive effects) requires the use of specialized mathematical models (see for example Duarte and Vale, 2022). That said, I do not appreciate even an additive effect of midostaurin combined with SP600125 in the data presented.

      In my view, there are significant issues with clarity and detail throughout the manuscript. For example, additional details and improved clarity are needed, in my view, with respect to the design and readouts of the signaling perturbation experiments (Methods, p. 15 and Fig 2B legend). For example, the Fig 2B legend states: "Schematic representation of the experimental design: FLT3 ITD-JMD and FLT3 ITD-JMD cells were cultured in starvation medium (w/o FBS) overnight and treated with selected kinase inhibitors for 90 minutes and IGF1 and TNFa for 10 minutes. Control cells are starved and treated with PKC412 for 90 minutes, while "untreated" cells are treated with IGF1 100ng/ml and TNFa 10ng/ml with PKC412 for 90 minutes.", which does not make sense to me. The "untreated" cells appear to be treated with more agents than the control cells. The logic behind cytokine stimulation is not adequately explained and it is not entirely clear to me whether the cytokines were used alone or in combination. Fig 2B is quite confusing overall, and it is not clear to me what the horizontal axis (i.e. columns of "experimental conditions", as opposed to "treatments") represents. The Method section states "Key cell signaling players were analyzed through the X-Map Luminex technology: we measured the analytes included in the MILLIPLEX assays" but the identities of the evaluated proteins are not given in the Methods. At the same time, the Results section states "TKIs sensitive and resistant cells were subjected to 16 experimental conditions" but these conditions do not appear to be listed (except in Supplementary data; and Fig 2B lists 9 conditions, not 16). In my subjective view, the manuscript would benefit from a clearer explanation and depiction of the experimental details and inhibitors used in the main text of the paper, as opposed to various Supplemental files/figures. The lack of clarity on what exactly were the experimental conditions makes the interpretation of Fig 2 very challenging. In the same vein, in the PCA analysis (Fig 2C) there seems to be no reference to the cytokine stimulation status while the authors claim that PC2 stratifies cells according to IGF1 vs TNFalpha. There are numerous other examples of incomplete or confusing legends and descriptions which, in my view, need to be addressed to make the paper more accessible.

      I am not sure that I see significant value in the patient-specific logic models because they are not supported by empirical evidence. Treating primary cells from AML patients with relevant drug combinations would be a feasible and convincing way to validate the computational models and evaluate their potential benefit in the clinical setting.

    3. Reviewer #2 (Public Review):

      Summary:<br /> This manuscript by Latini et al describes a methodology to develop Boolean-based predictive logic models that can be applied to uncover altered protein/signalling networks in cancer cells and discover potential new therapeutic targets. As a proof-of-concept, they have implemented their strategy on a hematopoietic cell line engineered to express one of two types of FLT3 internal tandem mutations (FLT3-ITD) found in patients, FLT3-ITD-TKD (which are less sensitive to tyrosine kinase inhibitors/TKIs) and FLT3-ITD-JMD (which are more sensitive to TKIs).

      Strengths:<br /> This useful work could potentially represent a step forward towards personalised targeted therapy, by describing a methodology using Boolean-based predictive logic models to uncover altered protein/signalling networks within cancer cells. However, the weaknesses highlighted below severely limit the extent of any conclusions that can be drawn from the results.

      Weaknesses:<br /> While the highly theoretical approach proposed by the authors is interesting, the potential relevance of their overall conclusions is severely undermined by a lack of validation of their predicted results in real-world data. Their predictive logic models are built upon a set of poorly-explained initial conditions, drawn from data generated in vitro from an engineered cell line, and no attempt was made to validate the predictions in independent settings. This is compounded by a lack of sufficient experimental detail or clear explanations at different steps. These concerns considerably temper one's enthusiasm about the conclusions that could be drawn from the manuscript. Some specific concerns include:

      1. It remains unclear how robust the logic models are, or conversely, how affected they might be by specific initial conditions or priors that are chosen. The authors fail to explain the rationale underlying their input conditions at various points. For example:<br /> - at the start of the manuscript, they assert that they begin with a pre-PKN that contains "76 nodes and 193 edges", though this is then ostensibly refined with additional new edges (as outlined in Fig 2A). However, why these edges were added, nor model performance comparisons against the basal model are presented, precluding an evaluation of whether this model is better.

      - At a later step (relevant to Fig S4 and Fig 3), they develop separate PKNs, for each of the mutation models, that contain "206 [or] 208 nodes" and "756 [or] 782 edges", without explaining how these seemingly arbitrary initial conditions were arrived at. Their relation to the original parameters in the previous model is also not investigated, raising concerns about model over-fitting and calling into question the general applicability of their proposed approach. The authors need to provide a clearer explanation of the logic underlying some of these initial parameter selections, and also investigate the biological/functional overlap between these sets of genes (nodes).

      2. There is concern about the underlying experimental data underpinning the models that were generated, further compounded by the lack of a clear explanation of the logic. For example, data concerning the status of signalling changes as a result of perturbation appears to be generated from multiplex LUMINEX assays using phosphorylation-specific antibodies against just 14 "sentinel" proteins. However, very little detail is provided about the rationale underlying how these 14 were chosen to be "sentinels" (and why not just 13, or 15, or any other number, for that effect?). How reliable are the antibodies used to query the phosphorylation status? What are the signal thresholds and linear ranges for these assays, and how would these impact the performance/reliability of the logic models that are generated from them?

      In addition, there are publicly available quantitative proteomics datasets from FLT3-mutant cell lines and primary samples treated with TKIs. At the very least, these should have been used by the authors to independently validate their models, selection of initial parameters, and signal performance of their antibody-based assays, to name a few unvalidated, yet critical, parameters.

      3. There is an overwhelming reliance on theoretical predictions without taking advantage of real-world validation of their findings. For example, the authors identified a set of primary AML samples with relevant mutations (Fig 5) that could potentially have provided a valuable experimental validation platform for their predictions of effective drug combination. Yet, they have performed Boolean simulations of the predicted effects, a perplexing instance of adding theoretical predictions on top of a theoretical prediction!

      Additionally, there are datasets of drug sensitivity on primary AML samples where mutational data is also known (for example, from the BEAT-AML consortia), that could be queried for independent validation of the authors' models.

      4. There are additional examples of insufficient experimental detail that preclude a fuller appreciation of the relevance of the work. For example, it is alluded that RNA-sequencing was performed on a subset of patients, but the entire methodological section detailing the RNA-seq amounts to just 3 lines! It is unclear which samples were selected for sequencing nor where the data has been deposited (or might be available for the community - there are resources for restricted/controlled access to deidentified genomics/transcriptomics data).

      Similarly, in the "combinatory treatment inference" methods, it states "...we computed the steady state of each cell line best model....." and "Then we inferred the activity of "apoptosis" and "proliferation" phenotypes", without explaining the details of how these were done. The outcomes of these methods are directly relevant to Fig 4, but with such sparse methodological detail, it is difficult to independently assess the validity of the presented data.

      Overall, the theoretical nature of the work is hampered by real-world validation, and insufficient methodological details limit a fuller appreciation of the overall relevance of this work.

    1. Joint Public Review:

      Lujan et al make a significant contribution to the field by elucidating the essential role of TGN46 in cargo sorting and soluble protein secretion. TGN46 is a prominent TGN protein that cycles to the plasma membrane and it has been used as a TGN marker for many years, but its function has been a fundamental mystery.

      In parallel, it remains unclear how most secreted proteins are targeted from the Golgi to the cell surface. These molecules do not contain conserved sequence motifs or post-translation modifications such as lysosomal hydrolases. Cargo receptors for these secreted proteins have remained elusive.

      Therefore, these investigations are likely to have a significant influence on the field.

      To gain an insight into the molecular role of TGN46 in sorting, they systematically test the impact of the luminal, transmembrane, and cytosolic domains. Importantly and against the current thinking, they demonstrate that the luminal domain of TGN facilitates sorting. Interestingly, neither the cytosolic nor the length of the transmembrane domain of TGN46 plays a role in cargo export. The effects of TGN46 depletion are specific as membrane-associated VSVG remains unaffected.

      Interestingly, TGN46 luminal domain also plays an important role in the intracellular and intra-Golgi localization of TGN46, and it contains a positive signal for Golgi export in CARTS. Rigorous, well-performed data support the experimental evidence.

      A speculative part of the manuscript, with some accompanying experimental data, proposes that the luminal domain of TGN46 forms biomolecular condensates that help to capture cargo proteins for export.

      One important point to discuss is that the effects of TGN46 KO are partial, suggesting that TGN46 stimulates the Golgi export of PAUF but is not essential for this process. The incomplete block is apparent in Fig 1 and in Fig 5D.

    1. Reviewer #1 (Public Review):

      The manuscript by Lin et al describes a wide biophysical survey of the molecular mechanisms underlying full length BTK regulation. This is a continuation of this lab's excellent work on deciphering the myriad levels of regulation of BTKs downstream of their activation by plasma membrane localised receptors.

      The manuscript uses a synergy of cryo EM, HDX-MS and mutational analysis to delve into the role of the how the accessory domains modify the activity of the kinase domain. The manuscript essentially has three main novel insights into BTK regulation.

      1. Cryo EM and SAXS shows that the PHTH region is dynamic compared to the conserved Src module.<br /> 2. A 2nd generation tethered PH-kinase construct crystal of BTK reveals a unique orientation of the PH domain relative to the kinase domain, that is different from previous structures.<br /> 3. A new structure of the kinase domain dimer shows how trans-phosphorylation can be achieved.

      Excitingly these structural work allow for the generation of a model of how BTK can act as a strict coincidence sensor for both activated BCR complex as well as PIP3 before it obtains full activity. To my eye the most exciting result of this work is describing how the PH domain can inhibit activity once the SH3/SH2 domain is disengaged, allowing for an additional level of regulatory control.

      I have very few experimental concerns as the methods and figures are well described and clear. As the authors are potentially saying that the previously solved PH domain-kinase interface is but one of many possible inhibitory conformations that can be adapted.

    2. Reviewer #2 (Public Review):

      In this study, multiple biophysical techniques were employed to investigate the activation mechanism of BTK, a multi-domain non-receptor protein kinase. Previous studies have elucidated the inhibitory effects of the SH3 and SH2 domains on the kinase and the potential activation mechanism involving the membrane-bound PIP3 inducing transient dimerization of the PH-TH domain, which binds to lipids.

      The primary focus of the present study was on three new constructs: a full-length BTK construct, a construct where the PH-TH domain is connected to the kinase domain, and a construct featuring a kinase domain with a phosphomimetic at the autophosphorylation site Y551. The authors aimed to provide new insights into the autoinhibition and allosteric control of BTK.

      The study reports that SAXS analysis of the full-length BTK protein construct, along with cryoEM visualization of the PH-TH domain, supports a model in which the N-terminal PH-TH domain exists in a conformational ensemble surrounding a compact/autoinhibited SH3-SH2-kinase core. This finding is interesting because it contradicts previous models proposing that each globular domain is tightly packed within the core.

      Furthermore, the authors present a model for an inhibitory interaction between the N-lobe of the kinase and the PH-TH domain. This model is based on a study using a tethered complex with a longer tether than a previously reported construct where the PH-TH domain was tightly attached to the kinase domain (ref 5). The authors argue that the new structure is relevant. However, this assertion requires further explanation and discussion, particularly considering that the functional assays used to assess the impact of mutating residues within the PH-TH/kinase domain contradict the results of the previous study (ref 5).

      Additionally, the study presents the structure of the kinase domain with swapped activation loops in a dimeric form, representing a previously unseen structure along the trans-phosphorylation pathway. This structure holds potential relevance. To better understand its significance, employing a structure/function approach like the one described for the PH-TH/kinase domain interface would be beneficial.

      Overall, this study contributes to our understanding of the activation mechanism of BTK and sheds light on the autoinhibition and allosteric control of this protein kinase. It presents new structural insights and proposes novel models that challenge previous understandings.

    3. Reviewer #3 (Public Review):

      Yin-wei Lin et al set out to visualize the inactive conformation of full length Bruton's Tyrosine Kinase (BTK), a molecule that has evaded high resolution structural studies in its full length form to this date. An open question in the field is how the Pleckstrin Homology-Tec Homology (PHTH) domain inhibits BTK activity, with multiple competing models in the field. The authors used a complimentary set of biophysical techniques combined with well thought out stabilizing mutations to obtain structural insights into BTK regulation in its full length form. They were able to crystallize the full length construct of BTK but unfortunately the PHTH was not resolved yielding the structure similar to previously obtained in the field. The investigation of the same construct by SAXS yielded an elongated structural model, consistent with previous SAXS studies. Using cryo-EM the authors obtained a low resolution model for the FL BTK with a loosely connected density assigned to the dynamic PHTH around the compact SH2-SH3-Kinase Domain (KD) core. To gain further molecular insights into PHTH-KD interactions the authors followed a previously reported strategy and generated a fusion of PHTH-KD with a longer linker, yielding a crystal structure with a novel PHTH-KD interface which they tested in biochemical assays. Lastly, Yin-wei Lin et al crystallized the BTK KD in a novel partially active state in a "face to face" dimer with kinases exchanging the activation loops, although partially disordered, being theoretically perfectly positioned for trans phosphorylation. Overall this presents a valiant effort to gain molecular insights into what clearly is a dynamic regulatory motif on BTK and is a valuable addition to the field.

      I think the authors addressed all the comments that I had during the initial round of review. The only thing I can think of that would strengthen the paper is to add a supplemental figure/table with the results of unbiased SITUS fitting rather than just saying that it is close to manual fitting. Additionally, SITUS outputs not just one best solution but all the top fits and having a significant difference in cross correlation between the best fit and second best fit is usually indicative of true fit. As the authors already ran SITUS and colores they have this data and I think having a sup table with cross correlations for the top 3 fits for each of their maps would make their EM fitting more convincing and not hard to do.

      Lastly, it seems like both the authors and I agree that the cryoEM reconstructions do not correspond to the reported resolutions by the FSC. This point in no way changes any of the conclusions of the paper, however, I can't help but feel guilty that some student who is not in the field will look at these EM maps in the future and think that this is how 7A reconstructions should look like. If the authors, maybe somewhere in the methods could add a sentence indicating that the FSC curves may be overly optimistic and that there are no secondary structure features present which would be expected at these resolutions, that would be great.

    1. Reviewer #1 (Public Review):

      Summary:<br /> This study is valuable in that it may lead to the discovery of future OA markers, etc., in that changes in glycan metabolism in chondrocytes are involved in the initiation of cartilage degeneration and early OA via hypertrophic differentiation of chondrocytes. However, more robust results would be obtained by analyzing the mechanisms and pathways by which changes in glycosylation lead to cartilage degeneration.

      Strengths:<br /> This study is important because it indicates that glycan metabolism may be associated with pre-OA and may lead to the elucidation of the cause and diagnosis of pre-OA.

      Weaknesses:<br /> More robust results would be obtained by analyzing the mechanism by which cartilage degeneration induced by changes in glycometabolism occurs.

    2. Reviewer #2 (Public Review):

      Summary:<br /> This paper consists of mostly descriptive data, judged from alpha-mannosidase-treated samples, in which they found an increase in core fucose, a product of Fut 8.

      Strengths:<br /> This paper is interesting in the clinical field, but unfortunately, the data is mostly descriptive and does not have a significant impact on the scientific community in general.

      Weaknesses:<br /> If core fucose is increased, at least the target glycan molecules of core fucose should be evaluated. They also found an increase in NO, suggesting that inflammatory processes also play an important role in OA in addition to glycan changes.<br /> It has already been reported that core fucose is decreased by administration of alpha-mannosidase inhibitors. Therefore, it is expected that alpha-mannosidase administration increases core fucose.

    3. Reviewer #3 (Public Review):

      Summary:<br /> In the manuscript "Articular cartilage corefucosylation regulates tissue resilience in osteoarthritis", the authors investigate the glycan structural changes in the context of pre-OA conditions. By mainly conducting animal experiments and glycomic analysis, this study clarified the molecular mechanism of N-glycan core fucosylation and Fut8 expression in the extracellular matrix resilience and unrecoverable cartilage degeneration. Lastly, a comprehensive glycan analysis of human OA cartilage verified the hypothesis.

      Strengths:<br /> Generally, this manuscript is well structured with rigorous logic and clear language. This study is valuable and important in the early diagnosis of OA patients in the clinic, which is a great challenge nowadays.

      Weaknesses:<br /> I recommend minor revisions:

      1. I would suggest the authors prepare an illustrative scheme for the whole study, to explain the complex mechanism and also to summarize the results.

      2. Including but not limited to Figures 2A-C, Figures 3A and C, Figure 4B, and Figures 5A and D. The texts in the above images are too small to read, I would suggest the authors remake these images.

      3. The paper is generally readable, but the language could be polished a bit. Several writing errors should be realized during the careful check.

      4. As several species and OA models were conducted in this study, it would be better if the authors could note the reason behind their choice for it.

    1. Reviewer #1 (Public Review):

      Summary:<br /> This paper described the dynamics of the nuclear substructure called PML Nucleolar Association (PNA) in response to DNA damage on ribosomal DNA (rDNA) repeats. The authors showed that the PNA with rDNA repeats is induced by the inhibition of topoisomerases and RNA polymerase I and that the PNA formation is modulated by RAD51, thus homologous recombination. Artificially induced DNA double-strand breaks (DSBs) in rDNA repeats stimulate the formation of PNA with DSB markers. This DSB-triggered PNA formation is regulated by DSB repair pathways.

      Strengths:<br /> This paper illustrates a unique DNA damage-induced sub-nuclear structure containing the PML body, which is specifically associated with the nucleolus. Moreover, the dynamics of this PML Nucleolar Association (PNA) require topoisomerases and RNA polymerase I and are modulated by RAD51-mediated homologous recombination and non-homologous end-joining. This study provides a unique regulation of DSB repair at rDNA repeats associated with the unique-membrane-less subnuclear structure.

      Weaknesses:<br /> Although the PNA formation on rDNA repeat is nicely shown by cytological analysis, the biological significance of PNA in DSB repair is not fully addressed.

    2. Reviewer #2 (Public Review):

      In this manuscript, the authors aim to study the PML-nucleoli association (PNAs) by different genotoxic stress and to determine the underlying molecular mechanisms.

      First, from a diverse set of genotoxic stress conditions (topoisomerases, RNA Pol I, rRNA processing, and DNA replication stress), the authors have found that the inhibition of topoisomerases and RNA Polymerase I has the highest PNA formation associated with p53 stabilization, gamma-H2AX, and PAF49 segregation. It was further demonstrated that Rad51-mediated HR pathway but not NHEJ pathway is associated with the PNA formation. Immuno-FISH assays show that doxorubicin induces DSBs (53BP1 foci) in rDNA and PNA interactions with rDNA/DJ regions. Furthermore, endonuclease I-Ppol induced DSB at a defined location in rDNA and led to PNAs.

      Most claims by the authors are supported by the data provided. However, below weaknesses/concerns may need to be addressed to improve the quality of the study.

      1) Top2B toxin doxorubicin had the highest degree of elevating PNAs; however, Top2B-knockdown had almost no noticeable effects on PNAs. How to reconcile the different phenotypes targeting Top2B?

      2) To test the role of Rad51 and DNA-PKcs in the PNA formation, Rad51 inhibitor B02 and DNA-PKcs inhibitor NU-7441 were chosen to use in the study. To further exclude the possible off-target of B02 and NU-7441, siRNA-mediated knockdown of Rad51 and DNA-PKcs would be an appropriate complementary approach to the pharmaceutical inhibitor approach.

      3) Several previous studies have shown the activation of the nucleolar ATM-mediated DNA damage response pathway by I-Ppol-induced DSBs in rDNA. What is the role of nucleolar ATM in the regulation of PNAs?

    3. Reviewer #3 (Public Review):

      Summary:<br /> Hornofova et al examined interactions between the nucleolus and promyelocytic leukemia nuclear bodies (PML-NBs) termed PML-nucleolar associations (PNAs). PNAs are found in a minor subset of cells, exist within distinct morphological subcategories, and are induced by cellular stressors including genotoxic damage. A systematic pharmacological investigation identified that compounds that inhibit RNA Polymerase 1 (RNAPI) and/or topoisomerase 1 or 2A caused the greatest proportion of cells with PNA. A specific RAD51 inhibitor (R02) impacted the number of cells exhibiting PNAs and PNA morphology. Genetic double-strand break (DSB) induction within the rDNA locus also induced PNA structures that were more prevalent when non-homologous end joining (NHEJ) was inhibited.

      Strengths:<br /> PNA are morphologically distinct and readily visualized. The imaging data are high quality, and rDNA is amenable to studying nuclear dynamics. Specific induction of rDNA damage is a strong addition to the non-specific pharmacological damage characterized early in the manuscript. These data nicely demonstrate that rDNA double-strand breaks undermine PNA formation. Figure 1 is a comprehensive examination and presents a compelling argument that RNAPI and/or TOP1, TOP2A inhibition promote PNA structures.

      Weaknesses:<br /> The data are limited to fixed fluorescent microscopy of structures present in a minority of cells. Data are occasionally qualitative and/or based upon interpretation of dynamic events extrapolated from fixed imaging. This study would benefit from live imaging that captures PNA dynamics.

      Cell cycle and cell division are not considered. Double-strand break repair is cell cycle dependent, and most experiments occur over days of treatment and recovery. It is unclear if the cultures are proliferating, or which cell cycle phase the cells are in at the time of analysis. It is also unclear if PNAs are repeatedly dissociating and reforming each cell division.

      The relationship of PNA morphologies (bowl, funnel, balloon, and PML-NDS) also remains unclear. It is possible that PNAs mature/progress through the distinct morphologies, and that morphological presentation is a readout of repair or damage in the rDNA locus. However, this is not formally addressed.

      An I-Ppol targeted sequence within the rDNA locus suggests 3D structural rearrangement following damage. An orthogonal approach measuring rDNA 3D architecture would benefit comprehension. Following I-Ppol induction, it is possible that cells arrest in a G1 state. This may explain why targeting NHEJ has a greater impact on the number of 53BP1 foci and should be investigated.

      Conclusions: PNAs are a phenomenon of biological significance and understanding that significance is of value. More work is required to advance knowledge in this area. The authors may wish to examine the literature on APBs (Alt-associated PML-NBs), which are similar structures where telomeres associate with PML-NBs in a specific subset of cancers. It is possible that APBs and PNAs share similar biology, and prior efforts on APBs may help guide future PNA studies.

    1. Reviewer #1 (Public Review):

      In this study, Chen et al. used super-resolution microscopy on T47D cells to investigate the cell surface distribution of hGHR and hPRLR in steady-state and in response to ligand stimulation. The initial findings of this study suggest both PRL and GH stimulation lead to a decrease in GH receptors but an increase in the PRLR on the cell surface. A subset of both receptors co-localize in close proximity and may form heteromers. Moreover, the study revealed that the box 1 region in GHR plays an essential role in the regulation of its interaction with the PRLR, and the box 1 region in the PRLR is involved in the PRL-induced downregulation of the GHR. The most innovative aspect of this study is the super-resolution microscopy methodology that permits the analysis of proteins on the level of single molecules, and other notable advances are the generation of T47D cells that lack the PRLR and GHR. The questions after reading this manuscript are what novel insights have been gained that significantly go beyond what was already known about the interaction of these receptors and, more importantly, what are the physiological implications of these findings? The proposed significance of the results in the last paragraph of the Discussion section is speculative since none of the receptor interactions have been investigated in TNBC cell lines. Moreover, no physiological experiments were conducted using the PRLR and GH knockout T47D cells to provide biological relevance for the receptor heteromers. The proposed role of JAK2 in the cell surface distribution and association of both receptors as stated in the title was only derived from the analysis of box 1 domain receptor mutants. A knockout of JAK2 was not conducted to assess heteromer formation.

      There are additional points that require the authors' attention:

      1. Except for some investigation of γ2A-JAK2 cells, most of the experiments in this study were conducted on a single breast cancer cell line. In terms of rigor and reproducibility, this is somewhat borderline. The CRISPR/Cas9 mutant T47D cells were not used for rescue experiments with the corresponding full-length receptors and the box1 mutants. A missed opportunity is the lack of an investigation correlating the number of receptors with physiological changes upon ligand stimulation (e.g., cellular clustering, proliferation, downstream signaling strength).

      2. An obvious shortcoming of the study that was not discussed seems to be that the main methodology used in this study (super-resolution microscopy) does not distinguish the presence of various isoforms of the PRLR on the cell surface. Is it possible that the ligand stimulation changes the ratio between different isoforms? Which isoforms besides the long form may be involved in heteromer formation, presumably all that can bind JAK2?

      3. Changes in the ligand-inducible activation of JAK2 and STAT5 were not investigated in the T47D knockout models for the PRL and GHR. It is also a missed opportunity to use super-resolution microscopy as a validation tool for the knockouts on the single cell level and how it might affect the distribution of the corresponding other receptor that is still expressed.

      4. Why does the binding of PRL not cause a similar decrease (internalization and downregulation) of the PRLR, and instead, an increase in cell surface localization? This seems to be contrary to previous observations in MCF-7 cells (J Biol Chem. 2005 October 7; 280(40): 33909-33916).

      5. Some figures and illustrations are of poor quality and were put together without paying attention to detail. For example, in Fig 5A, the GHR was cut off, possibly to omit other nonspecific bands, the WB images look 'washed out'. 5B, 5D: the labels are not in one line over the bars, and what is the point of showing all individual data points when the bar graphs with all annotations and SD lines are disappearing? As done for the y2A cells, the illustrations in 5B-5E should indicate what cell lines were used. No loading controls in Fig 5F, is there any protein in the first lane? No loading controls in Fig 6B and 6H.

      6. The proximity ligation method was not described in the M&M section of the manuscript.

    2. Reviewer #2 (Public Review):

      Summary:<br /> Chen Chen et al. investigated the interaction between GHR and PRLR at the cell surface using STORM-type super-resolution microscopy, proximity ligation assay, and mutagenesis. They found that GH and PRL change the surface expression of GHR and PRLR. Upon stimulation, the hGHR cluster size significantly increases in a transient manner, whereas changes in hPRLR occur more slowly. In their previous publication, the authors found that hGHR and hPRLR co-immunoprecipitate in the absence of ligands. Based on that finding and the observations here, the authors examined colocalization of hGHR and hPRLR in clusters with proximity ligation assays and found that the receptors form complexes on the surface of T47D cells, and that these complexes respond differently to the ligands. Remarkably, the experiments in cells lacking either hGHR or hPRLR showed that PRLR is necessary for the reduction of surface hGHR induced by PRL. Studies with truncation or deletion of hPRLR mutants, suggest the box 1 region in hPRLR plays a critical role in stabilizing the hGHR-hPRLR complexes. This region contains the JAK2 binding site, and the authors show that binding of JAK2 to hGHR is also required for hPRLR-mediated regulation of hGHR surface expression. Cytokine receptors have very important broad-ranging roles in regulating cells and physiological roles. Therefore, the new findings described here will significantly expand our understanding of the structure-function relationship that drives a core signalling mechanism in cell biology.

      Strengths:<br /> I particularly appreciate that the authors used different angles to examine the mechanism of GHR-PRLR interaction and that they also checked the conclusions with CRISPR/Cas9 technology and with a cellular reconstitution system.

      Weaknesses:<br /> I could not fully evaluate some of the data, mainly because several details on acquisition and analysis are lacking. It would be useful to know what the background signal was in dSTORM and how the authors distinguished the specific signal from unspecific background fluorescence, which can be quite prominent in these experiments. Typically, one would evaluate the signal coming from antibodies randomly bound to a substrate around the cells to determine the switching properties of the dyes in their buffer and the average number of localisations representing one antibody. This would help evaluate if GHR or PRLR appeared as monomers or multimers in the plasma membrane before stimulation, which is currently a matter of debate. It would also provide better support for the model proposed in Figure 8. Since many of the findings in this work come from the evaluation of localisation clusters, an image showing actual localisations would help support the main conclusions. I believe that the dSTORM images in Figures 1 and 2 are density maps, although this was not explicitly stated. Alexa 568 and Alexa 647 typically give a very different number of localisations, and this is also dependent on the concentration of BME. Did the authors take that into account when interpreting the results and creating the model in Figures 2 and 8? I believe that including this information is important as findings in this paper heavily rely on the number of localisations detected under different conditions. Including information on proximity labelling and CRISPR/Cas9 in the methods section would help with the reproducibility of these findings by other groups.

    3. Reviewer #3 (Public Review):

      Summary:<br /> The authors are interested in the relative importance of PRL versus GH and their interactive signaling in breast cancer. After examining GHR-PRLR interactions in response to ligands, they suggest that a reduction in cell surface GHR in response to PRL may be a mechanism whereby PRL can sometimes be protective against breast cancer.

      Strengths:<br /> The strengths of the study include the interesting question being addressed and the application of multiple complementary techniques, including dSTORM, which is technically very challenging, especially when using double labeling. Thus, dSTORM is used to show co-clustering of GHR and PRLR, and, in response to PRL, rapid internalization of GHR and increased cell surface PRLR. Proximity ligation assays demonstrate that some GHR and PRLR are within 40 nm (≈ 4 plasma membranes) of each other and that upon ligand stimulation, they move apart. Intact receptor knockin and knockout approaches and receptor constructs without the Jak2 binding domain demonstrate a) a requirement for the PRLR for there to be PRL-driven internalization of GHR, and b) that Jak2-PRLR interactions are necessary for the stability of the GHR-PRLR colocalizations.

      Weaknesses:<br /> The manuscript suffers from a lack of detail, which in places makes it difficult to evaluate the data and would make it very difficult for the results to be replicated by others. In addition, the manuscript would very much benefit from a full discussion of the limitations of the study. For example, the manuscript is written as if there is only one form of the PRLR while the anti-PRLR antibody used for dSTORM would also recognize the intermediate form and short forms 1a and 1b on the T47D cells. Given the very different roles of these other PRLR forms in breast cancer (Dufau, Vonderhaar, Clevenger, Walker and other labs), this limitation should at the very least be discussed. Similarly, the manuscript is written as if Jak2 essentially only signals through STAT5 but Jak2 is involved in multiple other signaling pathways from the multiple PRLRs, including the long form. Also, while there are papers suggesting that PRL can be protective in breast cancer, the majority of publications in this area find that PRL promotes breast cancer. How then would the authors interpret the effect of PRL on GHR in light of all those non-protective results?

    1. Reviewer #1 (Public Review):

      Summary:<br /> The authors started by stimulating the PBMCs in bulk, then encapsulated single cells in droplets to monitor the secreted cytokines in each droplet for the next 4 hours. The secreted cytokines are bound by fluorescently labeled detection antibodies. At the same time, the cytokines can be captured by the capture antibodies that are immobilized to the magnetic beads. Under the magnetic field, the magnetic beads will line up in the middle of the droplet along with bound fluorescent antibodies. This effectively enriches the fluorescent antibody to the middle of the droplet, making it a higher fluorescent signal compared to the background signal that is in the rest of the droplet. They can parallel the measurement of three cytokines in each droplet.

      Strengths:<br /> Observed heterogeneous cytokine secretion dynamics, which they have reported in their previous paper as well.

      Weaknesses:<br /> Since they used PBMCs, without other assays to confirm the cell subtypes, I am not sure if any of the heterogeneity they detected in 6 cytokine secretion would be able to relate back to biology. In addition, the two panels were measured on separate cells, I am not sure it is meaningful to make any comparisons of the two panels as they are on different cells.

    2. Reviewer #2 (Public Review):

      Summary:<br /> In their manuscript titled "Stimulation-induced cytokine polyfunctionality as a dynamic concept," the authors investigate the dynamic nature of polyfunctional cytokine responses to established stimulants. The authors use their previously published single-cell encapsulation droplet-microfluidic platform to analyse the response of peripheral blood mononuclear cells (PBMCs) to different stimulants and measure the secretion dynamics of individual cytokines. This assay shows that polyfunctionality in cytokine responses is a complex but short-lived phenomenon that decreases with prolonged stimulation times. The study finds that polyfunctional cells predominantly display elevated cytokine concentrations with similar secretion patterns but higher secretion levels compared to their monocytokine-secreting counterparts. The method is promising to analyse the correlation between the secretion dynamics of different cytokines in primary samples and heterogeneous cell populations.

      Strengths:<br /> This method provides single-cell-resolved and dynamic cytokine concentration information, which might be used to identify "fingerprints" of secretion patterns for selected cytokines. When extending the available data to more than one donor, this might be the basis for a diagnostic tool. The combination of established droplet microfluidics with an epi-fluorescence microscope-based readout makes it convincing that the method is transferable to other labs. Specifically, the dynamic analysis of cytokine concentrations is interesting, and the differences or similarities in secretion timepoints might be missed with end-point methods. The authors convincingly show that they detect up to three different cytokines in single cells.

      Weaknesses:<br /> The conclusions of the study are based on samples from a single donor, which makes the conclusions on secretion patterns difficult to interpret. The choice of cytokines is explained, but the justification of the groupings of the antibodies into the two panels is missing. It would further be helpful to discuss how the single cell incubation might affect the sectration dynamics vs. the influence of co-culture of all cell types during the 24 h activation. The authors compare average secretion rates and levels. However, the right panel in Fig. 6 looks like there might be two different populations of mono- or polyfuntional cells that have two secretion rates. As the authors have single-cell data, I would find the separation into these populations more meaningful than comparing the mean values. In line with this comment, comparing the mean values for these cytokines instead of the mean of the populations with distinct seretion properties might actually show stronger differences than the authors report here. Is the plateau of the cytokine concentration caused by the fluorescence signal saturating the camera, saturation of the magnetic beads, exhaustion of the fluorescent antibodies, or constant cytokine concentrations? The high number of non-CSCs and the limited number of droplets decrease the statistical power of the method. The authors discuss their choice to use PBMCs and not solely T cells, but this aspect is missing in the discussion.

    1. Joint Public Review:

      The manuscript by Budinska et al investigated that morphological heterogeneity may have an impact on gene-expression profiles and conventional molecular signatures applied to bulk CRC tissues. The authors conducted whole transcriptome microarrray profiling data from macro-dissected morphotype-specific tumor regions, bulk tumor and surrounding normal and stromal tissues to support their claims. The paper is interesting as it provides a putative morphological approach through which clinicians might improve the performance of molecular signatures and consequently predict the clinical response of patients with better accuracy. In the updated version of the manuscript, the authors have improved the manuscript and addressed several unsolved concerns such as patient selection and tumor area selection to justify their claims. The findings of the manuscript may have potential to be translated into the clinic of CRC.

    1. Reviewer #1 (Public Review):

      Ps observed 24 objects and were asked which afforded particular actions (14 action types). Affordances for each object were represented by a 14-item vector, values reflecting the percentage of Ps who agreed on a particular action being afforded by the object. An affordance similarity matrix was generated which reflected similarity in affordances between pairs of objects. Two clusters emerged, reflecting correlations between affordance ratings in objects smaller than body size and larger than body size. These clusters did not correlate themselves. There was a trough in similarity ratings between objects ~105 cm and ~130 cm, arguably reflecting the body size boundary. The authors subsequently provide some evidence that this clear demarcation is not simply an incidental reflection of body size, but likely causally related. This evidence comes in the flavour of requiring Ps to imagine themselves as small as a cat or as large as an elephant and showing a predicted shift in the affordance boundary. The manuscript further demonstrates that ChatGPT (theoretically interesting because it's trained on language alone without sensorimotor information; trained now on words rather than images) showed a similar boundary.

      The authors also conducted a small MRI study task where Ps decided whether a probe action was affordable (graspable?) and created a congruency factor according to the answer (yes/no). There was an effect of congruency in the posterior fusiform and superior parietal lobule for objects within body size range, but not outside. No effects in LOC or M1.

      The major strength of this manuscript in my opinion is the methodological novelty. I felt the correlation matrices were a clever method for demonstrating these demarcations, the imagination manipulation was also exciting, and the ChatGPT analysis provided excellent food for thought. These findings are important for our understanding of the interactions between action and perception, and hence for researchers from a range of domains of cognitive neuroscience.

      The major elements that limit conclusions and I'd recommend to be addressed in a revision include justification of the 80% of Ps removed for the imagination analysis, and consideration that an MRI study with 12 P in this context can really only provide pilot data. I'd also encourage the authors to consider theoretically how else this study could really have turned out and therefore the nature of the theoretical progress.

      Specifics:<br /> 1. The main behavioural work appears well-powered (>500 Ps). This sample reduces to 100 for the imagination study, after removing Ps whose imagined heights fell within the human range (100-200 cm). Why 100-200 cm? 100 cm is pretty short for an adult. Removing 80% of data feels like conclusions from the imagination study should be made with caution.

      2. There are only 12 Ps in the MRI study, which I think should mean the null effects are not interpreted. I would not interpret these data as demonstrating a difference between SPL and LOC/M1, but rather that some analyses happened to fall over the significance threshold and others did not.

      3. I found the MRI ROI selection and definition a little arbitrary and not really justified, which rendered me even more cautious of the results. Why these particular sensory and motor regions? Why M1 and not PMC or SMA? Why SPL and not other parietal regions? Relatedly, ROIs were defined by thresholding pF and LOC at "around 70%" and SPL and M1 "around 80%", and it is unclear how and why these (different) thresholds were determined.

      4. Discussion and theoretical implications. The authors discuss that the MRI results are consistent with the idea we only represent affordances within body size range. But the interpretation of the behavioural correlation matrices was that there was this similarity also for objects larger than body size, but forming a distinct cluster. I therefore found the interpretation of the MRI data inconsistent with the behavioural findings.

      5. In the discussion, the authors outline how this work is consistent with the idea that conceptual and linguistic knowledge is grounded in sensorimotor systems. But then reference Barsalou. My understanding of Barsalou is the proposition of a connectionist architecture for conceptual representation. I did not think sensorimotor representation was privileged, but rather that all information communicates with all other to constitute a concept.

      6. More generally, I believe that the impact and implications of this study would be clearer for the reader if the authors could properly entertain an alternative concerning how objects may be represented. Of course, the authors were going to demonstrate that objects more similar in size afforded more similar actions. It was impossible that Ps would ever have responded that aeroplanes afford grasping and balls afford sitting, for instance. What do the authors now believe about object representation that they did not believe before they conducted the study? Which accounts of object representation are now less likely?

    2. Reviewer #2 (Public Review):

      Summary<br /> In this work, the authors seek to test a version of an old idea, which is that our perception of the world and our understanding of the objects in it are deeply influenced by the nature of our bodies and the kinds of behaviours and actions that those objects afford. The studies presented here muster three kinds of evidence for a discontinuity in the encoding of objects, with a mental "border" between objects roughly of human body scale or smaller, which tend to relate to similar kinds of actions that are yet distinct from the kinds of actions implied by human-or-larger scale objects. This is demonstrated through observers' judgments of the kinds of actions different objects afford; through similar questioning of AI large-language models (LLMs); and through a neuroimaging study examining how brain regions implicated in object understanding make distinctions between kinds of objects at human and larger-than-human scales.

      Strengths <br /> The authors address questions of longstanding interest in the cognitive neurosciences -- namely how we encode and interact with the many diverse kinds of objects we see and use in daily life. A key strength of the work lies in the application of multiple approaches, as noted in the summary. Examining the correlations among kinds of objects, with respect to their suitability for different action kinds, is novel, as are the complementary tests of judgments made by LLMs.

      Weaknesses <br /> A limitation of the tests of LLMs may be that it is not always known what kinds of training material was used to build these models, leading to a possible "black box" problem. Further, presuming that those models are largely trained on previous human-written material, it may not necessarily be theoretically telling that the "judgments" of these models about action-object pairs show human-like discontinuities. Indeed, verbal descriptions of actions are very likely to mainly refer to typical human behaviour, and so the finding that these models demonstrate an affordance discontinuity may simply reflect those statistics, rather than evidence that affordance boundaries can arise independently even without "organism-environment interactions" as the authors claim here.

      The authors include a clever manipulation in which participants are asked to judge action-object pairs, having first adopted the imagined size of either a cat or an elephant, showing that the discontinuity in similarity judgments effectively moved to a new boundary closer to the imagined scale than the veridical human scale. The dynamic nature of the discontinuity suggests a different interpretation of the authors' main findings. It may be that action affordance is not a dimension that stably characterises the long-term representation of object kinds, as suggested by the authors' interpretation of their brain findings, for example. Rather these may be computed more dynamically, "on the fly" in response to direct questions (as here) or perhaps during actual action behaviours with objects in the real world.

    3. Reviewer #3 (Public Review):

      Summary:<br /> Feng et al. test the hypothesis that human body size constrains the perception of object affordances, whereby only objects that are smaller than the body size will be perceived as useful and manipulable parts of the environment, whereas larger objects will be perceived as "less interesting components."

      To test this idea, the study employs a multi-method approach consisting of three parts:

      In the first part, human observers classify a set of 24 objects that vary systematically in size (e.g., ball, piano, airplane) based on 14 different affordances (e.g., sit, throw, grasp). Based on the average agreement of ratings across participants, the authors compute the similarity of affordance profiles between all object pairs. They report evidence for two homogenous object clusters that are separated based on their size with the boundary between clusters roughly coinciding with the average human body size. In follow-up experiments, the authors show that this boundary is larger/smaller in separate groups of participants who are instructed to imagine themselves as an elephant/cat.

      In the second part, the authors ask different large language models (LLMs) to provide ratings for the same set of objects and affordances and conduct equivalent analyses on the obtained data. Some, but not all, of the models produce patterns of ratings that appear to show similar boundary effects, though less pronounced and at a different boundary size than in humans.

      In the third part, the authors conduct an fMRI experiment. Human observers are presented with four different objects of different sizes and asked if these objects afford a small set of specific actions. Affordances are either congruent or incongruent with objects. Contrasting brain activity on incongruent trials against brain activity on congruent trials yields significant effects in regions within the ventral and dorsal visual stream, but only for small objects and not for large objects.

      The authors interpret their findings as support for their hypothesis that human body size constrains object perception. They further conclude that this effect is cognitively penetrable, and only partly relies on sensorimotor interaction with the environment (and partly on linguistic abilities).

      Strengths:<br /> The authors examine an interesting and relevant question and articulate a plausible (though somewhat underspecified) hypothesis that certainly seems worth testing. Providing more detailed insights into how object affordances shape perception would be highly desirable. Their method of analyzing similarity ratings between sets of objects seems useful and the multi-method approach is quite original and interesting.

      Weaknesses:<br /> The study presents several shortcomings that clearly weaken the link between the obtained evidence and the drawn conclusions. Below I outline my concerns in no particular order:

      1) Even after several readings, it is not entirely clear to me what the authors are proposing and to what extent the conducted work actually speaks to this. In the introduction, the authors write that they seek to test if body size serves not merely as a reference for object manipulation but also "plays a pivotal role in shaping the representation of objects." This motivation seems rather vague motivation and it is not clear to me how it could be falsified.<br /> Similarly, in the discussion, the authors write that large objects do not receive "proper affordance representation," and are "not the range of objects with which the animal is intrinsically inclined to interact, but probably considered a less interesting component of the environment." This statement seems similarly vague and completely beyond the collected data, which did not assess object discriminability or motivational values.<br /> Overall, the lack of theoretical precision makes it difficult to judge the appropriateness of the approaches and the persuasiveness of the obtained results. This is partly due to the fact that the authors do not spell out all of their theoretical assumptions in the introduction but insert new "speculations" to motivate the corresponding parts of the results section. I would strongly suggest clarifying the theoretical rationale and explaining in more detail how the chosen experiments allow them to test falsifiable predictions.

      2) The authors used only a very small set of objects and affordances in their study and they do not describe in sufficient detail how these stimuli were selected. This renders the results rather exploratory and clearly limits their potential to discover general principles of human perception. Much larger sets of objects and affordances and explicit data-driven approaches for their selection would provide a far more convincing approach and allow the authors to rule out that their results are just a consequence of the selected set of objects and actions.

      3) Relatedly, the authors could be more thorough in ruling out potential alternative explanations. Object size likely correlates with other variables that could shape human similarity judgments and the estimated boundary is quite broad (depending on the method, either between 80 and 150 cm or between 105 to 130 cm). More precise estimates of the boundary and more rigorous tests of alternative explanations would add a lot to strengthen the authors' interpretation.

      4) Even though the division of the set of objects into two homogenous clusters appears defensible, based on visual inspection of the results, the authors should consider using more formal analysis to justify their interpretation of the data. A variety of metrics exist for cluster analysis (e.g., variation of information, silhouette values) and solutions are typically justified by convergent evidence across different metrics. I would recommend the authors consider using a more formal approach to their cluster definition using some of those metrics.

      5) While I appreciate the manipulation of imagined body size, as a way to solidify the link between body size and affordance perception, I find it unfortunate that this is implemented in a between-subjects design, as this clearly leaves open the possibility of pre-existing differences between groups. I certainly disagree with the authors' statement that their findings suggest "a causal link between body size and affordance perception."

      6) The use of LLMs in the current study is not clearly motivated and I find it hard to understand what exactly the authors are trying to test through their inclusion. As noted above, I think that the authors should discuss the putative roles of conceptual knowledge, language, and sensorimotor experience already in the introduction to avoid ambiguity about the derived predictions and the chosen methodology. As it currently stands, I find it hard to discern how the presence of perceptual boundaries in LLMs could constitute evidence for affordance-based perception.

      7) Along the same lines, the fMRI study also provides very limited evidence to support the authors' claims. The use of congruency effects as a way of probing affordance perception is not well motivated. What exactly can we infer from the fact a region may be more active when an object is paired with an activity that the object doesn't afford? The claim that "only the affordances of objects within the range of body size were represented in the brain" certainly seems far beyond the data.

      Importantly (related to my comments under 2) above), the very small set of objects and affordances in this experiment heavily complicates any conclusions about object size being the crucial variable determining the occurrence of congruency effects.

      I would also suggest providing a more comprehensive illustration of the results (including the effects of CONGRUENCY, OBJECT SIZE, and their interaction at the whole-brain level).

      Overall, I consider the main conclusions of the paper to be far beyond the reported data. Articulating a clearer theoretical framework with more specific hypotheses as well as conducting more principled analyses on more comprehensive data sets could help the authors obtain stronger tests of their ideas.

    1. Reviewer #1 (Public Review):

      Summary:

      Sex differences in the liver gene expression and function have previously been proposed to be caused by sex differences in the pattern growth hormone (GH) secretion by the pituitary, which are established by the effects of testicular hormones that act on the hypothalamus perinatally to masculinize control of pituitary GH secretion beginning at puberty and for the rest of the animal's life. The Waxman lab has previously implicated GH control of STAT5 as a critical event leading to a masculine pattern of gene expression. The present study separates male-biased regulatory sites associated with the male-biased genes into different classes based on their responsiveness to the cyclic male pattern of STAT5 activity, and investigates DNAse hypersensitivity sites (DHS) of different classes showing cyclic sex-bias or not. It further reports on the binding of transcription factors to STAT5-sensitive DHS, and involvement of specific histone marks at these sites. The study argues that STAT5 is the proximate factor regulating chromatin accessibility in about 1/3 of male-biased DHS that are sexually differentiated by GH secretion. The authors propose the pulsatile GH secretion as a novel proximate mechanism of regulating chromatin accessibility to cause sex differences.

      Strengths:

      The study offers new insight into the effects of hypophysectomy and injection of GH on different classes of sex-biased genes in mouse liver. The results support the general conclusion of the authors. Cyclic secretion of other hormones (for example, estrous secretion of estrogens and progesterone) are well known to cause sex differences in multiple organs in rodents, and it will be interesting to assess if these cyclic secretions induce similar changes in chromatin accessibility causing female tissue gene expression to differ from that of males.

      Weaknesses:

      The authors argue for two major mechanisms controlling sexual bias in liver gene expression, and analyze in depth one of these mechanisms. The focus is on the group of DHS (about 1/3 of all male-biased DHS) in which the sex bias is controlled by cyclic secretion of growth hormone (GH) in males, compared to static and low growth hormone in adult females. The sex difference in pituitary secretion of GH is induced by permanent effects of androgens acting on the hypothalamus perinatally. The manuscript study would be improved by further discussion of the mechanistic relationship between this class of sex-biased DHS and the other 2/3 of liver DHS that also show male-biased accessibility but whose chromatin does not respond directly to GH-stimulated STAT5. Previous studies, including those in the Waxman lab (PMIDs: 26959237, 18974276, 35396276) suggest castration of males or gonadectomy of both sexes eliminates most sex differences in mRNA expression in mouse liver, and/or that androgens such as DHT or testosterone administered in adulthood potentially reverses the effects of gonadectomy and/or masculinizes liver gene expression. It is not clear from the present discussion whether the GH/STAT5 cyclic effects to masculinize chromatin status require the presence of androgens in adulthood to masculinize pituitary GH secretion. Are there analyses of the present (or past) data that might provide evidence about a dual role for GH and androgen acting on the same genes? For example, are sex-biased DHS bound by androgen-dependent factors or show other signs of androgen sensitivity? Are histone marks associated with DHS regulated by androgens? Moreover, it would help if the authors indicate whether they believe that the "constitutive" static sex differences in the larger 2/3 set of male-biased DHS are the result of "constitutive" (but variable) action of testicular androgens in adulthood. Although the present study is nicely focused on the GH pulse-sensitive DHS, is there mechanistic overlap in sex-biasing mechanisms with the larger static class of sex-biased liver DHS?

    2. Reviewer #2 (Public Review):

      Summary:

      The present work addresses the mechanisms linking the sex-dependent temporal GH secretion patterns to the robust sex differences in chromatin accessibility and transcription factor binding that ultimately regulate sexually dimorphic liver gene expression. Using DNAseq analysis genomic sites hypersensitive to cleavage by DNase I, DNase hypersensitive sites [DHS] were studied in hepatocytes from male and female mice. DHS in the genome corresponds to accessible chromatin regions and encompasses key regulatory elements, including enhancers, promoters, insulators, and silencers, often flanked by specific histone modifications, and all of these players were described in different settings of GH action. Importantly, the dynamics of sex-dependent and independent chromatin accessibility linked to STAT5 binding were evaluated. For that purpose, hepatic samples from mice were divided into STAT high and STAT low binding by EMSA screening. With this information changes in DHS related to STAT binding were calculated in both sexes, giving an approximation of chromatin opening in response to STAT5, or alternatively to hypophsectomy, or a single GH pulse. More the 800 male-biased DHS (from a total of more than 70000 DHS) regions were identified in the STAT5 high groups, implying that the binding of a plasma GH pulse activates STAT5, and evokes a dynamic cycle of male liver chromatin opening and closing at sites that comprised 31% of all male-biased DHS. This proves that the pulsatility of plasma GH stimulation confers significant male bias in chromatin accessibility, and STAT5 binding at a fraction of the genomic sites linked to sex-biased liver gene expression and liver disease. As a proof of concept, authors show that a single physiological replacement dose or pulse of GH given to hypophysectomized mice recapitulate, within 30 min, the pulsatile re-opening of chromatin seen in pituitary-intact male mouse liver.

      In another male-biased DHS set (69% of male-biased DHS), chromatin accessibility was static, that is unchanged across the peaks and valleys of GH-induced liver STAT5 activity and mapped to a set of target genes and processes distinct though sometimes overlapping those of the dynamic male-biased DHS.

      In view of these distinct dynamic and static DHS in males, authors evaluated key epigenetic features distinguishing the dynamic STAT5-driven mechanism of chromatin opening from that of static male-biased DHS, which are constitutively open in the male liver but closed in the female liver. The analysis of histone marks enriched at each class of sex-biased DHS indicated exquisite differences in the epigenetic mechanisms that mediate sex-specific gene repression in each sex. For example, H3K27me3 and H3K9me3, two widely used repressive histone marks, are used in a unique way in each sex to enforce sex differences in chromatin states at sex-biased DHS.

      Finally, the work recapitulates and explains the classifications of sex dimorphic genes made in previous works. Sex-biased and pituitary hormone-dependent DHS act as regulatory elements with a positive enhancer potential, to induce or maintain gene expression in the intact liver by sustaining an open chromatin in the case of class I male-biased DHS and class I male-biased genes in the male liver. Contrariwise DHS may participate in the inhibition of gene expression by maintaining a closed chromatin state, as in the case of class II male-biased DHS and class II female-biased genes in male liver.<br /> These results as a whole present a complex mechanism by which GH regulates the sexual dimorphism of liver genes in order to cope with the metabolic needs of each sex. In a complete story, the information on chromatin accessibility, histone modification, and transcription factor binding was integrated to elucidate the complex patterns of transcriptional regulation, which is sexually dimorphic in the liver.

      Strengths:

      The work presents a novel insight into the fundamental underlying epigenetic mechanisms of sex-biased gene regulation.

      Results are supported by numerous Tables, and Supplementary Tables with the raw data, which present the advantage that they may be reanalyzed in the future to prove new hypotheses.

      Weaknesses:

      It is a complicated work to analyze, even though the main messages are clearly conveyed.

    1. Joint Public Review:

      The manuscript by Mitra and coworkers analyses the functional role of Orai in the excitability of central dopaminergic neurons in Drosophila. The authors show that a dominant-negative mutant of Orai (OraiE180A) significantly alters the gene expression profile of flight-promoting dopaminergic neurons (fpDANs). Among them, OraiE180A attenuates the expression of Set2 and enhances that of E(z) shifting the level of epigenetic signatures that modulate gene expression. The present results also demonstrate that Set2 expression via Orai involves the transcription factor Trl. The Orai-Trl-Set2 pathway modulates the expression of VGCC, which, in turn, are involved in dopamine release. The topic investigated is interesting and timely and the study is carefully performed and technically sound.

      The reviewers appreciate the authors' efforts to revise the manuscript in order to address many of their concerns. Nevertheless, there remain a few important issues:

      1) The main issue relates to Set2, and how STIM1 expression rescues Set2-dependent functions in Set2 KO flies. If Set2 is downstream of STIM1, how would STIM1 over-expression rescue a Set2-dependent effect?

      2) There is still no characterization of SOCE in fpDANs from flies expressing native Orai or the dominant negative OraiE180A mutant.

      3) The revised version does not include an analysis of the STIM:Orai stoichiometry, which has been demonstrated to be essential for SOCE.

    1. Reviewer #1 (Public Review):

      Hyperactivation of mTOR signaling causes epilepsy. It has long been assumed that this occurs through overactivation of mTORC1, since treatment with the mTORC1 inhibitor rapamycin suppresses seizures in multiple animal models. However, the recent finding that genetic inhibition of mTORC1 via Raptor deletion did not stop seizures while inhibition of mTORC2 did, challenged this view (Chen et al, Nat Med, 2019). In the present study, the authors tested whether mTORC1 or mTORC2 inhibition alone was sufficient to block the disease phenotypes in a model of somatic Pten loss-of-function (a negative regulator of mTOR). They found that inactivation of either mTORC1 or mTORC2 alone normalized brain pathology but did not prevent seizures, whereas dual inactivation of mTORC1 and mTORC2 prevented seizures. As the functions of mTORC1 versus mTORC2 in epilepsy remain unclear, this study provides important insight into the roles of mTORC1 and mTORC2 in epilepsy caused by Pten loss and adds to the emerging body of evidence supporting a role for both complexes in the disease development.

      Strengths:<br /> The animal models and the experimental design employed in this study allow for a direct comparison between the effects of mTORC1, mTORC2, and mTORC1/mTORC2 inactivation (i.e., same animal background, same strategy and timing of gene inactivation, same brain region, etc.). Additionally, the conclusions on brain epileptic activity are supported by analysis of multiple EEG parameters, including seizure frequencies, sharp wave discharges, interictal spiking, and total power analyses.

      Weaknesses:<br /> The sample size of the study is small and does not allow for the assessment of whether mTORC1 or mTORC2 inactivation reduces seizure frequency or incidence. This is a limitation of the study.

      The authors describe that they inactivated mTORC1 and mTORC2 in a new model of somatic Pten loss-of-function in the cortex. This is slightly misleading since Cre expression was found both in the cortex and the underlying hippocampus, as shown in Figure 1. Throughout the manuscript, they provide supporting histological data from the cortex. However, since Pten loss-of-function in the hippocampus can lead to hippocampal overgrowth and seizures, data showing the impact of the genetic rescue in the hippocampus would further strengthen the claim that neither mTORC1 nor mTORC2 inactivation prevents seizures.

      Some of the methods for the EEG seizure analysis are unclear. The authors describe that for control and Pten-Raptor-Rictor LOF animals, all 10-second epochs in which signal amplitude exceeded 400 μV at two time-points at least 1 second apart were manually reviewed, whereas, for the Pten LOF, Pten-Raptor LOF, and Pten-Rictor LOF animals, at least 100 of the highest-amplitude traces were manually reviewed. Does this mean that not all flagged epochs were reviewed? This could potentially lead to missed seizures. Additionally, the inclusion of how many consecutive hours were recorded among the ~150 hours of recording per animal would help readers with the interpretation of the data.

      Finally, it is surprising that mTORC2 inactivation completely rescued cortical thickness since such pathological phenotypes are thought to be conserved down the mTORC1 pathway. Additional comments on these findings in the Discussion would be interesting and useful to the readers.

    2. Reviewer #2 (Public Review):

      Summary:<br /> The study by Cullen et al presents intriguing data regarding the contribution of mTOR complex 1 (mTORC1) versus mTORC2 or both in Pten-null-induced macrocephaly and epileptiform activity. The role of mTORC2 in mTORopathies, and in particular Pten loss-off-function (LOF)-induced pathology and seizures, is understudied and controversial. In addition, recent data provided evidence against the role of mTORC1 in PtenLOF-induced seizures. To address these controversies and the contribution of these mTOR complexes in PtenLOF-induced pathology and seizures, the authors injected a AAV9-Cre into the cortex of conditional single, double, and triple transgenic mice at postnatal day 0 to remove Pten, Pten+Raptor or Rictor, and Pten+raptor+rictor. Raptor and Rictor are essentially binding partners of mTORC1 and mTORC2, respectively. One major finding is that despite preventing mild macrocephaly and increased cell size, Raptor knockout (KO, decreased mTORC1 activity) did not prevent the occurrence of seizures and the rate of SWD event, and aggravated seizure duration. Similarly, Rictor KO (decreased mTORC2 activity) partially prevented mild macrocephaly and increased cell size but did not prevent the occurrence of seizures and did not affect seizure duration. However, Rictor KO reduced the rate of SWD events. Finally, the pathology and seizure/SWD activity were fully prevented in the double KO. These data suggest the contribution of both increased mTORC1 and mTORC2 in the pathology and epileptic activity of Pten LOF mice, emphasizing the importance of blocking both complexes for seizure treatment. Whether these data apply to other mTORopathies due to Tsc1, Tsc2, mTOR, AKT or other gene variants remains to be examined.

      Strengths:<br /> The strengths are as follows: 1) they address an important and controversial question that has clinical application, 2) the study uses a reliable and relatively easy method to KO specific genes in cortical neurons, based on AAV9 injections in pups. 2) they perform careful video-EEG analyses correlated with some aspects of cellular pathology.

      Weaknesses:<br /> The study has nevertheless a few weaknesses: 1) the conclusions are perhaps a bit overstated. The data do not show that increased mTORC1 or mTORC2 are sufficient to cause epilepsy. However the data clearly show that both increased mTORC1 and mTORC2 activity contribute to the pathology and seizure activity and as such are necessary for seizures to occur. 2) the data related to the EEG would benefit from having more mice. Adding more mice would have helped determine whether there was a decrease in seizure activity with the Rictor or Raptor KO. 3) it would have been interesting to examine the impact of mTORC2 and mTORC1 overexpression related to point #1 above.

    3. Reviewer #3 (Public Review):

      Summary: This study investigated the role of mTORC1 and 2 in a mouse model of developmental epilepsy which simulates epilepsy in cortical malformations. Given activation of genes such as PTEN activates TORC1, and this is considered to be excessive in cortical malformations, the authors asked whether inactivating mTORC1 and 2 would ameliorate the seizures and malformation in the mouse model. The work is highly significant because a new mouse model is used where Raptor and Rictor, which regulate mTORC1 and 2 respectively, were inactivated in one hemisphere of the cortex. The work is also significant because the deletion of both Raptor and Rictor improved the epilepsy and malformation. In the mouse model, the seizures were generalized or there were spike-wave discharges (SWD). They also examined the interictal EEG. The malformation was manifested by increased cortical thickness and soma size.

      Strengths: The presentation and writing are strong. The quality of data is strong. The data support the conclusions for the most part. The results are significant: Generalized seizures and SWDs were reduced when both Torc1 and 2 were inactivated but not when one was inactivated.

      Weaknesses: One of the limitations is that it is not clear whether the area of cortex where Raptor or Rictor were affected was the same in each animal. Also, it is not clear which cortical cells were measured for soma size. Another limitation is that the hippocampus was affected as well as the cortex. One does not know the role of cortex vs. hippocampus. Any discussion about that would be good to add. It would also be useful to know if Raptor and Rictor are in glia, blood vessels, etc.

    1. Reviewer #1 (Public Review):

      In "Resting-state alterations in behavioral variant frontotemporal dementia are related to the distribution of monoamine and GABA neurotransmitter systems" by Hahn et al, the authors investigate the association between structural and functional alterations in bvFTD and neurotransmitter systems. The authors take this a step further and also relate functional activation reductions in bvFTD to mRNA expression levels of neurotransmitter systems, and clinical/behavioural measures of the bvFTD subjects. The authors find significant associations between fALFF bvFTD maps and serotonin, dopamine, noradrenaline, and GABAa receptors/transporters, demonstrating a link between specific neurotransmitter systems and functional alterations in bvFTD. They successfully achieve their aim of finding neurotransmitter systems that may subserve functional changes in bvFTD. This is strengthened by the finding that receptor-fALFF correspondence is correlated with performance on cognitive tests across individuals. This multimodal approach is important for informing clinical interventions in bvFTD and the authors nicely demonstrate a link between functional changes in bvFTD, receptor systems, and cognition. In my opinion, the primary weakness of the study is that the effects are small, although I wonder whether this is related to the fact that some of the neurotransmitter receptor maps have small sample size and low sensitivity in the cortex.

    2. Reviewer #2 (Public Review):

      The aim of this study was to relate functional alterations in patients with bvFTD to neurotransmitter maps provided by the JuSpace toolbox in order to better understand the underlying pathological mechanisms of this disease.

      A strength of the study is the novelty of this aim. Some weaknesses are the different fMRI parameters of patients belonging to each centre and a better explanation of some methodological choices as well a better description of the JuSpace toolbox.

      The authors have achieved their aims and the results seem to support some conclusions, although the results should be interpreted in light of a potential lack of proper control for multiple comparisons.

      This work will increase the use of approaches that relate brain abnormalities to neurotransmitters and transcriptomics.

      There is an increasing trend to assess the correspondence between neuroimaging alterations and detailed information of neurotransmitters across the brain. This work represents this trend and adds to an increasing body of work doing the same with transcriptomics.

    3. Reviewer #3 (Public Review):

      This manuscript analyzed resting state functional MRI metrics related to behavioral variant frontotemporal dementia (bvFTD) for associations with patterns of neurotransmitter system receptor distribution, patterns of neurotransmitter-related gene expression, and profiles of performance on neuropsychological test battery items.

      The overarching goal of the work was to assess whether these analyses point to selective vulnerability of some neurotransmitter systems in the symptomatology of bvFTD. The manuscript reports that reductions in fMRI measures of local brain functional activity in bvFTD followed the distribution of specific neurotransmitter systems. No similar findings were identified for MRI-based gray matter volume measurements.

      Strengths of the manuscript include its leveraging of publicly available tools for large-scale regional brain mRNA profiles and neurotransmitter receptor distributions. An additional positive step for the literature involves further development of the concept that biomarkers of disruptions to specific functionally-connected networks may guide specific treatment strategies (as a corollary to this work, related to neurotransmitter system disruption) in neurodegenerative disease.

      A weakness of the manuscript is that it is not able to directly address the main literature gap described in the Introduction -- namely, whether there is specific vulnerability of certain neuronal types versus other in bvFTD, or whether broader network/region-based neurodegeneration is the driver (and happens to include some selective neurotransmitter-related disruptions). In other words, if "A" is a biomarker of bvFTD, "A" has a partial correlation with "B", and the "AB" correlation has a partial correlation with "C", it seems too far a leap to conclude that "B" (in this case, profiles/distributions of neurotransmitter systems) is the central figure in the cascade.

    1. Joint Public Review:

      The authors clearly state the current mystery surrounding transcriptional regulation of ACE2-expression, and how SARS-CoV-2 infection might impact this regulation. Several medications have been identified impacting the gene expression of ACE2, such as colchicine. However, the mechanism behind this regulation of ACE2 gene expression is currently unknown, yet worth investigating. Indeed, getting to know the mechanism behind the transcriptional regulation of ACE2 might lead to development of therapies targeting this expression in order to attenuate COVID-19 severity.<br /> In order to achieve insight in the regulation of ACE2 expression by SARS-CoV-2, the authors used a luciferase reporter based assay to investigate a range of signaling pathways. The authors found that ACE2 expression is upregulated by SARS-CoV-2 infection via activation of transcription factor Sp1 and inhibition of HNF4α through the PI3K/AKT pathway. This led to the discovery that inhibition of Sp1 using mithramycin A reduces SARS-CoV-2 infection in vitro and in an animal model.

      Strengths<br /> - The authors used an elegant design for their investigation. Based on a broad luciferase based assay, and keeping in mind the opposite effects of SARS-CoV-2 infection and colchicine administration on the expression of ACE2, they identified transcription factors as potential candidates for regulating ACE2 expression.<br /> - Throughout the several experiments performed, the antagonizing effects of SARS-CoV-2 infection and colchicine on the identified transcription factors (Sp1 and HNF4α) are consistent and therefore strengthen the conclusions.

      Weaknesses<br /> - For the in vitro work, only one cell line is used in this article: HPAEpiC cells, an immortalized human cell line derived from alveolar epithelial type II cells. This limits the generalizability of the results obtained in this study, as SARS-CoV-2 is known to infect several kinds of cells.<br /> - From the results of two separate experiments (colchicine leading to reduced ACE2-expression in HPAEpiC cells & colchicine leading to reduced SARS-CoV-2 replication in HPAEpiC cells), the authors infer that inhibition of ACE2 expression by colchicine suppresses SARS-CoV-2 infection. However, their experiments do not explicitly prove this hypothesis and do not give weight to the importance of this reduced ACE2 expression in the colchicine antiviral effect they observed, as other mechanisms may play a (bigger) role in producing this effect.<br /> - The authors refer to colchicine as a drug leading to mortality benefit when used as treatment for COVID-19 (line 101-105). However, whether colchicine is beneficial in COVID-19 is unclear. For instance, the randomized controlled trial by the RECOVERY Collaborative Group (Lancet Respir Med 2021), which included more than 11,000 patients, did not find benefit from colchicine in patients admitted to hospital with COVID-19. The authors refer to the review of Drosos et al to infer benefit of colchicine in COVID-19, however this review ignores the numerous trials contradicting this (as also stated in a letter from Finsterer in response to this review). The meta-analysis by Elshafei to which the authors refer was published before the largest RCT by the RECOVERY Group was published.<br /> - The authors did not let a pathologist blinded to the infection/treatment state of the animals score the samples obtained in the animal experiments, which could have introduced bias in these results.

      These results add to the existing knowledge that the characteristics of ACE2 (its functionality and abundance) in the respiratory tract are pivotal to understand infection by SARS-CoV-2. The author conclusions are supported by the results. The identification of the two transcription factors influenced by SARS-CoV-2 infection is valuable, but needs further research to assess whether their effect on ACE2 expression is also seen in other cell types than the one assessed by the authors. More in-depth research will have to follow to assess if and how targeting the identified transcription factors could ultimately benefit patients with COVID-19.

    1. Reviewer #2 (Public Review):

      Prior results established that Lepr, Calcr, and Cck neurons are non-overlapping neuronal populations in the NTS that individually suppress food intake when activated. This paper examines the consequences of activating or inhibiting two or three of these populations simultaneously. Activating two or three populations inhibits food intake a body weight more than each individually. Activation of Lepr and/or Calcr neurons is not aversive based on the conditioned taste aversion test, whereas activating all three is aversive by this test, indicating that aversion due to Cck neurons activation is dominant. Vertical sleeve gastrectomy (VSG) causes weight loss, but inhibiting each of these neurons individual or all three of them does not prevent weight loss. Overall, this paper provides a solid set of results but does not provide mechanistic insight into any of the phenomena examined.

    1. Reviewer #2 (Public Review):

      Summary:<br /> In this study, Wilmot et al., ran a series of experiments to describe a dopaminergic projection from LC to dHPC, and its functional role in trace fear conditioning (TFC). Using fiber photometry in LC, they show convincingly that the activity of LC TH neurons is increased to both cues and footshock, and that this increases with acquisition or TFC, and decreases during extinction of this association. Projections from LC to dHPC show a similar pattern of activity, and dopamine release (measured by the fluorescent sensor GRAB-DA) is also comparable to calcium activity from LC. While the authors do show that activity at the dopamine D1R/D5R is necessary for TFC, a direct test of the necessity of dopamine release from LC during TFC is not shown.

      Strengths:<br /> • The authors clearly and effectively show that the LC-dHPC projection is activated by an aversive outcome (i.e. shock), and that activity in this pathway changes in response to learning about a neutral cue that predicts this shock (i.e. TFC). Furthermore, they show that increased dopamine release in dHPC can be observed if LC is chemogenetically activated. A critical role for dopamine receptors (but not β- and ⍺-adrenergic receptors) in TFC was demonstrated, and intra-dHPC injection of a D1R/D5R antagonist blocks this learning. Finally, dopamine release (measured by GRAB-DA) in dHPC was shown to also occur during trace fear conditioning.

      • The authors have conclusively shown that activity at the dopamine receptors in the dPHC during trace fear conditioning is of the same pattern as calcium activity recorded both in LC cell bodies, but more importantly in the axonal projections from LC to dHPC. This is very good evidence that this pathway is recruited during TFC.

      Weaknesses:<br /> • The claim that dopamine release in dHPC is caused by LC neurons is not directly tested. Unfortunately, the most critical experiment for the claims that dopamine release comes from LC during conditioning is not tested. A lack of dopamine signal in dHPC caused by inhibition of LC during TFC would show this. It is indeed an interesting observation that chemoegenetic activation of LC causes dopamine release in the dHPC. However, in the absence of concurrent VTA inhibition or lesion, it remains a possibility that the dopamine release is mediated through indirect actions on other dopamine-expressing neurons. The authors do a good job of arguing against this interpretation in the discussion, and the literature seems appropriate for this. However, the title is still an overstatement of the data presented in this study.

      • The primary alternative interpretations of the phasic activation experiment are whether only stimulation to the cue events (both on and off), or whether only stimulation to the shock. Thus this experiment would benefit from additional data showing either a no shock control, to show that enhanced activity of the LC to the tone is not inherently aversive, or manipulations to the tone but not to the shock.

      • Specificity of the GRAB-NE and GRAB-DA sensors should be either justified through additional experiments testing the alternative antagonist (i.e. GRAB-NE CNO+eticloprode / GRAB-DA CNO+yohimbine) or additional citations that have tested this already. It is critical for the claims of the paper to show that these sensors are specific to dopamine or norepinephrine.

      • The role of dopamine in prediction error was tested through a series of conditions whereby the shock was presented either signaled (i.e. predicted), or not. However, another way that prediction error is signaled is through the absence of an expected outcome. Admittedly it might not be possible to observe a decrease in dopamine signaling with this methodology.

      • The difference between Fig. 6E and 6H needs to be clarified. What is shown in Fig. 6E is that the response to the shock decreases through experience (i.e. by the 10th trial). However in Fig 6H, there is no difference between signaled and signaled shock, but this is during conditioning, and not after learning (based on my understanding of the methods, line 482).

      • Unless I missed it, at no point in the manuscript is the number of subjects described. Please add the n per experiment within each section describing each experiment in the methods (Behavioral procedures). Some more details in the photometry statistical analysis would be helpful. For example, what is the n per group for every data set that is presented? How many trials per analysis?

    2. Reviewer #1 (Public Review):

      Summary:<br /> The authors investigate the role of the noradrenergic nucleus Locus Coeruleus (LC) in hippocampally-dependent learning and memory processes. The two stated aims of these experiments are to distinguish between 'tonic and phasic' activity and release in LC neurons and to determine the relative contribution of noradrenaline and dopamine, released from LC terminals, during learning. To address these questions, the investigators used a trace conditioning protocol (a behavior that is well established to be dependent on the hippocampus), coupled with a genetically based toolbox of sensors allowing measurements and manipulation of cell-type specific populations of neurons.

      This includes photometric imaging of neuronal activity within the LC through Calcium signaling (Fig 1B), and in the hippocampal target site (Fig 3F), photostimulation of monoamine-containing neurons in the LC Fig 4B), measuring of extracellular dopamine and noradrenergic in the hippocampus with fluorescent sensors (GRABs) (Fig 5B). The study was complemented by a pharmacological approach to demonstrate that dopamine and not noradrenaline were essential for learning this task.

      Results show that the calcium signal in the LC increased in response to tone or footshock in an intensity-dependent manner (Fig 1C,D,E F). LC responses can be conditioned and conditioned responses are of higher amplitude than the responses to the to-be-conditioned stimulus (Fig 2D). These results replicate sparse data gleaned over the past four decades using single and multiple-unit electrophysiological recording in LC in rats and monkeys. Calcium imaging LC axonal projections in the hippocampus showed a small but significant increase in response to tone onset and offset and to shock during conditioning.

      Gain of function experiments show that enhancing a weak tone stimulus by phasic activation of LC through photostimulation during conditioning, facilitated subsequent memory performance (Fig 4D).

      Fluorescent sensors demonstrated the release of both Noradrenaline and Dopamine in the hippocampus in response to activation of LC.

      Using conventional pharmacology the essential role of dopamine was confirmed in the learning of this trace conditioning task, corroborating previous reports of hippocampal dopamine involvement in spatial learning.

      Strengths:<br /> The experiments confirm many of the results of the past four decades from unit recordings from the LC in behaving rats and monkeys. The available data are sparse, due to the difficulty of recording from this tiny pontine nucleus; the reports emanate from only a few laboratories. Given the large amount of theorizing based on sparse data, it is important that the observations concerning the environmental contingencies driving the activity of LC be corroborated.

      That dopamine is released from LC terminals in the forebrain has been known for 20 years (Devoto 2004), but this was largely ignored until recently when a few laboratories demonstrated the functional importance of this projection in hippocampal-dependent learning. The present corroboration should lend further credence and promote further studies of the factors governing this release of dopamine from LC terminals, into specific forebrain regions.

      Weaknesses:<br /> --One criticism the authors have made of previous studies was that they have not distinguished between 'tonic' and 'phasic' LC activity and could not demonstrate 'time-locked phasic firing'. This has not been achieved in the present report, as an examination of Fig 1 C,D and 2 C,D shows. Previous reports in rats and monkeys, using unit recording in rats and monkeys clearly show that the latency of LC 'phasic' responses to salient or behaviorally relevant stimuli are in the range of tens of milliseconds, with a very short duration, often followed by a long-lasting inhibition. This kind of temporal precision concerning the phasic response cannot be gleaned from the time scale shown in the Figures (assuming the time scale is in seconds). We can discern a long-lasting increase in tonic firing level for the more salient stimuli (Fig 1C) (although the authors state in the discussion that "we did not observe obvious changes in tonic LC-HPC activity). This calcium imaging methodology as used in the present experiments can give us a general idea of the temporal relation of LC response to the stimulus, but apparently does not afford the millisecond resolution necessary to capture a phasic response, at least as the data are presented in the Figures.

      --Much of the data presented here can be regarded as 'proof of concept' i.e. demonstrating that Photometric imaging of calcium signalling yields similar results concerning LC responses to salient or behaviorally relevant stimuli as has been previously reported using electrophysiological unit recording. The role of dopamine as the principal player in hippocampal-dependent learning also corroborates previous reports.

      -- No attempt was made to address the important current question of the modular organisation of Locus Coeruleus, although the authors recognize the importance of this question and propose future experiments using their methodology to record simultaneously in several LC projection sites.

    3. Reviewer #3 (Public Review):

      Summary:<br /> The manuscript examines an important question, namely how the brain associates events spaced in time. It uses a variety of neural methods including fiber photometry as well as area-specific and pathway-silencing methods with the exquisite dissociation of norepinephrine and dopamine. The data show that neurons in the locus coeruleus (LC) respond to auditory cue onset, offset, and shock. These responses are stronger if the cue is paired with shock in a trace procedure. Optogenetic stimulation similar to the neural response captured by fiber photometry enhances associative learning. LC terminals in the dorsal hippocampus also showed phasic responses during fear conditioning and drove dopamine and norepinephrine responses. Pharmacological methods revealed that dopamine and not norepinephrine is critical for fear learning.

      Strengths:<br /> The examination of the neural signal to different tone intensities, different shock intensities, repeated tone presentation (habituation), and conditioning, offers an unprecedented account of the neural signal to non-associative and associative processes. This kind of deconstruction of the elements of conditioning offers a strong account of how the brain processes the stimuli used and their interaction during learning.

      Excellent use of data acquired with fiber photometry in the optogenetic interrogation study.

      The use of pharmacology to disentangle dopamine and norepinephrine was excellent.

      Weaknesses:<br /> While the optogenetic study was lovely, a control using the same stimulation but delivered at different time points would have been a good addition to show how critical the neural signal at tone onset, tone offset, and shock is.

      Justification for the focus on D1 receptors was lacking.

      The manuscript provides convincing evidence that the neural signal is not an error-correcting one by including a predicted (by a tone) and unpredicted shock. One possibility is that perhaps the unpredicted shock could be predicted by the context. Some clarification on the behavioural procedures would help understand if indeed the unsignaled shock could be predicted by the context or not.

    1. Reviewer #3 (Public Review):

      Summary:<br /> The authors set out to characterize the anatomical connectivity profile and the functional responses of chandelier cells (ChCs) in the mouse primary visual cortex. Using retrograde rabies tracing, optogenetics, and in vitro electrophysiology, they found that the primary source of input to ChCs are local layer 5 pyramidal cells, as well as long-range thalamic and cortical connections. ChCs provided input to local layer 2/3 pyramidal neurons, but did not receive reciprocal connections.

      With two-photon calcium imaging recordings during passive viewing of drifting gratings, the authors showed that ChCs exhibit weakly selective visual responses, high correlations within their own population, and strong responses during periods of arousal (assessed by locomotion and pupil size). These results were replicated and extended in experiments with natural images and prediction of receptive field structure using a convolutional neural network.

      Furthermore, the authors employed a learned visuomotor task in a virtual corridor to show that ChCs exhibit strong responses to mismatches between visual flow and locomotion, locomotion-related activation (similar to what was shown above), and visually-evoked suppression. They also showed the existence of two clusters of pyramidal neurons with functionally different responses - a cluster with "classically visual" responses and a cluster with locomotion- and mismatch-driven responses (the latter more correlated with ChCs). Comparing naive and trained mice, the authors found that visual responses of ChCs are suppressed following task learning, accompanied by a shortening of the axon initial segment (AIS) of pyramidal cells and an increase in the proportion of AIS contacted by ChCs. However, additional controls would be required to identify which component(s) of the experimental paradigm led to the functional and anatomical changes observed.

      Finally, using a chemogenetic inactivation of ChCs, the authors propose weak connectivity to pyramidal cells (due to small effects in pyramidal cell activity). However, these results are not unequivocally supported, as the baseline activity of ChCs before inactivation is considerably lower, suggesting a potentially confounding homeostatic plasticity mechanism might already be operating.

      Strengths:<br /> The authors bring a comprehensive, state-of-the-art methodology to bear, including rabies tracing, in vivo two-photon calcium imaging, in vitro electrophysiology, optogenetics and chemogenetics, and deep neural networks. Their analyses and statistical tests are sound and for the most part, support their claims. Their results are in line with previous findings and extend them to the primary visual cortex.

      Weaknesses:<br /> - Some of the results (e.g. arousal-related responses) are not entirely surprising given that similar results exist in other cortical areas.

      - Control analyses regarding locomotion patterns before and after learning the task (Figure 5), and additional control experiments to identify whether functional and anatomical changes following task learning were due to learning, repeated visual exposure, exposure to reward, or visuomotor experience would strengthen the claims made.

      - The strength of the results of the chemogenetics experiment is impacted by the lower baseline activity of ChCs that express the KORD receptor. At present, it is not possible to exclude the presence of homeostatic plasticity in the network *before* the inactivation takes place.

    2. Reviewer #1 (Public Review):

      Overall, the experiments are well-designed and the results of the study are exciting. We have one major concern, as well as a few minor comments that are detailed in the following.

      Major:<br /> 1. The authors suggest that "Visuomotor experience induces functional and structural plasticity of chandelier cells". One puzzling thing here, however, is that mice constantly experience visuomotor coupling throughout life which is not different from experience in the virtual tunnel. Why do the authors think that the coupled experience in the VR induces stronger experience-dependent changes than the coupled experience in the home cage? Could this be a time-dependent effect (e.g. arousal levels could systematically decrease with the number of head-fixed VR sessions)? The control experiment here would be to have a group of mice that experience similar visual flow without coupling between movement and visual flow feedback. Either change would be experience-dependent of course, but having the "visuomotor experience dependent" in the title might be a bit strong given the lack of control for that. We would suggest changing the pitch of the manuscript to one of the conclusions the authors can make cleanly (e.g. Figure 4).

      Minor:<br /> 2. "ChCs shape the communication hierarchy of cortical networks providing visual and contextual information." We are not sure what this means.

      3. "respond to locomotion and visuomotor mismatch, indicating arousal-related activity" This is not clear. We think we understand what the authors mean but would suggest rephrasing.

      4. 'based on morphological properties revealed that 87% (287/329) of labeled neurons were ChCs" Please specify the morphological properties used for the classification somewhere in the methods.

      5. We may have missed this - in the patch clamp experiment (Fig.1 H-K), please add information about how many mice/slices these experiments were performed in.

      6. "These findings suggest that the rabies-labeled L1-4 neurons providing monosynaptic input to ChCs are predominantly inhibitory neurons". We are not sure this conclusion is warranted given the sparse set of neurons labelled and the low number of cells recorded in the paired patch experiment. We would suggest properly testing (e.g. stain for GABA on the rabies data) or rephrasing.

      7. Figure 2E. A direct comparison of dF/F across different cell types can be subject to a problematic interpretation. The transfer function from spikes to calcium can be different from cell type to cell type. Additionally, the two cell populations have been marked with different constructs (despite the fact that it's the same GECI) further reducing the reliability of dF/F comparisons. We would recommend using a different representation here that does not rely on a direct comparison of dF/F responses (e.g. like the "response strength" used in Figure 3B). Assuming calcium dynamics are different in ChCs and PyCs - this similarity in calcium response is likely a coincidence.

      8. If ChCs are more strongly driven by locomotion and arousal, then it's a bit counterintuitive that at the beginning of the visual corridor when locomotion speed consistently increases, the activity of ChCs consistently decreases. This does not appear to be driven by suppression by visual stimuli as it is present also in the first and last 20cm of the tunnel where there are no visual stimuli. How do the authors explain this?

      9. The authors mention that "ChC responses underwent sensory-evoked plasticity during the repeated visual exposure, even though the visual stimuli were different from those encountered during training in the virtual tunnel". How would this work? And would this mean all visual responses are reduced? What is special about the visual experience in the virtual tunnel? It does not inherently differ from visual experience in the home cage, given that the test stimuli (full field gratings) are different from both.

      10. Just as a point to consider for future experiments: For the open-loop control experiments, the visual flow is constant (20cm/s) - ideally, this would be a replay of the running speed the mouse previously generated to match statistics.

      11. We would recommend specifying the parameters used for neuropil correction in the methods section.

      12. If we understand correctly, the F0 used for the dF/F calculation is different from that used for division. Why is this?

      13. Authors compare neuronal responses using "baseline-corrected average". Please specify the parameters of the baseline correction (i.e. what is used as baseline here).

    3. Reviewer #2 (Public Review):

      Summary:<br /> Seignette et al. investigated the potential roles of axo-axonic (chandelier) cells (ChCs) in a sensory system, namely visual processing. As introduced by the authors, the axo-axonic cell type has remained (and still is) somehow mysterious in its function. Seignette and colleagues leveraged the development of a transgenic mouse line selective for ChC, and applied a very wide range of techniques: transsynaptic rabies tracing, optogenetic input activation, in vitro electrophysiology, 2-photon recording in vivo, behavior and chemogenetic manipulations, to precisely determine the contribution of ChCs to the primary visual cortex network.

      The main findings are 1) the identification of synaptic inputs to ChC, with a majority of local, deep layer principal neurons (PN), 2) the demonstration that ChC is strongly and synchronously activated by visual stimuli with low specificity in naive animals, 3) the recruitment of ChC by arousal/visuomotor mismatch, 4) the induction of functional and structural plasticity at the ChC-PN module, and, 5) the weak disinhibition of PNs induced by ChCs silencing. All these findings are strongly supported by experimental data and thoroughly compared to available evidence.

      Strengths:<br /> This article reports an impressive range of very demanding experiments, which were well executed and analyzed, and are presented in a very clear and balanced manner. Moreover, the manuscript is well-written throughout, making it appealing to future readers. It has also been a pleasure to review this article.

      In sum, this is an impressive study and an excellent manuscript, that presents no major flaws.

      Notably, this study is one of the first studies to report on the activities and potential roles of axo-axonic cells in an active, integrated brain process, beyond locomotion as reported and published in V1. This type of research was much awaited in the fields of interneuron and vision research.

      Weaknesses:<br /> There are no fundamental weaknesses; the latter mainly concern the presentation of the main results.

      The main weakness may be that the different sections appear somehow disconnected conceptually.

      Additionally, some parts deserve a more in-depth clarification/simplification of concepts and analytic methods for scientists outside the subfield of V1 research. Indeed, this paper will be of key interest to researchers of various backgrounds.

    1. Reviewer #2 (Public Review):

      In the revised manuscript, the authors aim to investigate brain-wide activation patterns following administration of the anesthetics ketamine and isoflurane, and conduct comparative analysis of these patterns to understand shared and distinct mechanisms of these two anesthetics. To this end, they perform Fos immunohistochemistry in perfused brain sections to label active nuclei, use a custom pipeline to register images to the ABA framework and quantify Fos+ nuclei, and perform multiple complementary analyses to compare activation patterns across groups. This is an interesting line of research and a tour de force in brain-wide Fos quantification.

      I appreciate many of the changes that were made in the revised manuscript, including FDR correction and transparency in showing their results with and without transformation. However, several key issues described in our first review have not been addressed.

      1-Aside from issues with their data transformation (see below), (a) I think they have some interesting Fos counts data in Figures 4B and 5B that indicate shared and distinct activation patterns after KET vs. ISO based anesthesia. These data are far closer to the raw data than PC analyses and need to be described and analyzed in the first figures long before figures with the more abstracted PC analyses. In other words, you need to show the concrete raw data before describing the highly transformed and abstracted PC analyses. (b) This gets to the main point that when selecting brain areas for follow up analyses, these should be chosen based on the concrete Fos counts data, not the highly transformed and abstracted PC analyses.

      2-Now, the choice of data transformation for Fos counts is the most significant problem. First, the authors show in the response letter that not using this transformation (region density/brain density) leads to no clustering. However, they also showed the region-densities without transformation (which we appreciate) and it looks like overall Fos levels in the control group Home (ISO) are a magnitude (~10-fold) higher than those in the control group Saline (KET) across all regions shown. This large difference seems unlikely to be due to a biologically driven effect and seems more likely to be due to a technical issue, such as differences in staining or imaging between experiments. Was the Homecage-ISO experiment or at least the Fos labeling and imaging performed at the same time as for the Saline-Ketamine experiment? Please state the answer to this question in the Results section one way or the other.

      3-Second, they need to deal with this large difference in overall staining or imaging for these two (Home/ISO and Saline/KET) experiments more directly; their current normalization choice does not really account for the large overall differences in mean values and variability in Fos counts (e.g. due to labeling and imaging differences).

      3a-I think one option (not perfect but I think better than the current normalization choice) could be z-scoring each treatment to its respective control. They can analyze these z-scored data first, and then in later figures show PC analyses of these data and assess whether the two treatments separate on PC1/2. And if they don't separate, then they don't separate, and you have to go with these results.

      3b-Alternatively, they need to figure out the overall intensity distributions from the different runs (if that the main reason of markedly different counts) and adjust their thresholds for Fos-positive cell detection based on this. I would expect that the saline and HC groups should have similar levels of activation, so they could use these as the 'control' group to determine a Fos-positive intensity threshold that gets applied to the corresponding 'treatment' group.

      3c- If neither 3a nor 3b is an option then they need to show the outcomes of their analysis when using the untransformed data in the main figures (the untransformed data plots in their responses to reviewer are currently not in the main or supplementary figs) and discuss these as well.

    2. Reviewer #1 (Public Review):

      Overall, the manuscript has been improved by addressing some of the concerns, however, I am still very confused about the data analysis due to the use of data transformation (relative %fos), the fact that some graphs only show regions that are significant and the interpretation of the PCA analysis which I find inappropriate. Moreover, many answers in the rebuttal did not make it to the final manuscript and are not discussed and limitations raised by the reviewers are not discussed either.

      1a. The addition of the EEG/EMG is useful, however, this information is not discussed. For instance, there are differences in EEG/EMG between the two groups (only Ket significantly increased delta/theta power, and only ISO decreased EMG power). These results should be discussed as well as the limitation of not having physiological measures of anesthesia to control for the anesthesia depth.<br /> 1b. The possibility that the differences in fos observed may be due to the doses used should be discussed.<br /> 1c. The possibility that the differences in fos observed may be due kinetic of anesthetic used should be discussed.

      2b. I am confused because Fig 2C seems to show significant decrease in %fos in the hypothalamus, midbrain and cerebellum after KET, while the author responded that " in our analysis, we did not detect regions with significant downregulation when comparing anesthetized mice with controls." Moreover the new figure in the rebuttal in response to reviewer 2 suggests that Ket increases Fos in almost every single region (green vs blue) which is not the conclusion of the paper.

      3. There are still critical misinterpretations of the PCA analysis. For instance, it is mentioned that "KET is associated with the activation of cortical regions (as evidenced by positive PC1 coefficients in MOB, AON, MO, ACA, and ORB) and the inhibition of subcortical areas (indicated by negative coefficients) " as well as "KET displays cortical activation and subcortical inhibition, whereas ISO shows a contrasting preference, activating the cerebral nucleus (CNU) and the hypothalamus while inhibiting cortical areas. To reduce inter-individual variability." These interpretations are in complete contradiction with the answer 2b above that there was no region that had decreased Fos by either anesthetic.

      4. I still do not understand the rationale for the use of that metric. The use of a % of total Fos makes the data for each region dependent on the data of the other regions which wrongly leads to the conclusion that some regions are inhibited while they are not when looking at the raw data. Moreover, the interdependence of the variable (relative density) may affect the covariance structure which the PCA relies upon. Why not using the PCA on the logarithm of the raw data or on a relative density compared to the control group on a region-per-region basis instead of the whole brain?

      Fig. 2B: it's unclear to me why the regions are connected by a line. Such representation is normally used for time series/within-subject series. What is the rationale for the order of the regions and the use of the line? The line connecting randomly organized regions is meaningless and confusing.

      Fig 6A. the correlation matrices are difficult to interpret because of the low resolution and arbitrary order of brain regions. I recommend using hierarchical clustering and/or a combination of hierarchical clustering and anatomical organization (e.g. PMID: 31937658). While it is difficult to add the name of the regions on the graph I recommend providing supplementary figures with large high-resolution figures with the name of each brain region so the reader can actually identify the correlation between specific brain regions and the whole brain,

      Rationale for Metric Choice: Note that I do not dispute the choice of the log which is appropriate, it is the choice of using the relative density that I am questioning.

      5. I am still having difficulties understanding Fig. 3.<br /> Panel A: The lack of identification for the dots in panel A makes it impossible to understand which regions are relevant.<br /> Panel B: what is the metric that the up/down arrow summarizes? Fos density? Relative density? PC1/2?<br /> Panel C: it's unclear to me why the regions are connected by a line. Such representation is normally used for time series/within-subject series. What is the rationale for the order of the regions?

    3. Reviewer #3 (Public Review):

      The present study presents a comprehensive exploration of the distinct impacts of Isoflurane and Ketamine on c-Fos expression throughout the brain. To understand the varying responses across individual brain regions to each anesthetic, the researchers employ principal component analysis (PCA) and c-Fos-based functional network analysis. The methodology employed in this research is both methodical and expansive. Notably, the utilization of a custom software package to align and analyze brain images for c-Fos positive cells stands out as an impressive addition to their approach. This innovative technique enables effective quantification of neural activity and enhances our understanding of how anesthetic drugs influence brain networks as a whole.

      The primary novelty of this paper lies in the comparative analysis of two anesthetics, Ketamine and Isoflurane, and their respective impacts on brain-wide c-Fos expression. The study reveals the distinct pathways through which these anesthetics induce loss of consciousness. Ketamine primarily influences the cerebral cortex, while Isoflurane targets subcortical brain regions. This finding highlights the differing mechanisms of action employed by these two anesthetics-a top-down approach for Ketamine and a bottom-up mechanism for Isoflurane. Furthermore, this study uncovers commonly activated brain regions under both anesthetics, advancing our knowledge about the mechanisms underlying general anesthesia.

    1. Reviewer #3 (Public Review):

      Summary:<br /> The study conducted by Ouasti et al. is an elegant investigation of fission yeast CAF-1, employing a diverse array of technologies to dissect its functions and their interdependence. These functions play a critical role in specifying interactions vital for DNA replication, heterochromatin maintenance, and DNA damage repair, and their dynamics involve multiple interactions. The authors have extensively utilized various in vitro and in vivo tools to validate their model and emphasize the dynamic nature of this complex.

      Strengths:<br /> Their work is supported by robust experimental data from multiple techniques, including NMR and SAXS, which validate their molecular model. They conducted in vitro interactions using EMSA and isothermal microcalorimetry, in vitro histone deposition using Xenopus high-speed egg extract, and systematically generated and tested various genetic mutants for functionality in in vivo assays. They successfully delineated domain-specific functions using in vitro assays and could validate their roles to large extent using genetic mutants. One significant revelation from this study is the unfolded nature of the acidic domain, observed to fold when binding to histones. Additionally, the authors also elucidated the role of the long KER helix in mediating DNA binding and enhancing the association of CAF-1 with PCNA. The paper effectively addresses its primary objective and is strong.

      Weaknesses:<br /> A few relatively minor unresolved aspects persist, which, if clarified or experimentally addressed by the authors, could further bolster the study.

      1. The precise function of the WHD domain remains elusive. Its deletion does not result in DNA damage accumulation or defects in heterochromatin maintenance. This raises questions about the biological significance of this domain and whether it is dispensable. While in vitro assays revealed defects in chromatin assembly using this mutant (Figure 5), confirming these phenotypes through in vivo assays would provide additional assurance that the lack of function is not simply due to the in vitro system lacking PTMs or other regulatory factors.

      2. The observation of increased Pcf2-gfp foci in pcf1-ED* cells, particularly in mono-nucleated (G2-phase) and bi-nucleated cells with septum marks (S-phase), might suggest the presence of replication stress. This could imply incomplete replication in specific regions, leading to the persistence of Caf1-ED*-PCNA factories throughout the cell cycle. To further confirm this, detecting accumulated single-stranded DNA (ssDNA) regions outside of S-phase using RPA as an ssDNA marker could be informative.

      3. Moreover, considering the authors' strong assertion of histone binding defects in ED* through in vitro assays (Figure 2d and S2a), these claims could be further substantiated, especially considering that some degree of histone deposition might still persist in vivo in the ED* mutant (Figure 7d, viable though growth defective double ED*+hip1D mutants). For example, the approach, akin to the one employed in Fig. 6a (FLAG-IPs of various Pcf1-FLAG-tagged mutants), could also enable a comparison of the association of different mutants with histones and PCNA, providing a more thorough validation of their findings.

      4. It would be valuable for the authors to speculate on the necessity of having disordered regions in CAF1. Specifically, exploring the overall distribution of these domains within disordered/unfolded structures could provide insightful perspectives. Additionally, it's intriguing to note that the significant disparities observed among mutants (ED*, PIP*, and KER*) in in vitro assays seem to become more generic in vivo, except for the indispensability of the WHD-domain. Could these disordered regions potentially play a crucial role in the phase separation of replication factories? Considering these questions could offer valuable insights into the underlying mechanisms at play.

    2. Reviewer #1 (Public Review):

      Summary:<br /> This paper makes important contributions to the structural analysis of the DNA replication-linked nucleosome assembly machine termed Chromatin Assembly Factor-1 (CAF-1). The authors focus on the interplay of domains that bind DNA, histones, and replication clamp protein PCNA.

      Strengths:<br /> The authors analyze soluble complexes containing full-length versions of all three fission yeast CAF-1 subunits, an important accomplishment given that many previous structural and biophysical studies have focused on truncated complexes. New data here supports previous experiments indicating that the KER domain is a long alpha helix that binds DNA. Via NMR, the authors discover structural changes at the histone binding site, defined here with high resolution. Most strikingly, the experiments here show that for the S. pombe CAF-1 complex, the WHD domain at the C-terminus of the large subunit lacks DNA binding activity observed in the human and budding yeast homologs, indicating a surprising divergence in the evolution of this complex. Together, these are important contributions to the understanding of how the CAF-1 complex works.

      Weaknesses:<br /> 1. There are some aspects of the experimentation that are incompletely described:

      In the SEC data (Fig. S1C) it appears that Pcf1 in the absence of other proteins forms three major peaks. Two are labeled as "1a" (eluting at ~8 mL) and "1b" (~10-11 mL). It appears that Pcf1 alone or in complex with either or both of the other two subunits forms two different high molecular weight complexes (e.g. 4a/4b, 5a/5b, 6a/6b). There is also a third peak in the analysis of Pcf1 alone, which isn't named here, eluting at ~14 mL, overlapping the peaks labeled 2a, 4c, and 5c.

      The text describing these different macromolecular complexes seems incomplete (p. 3, lines 32-33): "When isolated, both Pcf2 and Pcf3 are monomeric while Pcf1 forms large soluble oligomers". Which of the three Pcf1-alone peaks are oligomers, and how do we know? What is the third peak? The gel analysis across these chromatograms should be shown.

      More importantly, was a particular SEC peak of the three-subunit CAF-1 complex (i.e. 4a or 4b) characterized in the further experimentation, or were the data obtained from the input material prior to the separation of the different peaks? If the latter, how might this have affected the results? Do the forms inter-convert spontaneously?

      2. Given the strong structural predication about the roles of residues L359 and F380 (Fig. 2f), these should be mutated to determine effects on histone binding.

      3. Could it be that the apparent lack of histone deposition by the delta-WHD mutant complex occurs because this mutant complex is unstable when added to the Xenopus extract?

    3. Reviewer #2 (Public Review):

      Summary:<br /> The authors describe the structure-functional relationship of domains in S. pombe CAF-1, which promotes DNA replication-coupled deposition of histone H3-H4 dimer. The authors nicely showed that the ED domain with an intrinsically disordered structure binds to histone H3-H4, that the KER domain binds to DNA, and that, in addition to a PIP box, the KER domain also contributes to the PCNA binding. The ED and KER domains as well as the WHD domain are essential for nucleosome assembly in vitro. The ED, KER domains, and the PIP box are important for the maintenance of heterochromatin.

      Strengths:<br /> The combination of structural analysis using NMR and Alphafold2 modeling with biophysical and biochemical analysis provided strong evidence on the role of the different domain structures of the large subunit of SpCAF-1, spPCF-1 in the binding to histone H3-H4, DNA as well as PCNA. The conclusion was further supported by genetic analysis of the various pcf1 mutants. The large amounts of data provided in the paper support the authors' conclusion very well.

    1. Reviewer #3 (Public Review):

      Light energy drives photosynthesis. However, excessive light can damage (i.e., photo-damage) and thus inactivate the photosynthetic process. A major target site of photo-damage is photosystem II (PSII). In particular, one component of PSII, the reaction center protein, D1, is very suspectable to photo-damage, however, this protein is maintained efficiently by an elaborate multi-step PSII-D1 turnover/repair cycle. Two proteases, FtsH and Deg, are known to contribute to this process, respectively, by efficient degradation of photo-damaged D1 protein processively and endoproteolytically. In this manuscript, Kato et al., propose an additional step (an early step) in the D1 degradation/repair pathway. They propose that "Tryptophan oxidation" at the N-terminus of D1 may be one of the key oxidations in the PSII repair, leading to processive degradation of D1 by FtsH. Both, their data and arguments are very compelling.

      The D1 protein repair/degradation pathway in its simplest form can be defined essentially by five steps: (1) migration of damaged PSII core complex to the stroma thylakoid, (2) partial PSII disassembly of the PSII core monomer, (3) access of protease degrading damaged D1, (4) concomitant D1 synthesis, and (5) reassembly of PSII into grana thylakoid. An enormous amount of work has already been done to define and characterize these various steps. Kato et al., in this manuscript, are proposing a very early yet novel critical step in D1 protein turnover in which Tryptophan(Trp) oxidation in PSII core proteins influences D1 degradation mediated by FtsH.

      Using a variety of approaches, such as mass-spectrometry (Table 1), site-directed mutagenesis (Figures 2-4), D1 degradation assays (Figures 3, and 4), and simulation modeling (Figure 5), Kato et al., provide both strong evidence and reasonable arguments that an N-terminal Trp oxidation may be likely to be a 'key' oxidative post-translational modification (OPTM) that is involved in triggering D1 degradation and thus activating the PSII repair pathway. Consequently, from their accumulated data, the authors propose a scenario in which the unraveling of the N-terminal of the D1 protein facilitated by Trp oxidation plays a critical 'recognition' role in alerting the plant that the D1 protein is photo-damaged and thus to kick start the processive degradation pathway initiated possibly by FtsH. Coincidently, Forsman and Eaton-Rye (Biochemistry 2021, 60, 1, 53-63), while working with the thermophilic cyanobacterium, Thermosynechococcus vulcanus, showed that when the N-terminal DE-loop of the D1 protein is photo-damaged a disruption of the interaction between the PsbT subunit and D1 occurs which may serve as a signal for PSII to undergo repair following photodamage. While the activation of the processive degradation pathways in Chlamydomonas versus Thermosynechococcus vulcanus have significant mechanistic differences, it's interesting to note and speculate that the stability of the N-terminal of their respective D1 proteins seems to play a critical role in 'signaling' the PSII repair system to be activated and initiate repair. But it's complicated. For instance, significant Trp oxidation also occurs on the lumen side of other PSII subunits which may also play a significant role in activating the repair processes as well. Indeed, Kato et al.,( Photosynthesis Research volume 126, pages 409-416 (2015)) proposed a two-step model whereby the primary event is disruption of a Mn-cluster in PSII on the lumen side. A secondary event is damage to D1 caused by energy that is absorbed by chlorophyll. But models adapt, change, and get updated. And the data provided by Kato et al., in this manuscript, gives us a unique glimpse/snapshot into the importance of the stability of the N-terminal during photo-damage and its role in D1-turnover. For instance, the author's use site-directed mutagenesis of Trp residues undergoing OPTM in the D1 protein coupled with their D1 degradation assays (Figure 3 and 4), provides evidence that Trp oxidation (in particular the oxidation of Trp14) in coordination with FtsH results in the degradation of D1 protein. Indeed, their D1 degradation assays coupled with the use of a ftsh mutant provide further significant support that Trp14 oxidation and FtsH activity are strongly linked. But for FstH to degrade D1 protein it needs to gain access to photo-damaged D1. FtsH access to D1 is achieved by having CP43 partially dissociate from the PSII complex. Hence, the authors also addressed the possibility that Trp oxidation may also play a role in CP43 disassembly from the PSII complex thereby giving FtsH access to D1. Using a site-directed mutagenesis approach, they showed that Trp oxidation in CP43 appeared to have little impact on the PSII repair (Supplemental Figure S6). This result shows that D1-Trp14 oxidation appears to be playing a role in D1 turnover that occurs after CP43 disassembly from the PSII complex. Alternatively, the authors cannot exclude the possibility that D1-Trp14 oxidation in some way facilitates CP43 dissociation. Further investigation is needed on this point. However, D1-Trp14 oxidation is causing an internal disruption of the D1 protein possibly at the N-terminus of the protein. Consequently, the role of Trp14 oxidation in disrupting the stability of the N-terminal domain of the D1 protein was analyzed computationally. Using a molecular dynamics approach (Figure 5), the authors attempted to create a mechanistic model to explain why when D1 protein Trp14 undergoes oxidation the N-terminal domain of D1protein becomes unraveled. Specifically, the authors propose that the interaction between D1 protein Trp14 with PsbI Ser25 becomes disrupted upon oxidation of Trp14. Consequently, the authors concluded from their molecular dynamics simulation analysis that " the increased fluctuation of the first α-helix of D1 would give a chance to recognize the photo-damaged D1 by FtsH protease". Hence, the author's experimental and computational approaches employed here develop a compelling early-stage repair model that integrates 1) Trp14 oxidation, 2) FtsH activation and 3) D1- turnover being initiated at its N-terminal domain. However, a word of caution should be emphasized here. This model is just a snapshot of the very early stages of the D1 protein turnover process. The data presented here gives us just a small glimpse into the unique relationship between Trp oxidation of the D1 protein which may trigger significant N-terminal structural changes of the D1 protein that both signals and provides an opportunity for FstH to begin protease digestion of the D1 protein. However, the authors go to great lengths in their discussion section to not overstate solely the role of Trp14 oxidation in the complicated process of D1 turnover. The authors certainly recognize that there are a lot of moving parts involved in D1 turnover. And while Trp14 oxidation is the major focus of this paper, the authors show in Supplemental Fig S4 the structural positions of various additional oxidized Trp residues in the Thermosynecoccocus vulcans PSII core proteins. Indeed, this figure shows that the majority of oxidized Trps are located on the luminal side of PSII complex clustered around the oxygen-evolving complex. So, while oxidized Trp14 may be involved in the early stages of D1 turnover certainly oxidized Trps on the lumen side are also more than likely playing a role in D1 turnover as well. To untangle this complex process will require additional research.

      Nevertheless, identifying and characterizing the role of oxidative modification of tryptophan (Trp) residues, in particular, Trp14, in the PSII core provides another critical step in an already intricate multi-step process of D1 protein turnover during photo-damage.

    2. Reviewer #1 (Public Review):

      This manuscript tried to answer a long-standing question in an important research topic. I read it with great interest. The quality of the science is high, and the text is clearly written. The conclusion is exciting. However, I feel that the phenotype of the transgenic line may be explained by an alternative idea. At least, the results should be more carefully discussed.

      Specific comments:

      1) Stability or activity (Fv/Fm) was not affected in PSII with the W14F mutation in D1. If W14F really represents the status of PSII with oxidized D1, what is the reason for the degradation of almost normal D1?

      2) To focus on the PSII in which W14 is oxidized, this research depends on the W14F mutant lines. It is critical how exactly the W-to-F substitution mimics the oxidized W. The authors tried to show it in Figure 5. Because of the technical difficulty, it may be unfair to request more evidence. But the paper would be more convincing with the results directly monitoring the oxidized D1 to be recognized by FtsH.

      3) Figure 3. If the F14 mimics the oxidized W14 and is sensed by FtsH, I would expect the degradation of D1 even under the growth light. The actual result suggests that W14F mutation partially modifies the structure of D1 under high light and this structural modification of D1 is sensed by FtsH. Namely, high light may induce another event which is recognized by FtsH. The W14F is just an enhancer.

    3. Reviewer #2 (Public Review):

      In their manuscript, Kato et al investigate a key aspect of membrane protein quality control in plant photosynthesis. They study the turnover of plant photosystem II (PSII), a hetero-oligomeric membrane protein complex that undertakes the crucial light-driven water oxidation reaction in photosynthesis. The formidable water oxidation reaction makes PSII prone to photooxidative damage. PSII repair cycle is a protein repair pathway that replaces the photodamaged reaction center protein D1 with a new copy. The manuscript addresses an important question in PSII repair cycle - how is the damaged D1 protein recognized and selectively degraded by the membrane-bound ATP-dependent zinc metalloprotease FtsH in a processive manner? The authors show that oxidative post-translational modification (OPTM) of the D1 N-terminus is likely critical for the proper recognition and degradation of the damaged D1 by FtsH. Authors use a wide range of approaches and techniques to test their hypothesis that the singlet oxygen (1O2)-mediated oxidation of tryptophan 14 (W14) residue of D1 to N-formylkynurenine (NFK) facilitates the selective degradation of damaged D1. Overall, the authors propose an interesting new hypothesis for D1 degradation and their hypothesis is supported by most of the experimental data provided. The study certainly addresses an elusive aspect of PSII turnover and the data provided go some way in explaining the light-induced D1 turnover. However, some of the data are correlative and do not provide mechanistic insight. A rigorous demonstration of OPTM as a marker for D1 degradation is yet to be made in my opinion. Some strengths and weaknesses of the study are summarized below:

      Strengths:

      1. In support of their hypothesis, the authors find that FtsH mutants of Arabidopsis have increased OPTM, especially the formation of NFK at multiple Trp residues of D1 including the W14; a site-directed mutation of W14 to phenylalanine (W14F), mimicking NFK, results in accelerated D1 degradation in Chlamydomonas; accelerated D1 degradation of W14F mutant is mitigated in an ftsH1 mutant background of Chlamydomonas; and that the W14F mutation augmented the interaction between FtsH and the D1 substrate.

      2. Authors raise an intriguing possibility that the OPTM disrupts the hydrogen bonding between W14 residue of D1 and the serine 25 (S25) of PsbI. According to the authors, this leads to an increased fluctuation of the D1 N-terminal tail, and as a consequence, recognition and binding of the photodamaged D1 by the protease. This is an interesting hypothesis and the authors provide some molecular dynamics simulation data in support of this. If this hypothesis is further supported, it represents a significant advancement.

      3. The interdisciplinary experimental approach is certainly a strength of the study. The authors have successfully combined mass spectrometric analysis with several biochemical assays and molecular dynamics simulation. These, together with the generation of transplastomic algal cell lines, have enabled a clear test of the role of Trp oxidation in selective D1 degradation.

      4. Trp oxidative modification as a degradation signal has precedent in chloroplasts. The authors cite the case of 1O2 sensor protein EXECUTER 1 (EX1), whose degradation by FtsH2, the same protease that degrades D1, requires prior oxidation of a Trp residue. The earlier observation of an attenuated degradation of a truncated D1 protein lacking the N-terminal tail is also consistent with authors' suggestion of the importance of the D1 N-terminus recognition by FtsH. It is also noteworthy that in light of the current study, D1 phosphorylation is unlikely to be a marker for degradation as posited by earlier studies.

      Weaknesses:

      1. The study lacks some data that would have made the conclusions more rigorous and convincing. It is unclear why the level of Trp oxidation was not analyzed in the Chlamydomonas ftsH 1-1 mutant as done for the var 2 mutant. Increased oxidation of W14 OPTM in Chlamydomonas ftsH 1-1 is a key prediction of the hypothesis. It is also unclear to me what is the rationale for showing D1-FtsH interaction data only for the double mutant but not for the single mutant (W14F). Why is the FtsH pulldown of D2 not statistically significant (p value = {less than or equal to}0.1). Wouldn't one expect FtsH pulls down the RC47 complex containing D1, D2, and RC47. Probing the RC47 level would have been useful in settling this. A key proposition of the authors' is that the hydrogen bonding between D1 W14 and S25 of PsbI is disrupted by the oxidative modification of W14. Can this hypothesis be further tested by replacing the S25 of PsbI with Ala, for example?

      2. Although most of the work described is in vivo analysis, which is desirable, some in vitro degradation assays would have strengthened the conclusions. An in vitro degradation assay using the recombinant FtsH and a synthetic peptide encompassing D1 N-terminus with and without OPTM will test the enhanced D1 degradation that the authors predict. This will also help to discern the possibility that whether CP43 detachment alone is sufficient for D1 degradation as suggested for cyanobacteria.

      3. The rationale for analyzing a single oxidative modification (W14) as a D1 degradation signal is unclear. D1 N-terminus is modified at multiple sites. Please see Mckenzie and Puthiyaveetil, bioRxiv May 04 2023. Also, why is modification by only 1O2 considered while superoxide and hydroxide radicals can equally damage D1?

      4. The D1 degradation assay seems not repeatable for the W14F mutant. High light minus CAM results in Fig. 3 shows a statistically significant decrease in D1 levels for W14F at multiple time points but the same assay in Fig. 4a does not produce a statistically significant decrease at 90 min of incubation. Why is this? Accelerated D1 degradation in the Phe mutant under high light is key evidence that the authors cite in support of their hypothesis.

      5. The description of results at times is not nuanced enough, for e.g. lines 116-117 state "The oxidation levels in Trp-14 and Trp-314 increased 1.8-fold and 1.4-fold in var2 compared to the wild type, respectively (Fig. 1c)" while an inspection of the figure reveals that modification at W314 is significant only for NFK and not for KYN and OIA. Likewise, the authors write that CP43 mutant W353F has no growth phenotype under high light but Figure S6 reveals otherwise. The slow growth of this mutant is in line with the earlier observation made by Anderson et al., 2002. In lines 162-163, the authors talk about unchanged electron transport in some site-directed mutants and cite Fig. 2c but this figure only shows chl fluorescence trace and nothing else.

      6. The authors rightly discuss an alternate hypothesis that the simple disassembly of the monomeric core into RC47 and CP43 alone may be sufficient for selective D1 degradation as in cyanobacteria. This hypothesis cannot yet be ruled out completely given the lack of some in vitro degradation data as mentioned in point 2. Oxidative protein modification indeed drives the disassembly of the monomeric core (Mckenzie and Puthiyaveetil, bioRxiv May 04 2023).

    1. Reviewer #2 (Public Review):

      Summary:<br /> This work tests the hypothesis that water coordination in WNK kinases is linked to allosteric control of activity. It is proposed that dimeric WNK is inactive and bound to some conserved water molecules, and that monomerization/activation involves departure of these waters. New data here include a crystal structure of monomeric WNK1 which shows missing waters compared to the dimeric structure, in support of the hypothesis. Mutant proteins of a different isozyme (WNK3) designed to disrupt water coordination were produced, and activity and quaternary structure were measured. The results with WNK3 do not clearly support or refute the hypothesis as there is no systematic correlation between mutations designed to disrupt water coordination and activity or quaternary structure.

      Strengths:<br /> The most interesting result presented here is that P1 crystals of WNK1 convert to P21 in the presence of PEG400 and still diffract (rather than being destroyed as the crystal contacts change, as one would expect). All of the assays for activity and osmolyte sensing are carried out well.

      Weaknesses:<br /> The rationale for using WNK3 for the mutagenesis study is that it is more sensitive to osmotic pressure than WNK1. I think that WNK1 would have been a better platform because of the direct correlation to the structural work leading to the hypothesis being tested. All of the crystallographic work is WNK1; it is not logical to jump to WNK3 without other practical considerations.

      Osmolyte sensing was tested by measuring ATP consumption as a function of PEG400 (Figure 6). Data for the subset of mutants analyzed by this assay showed increasing activity. It is not clear why the same collection of mutant proteins analyzed in the experiments of Figure 5 was not also measured for osmolyte sensing in Figure 6.

      The last set of data presented uses light scattering to test whether the WNK3 mutant proteins exhibit quaternary structural changes consistent with the monomer/dimer hypothesis. If they did, one would expect a higher degree of monomer for those that are activated by mutation, and a lower amount of monomer (like wt) for those that are not. Instead, one of the mutant proteins that showed the most chloride inhibition (Y346F) had a quaternary structure similar to the wt protein, and others have similar monomer/dimer mixtures but distinct chloride inhibition profiles (K307A and M301A). I don't see how the light scattering data contribute to this story other than to refute the hypothesis by showing a lack of correlation between quaternary structure, water binding, and activity. This is another reason why the disconnect between WNK1 and WNK3 could be a problem. All of the detailed structural work with WNK1 must be assumed with WNK3; perhaps the light scattering data are contradicting this assumption?

    2. Reviewer #1 (Public Review):

      Summary:<br /> This manuscript addresses the regulation of the osmosensing protein kinases, WNK1 and WNK3. Prior work by the authors has shown that these enzymes are activated by PEG400 or ethylene glycol and inhibited by chloride ion, and that activation is associated with a conformational transition from dimer to monomer. In X-ray structures of the WNK1/SA inactive dimer, a water-mediated hydrogen bond network was observed between the catalytic loop (CL) and the activation loop (AL), named CWN1. This led to the proposal that bound water may be part of the osmosensing mechanism.

      The current study carries this work further, by applying PEG400 to Xtals of dimeric WNK1/SA. This results in a change in kinase conformation and space group, along with 4-9 fewer waters in CWN1 and the complete disappearance of another water cluster (CWN2) located at the dimer interface. Six conserved residues lining the CWN1 pocket in WNK3 are mutated to determine effects on activity and inhibition by chloride ion (measured by AL autophosphorylation) and monomer-dimer interconversion (light scattering).

      The results show that two mutants (E314Q/A in WNK3) at a site central to the water cluster result in increased kinase activity (autophosphorylation), and increased SLS, interpreted as aggregation. Three sites (D279A, Y346F, M301A) inhibit kinase activity with varying effects on oligomerization - Y346A and M301A retain monomer-dimer ratios similar to WT while D279N promotes aggregation. K236A and K307A show activity and monomer:dimer ratios similar to WT. Selected mutants (E314Q, D279N, Y346F) and WT appear to retain osmosensitivity with comparable activation by PEG400.

      The study concludes that osmolytes may activate the kinase by removing waters from the CWN1 and CWN2 clusters, suggesting that waters might be considered allosteric ligands that promote the inactive structure of WNKs. The differing effects of mutations may be ascribed to disruption of the water networks as well as inhibitory perturbations at the active site.

      Strengths:<br /> This study presents a novel and unique function for bound water, and its potential role to explain osmosensory regulation. The mechanism is innovative and the new structures and mutational data presented by the work will be useful for further investigations of the mechanisms that enable cells to respond to osmotic pressure.

      Weaknesses:<br /> Given that all mutants tested showed the same degree of activation by PEG400, it seemed possible that PEG400 might be an allosteric activator of WNK1/3 through direct binding interactions. Perhaps PEG400 eliminates CWN1/2 waters by inducing conformational changes so that water loss is an effect not a cause of activation. To address this it would be helpful to comment on whether new electron densities appeared in the X-ray structure of WNK1/SA/PEG400 that might reflect PEG400 interactions with chains A or B. It would also be helpful to discuss any experiments that might have been done in previous work to examine the direct binding of glycerol and other osmolytes to WNKs.

      The study would benefit from a deeper discussion about how to reconcile the different effects of mutations. For example, wouldn't most or all of the mutations be expected to disrupt the water network, and relieve the proposed autoinhibition? This seemed especially true for some of the residues, like Y420(Y346), D353(D279), and K310(K236), which based on Fig 3 appeared to interact with waters that were removed by PEG400.

      Alternatively, perhaps the waters in CWN2 are more important for maintaining the autoinhibited structure. This possibility would be useful to discuss, and perhaps comment on what may be known about the energetic contributions of bound water towards stabilizing dimers.

      It would also be useful to comment on why aggregation of E319Q/A shouldn't inhibit kinase activity instead of activating it.

      The X-ray work was done entirely with WNK1 while the mutational work was done entirely with WNK3. Therefore, a simple explanation for the disconnect between structure and mutations might be that WNK1 and WNK3 differ enough that predictions from the structure of one are not applicable to mutations of the other. It would be helpful to describe past work comparing the structure and regulation of WNK1 and WNK3 that support the assumption of their interchangeability.

    1. Reviewer #2 (Public Review):

      Summary:

      The manuscript's main claim is that the absence of H2-O, a component of the MHC II presentation pathway, promotes regulatory T cell development and function.

      Unfortunately, the submitted material is not sufficient for proper evaluation of the manuscript, both in terms of the significance of the findings and the strength of the supporting evidence.

      Major issues include:

      - the scRNAseq (shown in Fig. 5) is too rudimentary to allow any conclusion. Statements in the text (eg "Principle Component Analysis (PCA) of the normalized scRNA-seq data identified 11 distinct CD4 T cell clusters", line 166) suggest that additional expertise should be leveraged for these analyses.

      - Most flow cytometry data (Figs. 1 and 2) shows marginal (at best) differences on y-axis truncated bar graphs, with no original data plot, gating strategies, etc., severely challenging conclusions drawn from this data.

    2. Reviewer #1 (Public Review):

      The non-classical MHCII-like protein H2-M is essential for the loading of peptides on MHCII. The discovery that DM was partnered with a second MHCII-like protein, H2-O, which squelched or modified its activity was confounding. It was immediately speculated that H2-O was likely diminished self-peptide presentation. This led to the hypothesis that H2-O was involved in preventing unwanted CD4 T cell activation, thereby making autoimmunity less likely. 25 years of analysis of H2-O deficient mice have, indeed, shown that the self-peptide repertoire in the absence of H2-O is modestly altered. Demonstrating that autoimmunity results from this altered peptide repertoire has been decidedly less convincing. Old mice are reported to have increased serum anti-nuclear antibody titers, but mice prone to type 1 diabetes (T1D) and systemic lupus erythematosus (SLE) were not impacted by the loss of H2-O (Lee et al, 2021). Induction of the multiple sclerosis-like disease, EAE, in mice, was also shown to not be impacted by Lee et al 2021, although in a previous paper (Welsh et al 2020), the authors of this current manuscript suggest otherwise. Unfortunately, these discrepancies are not acknowledged by the authors, and the papers are, for the most part, not referenced.

      In addition to antigen-presenting cells, H2-O is also found in MHCII-expressing medullary epithelial cells, suggesting it might play a role in T-cell selection. Direct data to support this idea, however, has, at most, shown a minimal impact. In this manuscript, the authors follow up on their previous paper (Welsh et al, 2020) to further evaluate changes to T cell development. The conclusions are that H2-O impacts Treg development and changes the frequency and homeostasis of CD4 T cells. Although these would be interesting results, the data analysis is flawed, the presentation is incomplete, and the conclusions are exaggerated.

      T-cell development analysis shown in Figs. 1 and 2 use the discovery from the Hogquist lab (Breed et all 2019) that thymocytes destined for clonal deletion can be differentiated from those still "auditioning" for selection by FACS for expression of cleaved caspase 3. Detection relies on complex FACS analysis that requires the exclusion of multiple populations, followed by accurate gating on CD5+TCRb+ cells (see Hogquist Fig. 1A). The authors apparently neglected to use the essential gating steps, but rather only used CD4 and CCR7 expression (Fig. 1A). This deviation from the Hogquist approach makes interpretation of Figs 1 and 2 meaningless. Even if this is an oversight in the description of the experiments, key conclusions are drawn from minimal changes to CD69 expression. CD69 is expressed as a continuum in the thymus (a "shoulder") making gating somewhat subjective and prone to variation from experiment to experiment. At the minimum, FACS data should be shown to indicate how these changes were measured, plus variations from mouse to mouse should be plotted, with statistics. FACS data needs to be shown to define how the complex semi-mature, M1, and M2 populations were defined (see Hogquist Fig. 2) from which key conclusions are drawn.

      To make the data more robust, 1) cell numbers must be included for all experiments;

      2) rather than normalizing results to "the average H2-O WT levels", the actual data should be included;

      3) figures should be more completely labeled/described;

      4) FACS gating strategies should be clearly laid out (again, see Hogquist for examples). Furthermore, efforts must be made to explain why results are so different from analyses of H2-O deficient mice that have been published by many other groups. For example, the reported "dramatic increase in the proportion of CD3+CD4+ T cells" is not consistent with previous reports starting with Lars Karlsson's initial report (Liljedahl et al 1998). Extensive spontaneous activation of CD4 T cells has also not been reported in other papers that have studied these mice. Again, the paper is not placed in the context of the long, very thorough analysis of both the H2-O deficient mice and the study of H2-O/DO and H2-M/DM in general.

    1. Reviewer #1 (Public Review):

      Summary:<br /> Glaser et al present ExA-SPIM, a light-sheet microscope platform with large volumetric coverage (Field of view 85mm^2, working distance 35mm), designed to image expanded mouse brains in their entirety. The authors also present an expansion method optimized for whole mouse brains and an acquisition software suite. The microscope is employed in imaging an expanded mouse brain, the macaque motor cortex, and human brain slices of white matter.

      This is impressive work and represents a leap over existing light-sheet microscopes. As an example, it offers a fivefold higher resolution than mesoSPIM (https://mesospim.org/), a popular platform for imaging large cleared samples. Thus while this work is rooted in optical engineering, it manifests a huge step forward and has the potential to become an important tool in the neurosciences.

      Strengths:<br /> -ExA-SPIM features an exceptional combination of field of view, working distance, resolution, and throughput.

      -An expanded mouse brain can be acquired with only 15 tiles, lowering the burden on computational stitching. That the brain does not need to be mechanically sectioned is also seen as an important capability.

      -The image data is compelling, and tracing of neurons has been performed. This demonstrates the potential of the microscope platform.

      Weaknesses:<br /> -There is a general question about the scaling laws of lenses, and expansion microscopy, which in my opinion remained unanswered: In the context of whole brain imaging, a larger expansion factor requires a microscope system with larger volumetric coverage, which in turn will have lower resolution (Figure 1B). So what is optimal? Could one alternatively image a cleared (non-expanded) brain with a high-resolution ASLM system (Chakraborty, Tonmoy, Nature Methods 2019, potentially upgraded with custom objectives) and get a similar effective resolution as the authors get with expansion? This is not meant to diminish the achievement, but it was unclear if the gains in resolution from the expansion factor are traded off by the scaling laws of current optical systems.

      -It was unclear if 300 nm lateral and 800 nm axial resolution is enough for many questions in neuroscience. Segmenting spines, distinguishing pre- and postsynaptic densities, or tracing densely labeled neurons might be challenging. A discussion about the necessary resolution levels in neuroscience would be appreciated.

      -Would it be possible to characterize the aberrations that might be still present after whole brain expansion? One approach could be to image small fluorescent nanospheres behind the expanded brain and recover the pupil function via phase retrieval. But even full width half maximum (FWHM) measurements of the nanospheres' images would give some idea of the magnitude of the aberrations.

    2. Reviewer #2 (Public Review):

      Summary:<br /> In this manuscript, Glaser et al. describe a new selective plane illumination microscope designed to image a large field of view that is optimized for expanded and cleared tissue samples. For the most part, the microscope design follows a standard formula that is common among many systems (e.g. Keller PJ et al Science 2008, Pitrone PG et al. Nature Methods 2013, Dean KM et al. Biophys J 2015, and Voigt FF et al. Nature Methods 2019). The primary conceptual and technical novelty is to use a detection objective from the metrology industry that has a large field of view and a large area camera. The authors characterize the system resolution, field curvature, and chromatic focal shift by measuring fluorescent beads in a hydrogel and then show example images of expanded samples from mouse, macaque, and human brain tissue.

      Strengths:<br /> I commend the authors for making all of the documentation, models, and acquisition software openly accessible and believe that this will help assist others who would like to replicate the instrument. I anticipate that the protocols for imaging large expanded tissues (such as an entire mouse brain) will also be useful to the community.

      Weaknesses:<br /> The characterization of the instrument needs to be improved to validate the claims. If the manuscript claims that the instrument allows for robust automated neuronal tracing, then this should be included in the data.

    1. Reviewer #1 (Public Review):

      Summary:<br /> This publication applies 3D super-resolution STORM imaging to understanding the role of developmental neural activity in the clustering of retinal inputs to the mouse dorsal lateral geniculate nucleus (dLGN). The authors argue that retinal ganglion cell (RGC) synaptic boutons start forming clusters early in postnatal development (P2). They then argue that these clusters contribute to eye-specific segregation of retinal inputs by activity-dependent stabilization of nearby boutons from the same eye. The data provided is N=3 animals for each condition of P2, P4, and P8 animals in wild-type mice and in mice where early patterns of structured retinal activity are blocked.

      Strengths:<br /> The 3D storm imaging of pre and postsynaptic elements provides convincing high-resolution localization of synapses.

      The experimental design of comparing ipsilateral and contralateral RGC axon boutons in a region of the dLGN that is known to become contralateral is elegant. The design makes it possible to relate fixed time point structural data to a known outcome of activity-dependent remodeling.

      Weaknesses:<br /> Based on previous literature, it is known that synapse density, synapse clustering, and synaptic specificity increase during postnatal development. Previous work has also shown that both the changes in synaptic clustering and synaptic specificity are affected by retinal activity. The data and analysis provided by the authors add little unambiguous evidence that advances this understanding.

      General problem 1: Most of the statistical analysis is limited to ANOVA comparison of axons from the contralateral and ipsilateral retina in the contralateral dLGN. The hypothesis that ipsilateral and contralateral axons would be statistically identical in the contralateral dLGN is not a plausible hypothesis so rejecting the hypothesis with P < X does not advance the authors' arguments beyond what was already known.

      General problem 2: Most of the interpretation of data is qualitative. While error bars are provided, these error bars are not used to draw conclusions. Given the small sample size (N=3), there is a large degree of uncertainty regarding the magnitude of changes (synapse size, number, specificity). The authors base their conclusions on the averages of these values when the likely degree of uncertainty could allow for the opposite interpretation.

      General problem 3: Two of the four results sections depend on using the frequency of single active zone vGlut2 clusters near multiple active zone vGlut2 as a proxy for synaptic stabilization of the single active zone vGlut2 clusters by the multiple active zone vGlut2 clusters. The authors argue that the increased frequency of same-eye single active zone clusters relative to opposite-eye single active zone clusters means that multiple active zone vGlut2 clusters are selectively stabilizing single active zone clusters. There are other plausible explanations for this observation that are not eliminated. An increased frequency of nearby single active zone clusters would also occur if RGC axons form more than one synapse in the dLGN. Eye-specific segregation is, by definition, a relative increase in the frequency of nearby boutons from the same eye. The authors were, therefore, guaranteed to observe a non-random relationship between boutons from the same eye. The authors do compare their measures to a random model, but I could not find a description of the model. I would expect that the model would need to account for RGC arbor size, arbor structure, bouton number, and segregation independent of multi-active-zone vGlut2 clusters. The most common randomization for the type of analysis described here, a shift in the positions of single-active zone boutons, would not be adequate.

      In discussing the claimed cluster-induced stabilization of nearby boutons, the authors state that the specificity increases with age due to activity-dependent refinement. Their quantification does not support an increase in specificity with age. In fact, the high degree of clustering "specificity" they observe at P2 argues for the trivial same axon explanation.

      Analysis of specific claims:

      Result Section 1

      Most of the figures show mean, error bars, and asterisks, but not the three data points from which these statistics are derived. Large changes in variance from condition to condition suggest that displaying the data points would provide more useful information.

      Claim 1: Contralateral density increases more than ipsilateral in the contralateral region over the course of development. This claim is supported by the qualitative comparison of means and error bars in Figure 2D. The argument could be made quantitative by providing a confidence interval for synapse density increase for dominant and non-dominant synapse density. A confidence interval could then be generated for the difference in this change between the two groups. Currently, the most striking effect is a big difference in variance between P4 and P8 for dominant eye complex synapses. Given that N=3, I assume there is one extreme outlier here.

      Claim 2: The fraction of multiple-active zone vGlut2 clusters increases with age. This claim is weakly supported by a qualitative reading of panel 1E. The error bars overlap so it is difficult to know what the range of possible increases could be. In the text, the authors report mean differences without confidence intervals (or any other statistics). The reported results should, therefore, be interpreted as a description of their three mice and not as evidence about mice in general.

      Figure S1. Panel A makes the point that the study could not be done without STORM by comparing the STORM images to "Conventional" images. The images are over-saturated low-resolution images. A reasonable comparison would be to a high-quality quality confocal image acquired with a high NA objective (~1.4) and low laser power (PSF ~ 0.2 x 0.2 x 0.6 um) that was acquired over the same amount of time it takes to acquire a STORM volume.

      Result section 2.

      Claim 1: The ipsi/contra (in contra LGN) difference in VGluT2 cluster volume increases with development. While there are many p-values listed, the main point is not directly quantified. A reasonable way to quantify the relative increase in volume could be in the form: the non-dominant volumes were 75%-95%(?) of the dominant volume at P2 and 60%-80% (?) at P8. The difference in change was -5 to 15%(?).

      Claim 2: Complex synapses (vGlut2 clusters with multiple active zones) represent clusters of simple synapses and not single large boutons with multiple active zones. The authors argue that because vGlut2 cluster volume scales roughly linearly with active zone number, the vGlut2 clusters are composed of multiple boutons each containing a single active zone. Their analysis does not rule out the (known to be true) possibility that RGC bouton sizes are much larger in boutons with multiple active zones. The correlation of volume and active zone number, by itself, does not resolve the issue. A good argument for multiple boutons might be that the variance is smallest in clusters with 4 active zones (looks like it in the plot) since they would be the average of four active zones to vesicle pool ratios. It is very likely that the multi-active zone vGlut2 clusters represent some clustering and some multi-synaptic boutons. The reference cited by the authors as evidence for the presence of single active zone boutons in young tissue does not rule out the existence of multiple active zone boutons.

      Several arguments are made that depend on the interpretation of "not statistically significant" (n.s.) meaning that "two groups are the same" instead of "we don't know if they are different". This interpretation is incorrect and materially impacts the conclusions.

      Several arguments are made that interpret statistical significance for one group and a lack of statistical significance for another group meaning that the effect was bigger in the first group. This interpretation is incorrect and materially impacts the conclusions.

      Result Section 3.

      Claim 1: Complex synapses stabilize simple synapses. There are alternative explanations (mentioned above) for the observed clustering that negate the conclusions. 1) Boutons from the same axon tend to be found near one another. 2) Any form of eye-specific segregation would produce non-random associations in the analysis as performed. The authors compare each observation to a random model, but I cannot determine from the text if the model adequately accounts for alternative explanations.

      The authors claim that specificity increases over time. Figure 3b (middle) shows that the number of synapses near complex synapses might increase with time (needs confidence interval for effect size), but does not show that specificity (original relative to randomized) increases with time. The fact that nearby simple synapse density is always (P2) very different from random suggests a primarily non-activity-dependent explanation. The simplest explanation is that same-side boutons could be from the same axon whereas different-side axons could not be.

      Claim 2: vGlut2 clusters more than 1.5 um away from multi-active zone vGlut2 clusters are not statistically significantly different in size than vGlut2 clusters within 1.5 um of multi-active zone vGlut2 clusters. Therefore "activity-dependent synapse stabilization mechanisms do not impact simple synapse vesicle pool size". The specific measure of 1.5 um from multi-active zone vGlut2 clusters does not represent all possible synapse stabilization mechanisms.

      Result Section 4.

      Claim: The proximity of complex synapses with nearby simple synapses to other complex synapses with nearby simple synapses from the same eye is used to argue that activity is responsible for all this clustering.

      It is difficult to derive anything from the quantification besides 'not-random'. That is a problem because we already know that axons from the left and right eye segregate during the period being studied. All the measures in Section 4 are influenced by eye-specific segregation. Given this known bias, demonstrating a non-random relationship (P<br /> The results can be stated as: If you are a contralateral complex synapse, contralateral complex synapses that are also close to contralateral simple synapses will, on average, be slightly closer to you than contralateral complex synapses that are not close to contralateral ipsilateral synapses. That would be true if there is any eye-specific segregation (which there is).

      It is an overinterpretation of the data to claim that the lack of a clear correlation between vGlut2 cluster volume and distance to vGlut2 clusters with multiple active zones provides support for the claim that "presynaptic protein organization is not influenced by mechanisms governing synaptic clustering".

    2. Reviewer #2 (Public Review):

      Summary:<br /> In this manuscript, Zhang and Speer examine changes in the spatial organization of synaptic proteins during eye-specific segregation, a developmental period when axons from the two eyes initially mingle and gradually segregate into eye-specific regions of the dorsal lateral geniculate. The authors use STORM microscopy and immunostain presynaptic (VGluT2, Bassoon) and postsynaptic (Homer) proteins to identify synaptic release sites. Activity-dependent changes in this spatial organization are identified by comparing the β2KO mice to WT mice. They describe two types of presynaptic organization based on Bassoon clustering, the complex and the simple synapse. By analyzing the relative densities and distances between these proteins over age, the authors conclude that the complex synapses promote the clustering of simple synapses nearby to form the future mature glomerular synaptic structure.

      Strengths:<br /> The data presented is of good quality and provides an unprecedented view at high resolution of the presynaptic components of the retinogeniculate synapse during active developmental remodeling. This approach offers an advance to the previous mouse EM studies of this synapse because of the CTB label allows identification of the eye from which the presynaptic terminal arises. Using this approach, the authors find that simple synapses cluster close to complex synapses over age, that complex synapse density increases with age.

      Weaknesses:<br /> From these data, the authors conclude that the complex synapse serves to "promote clustering of like-eye synapses and prohibit synapse clustering from the opposite eye". However, the authors show no causal data to support these ideas. There are a number of issues that the authors should consider:

      1. Clustering of retinal synapses is in part due to the fact that retinal inputs synapse on the proximal dendrites. With increased synaptogenesis, there will be increased density of retinal terminals that are closely localized. And with development, perhaps simple synapses mature into complex synapses. Simple synapses may also represent ones that are in the process of being eliminated as previously described by Campbell and Shatz, JNeurosci 1992 (consider citing). Can the authors distinguish these scenarios from the ones that they conclude?

      2. The argument that "complex" synapses are the aggregate of "simple" synapses (Fig 2, S2) is not convincing.

      3. The authors use of the β2KO mice to assess changes in the organization of synaptic proteins in retinal terminals that have disrupted retinal waves. However, β2-nAChRs are also expressed in the dLGN and other areas of the brain and glutamatergic synapse development has been reported in the CNS independent of the disruption in retinal waves. This issue should be considered when interpreting the total reduced retinal synapse density in the dLGN of the mutant.

      4. Outside of a total synapse density difference between WT and β2KO mice, the changes in the spatial organization of synaptic proteins over development do not seem that different. In fact % simple synapses near complex synapses from the non-dominant eye in the mutant is not that different from WT at P8 (Fig 3C), an age when eye-specific segregation is very different between the genotypes. Can the authors explain this discrepancy?

      5. The authors use nomenclature that has been previously used and associated with other aspects of retinogeniculate properties. For example, the phrases "simple" and "complex" synapses have been used to describe single boutons or aggregates of boutons from numerous retinal axons, whereas in this manuscript the phrases are used to describe vesicle clusters/release sites with no knowledge of whether they are from single or multiple boutons. Likewise, the use of the word "glomerulus" has been used in the context of the retinogeniculate synapse to refer to a specific pattern of bouton aggregates that involves inhibitory and neuromodulatory inputs. It is not clear how the release sites described by the authors fit in this picture. Finally the use of the word "punishment" is associated with a body of literature regarding the immune system and retinogeniculate refinement-which is not addressed in this study. This double use of the phrases can lead to confusion in the field and should be clarified by clear definitions of how they are used in the current study.

    3. Reviewer #3 (Public Review):

      Summary:<br /> This manuscript is a follow-up to a recent study of synaptic development based on a powerful data set that combines anterograde labeling, immunofluorescence labeling of synaptic proteins, and STORM imaging (Cell Reports 2023). Specifically, they use anti-Vglut2 label to determine the size of the presynaptic structure (which they describe as the vesicle pool size), anti-Bassoon to label a number of active zones, and anti-Homer to identify postsynaptic densities. In their previous study, they compared the detailed synaptic structure across the development of synapses made with contra-projecting vs ipsi-projecting RGCs and compared this developmental profile with a mouse model with reduced retinal waves. In this study, they produce a new analysis on the same data set in which they classify synapses into "complex" vs. "simple" and assess the number and spacing of these synapses. From these measurements, they make conclusions regarding the processes that lead to synapse competition/stabilization.

      Strengths:<br /> This is a fantastic data set for describing the structural details of synapse development in a part of the brain undergoing activity-dependent synaptic rearrangements. The fact that they can differentiate eye of origin is also a plus.

      Weaknesses:<br /> The lack of details provided for the classification scheme as well as the interpretation of small effect sizes limit the interpretations that can be made based on these findings.

      1. The criteria to classify synapses as simple vs. complex is critical for all of the analysis in this study. Therefore this criteria for classification should be much more explicit and tested for robustness. As stated in the methods, it is based on the number of active zones which are designated by the number of Bassoon clusters associated with a Vglut2 cluster (line 697). A second part of the criteria is the size of the presynaptic terminal as assayed by "greater Vglut2 signal" (line 116). So how are these thresholds determined? For Bassoon clusters, is one voxel sufficient? Two? If it's one, how often do they see a Bassoon positive voxel with no Vglut2 cluster and therefore may represent "noise"? There is no distribution of Bassoon volumes that is provided that might be the basis for selecting this number of sites. Unfortunately, the images are not helpful. For example, does P8 WT in Figure 1B have 7 or 2? According to Figure 2C, it appears the numbers are closer to 2-4.

      The Vglut volume measurements also do not seem to provide a clear criterion. Figure 2 shows that the distributions of Vglut2 cluster volumes for complex and for simple synapses are significantly overlapping.

      The authors need to clarify the quantitative approach used for this classification strategy and test how sensitive the results of the study are to how robust this strategy is

      2. Effect sizes are quite small and all comparisons are made on medians of distributions. This leads to an n=3 biological replicates for all comparisons. Hence this small n may lead to significant results based on ANOVAS/t-tests, but the statistical power of these effects is quite weak. To accurately represent the variance in their data, the authors should show all three data points for each category (with a SD error bar when possible). They should also include the number of synapses in each category (e.g. the numerators in Figure 1D and the denominators for Figure 1E). For other figures, there are additional statistical questions described below.

      3. The authors need to add a caveat regarding their classification of synapses as "complex" vs. "simple" since this is a terminology that already exists in the field and it is not clear that these STORM images are measuring the same thing. For example, in EM studies, "complex" refers to multiple RGCs converging on the same single postsynaptic site. The authors here acknowledge that they cannot assign different AZs to different RGCs so this comparison is an assumption. In Figure 2 they argue this is a good assumption based on the finding that the Vglut column/active zone is constant and therefore each represents a single RGC. However, the authors should acknowledge that they are actually seeing quite different percentages than those in EM studies. For example, in Monavarfeshani et al, eLife 2018, there were no complex synapses found at P8. (Note this study also found many more complex vs. simple synapses in the adult - 70% vs. the 20% found in the current study - but this difference could be a developmental effect). In the future, the authors may want to take another data set in the adult dLGN to make a direct comparison based on numbers and see if their classification method for complex/simple maps onto the one that currently exists in the literature.

      4. Figure 3 assays the relative distribution of simple vs. complex synapses. They found that a larger percentage of simple synapses were within 1.5 microns of complex synapses than you would expect by chance for both ipsi and contra projecting RGCs, and hence conclude that complex synapses are sites of synaptic clustering. In contrast, there was no clustering of ipsi-simple to contra-complex synapses and vice versa. The authors also argue that this clustering decreases between P4 and P8 for ipsi projecting RGCs.

      This analysis needs much more rigor before any conclusions can be drawn. First, the authors need to justify the 1.5-micron criteria for clustering and how robust their results are to variations in this distance. Second, these age effects need to be tested for statistical significance with an ANOVA (all the stats presented are pairwise comparisons to means expected by random distributions at each age). Finally, the authors should consider what n's to use here - is it still grouped by biological replicate? Why not use individual synapses across mice? If they do biological replicates, then they should again show error bars for each data point in their biological replicates. And they should include the number of synapses that went into these measurements in the caption.

      5. Line 211-212 - the authors conclude that the absence of clustered ipsi-simple synapses indicates a failure to stabilize (Figure 3). Yet, the link between this measurement and synapse stabilization is not clear. In particular, the conclusion that "isolated" synapses are the ones that will be eliminated seems to be countered by their finding in Figure 3D/E which shows that there is no difference in vesicle pool volume between near and far synapses. If isolated synapses are indeed the ones that fail to stabilize by P8, wouldn't you expect them to be weaker/have fewer vesicles? Also, it's hard to tell if there is an age-dependent effect since the data presented in Figures 3D/E are merged across ages.

    1. Reviewer #1 (Public Review):

      Summary:<br /> The work of Muller and colleagues concerns the question of where we place our feet when passing uneven terrain, in particular how we trade-off path length against the steepness of each single step. The authors find that paths are chosen that are consistently less steep and deviate from the straight line more than an average random path, suggesting that participants indeed trade-off steepness for path length. They show that this might be related to biomechanical properties, specifically the leg length of the walkers. In addition, they show using a neural network model that participants could choose the footholds based on their sensory (visual) information about depth.

      Strengths:<br /> The work is a natural continuation of some of the researchers' earlier work that related the immediately following steps to gaze [17]. Methodologically, the work is very impressive and presents a further step forward towards understanding real-world locomotion and its interaction with sampling visual information. While some of the results may seem somewhat trivial in hindsight (as always in this kind of study), I still think this is a very important approach to understanding locomotion in the wild better.

      Weaknesses:<br /> The manuscript as it stands has several issues with the reporting of the results and the statistics. In particular, it is hard to assess the inter-individual variability, as some of the data are aggregated across individuals, while in other cases only central tendencies (means or medians) are reported without providing measures of variability; this is critical, in particular as N=9 is a rather small sample size. It would also be helpful to see the actual data for some of the information merely described in the text (e.g., the dependence of \Delta H on path length). When reporting statistical analyses, test statistics and degrees of freedom should be given (or other variants that unambiguously describe the analysis). The CNN analysis chosen to link the step data to visual sampling (gaze and depth features) should be motivated more clearly, and it should describe how training and test sets were generated and separated for this analysis. There are also some parts of figures, where it is unclear what is shown or where units are missing. The details are listed in the private review section, as I believe that all of these issues can be fixed in principle without additional experiments.

    2. Reviewer #2 (Public Review):

      Summary:<br /> This manuscript examines how humans walk over uneven terrain using vision to decide where to step. There is a huge lack of evidence about this because the vast majority of locomotion studies have focused on steady, well-controlled conditions, and not on decisions made in the real world. The author team has already made great advances in this topic, but there has been no practical way to map 3D terrain features in naturalistic environments. They have now developed a way to integrate such measurements along with gaze and step tracking, which allows quantitative evaluation of the proposed trade-offs between stepping vertically onto vs. stepping around obstacles, along with how far people look to decide where to step.

      Strengths:<br /> 1. I am impressed by the overarching outlook of the researchers. They seek to understand human decision-making in real-world locomotion tasks, a topic of obvious relevance to the human condition but not often examined in research. The field has been biased toward well-controlled studies, which have scientific advantages but also serious limitations. A well-controlled study may eliminate human decisions and favor steady or periodic motions in laboratory conditions that facilitate reliable and repeatable data collection. The present study discards all of these usually-favorable factors for rather uncontrolled conditions, yet still finds a way to explore real-world behaviors in a quantitative manner. It is an ambitious and forward-thinking approach, used to tackle an ecologically relevant question.

      2. There are serious technical challenges to a study of this kind. It is true that there are existing solutions for motion tracking, eye tracking, and most recently, 3D terrain mapping. However most of the solutions do not have turn-key simplicity and require significant technical expertise. To integrate multiple such solutions together is even more challenging. The authors are to be commended on the technical integration here.

      3. In the absence of prior studies on this issue, it was necessary to invent new analysis methods to go with the new experimental measures. This is non-trivial and places an added burden on the authors to communicate the new methods. It's harder to be at the forefront in the choice of topic, technical experimental techniques, and analysis methods all at once.

      Weaknesses:<br /> 1. I am predisposed to agree with all of the major conclusions, which seem reasonable and likely to be correct. Ignoring that bias, I was confused by much of the analysis. There is an argument that the chosen paths were not random, based on a comparison of probability distributions that I could not understand. There are plots described as "turn probability vs. X" where the axes are unlabeled and the data range above 1. I hope the authors can provide a clearer description to support the findings. This manuscript stands to be cited well as THE evidence for looking ahead to plan steps, but that is only meaningful if others can understand (and ultimately replicate) the evidence.

      2. I wish a bit more and simpler data could be provided. It is great that step parameter distributions are shown, but I am left wondering how this compares to level walking. The distributions also seem to use absolute values for slope and direction, for understandable reasons, but that also probably skews the actual distribution. Presumably, there should be (and is) a peak at zero slope and zero direction, but absolute values mean that non-zero steps may appear approximately doubled in frequency, compared to separate positive and negative. I would hope to see actual distributions, which moreover are likely not independent and probably have a covariance structure. The covariance might help with the argument that steps are not random, and might even be an easy way to suggest the trade-off between turning and stepping vertically. This is not to disregard the present use of absolute values but to suggest some basic summary of the data before taking that step.

      3. Along these same lines, the manuscript could do more to enable others to digest and go further with the approach, and to facilitate interpretability of results. I like the use of a neural network to demonstrate the predictiveness of stepping, but aside from above-chance probability, what else can inform us about what visual data drives that? Similarly, the step distributions and height-turn trade-off curves are somewhat opaque and do not make it easy to envision further efforts by others, for example, people who want to model locomotion. For that, clearer (and perhaps) simpler measures would be helpful.

      I am absolutely in support of this manuscript and expect it to have a high impact. I do feel that it could benefit from clarification of the analysis and how it supports the conclusions.

    3. Reviewer #3 (Public Review):

      Summary:<br /> The systematic way in which path selection is parametrically investigated is the main contribution.

      Strengths:<br /> The authors have developed an impressive workflow to study gait and gaze in natural terrain.

      Weaknesses:<br /> 1. The training and validation data of the CNN are not explained fully making it unclear if the data tells us anything about the visual features used to guide steering.

      It is not clear how or on what data the network was trained (training vs. validation vs. un-peeked test data), and justification of the choices made. There is no discussion of possible overfitting. The network could be learning just e.g. specific rock arrangements. If the network is overfitting the "features" it uses could be very artefactual, pixel-level patterns and not the kinds of "features" the human reader immediately has in mind.

      2. The use of descriptive terminology should be made systematic.

      Specifically, the following terms are used without giving a single, clear definition for them: path, step, step location, foot plant, foothold, future foothold, foot location, future foot location, foot position.

      I think some terms are being used interchangeably. I would really highly recommend a diagrammatic cartoon sketch, showing the definitions of all these terms in a single figure, and then sticking to them in the main text.

      3. More coverage of different interpretations / less interpretation in the abstract/introduction would be prudent

      The authors discuss the path selection very much on the basis of energetic costs and gait stability. At least mention should be given to other plausible parameters the participants might be optimizing (or that indeed they may be just satisficing).

      That is, it is taken as "given" that energetic cost is the major driver of path selection in your task, and that the relevant perception relies on internal models. Neither of these is a priori obvious nor is it as far as I can tell shown by the data (optimizing other variables, satisficing behavior, or online "direct perception" cannot be ruled out).

    1. Reviewer #1 (Public Review):

      Summary:<br /> The authors aim to consider the effects of phonotactics on the effectiveness of memory reactivation during sleep. They have created artificial words that are either typical or atypical and showed that reactivation improves memory for the latter but not the former.

      Strengths:<br /> This is an interesting design and a creative way of manipulating memory strength and typicality. In addition, the spectral analysis on both the wakefulness data and the sleep data is well done. The article is clearly written and provides a relevant and comprehensive of the literature and of how the results contribute to it.

      Weaknesses:<br /> 1. Unlike most research involving artificial language or language in general, the task engaged in this manuscript did not require (or test) learning of meaning or translation. Instead, the artificial words were arbitrarily categorised and memory was tested for that categorisation. This somewhat limits the interpretation of the results as they pertain to language science, and qualifies comparisons with other language-related sleep studies that the manuscript builds on.

      2. The details of the behavioural task are hard to understand as described in the manuscript. Specifically, I wasn't able to understand when words were to be responded to with the left or right button. What were the instructions? Were half of the words randomly paired with left and half with right and then half of each rewarded and half unrewarded? Or was the task to know if a word was rewarded or not and right/left responses reflected the participants' guesses as to the reward (yes/no)? Please explain this fully in the methods, but also briefly in the caption to Figure 1 (e.g., panel C) and in the Results section.

      3. Relatedly, it is unclear how reward or lack thereof would translate cleanly into a categorisation of hits/misses/correct rejections/false alarms, as explained in the text and shown in Figure 1D. If the item was of the non-rewarded class and the participant got it correct, they avoided loss. Why would that be considered a correct rejection, as the text suggests? It is no less of a hit than the rewarded-correct, it's just the trial was set up in a way that limits gains. This seems to mix together signal detection nomenclature (in which reward is uniform and there are two options, one of which is correct and one isn't) and loss-aversion types of studies (in which reward is different for two types of stimuli, but for each type you can have H/M/CR/FA separably). Again, it might all stem from me not understanding the task, but at the very least this required extended explanations. Once the authors address this, they should also update Fig 1D. This complexity makes the results relatively hard to interpret and the merit of the manuscript hard to access. Unless there are strong hypotheses about reward's impact on memory (which, as far as I can see, are not at the core of the paper), there should be no difference in the manner in which the currently labelled "hits" and "CR" are deemed - both are correct memories. Treating them differently may have implications on the d', which is the main memory measure in the paper, and possibly on measures of decision bias that are used as well.

      4. The study starts off with a sample size of N=39 but excludes 17 participants for some crucial analyses. This is a high number, and it's not entirely clear from the text whether exclusion criteria were pre-registered or decided upon before looking at the data. Having said that, some criteria seem very reasonable (e.g., excluding participants who were not fully exposed to words during sleep). It would still be helpful to see that the trend remains when including all participants who had sufficient exposure during sleep. Also, please carefully mention for each analysis what the N was.

      5. Relatedly, the final N is low for a between-subjects study (N=11 per group). This is adequately mentioned as a limitation, but since it does qualify the results, it seemed important to mention it in the public review.

      6. The linguistic statistics used for establishing the artificial words are all based on American English, and are therefore in misalignment with the spoken language of the participants (which was German). The authors should address this limitation and discuss possible differences between the languages. Also, if the authors checked whether participants were fluent in English they should report these results and possibly consider them in their analyses. In all fairness, the behavioural effects presented in Figure 2A are convincing, providing a valuable manipulation test.

      7. With regard to the higher probability of nested spindles for the high- vs low-PP cueing conditions, the authors should try and explore whether what the results show is a general increase for spindles altogether (as has been reported in the past to be correlated with TMR benefit and sleep more generally) or a specific increase in nested spindles (with no significant change in the absolute numbers of post-cue spindles). In both cases, the results would be interesting, but differentiating the two is necessary in order to make the claim that nesting is what increased rather than spindle density altogether, regardless of the SW phase.

    2. Reviewer #2 (Public Review):

      Summary:<br /> The work by Klaassen & Rasch investigates the influence of word learning difficulty on sleep-associated consolidation and reactivation. They elicited reactivation during sleep by applying targeted memory reactivation (TMR) and manipulated word learning difficulty by creating words more similar (easy) or more dissimilar (difficult) to our language. In one group of participants, they applied TMR of easy words and in another group of participants, they applied TMR of difficult words (between-subjects design). They showed that TMR leads to higher memory benefits in the easy compared to the difficult word group. On a neural level, they showed an increase in spindle power (in the up-state of an evoked response) when easy words were presented during sleep.

      Strengths:<br /> The authors investigate a research question relevant to the field, that is, which experiences are actually consolidated during sleep. To address this question, they developed an innovative task and manipulated difficulty in an elegant way.

      Overall, the paper is clearly structured, and results and methods are described in an understandable way. The analysis approach is solid.

      Weaknesses:<br /> 1.Sample size<br /> For a between-subjects design, the sample size is too small (N = 22). The main finding (also found in the title "Difficulty in artificial word learning impacts targeted memory reactivation") is based on an independent samples t-test with 11 participants/group.

      The authors explicitly mention the small sample size and the between-subjects design as a limitation in their discussion. Nevertheless, making meaningful inferences based on studies with such a small sample size is difficult, if not impossible.

      2.Choice of task<br /> Even though the task itself is innovative, there would have been tasks better suited to address the research question. The main disadvantage the task and the operationalisation of memory performance (d') have is that single-trial performance cannot be calculated. Consequently, choosing individual items for TMR is not possible.

      Additionally, TMR of low vs. high difficulty is conducted between subjects (and independently of pre-sleep memory performance) which is a consequence of the task design.

      The motivation for why this task has been used is missing in the paper.

    3. Reviewer #3 (Public Review):

      Summary:<br /> In this study, the authors investigated the effects of targeted memory reactivation (TMR) during sleep on memory retention for artificial words with varying levels of phonotactical similarity to real words. The authors report that the high phonotactic probability (PP) words showed a more pronounced EEG alpha decrease during encoding and were more easily learned than the low PP words. Following TMR during sleep, participants who had been cued with the high PP TMR, remembered those words better than 0, whilst no such difference was found in the other conditions. Accordingly, the authors report higher EEG spindle band power during slow-wave up-states for the high PP as compared to low PP TMR trials. Overall, the authors conclude that artificial words that are easier to learn, benefit more from TMR than those which are difficult to learn.

      Strengths:<br /> 1. The authors have carefully designed the artificial stimuli to investigate the effectiveness of TMR on words that are easy to learn and difficult to learn due to their levels of similarity with prior word-sound knowledge. Their approach of varying the level of phonotactic probability enables them to have better control over phonotactical familiarity than in a natural language and are thus able to disentangle which properties of word learning contribute to TMR success.

      2. The use of EEG during wakeful encoding and sleep TMR sheds new light on the neural correlates of high PP vs. low PP both during wakeful encoding and cue-induced retrieval during sleep.

      Weaknesses:<br /> 1. The present analyses are based on a small sample and comparisons between participants. Considering that the TMR benefits are based on changes in memory categorization between participants, it could be argued that the individuals in the high PP group were more susceptible to TMR than those in the low PP group for reasons other than the phonotactic probabilities of the stimuli (e.g., these individuals might be more attentive to sounds in the environment during sleep). While the authors acknowledge the small sample size and between-subjects comparison as a limitation, a discussion of an alternative interpretation of the data is missing.

      2. While the one-tailed comparison between the high PP condition and 0 is significant, the ANOVA comparing the four conditions (between subjects: cued/non-cued, within-subjects: high/low PP) does not show a significant effect. With a non-significant interaction, I would consider it statistically inappropriate to conduct post-hoc tests comparing the conditions against each other. Furthermore, it is unclear whether the p-values reported for the t-tests have been corrected for multiple comparisons. Thus, these findings should be interpreted with caution.

      3. With the assumption that the artificial words in the study have different levels of phonotactic similarity to prior word-sound knowledge, it was surprising to find that the phonotactic probabilities were calculated based on an American English lexicon whilst the participants were German speakers. While it may be the case that the between-language lexicons overlap, it would be reassuring to see some evidence of this, as the level of phonotactic probability is a key manipulation in the study.

      4. Another manipulation in the study is that participants learn whether the words are linked to a monetary reward or not, however, the rationale for this manipulation is unclear. For instance, it is unclear whether the authors expect the reward to interact with the TMR effects.

    1. Reviewer #1 (Public Review):

      Summary:

      Flavonoids are abundant in plant-based foods. They have been widely recognized for their health-promoting properties. There is increasing evidence that the effects of dietary flavonoids depend on their metabolism by gut bacteria, which can enhance, reduce or otherwise alter the flavonoids' bioactivities. On the other hand, little is known regarding the enzymes and species that can utilize flavonoids as metabolic substrates.

      In the current manuscript, the authors analyzed the possibility to predict the degradation of flavonoids that we take up with our food by gut bacteria. In contrast to plants, bacteria do not contain obvious degradation enzymes.

      Strengths:

      To predict such enzymes with a broad substrate specificity (enzyme promiscuity) the authors optimized/modified a bioinformatic tool to predict whether a gut bacterial enzyme could catalyze a flavonoid reaction based on the chemical reaction similarity of the enzyme's native reaction and known flavonoid reactions in plants.<br /> They predicted such enzyme activities in genomes of bacteria that had been shown to occur in the human gut. Then, they cultivated selected bacteria with the predicted enzymatic activities and in fact showed, that they can degrade parts of these flavonoids. Together with the bioinformatic and mass spectrometry they identified a metabolization pathway of the flavonoid tilianin that spanned multiple species, i.e., Bifidobacterium longum subsp. animalis, Blautia coccoides, and Flavonifractor plautii. Lastly, the authors showed that tilianin metabolites exhibit protective effects against H2O2 through reactive oxygen species scavenging activity and thus, improve viability of a neuronal cell line, while the parent compound, tilianin, was ineffective. This protective effect might be due to gut microbiota-dependent physiological effects of dietary flavonoids.

      Weaknesses:

      1) To confirm the bioinformatic-based predictions the authors used in vitro culture experiments and LC-MS experiments. Although these in vitro experiments clearly add value to the bioinformatic prediction, they fall short of providing firm evidence for the predictions because they do not show whether the predicted enzymes really catalyze the predicted reactions. In theory, there could be other enzymes not identified bioinformatically that catalyze the reactions.

      2) It is not clear how the authors selected the bacterial species. Did they analyze meta genome sequences or hundreds of genomes of gut bacteria? Did they analyze bacteria isolated from the gut or rather type strains? What about other bacterial species in the gut? Do they also encode relevant enzymes? If yes, how many do? This needs to be clarified.

      3) The reported data on E. coli is difficult to understand. Has E. coli a different degradation pathway leading to the observed disappearance of tilianins?

    2. Reviewer #2 (Public Review):

      The manuscript deals with an interesting topic in metabolism: the so-called underground metabolism enabled by enzymes with broad substrate specificity. This is mainly relevant in secondary metabolisms. The authors deal, in particular, with the conversion of flavonoids, which have health-promoting effects. They present an algorithm for predicting the moonlight activities of enzymes, which must be given as inputs. Moreover, the authors performed experiments on the antioxidant activities of the flavonoids under study.

      My focus was on the bioinformatics part. Overall, the bioinformatics part is not a major scientific achievement in my eyes, or it is too poorly described to see its merits. There may be difficulties understanding the presented algorithm.

      Comments:

      The prediction algorithm should be explained much better. Although the manuscript is quite long, it does not describe the approaches sufficiently well. It is quite hard to read.

      As far as I can see, the method was only tested with a small sample of different flavonoid substances.

      Major comments<br /> (1) I see the following contradiction. Line 18/19: "As flavonoids are not natural substrates of gut bacterial enzymes" and lines 76/77: "commensal gut microorganisms do not have specialized enzymes that utilize flavonoids as their native substrates" versus lines 72-74: "flavonoids ..., which makes them available to be metabolized". How can they be metabolized given what is said in the first two phrases?<br /> (2) It should be explained better what is meant by "reaction class" (e.g. in lines 97 and 99). Is this the same as the EC number (in the Enzyme Catalogue)? The term "reaction class" is indeed used in the KEGG database. On the webpage<br /> https://www.genome.jp/brite/br08204<br /> it seems indeed as if the terms "reaction class" and EC number are somehow equivalent. However, the term "RClass RC00392" in line 557 of the manuscript points to a difference in meaning.<br /> (3) The prediction algorithm should be explained much better. For example, in the Figure showing the workflow, it is shown that an EC number should be given as input. However, if we search for enzymes which could potentially degrade a given flavonoid, we may not know any suitable EC number. Line 122: "To match a given enzyme with its non-native polyphenolic substrates..." However, where can we take the enzyme name/EC number from? Moreover, given that it is assumed that the reaction is performed by underground metabolism, should the enzyme given as input come from another organism, for example, a plant?<br /> (4) Lines 521-523: Our prediction tool can take either a single enzyme in the form of Enzyme Commission (EC) number (e.g. "ec:2.1.1.75"), or a KEGG organism-identifier (e.g. "cpv") or a consortium, a list of different organism-identifiers, as input." I do not understand the wording "or a consortium". According to the Figure showing the workflow, it should read "and a consortium".<br /> (5) In the Materials and Methods section, the KEGG PATHWAY database is mentioned. This comes somewhat out of the blue. What is the connection to the "reaction class" concept in KEGG? Or is the PATHWAY database only used for extracting the negative controls?<br /> (6) Line 142,143. "Our analysis shows that RClass-based similarity can predict the correct reactions for known flavonoid-metabolizing enzymes". How do the authors know that the results are correct? If it is easy to check, then I assume the test whether a given enzyme is able to catalyze reactions with flavonoids can be done manually in KEGG, so that a computer algorithm is unnecessary.<br /> (7) Elaborating on the previous point - I have the impression that the algorithm is a rather simple search routine for finding reactions in the KEGG database that match certain criteria. This might be a helpful tool to save time in comparison to doing the search manually. However, at least the bioinformatics part of the paper is not a major scientific achievement as far as I can see.<br /> (8) It is not sufficiently clear whether the prediction algorithm only works for the example shown in the top figure (tilianin, acacetin etc), which would be quite a restricted application, or for many or even all flavonoids. In line 565, the authors say: "our tabulated 312 unique flavonoids", while in the upper part of the MS, e.g. in lines 26 and 109, only the pathway starting from tilianin is mentioned.<br /> (9) In which programming language was the algorithm implemented?<br /> (10) The connection between the theoretical and experimental parts of the paper is not fully clear. Some of the experiments serve to test the predictions, which is fine. The experiments on free radicals, however, seem to be somewhat unrelated.

    1. Reviewer #1 (Public Review):

      Summary:<br /> In this highly ambitious paper, Breen and Deffner used a multi-pronged approach to generate novel insights on how differences between male and female birds in their learning strategies might relate to patterns of invasion and spread into new geographic and urban areas.

      The empirical results, drawn from data available in online archives, showed that while males and females are similar in their initial efficiency of learning a standard color-food association (e.g., color X = food; color Y = no food) scenario when the associations are switched (now, color Y = food, X= no food), males are more efficient than females at adjusting to the new situation (i.e., faster at 'reversal learning'). Clearly, if animals live in an unstable world, where associations between cues (e.g., color) and what is good versus bad might change unpredictably, it is important to be good at reversal learning. In these grackles, males tend to disperse into new areas before females. It is thus fascinating that males appear to be better than females at reversal learning. Importantly, to gain a better understanding of underlying learning mechanisms, the authors use a Bayesian learning model to assess the relative role of two mechanisms (each governed by a single parameter) that might contribute to differences in learning. They find that what they term 'risk sensitive' learning is the key to explaining the differences in reversal learning. Males tend to exhibit higher risk sensitivity which explains their faster reversal learning. The authors then tested the validity of their empirical results by running agent-based simulations where 10,000 computer-simulated 'birds' were asked to make feeding choices using the learning parameters estimated from real birds. Perhaps not surprisingly, the computer birds exhibited learning patterns that were strikingly similar to the real birds. Finally, the authors ran evolutionary algorithms that simulate evolution by natural selection where the key traits that can evolve are the two learning parameters. They find that under conditions that might be common in urban environments, high-risk sensitivity is indeed favored.

      Strengths:<br /> The paper addresses a critically important issue in the modern world. Clearly, some organisms (some species, some individuals) are adjusting well and thriving in the modern, human-altered world, while others are doing poorly. Understanding how organisms cope with human-induced environmental change, and why some are particularly good at adjusting to change is thus an important question.

      The comparison of male versus female reversal learning across three populations that differ in years since they were first invaded by grackles is one of few, perhaps the first in any species, to address this important issue experimentally.

      Using a combination of experimental results, statistical simulations, and evolutionary modeling is a powerful method for elucidating novel insights.

      Weaknesses:<br /> The match between the broader conceptual background involving range expansion, urbanization, and sex-biased dispersal and learning, and the actual comparison of three urban populations along a range expansion gradient was somewhat confusing. The fact that three populations were compared along a range expansion gradient implies an expectation that they might differ because they are at very different points in a range expansion. Indeed, the predicted differences between males and females are largely couched in terms of population differences based on their 'location' along the range-expansion gradient. However, the fact that they are all urban areas suggests that one might not expect the populations to differ. In addition, the evolutionary model suggests that all animals, male or female, living in urban environments (that the authors suggest are stable but unpredictable) should exhibit high-risk sensitivity. Given that all grackles, male and female, in all populations, are both living in urban environments and likely come from an urban background, should males and females differ in their learning behavior? Clarification would be useful.

      Reinforcement learning mechanisms:<br /> Although the authors' title, abstract, and conclusions emphasize the importance of variation in 'risk sensitivity', most readers in this field will very possibly misunderstand what this means biologically. Both the authors' use of the term 'risk sensitivity' and their statistical methods for measuring this concept have potential problems.

      First, most behavioral ecologists think of risk as predation risk which is not considered in this paper. Secondarily, some might think of risk as uncertainty. Here, as discussed in more detail below, the 'risk sensitivity' parameter basically influences how strongly an option's attractiveness affects the animal's choice of that option. They say that this is in line with foraging theory (Stephens and Krebs 2019) where sensitivity means seeking higher expected payoffs based on prior experience. To me, this sounds like 'reward sensitivity', but not what most think of as 'risk sensitivity'. This problem can be easily fixed by changing the name of the term.

      In addition, however, the parameter does not measure sensitivity to rewards per se - rewards are not in equation 2. As noted above, instead, equation 2 addresses the sensitivity of choice to the attraction score which can be sensitive to rewards, though in complex ways depending on the updating parameter. Second, equations 1 and 2 involve one specific assumption about how sensitivity to rewards vs. to attraction influences the probability of choosing an option. In essence, the authors split the translation from rewards to behavioral choices into 2 steps. Step 1 is how strongly rewards influence an option's attractiveness and step 2 is how strongly attractiveness influences the actual choice to use that option. The equation for step 1 is linear whereas the equation for step 2 has an exponential component. Whether a relationship is linear or exponential can clearly have a major effect on how parameter values influence outcomes. Is there a justification for the form of these equations? The analyses suggest that the exponential component provides a better explanation than the linear component for the difference between males and females in the sequence of choices made by birds, but translating that to the concepts of information updating versus reward sensitivity is unclear. As noted above, the authors' equation for reward sensitivity does not actually include rewards explicitly, but instead only responds to rewards if the rewards influence attraction scores. The more strongly recent rewards drive an update of attraction scores, the more strongly they also influence food choices. While this is intuitively reasonable, I am skeptical about the authors' biological/cognitive conclusions that are couched in terms of words (updating rate and risk sensitivity) that readers will likely interpret as concepts that, in my view, do not actually concur with what the models and analyses address.

      To emphasize, while the authors imply that their analyses separate the updating rate from 'risk sensitivity', both the 'updating parameter' and the 'risk sensitivity' parameter influence both the strength of updating and the sensitivity to reward payoffs in the sense of altering the tendency to prefer an option based on recent experience with payoffs. As noted in the previous paragraph, the main difference between the two parameters is whether they relate to behaviour linearly versus with an exponential component.

      Overall, while the statistical analyses based on equations (1) and (2) seem to have identified something interesting about two steps underlying learning patterns, to maximize the valuable conceptual impact that these analyses have for the field, more thinking is required to better understand the biological meaning of how these two parameters relate to observed behaviours, and the 'risk sensitivity' parameter needs to be re-named.

      Agent-based simulations:<br /> The authors estimated two learning parameters based on the behaviour of real birds, and then ran simulations to see whether computer 'birds' that base their choices on those learning parameters return behaviours that, on average, mirror the behaviour of the real birds. This exercise is clearly circular. In old-style, statistical terms, I suppose this means that the R-square of the statistical model is good. A more insightful use of the simulations would be to identify situations where the simulation does not do as well in mirroring behaviour that it is designed to mirror.

    2. Reviewer #2 (Public Review):

      Summary:<br /> The study is titled "Leading an urban invasion: risk-sensitive learning is a winning strategy", and consists of three different parts. First, the authors analyse data on initial and reversal learning in Grackles confronted with a foraging task, derived from three populations labeled as "core", "middle" and "edge" in relation to the invasion front. The suggested difference between study populations does not surface, but the authors do find moderate support for a difference between male and female individuals. Secondly, the authors confirm that the proposed mechanism can actually generate patterns such as those observed in the Grackle data. In the third part, the authors present an evolutionary model, in which they show that learning strategies as observed in male Grackles do evolve in what they regard as conditions present in urban environments.

      Strengths:<br /> The manuscript's strength is that it combines real learning data collected across different populations of the Great-tailed grackle (Quiscalus mexicanus) with theoretical approaches to better understand the processes with which grackles learn and how such learning processes might be advantageous during range expansion. Furthermore, the authors also take sex into account revealing that males, the dispersing sex, show moderately better reversal learning through higher reward-payoff sensitivity. I also find it refreshing to see that the authors took the time to preregister their study to improve transparency, especially regarding data analysis.

      Weaknesses:<br /> One major weakness of this manuscript is the fact that the authors are working with quite low sample sizes when we look at the different populations of edge (11 males & 8 females), middle (4 males & 4 females), and core (17 males & 5 females) expansion range. Although I think that when all populations are pooled together, the sample size is sufficient to answer the questions regarding sex differences in learning performance and which learning processes might be used by grackles but insufficient when taking the different populations into account.

      Another weakness of this manuscript is that it does not set up the background well in the introduction. Firstly, are grackles urban dwellers in their natural range and expand by colonising urban habitats because they are adapted to it? The introduction also fails to mention why urban habitats are special and why we expect them to be more challenging for animals to inhabit. If we consider that one of their main questions is related to how learning processes might help individuals deal with a challenging urban habitat, then this should be properly introduced.

      Also, the authors provide a single example of how learning can differ between populations from more urban and more natural habitats. The authors also label the urban dwellers as the invaders, which might be the case for grackles but is not necessarily true for other species, such as the Indian rock agama in the example which are native to the area of study. Also, the authors need to be aware that only male lizards were tested in this study. I suggest being a bit more clear about what has been found across different studies looking at: (1) differences across individuals from invasive and native populations of invasive species and (2) differences across individuals from natural and urban populations.

      Finally, the introduction is very much written with regard to the interaction between learning and dispersal, i.e. the 'invasion front' theme. The authors lay out four predictions, the most important of which is No. 4: "Such sex-mediated differences in learning to be more pronounced in grackles living at the edge, rather than the intermediate and/or core region of their range." The authors, however, never return to this prediction, at least not in a transparent way that clearly pronounces this pattern not being found. The model looking at the evolution of risk-sensitive learning in urban environments is based on the assumption that urban and natural environments "differ along two key ecological axes: environmental stability 𝑢 (How often does optimal behaviour change?) and environmental stochasticity 𝑠 (How often does optimal behaviour fail to pay off?). Urban environments are generally characterised as both stable (lower 𝑢) and stochastic (higher 𝑠)". Even though it is generally assumed that urban environments differ from natural environments the authors' assumption is just one way of looking at the differences which have generally not been confirmed and are highly debated. Additionally, it is not clear how this result relates to the rest of the paper: The three populations are distinguished according to their relation to the invasion front, not with respect to a gradient of urbanization, and further do not show a meaningful difference in learning behaviour possibly due to low sample sizes as mentioned above.

      In conclusion, the manuscript was well written and for the most part easy to follow. The format of having the results before the methods makes it a bit harder to follow because the reader is not fully aware of the methods at the time the results are presented. It would, therefore, be important to more clearly delineate the different parts and purposes. Is this article about the interaction between urban invasion, dispersal, and learning? Or about the correct identification of learning mechanisms? Or about how learning mechanisms evolve in urban and natural environments? Maybe this article can harbor all three, but the borders need to be clear. The authors need to be transparent about what has and especially what has not been found, and be careful to not overstate their case.

    1. Reviewer #1 (Public Review):

      There are a number of outstanding questions concerning how cohesin turnover on DNA is controlled by various accessory factors and how such turnover is controlled by post-translational modification. In this paper, Nasmyth et al. perform a series of AlphaFold structure predictions that aim to address several of these outstanding questions. Their structure predictions suggest that the release factor WAPL forms a ternary complex with PDS5 and SA/SCC3. This ternary complex appears to be able to bind the N-terminal end of SCC1, suggesting how formation of such a complex could stabilize an open state of the cohesin ring. Additional calculations suggest how the Eco/ESCO acetyltransferases and Sororin engage the SMC3 head domain presumably to protect against WAPL-mediated release.

      This work thus demonstrates the power of AF prediction methods and how they can lead to a number of interesting and testable hypotheses that can transform our understanding of cohesin regulation. These findings require orthogonal experimental validation, but authors argue convincingly that such validation should not be a pre-requisite to publication.

      In their revised version, the authors did not systematically include model confidence scores, and it therefore remains difficult for the reader to evaluate the reliability of the models obtained. The authors correctly point out that such metrics are available on figshare. It is therefore possible to obtain such information. The caveat is that it remains to the user to identify and extract the relevant information. While they claim that they have labeled N- and C-termini in their figures, no such labeling can be seen in the revised version. Addition of such labels, at least for some of the figures, would help the user to navigate the models.

      The authors have now updated figure legends to indicate which protein is referred to by the chain labels shown in PAE plots.

      It is exciting to see AF-multimer predictions being applied to cohesin. As some of the reported interactions are not universally conserved and some involve relatively small interfaces the possibility arises that these interfaces show poor or borderline confidence scores. As some of these interfaces map to mutants that have previously been obtained by hypothesis-free genetic screens and mutational analyses, they appear nevertheless valid. Thus, an important point to make is that even interfaces that show modest confidence scores may turn out to be valid while others may be not.

    2. Reviewer #2 (Public Review):

      The ATPase protein machine cohesin shapes the genome by loop extrusion and holds sister chromatids together by topological entrapment. When executing these functions, cohesin is tightly regulated by multiple cofactors, such as Scc2/Nipbl, Pds5, Wapl, and Eco1/Esco1/2, and it undergoes dynamic conformational changes with ATP binding and hydrolysis. The mechanisms by which cohesin extrudes DNA loops and medicates siter-chromatid cohesion are still not understood. A major reason for the lack of understanding of cohesin dynamics and regulation is the failure to capture the structures of intact cohesin in different nucleotide-bound states and in complex with various regulators. So far only the ATP state cohesin bound to NIPBL and DNA have been experimentally determined.

      In this manuscript, Nasmyth et al. made use of the powerful protein structure prediction tool, AlphaFold2 (AF), to predict the models of tens of cohesin subcomplexes from different species. The results provide important insight into how the Smc3-Scc1 DNA exiting gate is opened, how Pds5 and Wapl maintain the opened gate, how Pds5 and Scc3/SA recruit different cofactors, how Eco1 and Sororin antagonize Wapl, and how Scc2/Nipbl interacts with Scc3/SA. The models are for the most part consistent with published mutations in these proteins that affect cohesin's functions in vitro and in vivo and raise testable hypotheses of cohesin dynamics and regulation. This study also serves as an example of how to use AF to build models of protein complexes that involve the docking of flexible regions to globular domains.

    1. Joint Public Review:

      The study as a concept is well designed, although there is still one issue I see in the methodology.

      I still have concerns with their attempts to combine the different scales of data. While the use of point data is great, it limits the sample size, and they have included the district to country level data to try and increase the sample size. The problem is that although they try to get an overall estimate at the district/state/country by taking 10 random sample points, which could be a method to get an estimate for the district/state/country. It would be a suitable method if the primates were evenly distributed across the district/state/country. The reality is that the primates are not evenly distributed across the district/state/country therefore the random point sampling is not a reasonable method to get an estimate of the environmental variables in relation to the macaques. For example if you had a mountainous country and you took 10 random points to estimate altitude, you would end up with a large number, but if all the animals of interest lived on the coast, your average altitude is meaningless in relation to the animals of interest as they are all living at low altitude. The fact that the model relies less on highly variable components and places more reliance on less variable components, is really not relevant as the district/state/country measurements have no real meaning in relation to the distribution of masques.

      A simple possible way forward could be to run the model without the district/state/country samples and see what the outcome is. If the outcome is similar then the random point method may be viable (but if it gives the same outcome as ignoring those samples then you don't need the district/state/country samples). If you get a totally different outcome then it should raise concerns about using the district/state/country samples.

      This paper is a really nice piece of work and is a valuable contribution but the district/state/country sample issue really needs to be addressed.

    1. Reviewer #1 (Public Review):

      Summary:<br /> Ciliary rootlet is a structure associated with the ciliary basal body (centriole) with beautiful striation observed by electron microscopy. It has been known for more than a century, but its function and protein arrangement are still unknown. This work reconstructed the near-atomic resolution 3D structure of the rootlet using cryo-electron tomography, discovered a number of interesting filamentous structures inside, and built a molecular model of the rootlet.

      Strengths:<br /> The authors exploited the currently possible ability of cryo-ET and used it appropriately to describe the 3D structure of the rootlet. They carefully conducted subtomogram averaging and classification, which enabled an unprecedented detailed view of this structure. The dual use of (nearly) intact rootlets from cilia and extracted (demembraned) rootlets enabled them to describe with confidence how D1/D2/A bands form periodic structures and cross with longitudinal filaments, which are likely coiled-coil.

      Weaknesses:<br /> Some more clarifications are needed. This reviewer believes that the authors can address them.

    2. Reviewer #2 (Public Review):

      Summary:<br /> This work performs structural analysis on isolated or purified rootlets.

      Strengths:<br /> To date, most studies of this cellular assembly have been from fluorescence microscopy, conventional TEM methods, or through biochemical analysis of constituents. It is clearly a challenging target for structural analysis due to its complexity and heterogeneity. The authors combine observations from cryo-electron tomograms, automated segmentations, subtomogram averaging, and previous data from the literature to present an overall model of how the rootlet is organised.

      Their model will serve as a jumping-off point for future studies, and as such it is something of considerable value and interest.

      Weaknesses:<br /> It is speculative but is presented as such, and is well-reasoned, plausible, and thorough.

    1. Reviewer #1 (Public Review):

      The authors set out to define the molecular basis for LP as the origin of BRCA1-deficient breast cancers. They showed that LPs have the highest level of replicative stress, and hypothesise that this may account for their tendency to transform. They went on to identify ELF3 as a candidate driver of LP transformation and showed that ELF3 expression is up-regulated in response to replicative stress as well as BRCA1 deficiency. They went on to show that ELF3 inactivation led to a higher level of DNA damage, which may result from compromised replicative stress responses.

      While the manuscript supports the interesting idea wherein ELF3 may fuel LP cell transformation, it remains obscure how ELF3 promotes cell tolerance to DNA damage. Interestingly the authors proposed that ELF3 suppresses excessive genomic instability, but in my opinion, I do not see any evidence that supports this claim. In fact, one might think that genomic instability is key to cell transformation.

    2. Reviewer #2 (Public Review):

      Summary:<br /> The manuscript focuses on a persistent question of why germline mutations in BRCA1 which impair homology-directed repair of DNA double-strand breaks predispose to primarily breast and ovarian cancers but not other tissues. The authors propose that replication stress is elevated in the luminal progenitor (LP) cells and apply the gene signature from Dreyer et al as a measure of replication stress in populations of cells selected by FACS previously (published by Lim et al.) and suggest an enrichment of replication stress among the LP cells. This is followed by single-cell RNA seq data from a small number of breast tissues from a small number of BRCA1 mutation carriers but the pathogenic variants are not listed. The authors perform an elegant analysis of the effects of BRCA1 knockdown in MCF10A cells, but these cells are not considered a model of LP cells.

      Overall, the manuscript suffers from significant gaps and leaps in logic among the datasets used. The connection to luminal progenitor cells is not adequately established because the models used are not representative of this population of cells. Therefore, the central hypothesis is not sufficiently justified.

      Strengths:<br /> The inducible knockdown of BRCA1 provided compelling data pointing to an upregulation of ELF3 in this setting as well as a small number of other genes. It would be useful to discuss the other genes for completeness and explain the logic for focusing on ELF3. Nonetheless, the connection with ELF 3 is reasonable. The authors provide significant data showing a role for ELF3 in breast epithelial cells and its role in cell survival.

      Weaknesses:<br /> The initial observations in primary breast cells have small sample sizes. The mutations in BRCA1 seem to be presumed to be all the same, but we know that pathogenic variants differ among individuals and range from missense mutations affecting interactions with one critical partner to large-scale truncations of the protein.

      The figure legends are missing critical details that make it difficult for the reader to evaluate the data. The data support the notion that ELF3 may participate in relieving replication stress, but does not appear to be limited to LP cells as proposed in the hypothesis.

    1. Reviewer #1 (Public Review):

      Summary:<br /> The authors analyzed 102 human embryos in order to address outstanding questions about human lower spinal development and secondary neural tube formation. Through whole embryo imaging and histologic analysis, they provide exceptional quantification of the timing of posterior neuropore closure, rate of lower spinal somite formation, and formation and regression of the human "tail." Their analysis also provides convincing qualitative evidence of the cellular and molecular mechanisms at play during lower spinal development by identifying the presence of caspase-dependent programmed cell death and the dynamic expression of FGF8/WNT3A within the elongating embryo. Interestingly, they identify multiple polarized lumens within the site of secondary neural tube formation and add a solid argument for the mode of formation of this structure; however, in its current state, the evidence for a conclusive morphogenetic mechanism remains elusive. Finally, the authors provide a substantial review of the existing publications related to human lower spinal development, creating an excellent reference and demonstrating the importance of continuing to utilize each of these precious samples for furthering our understanding of human development.

      Strengths:<br /> This manuscript provides an excellent window into the key morphogenetic events of human caudal neural tube formation. Figures 1 and 2 provide beautiful images and quantification of the developmental events, enabling comparison to models that are currently in use, including model organisms and the developing spinal organoid field. The characterization of somite development and later regression is particularly important.

      Next, the authors addressed current questions regarding the molecular pathways present during the elongation of the embryo and later regression of the tail structure. The in situ hybridization experiments in Figures 5 and 6 demonstrate important evidence for a maintained neuromesodermal progenitor pool of stem cells that promote axial elongation. Additionally, the identification of caspase-dependent cell death within the human tail provides an explanation for the mechanism of this regression, especially given the notable lack of presence of any gross necrosis.

      Finally, as mentioned above, the non-trivial collection and review of the existing human secondary neural tube and body formation literature is an important tool and organizes and synthesizes ~ 100 years of observations from precious human samples.

      Weaknesses:<br /> While there are no glaringly incorrect claims from the authors, several of the conclusions could benefit from a form of quantification to support their observations:

      1) The identification of the proximal to distal degeneration of the tailgut within the human tail is difficult to distinguish with the current images present in Figure 3. A picture within a picture of the area containing the tail gut could be provided to prominently demonstrate the cellular architecture. Additionally, quantification of the localization of apoptosis would strongly support this observation, as well as provide a visualization of the tail's regression overall. For example, a graph plotting the number of apoptotic cells versus the rostral to caudal locations of the transverse sections while accounting for the CS stage of each analyzed embryo could be created; this could even be further broken down by region of tail, for example, tailgut, ventral ectodermal ridge, somite, etc.

      2) The identification of the mode of formation of the secondary neural tube is probably the most interesting question to be addressed, however, Figure 7's evidence is not completely satisfying in its current form. While I agree that it is unlikely that multiple polarization foci form within the most caudal part of the tail and coalesce more rostrally, I am equally unsure that a single polarization would form rostrally and then split and re-coalesce as it moves caudally, as is currently depicted by 7B.

      Multiple groups have recently shown the influence of geometric confinement on neuroectoderm and its ability to polarize and form a singular central lumen (Karzbrun 2021, Knight 2018), or the inverse situation of a lack of confinement resulting in the presence of multiple lumens. The tapering of the diameter of the tail and its shared perimeter and curvature with the polarization bears a striking resemblance to this controlled confinement. An interesting quantification to depict would include the number of lumens versus the transverse section diameter and CS stage to see if there is any correlation between embryo size and the number of multiple polarizations. Anecdotally, the fusion of multiple polarizations/lumens tends to occur often in these human organoid-type platforms, while splitting to multiple lumens as the tissues mature does not. Other supplements to Figure 7 could include 3D renderings of lumens of interest as depicted in Catala 2021, especially if it demonstrates the re-coalescence as seen in 7B.

      The non-pathologic presence of multiple polarizations in human tails compared to the rodent pathogenic counterpart is interesting given that rodents obviously maintain this appendage while it is lost in humans.

      3) Of potential interest is the process of junctional neurulation describing the mechanistic joining of the primary and secondary neural tube, which has recently been explored in chick embryos and demonstrated to have relevance to human disease (Dady 2014, Eibach 2017, Kim 2021). While it is clear this paper's goal does not center on the relationship between primary and secondary neurulation, such a mechanism may be relevant to the authors' interpretation of their observations of lumen coalescence. I wonder if the embryos studied provide any evidence to support junctional neurulation.

    2. Reviewer #2 (Public Review):

      Summary:<br /> This study utilizes a large series of neurulation human embryos to address several questions about the similarities and differences between human neurulation and model systems such as the chicken and rodent.

      Strengths:<br /> The number of specimens utilized for the analysis provides robustness to the findings.

      Weaknesses:<br /> It is not clear how the gestational age of the specimens was determined or how that can be known with certainty. There is no information given in the methods on this. With this in mind, bunching the samples at 2-day intervals in Figure 1J will lead to inaccuracies in assessing the rate of somite formation. This is pointed out as a major difference between specimens and organoids in the abstract but a similar result in the results section. The data supporting either of these statements is not convincing.

      Whenever possible, give the numbers of specimens that had the described findings. For example, in Figure 2C - how many embryos were examined with the massive rounded end at CS13? Apoptosis in Figures 3 and 4?

      For Figure 2I-K, it would be informative to superimpose the individual data points on the box plots distinguishing males from females, as in Figure 1I.

      Is it possible to quantitate apoptosis and proliferation data?

      The Tunel staining in Figure 3 is difficult to make out.

      Additional improvements to the presentation of figures, writing, and quantization of results are suggested.